CUDA-4.4ΒΆ

Do not vary launch bounds for a global function

Required inputs: IR

CUDA 4.4 [kernel.launch_bounds] Do not vary launch bounds for a global function

The launch bounds specified on a global function shall be the same for all other declarations of the same functions in the same device program.

Scope: Host, Device.
Audience: CUDA C++, CUDA Libraries.
Category: Mandatory.
Hardware Applicability: All Compute Capabilities.
Rationale

The implementation can only track one launch bounds for a function. Providing multiple bounds will result in an unknown setting of the values.

Example: 1 (Bad)
// translation unit a.cu:

__global__ void __launch_bounds__(1,1) f();
# if __CUDA_ARCH__ == 600

__global__ void __launch_bounds__(1,1) g();
# endif
// translation unit b.cu:

__global__ void __launch_bounds__(2,1) f();
  // non-conformant: the launch bounds of the function 'f' differ across
  // translation units.
# if __CUDA_ARCH__ == 700

__global__ void __launch_bounds__(2,1) g();
  // conformant: the launch bounds of the function 'g' differ across translation
  // units, but for different device programs.
# endif
Excerpt from NVIDIA CUDA C++ Guidelines for robust and safety-critical programming, Version 3.0.1, Copyright (C) 2018-2023 NVIDIA Corporation.

Possible Messages

Key

Text

Severity

Disabled

launch_bounds_mismatch

Mismatch in bounds specified by __launch_bounds__ across different declarations

None

False

Options