CUDA-1.1¶

Use CUDA after host program initiation and before host program termination

Required inputs: IR

CUDA 1.1 [device-dependent] Use CUDA after host program initiation and before host program termination

Perform the following operations only after host program initiation and before host program termination (not before or after main):

Call device-dependent CUDA runtime or driver interfaces.
Launch CUDA kernels.
ODR-use a managed storage duration object (e.g. __managed__, CUDA C++ only).

Most CUDA runtime and driver interfaces are device-dependent, with some exceptions such as:

cudaGetErrorString and cudaGetErrorName.
cuGetErrorString and cuGetErrorName.
cuDriverGetVersion.

Scope: Host.

Audience: CUDA C++, CUDA Libraries.

Category: Mandatory.

Hardware Applicability: All Compute Capabilities.

Rationale

Most CUDA runtime and driver interfaces are dependent on the existence of valid internal device state, which is lazily initialized during host program initiation and destroyed during host program termination. The CUDA runtime and driver cannot detect if this device state is invalid, so using any of these interfaces (implicitly or explicitly) during program initiation or termination will result in undefined behavior. This means that most uses of the CUDA runtime or driver in the global constructors or destructors of C++ classes in the host program is undefined behavior.

Example 1 (Bad)

# include <cstdio>
# include <cassert>
# include <cuda.h>

__global__ void hello_world() {
  printf("hello world\n");
}

struct launch_hello_world {
  __host__ launch_hello_world() {
    hello_world<<<1, 1>>>();
    cudaError_t const error0 = cudaGetLastError();
    assert(cudaSuccess == error0);
  }
  __host__ ~launch_hello_world() {
    hello_world<<<1, 1>>>();
    cudaError_t const error0 = cudaGetLastError();
    assert(cudaSuccess == error0);
  }
};
// This object's constructor is run during program initiation and attempts to
// launch a kernel, which is invalid. Likewise, the object's destructor is run
// during program termination and attempts to launch a kernel, which is invalid.
launch_hello_world launcher;

int main() {}

Example 2 (Bad)

# include <cassert>
# include <cuda.h>

struct device_storage {
  char* ptr;
  __host__ device_storage(int64_t bytes) {
    cudaError_t const error = cudaMalloc(&ptr, bytes);
    assert(cudaSuccess == error);
  }
  __host__ ~device_storage() {
    cudaError_t const error = cudaFree(ptr);
    assert(cudaSuccess == error);
  }
};
// This object's constructor is run during program initiation and attempts to
// call an initialization-dependent CUDA runtime function, which is invalid.
// Likewise, the object's destructor is run during program termination and
// attempts to call an initialization-dependent CUDA runtime function, which is
// invalid.
device_storage scratch(128);

int main() {}

Example 3 (Bad)

__managed__ int32_t global_object_count = 0;

struct A {
  __host__ A() {
    ++global_object_count;
  }
  __host__ ~A() {
    --global_object_count;
  }
};
// This object's constructor is run during program initiation and attempts to
// increment a `__managed__` object, which is invalid. Likewise, the object's
// destructor is run during program termination and attempts to decrement a
// `__managed__` object, which is invalid.
A a;

int main() {}

Possible Messages

Key	Text	Severity	Disabled
cuda_api_outside_main	Calling a device-dependent CUDA runtime or driver interface in initialization or finalization	None	False
cuda_managed_outside_main	CUDA managed object odr-used in initialization or finalization	None	False
kernel_launch_outside_main	CUDA kernel launch in initialization or finalization	None	False

Options¶

This rule shares the following common options: exclude_in_macros, exclude_messages_in_system_headers, excludes, extend_exclude_to_macro_invocations, includes, justification_checker, languages, post_processing, provider, report_at, severity
The following places define options that affect this rule: Stylechecks, Analysis-GlobalOptions

allowed_cuda_functions¶

allowed_cuda_functions

Type: set[bauhaus.analysis.config.QualifiedName]

Default: {'cuDriverGetVersion', 'cuGetErrorName', 'cuGetErrorString', 'cudaGetErrorName', 'cudaGetErrorString'}

Names of functions in the CUDA API which can be called during global initialization and finalization.

report_at_call¶

report_at_call : bool = False

If set to True, the error is reported at the call-sites of routines using CUDA (e.g., at the global variables where some constructor/destructor is called) rather than at the use itself.

Axivion Suite 7.12.2-public

Navigation