CUDADirective-1.4

Do not use CUDA within CUDA callbacks

Required inputs: IR

CUDA DIRECTIVE 1.4 [device-dependent.callback] Do not use CUDA within CUDA callbacks

In callbacks which happen in stream order, e.g. those created via cudaStreamAddCallback, cuStreamAddCallback, cudaLaunchHostFunc, cuLaunchHostFunc, cudaGraphAddHostNode, or cuGraphAddHostNode, do not:

  • Call device-dependent CUDA runtime or driver interfaces.
  • Launch CUDA kernels.
  • ODR-use a managed storage duration object (e.g. __managed__, CUDA C++ only).
Scope: Host.
Audience: CUDA C++, CUDA Libraries.
Hardware Applicability: All Compute Capabilities.
Rationale

Callbacks registered with CUDA which happen in stream order are executed in unspecified threads of exe- cution which are owned by the CUDA runtime and driver. Calling a CUDA operation within one of these callbacks may potentially need to synchronize with the stream which the callback is in, which would lead to a deadlock, as the callback would be waiting for itself to complete. Thus, calling CUDA operations from stream-order callbacks has undefined behavior.

Example 1 (Bad)
# include <cassert>
# include <cuda.h>
# include <cuda_runtime_api.h>

__host__ void callback(void*) {
  // This call attempts to synchronize with all streams, but the callback
  // itself is executing in a stream, so this would deadlock. Thus,
  // `cudaErrorNotPermitted` is returned instead.
  cudaError_t const error = cudaDeviceSynchronize();
  assert(cudaSuccess == error);
}

int main() {
  cudaError_t const error0 = cudaLaunchHostFunc(nullptr, &callback, nullptr);
  assert(cudaSuccess == error0);
  cudaError_t const error1 = cudaDeviceSynchronize();
  assert(cudaSuccess == error1);
}
Excerpt from NVIDIA CUDA C++ Guidelines for robust and safety-critical programming, Version 3.0.1, Copyright (C) 2018-2023 NVIDIA Corporation.

Possible Messages

Key

Text

Severity

Disabled

cuda_within_cuda

Use of CUDA ‘{}’ detected in CUDA callback.

None

False

Options

functions

functions

Type: set[bauhaus.analysis.config.QualifiedName]

Default: {'cuGraphAddHostNode', 'cuLaunchHostFunc', 'cuStreamAddCallback', 'cudaGraphAddHostNode', 'cudaLaunchHostFunc', 'cudaStreamAddCallback'}

Set of functions that create callbacks in stream order.