CUDA-1.3

Check for errors after calling any CUDA library interface

Required inputs: IR

CUDA 1.3 [error.interface] Check for errors after calling any CUDA library interface

When calling a CUDA library interface that returns a cudaError_t or CUresult, check whether the return value indicates that an error occurred.

Scope: Host, Device.
Audience: CUDA C++, CUDA Libraries.
Category: Advisory.
Hardware Applicability: All Compute Capabilities.
Rationale

CUDA libraries report errors by returning numeric error codes from interfaces. These codes must be inspected by programs using CUDA to determine if an error has occurred. There are two classes of errors that an interface can return:

  • Synchronous errors, which are the immediate result of the call to the interface that returned the error, and
  • Asynchronous errors, which are the result of some unspecified previous asynchronous operation.

Asynchronous errors are "sticky", meaning that they will continue to be returned by subsequent calls to the CUDA runtime or driver. Error codes report the class of error that occurred, but do not contain any information about which operation caused the error. Because of the existence of asynchronous errors, almost any CUDA library interface can return an error, even if the interface itself cannot produce an error, because a "sticky" asynchronous error may be returned. If a program fails to check whether an interface call returned an error, either:

  • No diagnostic will be produced for the error if the error was synchronous.
  • A subsequent interface call will return the error if the error was asynchronous.

This may make it unclear where the asynchronous error originated.

Example 1 (Bad)
# include <cassert>

int main() {
  int32_t device = -1;
  cudaDeviceProp prop = {};
  // The below call fails with `cudaErrorInvalidValue`, because `prop` is not
  // a valid device property. The returned error code is never checked, so the
  // error is silently ignored.
  cudaChooseDevice(&device, &prop);
  // Because the above call failed, `device` is still `-1`, which is not valid
  // as a parameter to `cudaSetDevice`, so the below call fails, returning code
  // `cudaErrorInvalidDevice`. This may be confusing, as the root of the error
  // is the previous failed call.
  cudaError_t const error0 = cudaSetDevice(device);
  assert(cudaSuccess == error0);
}
Example 2 (Good)
# include <cassert>

int main() {
  int32_t device = -1;
  cudaDeviceProp prop = {};
  // The below call fails with `cudaErrorInvalidValue`, because `prop` is not
  // a valid device property. The returned error code is checked, so the
  // failure is caught.
  cudaError_t const error0 = cudaChooseDevice(&device, &prop);
  assert(cudaSuccess == error0);
  // Because the above call failed, this code is never reached.
  cudaError_t const error1 = cudaSetDevice(device);
  assert(cudaSuccess == error1);
}
Example 3 (Bad)
# include <memory>
# include <cassert>
# include "testTerminate.h"

__global__ void fail() {
  *(int*)0 = 0;
}

__global__ void pass() {}

int main() {
  fail<<<1, 1>>>(); // Fails.
  // Catch synchronous launch failures
  cudaError_t const error0 = cudaGetLastError();
  testTerminate(error0);
  pass<<<1, 1>>>(); // Never executed because the above failure was fatal.
  cudaError_t const error1 = cudaGetLastError();
  testTerminate(error1);
  // This synchronization and any subsequent CUDA runtime or driver interface
  // calls fail with the asynchronous error `cudaErrorAssert`. The failure
  // reported by this call is silently ignored.
  cudaDeviceSynchronize();
  // This unrelated memory allocation and any subsequent CUDA runtime or
  // driver interface calls fail with the asynchronous error `cudaErrorAssert`.
  auto deleter = [] (int32_t* p) {
    cudaError_t const error0 = cudaFree(p);
    assert(cudaSuccess == error0);
  };
  int32_t* raw_u;
  cudaError_t const error2 = cudaMalloc(&raw_u, sizeof(int32_t));
  assert(cudaSuccess == error2);
  std::unique_ptr<int32_t, decltype(deleter)> u(raw_u, deleter);
}
Example 4 (Good)
# include <memory>
# include <cassert>
# include "testTerminate.h"

__global__ void fail() {
  *(int*)0 = 0;
}

__global__ void pass() {}

int main() {
  fail<<<1, 1>>>(); // Fails.
  // Catch synchronous launch failures
  testTerminate(cudaGetLastError());
  // Never executed because the above failure was fatal.
  pass<<<1, 1>>>();
  testTerminate(cudaGetLastError());
  // This synchronization and any subsequent CUDA runtime or driver interface
  // calls fail with the asynchronous error `cudaErrorAssert`.
  cudaError_t const error2 = cudaDeviceSynchronize();
  assert(cudaSuccess == error2);
}
Excerpt from NVIDIA CUDA C++ Guidelines for robust and safety-critical programming, Version 3.0.1, Copyright (C) 2018-2023 NVIDIA Corporation.

Possible Messages

Key

Text

Severity

Disabled

discarded_return

Missing check for errors after calling a CUDA library interface

None

False

Options

error_types

error_types : set[bauhaus.analysis.config.ShortTypeName] = {'CUresult', 'cudaError_t'}

User defined error type names (if empty, all ignored int values are reported).
 

inspect_template_instances

inspect_template_instances : bool = False

Whether calls in template instances should be reported.
 

whitelist

whitelist : dict[bauhaus.analysis.config.FileGlobPattern, list[bauhaus.analysis.config.GlobPattern]] = {}

Dictionary of header globbing to (list of) function names whose return codes can be ignored.