CUDA-1.4¶

Check for errors after launching a kernel by calling cudaGetLastError

Required inputs: IR

CUDA 1.4 [error.kernel-launch] Check for errors after launching a kernel by calling cudaGetLastError

After launching a kernel with the CUDA C++ kernel launch syntax, call cudaGetLastError and check whether its return value indicates that a synchronous error occurred before the next CUDA API call.

Scope: Host, Device.

Audience: CUDA C++.

Category: Advisory.

Hardware Applicability: All Compute Capabilities.

Rationale

CUDA C++ kernel launches return void, but reports errors by setting the CUDA global error state, which can be checked with cudaGetLastError. There are two classes of errors that an interface can return:

Synchronous errors, which are the immediate result of the call to the interface that returned the error, and
Asynchronous errors, which are the result of some unspecified previous asynchronous operation.

Asynchronous errors are "sticky", meaning that they will continue to be returned by subsequent calls to the CUDA runtime or driver. Error codes report the class of error that occurred, but do not contain any information about which operation caused the error. If a program fails to check whether a kernel launch returned an error, either:

No diagnostic will be produced for the error if the error was synchronous.
A subsequent interface call will return the error if the error was asynchronous.

This may make it unclear where the asynchronous error originated from.

Example 1 (Bad)

# include <cassert>
# include "testTerminate.h"

__global__ void fail() {}

__global__ void pass() {}

int main() {
  fail<<<0, 1>>>(); // Fails synchronously due to invalid grid dimensions.
  pass<<<1, 0>>>(); // Succeeds, overwriting the previously error which was
// synchronous and thus is not "sticky".
  cudaError_t const error0 = cudaGetLastError();
  testTerminate(error0);
  // This synchronization succeeds. The failure of the first kernel is silently
  // ignored.
  cudaError_t const error1 = cudaDeviceSynchronize();
  assert(cudaSuccess == error1);
}

Example 2 (Good)

# include <cassert>
# include "testTerminate.h"

__global__ void fail() {}

__global__ void pass() {}

int main() {
  // Fails synchronously due to invalid grid dimensions.
  fail<<<0, 1>>>();
  // The kernel launch failure is detected by checking the CUDA global error
  // state.
  cudaError_t const error0 = cudaGetLastError();
  testTerminate(error0);
  pass<<<1, 0>>>(); // Succeeds, overwriting the previously error which was
// synchronous and thus is not "sticky".
  cudaError_t const error1 = cudaGetLastError();
  testTerminate(error1);
  cudaError_t const error2 = cudaDeviceSynchronize();
  assert(cudaSuccess == error2);
}

Possible Messages

Key	Text	Severity	Disabled
missing_error_check_after_kernel	Missing check for errors (by calling cudaGetLastError) after launching a kernel	None	False

Options¶

This rule shares the following common options: exclude_in_macros, exclude_messages_in_system_headers, excludes, extend_exclude_to_macro_invocations, includes, justification_checker, languages, post_processing, provider, report_at, severity
The following places define options that affect this rule: Stylechecks, Analysis-GlobalOptions

error_checking_functions¶

error_checking_functions : set[bauhaus.analysis.config.FunctionName] = {'cudaGetLastError', 'cudaPeekAtLastError'}

Names of functions that are sufficient to check for errors after launching a kernel.

Axivion Suite 7.12.2-public

Navigation