CUDA-1.3¶
Check for errors after calling any CUDA library interface
Required inputs: IR
CUDA 1.3 [error.interface] Check for errors after calling any CUDA library interface
When calling a CUDA library interface that returns a cudaError_t or CUresult, check whether the return
value indicates that an error occurred.
Scope: Host, Device.
Audience: CUDA C++, CUDA Libraries.
Category: Advisory.
Hardware Applicability: All Compute Capabilities.
Rationale
CUDA libraries report errors by returning numeric error codes from interfaces. These codes must be inspected by programs using CUDA to determine if an error has occurred. There are two classes of errors that an interface can return:
- Synchronous errors, which are the immediate result of the call to the interface that returned the error, and
- Asynchronous errors, which are the result of some unspecified previous asynchronous operation.
Asynchronous errors are "sticky", meaning that they will continue to be returned by subsequent calls to the CUDA runtime or driver. Error codes report the class of error that occurred, but do not contain any information about which operation caused the error. Because of the existence of asynchronous errors, almost any CUDA library interface can return an error, even if the interface itself cannot produce an error, because a "sticky" asynchronous error may be returned. If a program fails to check whether an interface call returned an error, either:
- No diagnostic will be produced for the error if the error was synchronous.
- A subsequent interface call will return the error if the error was asynchronous.
This may make it unclear where the asynchronous error originated.
Example 1 (Bad)
# include <cassert> int main() { int32_t device = -1; cudaDeviceProp prop = {}; // The below call fails with `cudaErrorInvalidValue`, because `prop` is not // a valid device property. The returned error code is never checked, so the // error is silently ignored. cudaChooseDevice(&device, &prop); // Because the above call failed, `device` is still `-1`, which is not valid // as a parameter to `cudaSetDevice`, so the below call fails, returning code // `cudaErrorInvalidDevice`. This may be confusing, as the root of the error // is the previous failed call. cudaError_t const error0 = cudaSetDevice(device); assert(cudaSuccess == error0); }
Example 2 (Good)
# include <cassert> int main() { int32_t device = -1; cudaDeviceProp prop = {}; // The below call fails with `cudaErrorInvalidValue`, because `prop` is not // a valid device property. The returned error code is checked, so the // failure is caught. cudaError_t const error0 = cudaChooseDevice(&device, &prop); assert(cudaSuccess == error0); // Because the above call failed, this code is never reached. cudaError_t const error1 = cudaSetDevice(device); assert(cudaSuccess == error1); }
Example 3 (Bad)
# include <memory> # include <cassert> # include "testTerminate.h" __global__ void fail() { *(int*)0 = 0; } __global__ void pass() {} int main() { fail<<<1, 1>>>(); // Fails. // Catch synchronous launch failures cudaError_t const error0 = cudaGetLastError(); testTerminate(error0); pass<<<1, 1>>>(); // Never executed because the above failure was fatal. cudaError_t const error1 = cudaGetLastError(); testTerminate(error1); // This synchronization and any subsequent CUDA runtime or driver interface // calls fail with the asynchronous error `cudaErrorAssert`. The failure // reported by this call is silently ignored. cudaDeviceSynchronize(); // This unrelated memory allocation and any subsequent CUDA runtime or // driver interface calls fail with the asynchronous error `cudaErrorAssert`. auto deleter = [] (int32_t* p) { cudaError_t const error0 = cudaFree(p); assert(cudaSuccess == error0); }; int32_t* raw_u; cudaError_t const error2 = cudaMalloc(&raw_u, sizeof(int32_t)); assert(cudaSuccess == error2); std::unique_ptr<int32_t, decltype(deleter)> u(raw_u, deleter); }
Example 4 (Good)
# include <memory> # include <cassert> # include "testTerminate.h" __global__ void fail() { *(int*)0 = 0; } __global__ void pass() {} int main() { fail<<<1, 1>>>(); // Fails. // Catch synchronous launch failures testTerminate(cudaGetLastError()); // Never executed because the above failure was fatal. pass<<<1, 1>>>(); testTerminate(cudaGetLastError()); // This synchronization and any subsequent CUDA runtime or driver interface // calls fail with the asynchronous error `cudaErrorAssert`. cudaError_t const error2 = cudaDeviceSynchronize(); assert(cudaSuccess == error2); }Excerpt from NVIDIA CUDA C++ Guidelines for robust and safety-critical programming, Version 3.0.1, Copyright (C) 2018-2023 NVIDIA Corporation.
Possible Messages
Key |
Text |
Severity |
Disabled |
|---|---|---|---|
discarded_return |
Missing check for errors after calling a CUDA library interface |
None |
False |
Options¶
This rule shares the following common options: exclude_in_macros, exclude_messages_in_system_headers, excludes, extend_exclude_to_macro_invocations, includes, justification_checker, languages, post_processing, provider, report_at, severity
The following places define options that affect this rule: Stylechecks, Analysis-GlobalOptions
error_types¶
error_types : set[bauhaus.analysis.config.ShortTypeName] = {'CUresult', 'cudaError_t'}
inspect_template_instances¶
inspect_template_instances : bool = False
whitelist¶
whitelist : dict[bauhaus.analysis.config.FileGlobPattern, list[bauhaus.analysis.config.GlobPattern]] = {}