CUDA-1.2¶

Synchronize with all devices before termination

Required inputs: IR

CUDA 1.2 [synchronize.termination] Synchronize with all devices before termination

Before the start of normal host program termination, synchronize with all outstanding work on all devices.

Scope: Host.

Audience: CUDA C++, CUDA Libraries.

Category: Advisory.

Hardware Applicability: All Compute Capabilities.

Rationale

Many CUDA interfaces are asynchronous and are evaluated concurrently with the host program on devices. The CUDA runtime and driver do not implicitly synchronize with any outstanding work on any devices during host program termination. If a CUDA C++ program fails to synchronize with all outstanding work on all devices before normal host program termination (returning from main, calling std::abort, etc.), no diagnostic will be produced for any asynchronous errors that occur during the execution of that work.

Example 1 (Bad)

# include <cassert>

__global__ void fail() {
  *(int*)0 = 0;
}

int main() {
  fail<<<1, 1>>>();
  // This call succeeds, because the kernel launch did not have a synchronous
  // error.
  cudaError_t const error0 = cudaGetLastError();
  assert(cudaSuccess == error0);
  // The assert in the kernel leads to an asynchronous error which is silently
  // ignored.
}

Example 2 (Good)

# include <cassert>
# include <cuda.h>
# include <stdio.h>

__global__ void fail(int n, float m, float *y) {
  y[n] = m;
}

int main() {
  fail<<<1, 1>>>(1000, 0, NULL);
  // This call succeeds, because the kernel launch did not have a synchronous
  // error.
  cudaError_t const error1 = cudaGetLastError();
  if(error1 != cudaSuccess) {
    printf("%s", "fail<<<1, 1>>>(1000, 0, d_y);");
  }
  // This call returns the sticky error from the kernel launch.
  cudaError_t const error2 = cudaDeviceSynchronize();
  assert(error2 == cudaSuccess);
}

Example 3 (Bad)

# include <cassert>
# include "testTerminate.h"

__global__ void fail(int* i) {
  *i = 17;
}

int main() {
  fail<<<1, 1>>>(nullptr); // null pointer results in a segmentation fault.
  cudaError_t const error0 = cudaGetLastError();
  testTerminate(error0);
  // The assert in the kernel leads to an asynchronous error which is silently
  // ignored.
}

Example 4 (Good)

# include <cassert>
# include "testTerminate.h"

__global__ void fail(int* i) {
  *i = 17;
}

int main() {
  fail<<<1, 1>>>(nullptr);
  // This call succeeds, because the kernel launch did not have a synchronous
  // error.
  cudaError_t const error0 = cudaGetLastError();
  testTerminate(error0);
  cudaError_t const error1 = cudaDeviceSynchronize();
  assert(cudaSuccess == error1);
}

Possible Messages

Key	Text	Severity	Disabled
cuda_synchronize_termination	Synchronize with all devices before termination	None	False

Options¶

This rule shares the following common options: exclude_in_macros, exclude_messages_in_system_headers, excludes, extend_exclude_to_macro_invocations, includes, justification_checker, languages, post_processing, provider, report_at, severity
The following places define options that affect this rule: Stylechecks, Analysis-GlobalOptions

terminate_functions¶

terminate_functions

Type: dict[bauhaus.analysis.config.FileGlobPattern, list[bauhaus.analysis.config.GlobPattern]]

Default:
{
   '*abort.h': ['abort'],
   '*stdlib.h': ['exit', 'abort', 'quick_exit', '_Exit'],
   '*terminate.h': ['terminate'],
   'exception': ['terminate']
}

A table providing (globbing patterns for) header file names and functions declared therein that should be treated as terminating the program for the purposes of this rule.

Axivion Suite 7.12.2-public

Navigation