CUDA-2.3¶
All involved threads should be included in the warp collective
Required inputs: IR, StaticSemanticAnalysis
CUDA 2.3 [collective.warp.include_self] All involved threads should be included in the warp collective
The encountering thread should always be included in a mask when evaluating a warp level intrinsic. When reading data from another device thread in a warp-level intrinsic, the source thread must be active and included in the mask.
Scope: Device.
Audience: CUDA C++.
Category: Mandatory.
Hardware Applicability: All Compute Capabilities.
Rationale
The mask is used to determine which threads need to arrive before execution can continue. Since all threads involved in a warp collective must use the same mask if a thread makes a call and is not in the mask the result will be undefined. If the source thread for a shuffle sync operation is not present the result of the instruction is undefined.
Example 1 (Bad)
__global__ void test() { __syncwarp(1u); // Non-compliant: thread-lane 0 invokes the synchronization // without being included in the mask. }
Example 2 (Bad)
__global__ void kernel() { if (threadIdx.x == 0) return; else __shfl_sync(~1u, 1, /* srcLane = */ 0); // Non-compliant: source lane is // not included in the mask. int x = __shfl_sync(~0u, 1, /* srcLane = */ 0); // Non-compliant: the source thread has terminated. }Excerpt from NVIDIA CUDA C++ Guidelines for robust and safety-critical programming, Version 3.0.1, Copyright (C) 2018-2023 NVIDIA Corporation.
Possible Messages
Key |
Text |
Severity |
Disabled |
|---|---|---|---|
participating_lane_not_in_mask |
Lane(s) {lanes} participate but are not named in the sync mask 0b{mask:032b} in context(s) {contexts} |
None |
False |
participating_lane_potentially_not_in_mask |
Lane(s) {lanes} potentially participate but are not named in the sync mask 0b{mask:032b} in context(s) {contexts} |
None |
True |
source_lane_not_in_mask |
Source lane(s) {src_lanes} are not in the sync mask 0b{mask:032b} for “dest-lane:=src-lane”-reads {bad_accesses} in context(s) {contexts} |
None |
False |
Options
This rule shares the following common options: exclude_in_macros, exclude_messages_in_system_headers, excludes, extend_exclude_to_macro_invocations, includes, justification_checker, languages, post_processing, provider, report_at, severity
The following places define options that affect this rule: Stylechecks, Analysis-GlobalOptions
This rule has no individual options.