CUDA-2.2ΒΆ

Use a power of 2 width less than or equal to the warp size with warp collective shuffle operations

Required inputs: IR, StaticSemanticAnalysis

CUDA 2.2 [collective.warp.shuffle.width] Use a power of 2 width less than or equal to the warp size with warp collective shuffle operations

When a group of device-threads participate in a warp collective shuffle operation, the width parameter should be a power of 2 and less than or equal to the warp size. The following are warp collective shuffle operations:

  • __shfl_sync.
  • __shfl_up_sync.
  • __shfl_down_sync.
  • __shfl_xor_sync.
Scope: Device.
Audience: CUDA C++.
Category: Mandatory.
Hardware Applicability: All Compute Capabilities.
Rationale

Warp wide collective shuffle operations have undefined behavior if the width parameter is not a power of 2 or is greater than the warp size.

Excerpt from NVIDIA CUDA C++ Guidelines for robust and safety-critical programming, Version 3.0.1, Copyright (C) 2018-2023 NVIDIA Corporation.

Possible Messages

Key

Text

Severity

Disabled

width_is_not_power_of_two

Width parameter must be power of two not larger than warp width (i.e., 1, 2, 4, 8, 16, or 32). Lane(s) {lanes} have width {width} in context(s) {contexts}

None

False

Options