CUDASecurity-CON05

Avoid deadlock by locking in a predefined order

Required inputs: IR, StaticSemanticAnalysis

Avoid deadlocks caused by incorrect thread synchronization.

Scope: Host, Device.
Audience: CUDA C++, CUDA Libraries.
Category: Required.
Hardware Applicability: All Compute Capabilities.
Rationale

When using synchronization techniques such as semaphores or __syncthreads, make sure the order and execution of locking / synchronization is appropriate to prevent deadlocks.

For semaphores, this means ensuring that there are no lock cycles in which an adverse order of locking can lead the system to deadlock.

For thread synchronization primitives such as __syncthreads, this means ensuring that the synchronization points are unconditionally reached by all participating threads.

Example 1 (Bad)
#include 

__device__ cuda::binary_semaphore s1(1);
__device__ cuda::binary_semaphore s2(1);

__global__ void t1(){
  s1.acquire();
  s2.acquire();

  // critical section here

  s2.release();
  s1.release();
}

__global__ void t2(){
  // locks in reverse order compared to t1
  // thus can lead to deadlock
  s2.acquire();
  s1.acquire();

  // critical section here

  s1.release();
  s2.release();
}
Example 2 (Good)
#include 

__device__ cuda::binary_semaphore s1(1);
__device__ cuda::binary_semaphore s2(1);

__global__ void t1(){
  s1.acquire();
  s2.acquire();

  // critical section here

  s2.release();
  s1.release();
}

__global__ void t2(){
  s1.acquire();
  s2.acquire();

  // critical section here

  s2.release();
  s1.release();
Excerpt from NVIDIA CUDA C++ Guidelines for robust and safety-critical programming, Version 3.0.1, Copyright (C) 2018-2023 NVIDIA Corporation.

Possible Messages

Key

Text

Severity

Disabled

deadlock

Unfavorable locking sequence could lead to deadlock.

None

False

thread_synchronization_in_divergent_region

Implicit thread block synchronization function is called in possibly divergent code.

None

False

Options

additional_lock_api_callables

additional_lock_api_callables

Type: typing.Callable[[bauhaus.ir.Graph], typing.Iterable[tuple[bauhaus.rules.parallelism.locksets.LockApiCallableKind, str]]] | None

Default: <function bauhaus.ir.cuda.cuda_library_names.cuda_atomic_semaphore_lock_api_functions_for_graph(_ir_graph: 'ir.Graph') -> 'typing.Iterable[tuple[locksets.LockApiCallableKind, str]]'>

Python callable that generates additional [enter|exit]_additional_functions from the ir.Graph.
 

enter_critical_functions

enter_critical_functions : set[bauhaus.analysis.config.QualifiedName] = set()

Set of function names to enter a critical region.
 

enter_critical_macros

enter_critical_macros : set[bauhaus.analysis.config.MacroName] = set()

Set of macro names to enter a critical region (macros must expand to asm() statement).
 

exit_critical_functions

exit_critical_functions : set[bauhaus.analysis.config.QualifiedName] = set()

Set of function names to exit a critical region.
 

exit_critical_macros

exit_critical_macros : set[bauhaus.analysis.config.MacroName] = set()

Set of macro names to exit a critical region (macros must expand to asm() statement).
 

nested_critical_regions

nested_critical_regions : bool = True

If set to true, critical regions nest; if set to false, a single exit-critical-region terminates all open critical regions.
 

synchronization_functions

synchronization_functions

Type: set[bauhaus.analysis.config.QualifiedName]

Default: {'__syncthreads', '__syncthreads_and', '__syncthreads_count', '__syncthreads_or', 'cooperative_groups::__v1::thread_block::sync'}

Names of synchronization functions for the purposes of this rule. Qualified names can be used.