CUDASecurity-CON05¶

Avoid deadlock by locking in a predefined order

Required inputs: IR, StaticSemanticAnalysis

Avoid deadlocks caused by incorrect thread synchronization.

Scope: Host, Device.

Audience: CUDA C++, CUDA Libraries.

Category: Required.

Hardware Applicability: All Compute Capabilities.

Rationale

When using synchronization techniques such as semaphores or __syncthreads, make sure the order and execution of locking / synchronization is appropriate to prevent deadlocks.

For semaphores, this means ensuring that there are no lock cycles in which an adverse order of locking can lead the system to deadlock.

For thread synchronization primitives such as __syncthreads, this means ensuring that the synchronization points are unconditionally reached by all participating threads.

Example 1 (Bad)

#include <cuda/semaphore>

__device__ cuda::binary_semaphore<cuda::thread_scope_device> s1(1);
__device__ cuda::binary_semaphore<cuda::thread_scope_device> s2(1);

__global__ void t1(){
  s1.acquire();
  s2.acquire();

  // critical section here

  s2.release();
  s1.release();
}

__global__ void t2(){
  // locks in reverse order compared to t1
  // thus can lead to deadlock
  s2.acquire();
  s1.acquire();

  // critical section here

  s1.release();
  s2.release();
}

Example 2 (Good)

#include <cuda/semaphore>

__device__ cuda::binary_semaphore<cuda::thread_scope_device> s1(1);
__device__ cuda::binary_semaphore<cuda::thread_scope_device> s2(1);

__global__ void t1(){
  s1.acquire();
  s2.acquire();

  // critical section here

  s2.release();
  s1.release();
}

__global__ void t2(){
  s1.acquire();
  s2.acquire();

  // critical section here

  s2.release();
  s1.release();

Possible Messages

Key	Text	Severity	Disabled
deadlock	Unfavorable locking sequence could lead to deadlock.	None	False
thread_synchronization_in_divergent_region	Implicit thread block synchronization function is called in possibly divergent code.	None	False

Options¶

This rule shares the following common options: exclude_in_macros, exclude_messages_in_system_headers, excludes, extend_exclude_to_macro_invocations, includes, justification_checker, languages, post_processing, provider, report_at, severity
The following places define options that affect this rule: Stylechecks, Analysis-GlobalOptions

additional_lock_api_callables¶

additional_lock_api_callables

Type: typing.Callable[[bauhaus.ir.Graph], typing.Iterable[tuple[bauhaus.rules.parallelism.locksets.LockApiCallableKind, str]]] | None

Default: <function bauhaus.ir.cuda.cuda_library_names.cuda_atomic_semaphore_lock_api_functions_for_graph(_ir_graph: 'ir.Graph') -> 'typing.Iterable[tuple[locksets.LockApiCallableKind, str]]'>

Python callable that generates additional [enter|exit]_additional_functions from the ir.Graph.

enter_critical_functions¶

enter_critical_functions : set[bauhaus.analysis.config.QualifiedName] = set()

Set of function names to enter a critical region.

enter_critical_macros¶

enter_critical_macros : set[bauhaus.analysis.config.MacroName] = set()

Set of macro names to enter a critical region (macros must expand to asm() statement).

exit_critical_functions¶

exit_critical_functions : set[bauhaus.analysis.config.QualifiedName] = set()

Set of function names to exit a critical region.

exit_critical_macros¶

exit_critical_macros : set[bauhaus.analysis.config.MacroName] = set()

Set of macro names to exit a critical region (macros must expand to asm() statement).

nested_critical_regions¶

nested_critical_regions : bool = True

If set to true, critical regions nest; if set to false, a single exit-critical-region terminates all open critical regions.

synchronization_functions¶

synchronization_functions

Type: set[bauhaus.analysis.config.QualifiedName]

Default: {'__syncthreads', '__syncthreads_and', '__syncthreads_count', '__syncthreads_or', 'cooperative_groups::__v1::thread_block::sync'}

Names of synchronization functions for the purposes of this rule. Qualified names can be used.

Axivion Suite 7.12.2-public

Navigation