CUDASecurity-ARR01

Do not form or use out-of-bounds array subscripts for CUDA kernels

Required inputs: IR, StaticSemanticAnalysis

This rule prevents accessing array elements beyond their allocated bounds within CUDA kernels.

Scope: Host, Device.
Audience: CUDA C++, CUDA Libraries.
Category: Required.
Hardware Applicability: All Compute Capabilities.
Rationale

Out-of-bounds array access in CUDA kernels can lead to undefined behavior, memory corruption, segmentation faults, or silent data corruption. The rule checks that array subscripts are properly validated against array dimensions before use, especially when using thread indices (threadIdx, blockIdx, blockDim, gridDim) to calculate array positions.

Example 1 (Good)
__global__ void vectorAdd(float* a, float* b, float* c, int n)
{
    int idx = blockIdx.x * blockDim.x + threadIdx.x;

    // Proper bounds checking before array access
    if (idx >= 0 && idx < n)
    {
        c[idx] = a[idx] + b[idx];
    }
}

int main()
{
    const int N = 1000;
    float *d_a, *d_b, *d_c;

    // Allocate GPU memory
    cudaMalloc(&d_a, N * sizeof(float));
    cudaMalloc(&d_b, N * sizeof(float));
    cudaMalloc(&d_c, N * sizeof(float));

    // Launch kernel with proper grid/block sizing
    int blockSize = 256;
    int numBlocks = (N + blockSize - 1) / blockSize;
    vectorAdd<<>>(d_a, d_b, d_c, N);

    cudaFree(d_a);
    cudaFree(d_b);
    cudaFree(d_c);
    return 0;
}
Example 2 (Bad)
__global__ void vectorAdd(float* a, float* b, float* c, int n)
{
    int idx = blockIdx.x * blockDim.x + threadIdx.x;

    // BAD: No bounds checking - may access beyond array limits
    c[idx] = a[idx] + b[idx];
}

__global__ void processArray(int* data, int size)
{
    int tid = threadIdx.x;

    // BAD: Assumes array has at least 1024 elements
    data[tid + 1024] = data[tid] * 2;

    // BAD: Using hardcoded offset without validation
    if (tid > 0)
    {
        data[tid - 1] = data[tid + 512]; // May exceed bounds
    }
}

int main()
{
    const int N = 1000;
    float *d_a, *d_b, *d_c;
    int* d_data;

    cudaMalloc(&d_a, N * sizeof(float));
    cudaMalloc(&d_b, N * sizeof(float));
    cudaMalloc(&d_c, N * sizeof(float));
    cudaMalloc(&d_data, 500 * sizeof(int)); // Only 500 elements allocated

    // BAD: Grid might create more threads than array elements
    int blockSize = 256;
    int numBlocks = 8; // Could launch 2048 threads for 1000 elements
    vectorAdd<<>>(d_a, d_b, d_c, N);

    // BAD: Kernel assumes 1024+ elements but only 500 allocated
    processArray<<<1, 256>>>(d_data, 500);

    cudaFree(d_a);
    cudaFree(d_b);
    cudaFree(d_c);
    cudaFree(d_data);
    return 0;
}
Excerpt from NVIDIA CUDA C++ Guidelines for robust and safety-critical programming, Version 3.0.1, Copyright (C) 2018-2023 NVIDIA Corporation.

Possible Messages

Key

Text

Severity

Disabled

arithmetic_out_of_bounds

Pointer arithmetic on {node0} might create pointer outside array bounds of {name0}

None

False

out_of_bounds

Access into array is out of bounds

None

False

out_of_bounds_nonarray

Pointer arithmetic on {node0} possibly operates on non-array target {name0}

None

False

out_of_bounds_nonarray_arithmetic

Pointer arithmetic on {node0} possibly creates pointer outside of non-array target {name0}

None

False

out_of_bounds_nonarray_arithmetic_undereferenced

Pointer arithmetic on {node0} possibly creates pointer one past non-array target {name0} (but not dereferenced)

None

False

possible_indirect_out_of_bounds

Pointer-indirect access through {node0} might be out of bounds accessing {name0}

None

False

possible_out_of_bounds

Access into array might be out of bounds

None

False

undereferenced_arithmetic_out_of_bounds

Pointer arithmetic on {node0} might create pointer one past the end of {name0} (but not dereferenced)

None

False

undereferenced_out_of_bounds

Access is one past the end of the array (but not dereferenced)

None

False

undereferenced_possible_indirect_out_of_bounds

Pointer-indirect access through {node0} might be one past the end accessing {name0} (but not dereferenced)

None

False

undereferenced_possible_out_of_bounds

Access might be one past the end of the array (but not dereferenced)

None

False

Options

abstract_interpretation_out_of_bounds

abstract_interpretation_out_of_bounds : bool = False

Use additional "symbolic expression analysis" as postprocessing step. This can remove false positives, but might require more time. Option is automatically active if option StaticSemanticAnalysis/performance.general.enhanced_analysis is active.
 

distinguish_nonarray_access_kinds

distinguish_nonarray_access_kinds : bool = False

If false, the rule reports every finding involving a non-array target using messages of kind out_of_bounds_nonarray. If true, it differentiates these findings using the message kinds out_of_bounds_nonarray_arithmetic (for cases where possibly an invalid pointer is formed), out_of_bounds_nonarray_arithmetic_undereferenced (pointer one past target object is formed, but not dereferenced), and out_of_bounds_nonarray (remaining cases where valid pointers are formed).
 

exclude_very_high_indices

exclude_very_high_indices : bool = True

Enables heuristic to detect false positives: When index used for array access is very high in comparison to the array's size, assume false positive.
 

nonarray_only_report_invalid_pointers

nonarray_only_report_invalid_pointers : bool = False

If true, the rule only reports findings where invalid pointers are formed, using either messages of kind out_of_bounds_nonarray (if option distinguish_nonarray_access_kinds is false) or messages of kind out_of_bounds_nonarray_arithmetic (otherwise).
 

report_unbounded_arrays

report_unbounded_arrays : bool = False

If true, accesses into arrays with unknown bound are reported as being potentially outside the allowed range. This affects arrays like extern char buf[];.
 

report_undereferenced_one_past_the_end

report_undereferenced_one_past_the_end : bool = False

If true, report accesses one past the end of an array even if there is no dereference of the resulting pointer.
 

report_unknown_index

report_unknown_index : bool = False

If false, do not report possible out-of-bound findings for which the analysis was not able to infer any restricting information about the array index (this can lead to excluding both false positives and true findings).