CWE-119¶
Improper Restriction of Operations within the Bounds of a Memory Buffer. [Improper-Control-Of-A-Resource-Through-Its-Lifetime, Top25-2024-20]
Required inputs: IR, StaticSemanticAnalysis
Certain languages allow direct addressing of memory locations and do not automatically ensure that these locations are valid for the memory buffer that is being referenced. This can cause read or write operations to be performed on memory locations that may be associated with other variables, data structures, or internal program data.
As a result, an attacker may be able to execute arbitrary code, alter the intended control flow, read sensitive information, or cause the system to crash.
Demonstrative Examples
Example 1
This example takes an IP address from a user, verifies that it is well formed and then looks up the hostname and copies it into a buffer.
Example Language:C
void host_lookup(char *user_supplied_addr){
struct hostent *hp;
in_addr_t *addr;
char hostname[64];
in_addr_t inet_addr(const char *cp);
/*routine that ensures user_supplied_addr is in the right format for conversion */
validate_addr_form(user_supplied_addr);
addr = inet_addr(user_supplied_addr);
hp = gethostbyaddr( addr, sizeof(struct in_addr), AF_INET);
strcpy(hostname, hp->h_name);
}
This function allocates a buffer of 64 bytes to store the hostname, however there is no guarantee that the hostname will not be larger than 64 bytes. If an attacker specifies an address which resolves to a very large hostname, then the function may overwrite sensitive data or even relinquish control flow to the attacker.
Note that this example also contains an unchecked return value (CWE-252) that can lead to a NULL pointer dereference (CWE-476).
Example 2
This example applies an encoding procedure to an input string and stores it into a buffer.
Example Language:C
char * copy_input(char *user_supplied_string){
int i, dst_index;
char *dst_buf = (char*)malloc(4*sizeof(char) * MAX_SIZE);
if ( MAX_SIZE <= strlen(user_supplied_string) ){
die("user string too long, die evil hacker!");
}
dst_index = 0;
for ( i = 0; i < strlen(user_supplied_string); i++ ){
if( '&' == user_supplied_string[i] ){
dst_buf[dst_index++] = '&';
dst_buf[dst_index++] = 'a';
dst_buf[dst_index++] = 'm';
dst_buf[dst_index++] = 'p';
dst_buf[dst_index++] = ';';
}
else if ('<' == user_supplied_string[i] ){
/* encode to < */
}
else dst_buf[dst_index++] = user_supplied_string[i];
}
return dst_buf;
}
The programmer attempts to encode the ampersand character in the user-controlled string, however the length of the string is validated before the encoding procedure is applied. Furthermore, the programmer assumes encoding expansion will only expand a given character by a factor of 4, while the encoding of the ampersand expands by 5. As a result, when the encoding procedure expands the string it is possible to overflow the destination buffer if the attacker provides a string of many ampersands.
Example 3
The following example asks a user for an offset into an array to select an item.
Example Language:C
int main (int argc, char **argv) {
char *items[] = {"boat", "car", "truck", "train"};
int index = GetUntrustedOffset();
printf("You selected %s\n", items[index-1]);
}
The programmer allows the user to specify which element in the list to select, however an attacker can provide an out-of-bounds offset, resulting in a buffer over-read (CWE-126).
Example 4
In the following code, the method retrieves a value from an array at a specific array index location that is given as an input parameter to the method
Example Language:C
int getValueFromArray(int *array, int len, int index) {
int value;
// check that the array index is less than the maximum
// length of the array
if (index < len) {
// get the value at the specified index of the array
value = array[index];
}
// if array index is invalid then output error message
// and return value indicating error
else {
printf("Value is: %d\n", array[index]);
value = -1;
}
return value;
}
However, this method only verifies that the given array index is less than the maximum length of the array but does not check for the minimum value (CWE-839). This will allow a negative value to be accepted as the input array index, which will result in a out of bounds read (CWE-125) and may allow access to sensitive memory. The input array index should be checked to verify that is within the maximum and minimum range required for the array (CWE-129). In this example the if statement should be modified to include a minimum range check, as shown below.
Example Language:C
...
// check that the array index is within the correct
// range of values for the array
if (index >= 0 && index < len) {
...
Example 5
Windows provides the _mbs family of functions to perform various operations on multibyte strings. When these functions are passed a malformed multibyte string, such as a string containing a valid leading byte followed by a single null byte, they can read or write past the end of the string buffer causing a buffer overflow. The following functions all pose a risk of buffer overflow: _mbsinc _mbsdec _mbsncat _mbsncpy _mbsnextc _mbsnset _mbsrev _mbsset _mbsstr _mbstok _mbccpy _mbslen
Excerpts from CWE [https://cwe.mitre.org], Copyright (C) 2006-2026, the MITRE Corporation. See section 9.4. "3rd-Party Licenses" in the documentation for full details.Possible Messages
Key |
Text |
Severity |
Disabled |
|---|---|---|---|
arithmetic_out_of_bounds |
Pointer arithmetic on {node0} might create pointer outside array bounds of {name0} |
None |
False |
assigned_to_pointer_to_const |
Assigning the address of a partially initialized variable to some pointer-to-const |
None |
False |
double_free |
Dynamic memory released here was already released earlier |
None |
False |
out_of_bounds |
Access into array is out of bounds |
None |
False |
pass_as_pointer_to_const_param |
Passing uninitialized variable by pointer as function parameter with pointer-to-const type |
None |
False |
possible_double_free |
Dynamic memory released here possibly already released earlier |
None |
False |
possible_indirect_out_of_bounds |
Pointer-indirect access through {node0} might be out of bounds accessing {name0} |
None |
False |
possible_invalid_call_argument |
Call to {} with string buffer argument {} that possibly has no valid null delimiter character. |
None |
False |
possible_out_of_bounds |
Access into array might be out of bounds |
None |
False |
possible_return_value_uninit |
Function return value is potentially not initialized |
None |
False |
possible_uninit |
Use of possibly uninitialized variable |
None |
False |
possible_use_after_free |
Dynamic memory possibly used after it was previously released |
None |
False |
possible_write_beyond_argument |
Call to {} might result in a write access beyond the bounds of argument {}, since argument {} might be too large. |
None |
False |
possibly_initialized |
Use of possibly uninitialized variable (previous call {node0} might have initialized the variable) |
None |
False |
return_value_uninit |
Function return value is not initialized |
None |
False |
undereferenced_arithmetic_out_of_bounds |
Pointer arithmetic on {node0} might create pointer one past the end of {name0} (but not dereferenced) |
None |
False |
undereferenced_out_of_bounds |
Access is one past the end of the array (but not dereferenced) |
None |
False |
undereferenced_possible_indirect_out_of_bounds |
Pointer-indirect access through {node0} might be one past the end accessing {name0} (but not dereferenced) |
None |
False |
undereferenced_possible_out_of_bounds |
Access might be one past the end of the array (but not dereferenced) |
None |
False |
uninit |
Use of uninitialized variable |
None |
False |
use_after_free |
Dynamic memory used after it was previously released |
None |
False |
Options¶
This rule shares the following common options: exclude_in_macros, exclude_messages_in_system_headers, excludes, extend_exclude_to_macro_invocations, includes, justification_checker, languages, post_processing, provider, report_at, severity
The following places define options that affect this rule: Stylechecks, Analysis-GlobalOptions
abstract_interpretation_out_of_bounds¶
abstract_interpretation_out_of_bounds : bool = False
additional_local_array_check¶
additional_local_array_check : bool = True
int example()
{
int a[10];
int b[20];
int uninit_var;
for (int i = 0; i < 10; ++i)
{
L1: a[i] = uninit_var; // use of uninit_var reported
b[i] = i;
}
int result = a[3]; // not reported, since already reported at L1
result += b[15]; // reported; c[] is not (completely) initialized
return result;
}
assume_globals_are_initialized¶
assume_globals_are_initialized : bool = True
check_array_access_with_unknown_index¶
check_array_access_with_unknown_index : bool = False
a[i] with non-literal index
i should be checked as well.
concat_operations¶
concat_operations
Names of buffer-concatenating functions being relevant as call targets for this check, with the position of the argument pointing to the destination buffer, and the position of the argument that references the buffer that should be appended at the end of the destination buffer.Type: dict[bauhaus.analysis.config.QualifiedName, typing.Tuple[int, int]]
Default:
{ 'strcat': (0, 1) }
copy_operations¶
copy_operations
Names of buffer copy functions being relevant as call targets for this check, with the position of the destination argument and the source argument of the buffer copy operation.Type: dict[bauhaus.analysis.config.QualifiedName, typing.Tuple[int, int]]
Default:
{ 'strcpy': (0, 1) }
delimiter_of_arguments¶
delimiter_of_arguments
Names of functions being relevant as call targets for this check, with the position of parameters whose referenced buffers should be checked for being properly terminated by a null terminator.Type: dict[bauhaus.analysis.config.QualifiedName, set[int]]
Default:
{ 'strcat': {0, 1}, 'strchr': {0}, 'strcmp': {0, 1}, 'strcoll': {0, 1}, 'strcpy': {1}, 'strcspn': {0, 1}, 'strlen': {0}, 'strpbrk': {0, 1}, 'strrchr': {0}, 'strspn': {0, 1}, 'strstr': {0, 1}, 'strtok': {0, 1} }
exclude_from_pointer_to_const_param_check¶
exclude_from_pointer_to_const_param_check : set[bauhaus.analysis.config.QualifiedName] = {'__builtin_object_size'}
exclude_very_high_indices¶
exclude_very_high_indices : bool = True
exclude_warnings_for_unknown_arguments¶
exclude_warnings_for_unknown_arguments : bool = False
functions_with_ignored_deallocators¶
functions_with_ignored_deallocators : set[str] = set()
ignore_calls_in_functions¶
ignore_calls_in_functions : set[bauhaus.analysis.config.QualifiedName] = set()
report_freed_this_at_call¶
report_freed_this_at_call : bool = False
report_read_pointer_args_in_calls_to_undefined¶
report_read_pointer_args_in_calls_to_undefined : bool = True
report_unbounded_arrays¶
report_unbounded_arrays : bool = False
extern char buf[];.
report_undereferenced_one_past_the_end¶
report_undereferenced_one_past_the_end : bool = False
report_unknown_index¶
report_unknown_index : bool = False
resources¶
resources
Set of resources to be checked (selection of rules in the Resources group).Type: set[str]
Default:
{'C++ArrayHeapMemory', 'C++HeapMemory', 'CudaAsyncMemory', 'CudaDeviceMemory', 'CudaDriverAsyncMemory', 'CudaHostMemory', 'CudaManagedMemory', 'FileHandle', 'HeapMemory', 'UniquePtrHeapMemory'}
track_conditional_initialization¶
track_conditional_initialization : bool = True
use_semantic_analysis¶
use_semantic_analysis : bool = True
witness_paths¶
witness_paths : bool = True
writing_into_pointer_to_const¶
writing_into_pointer_to_const
Names of routines (mapping to parameter index, starting at 0) having a parameter declared as pointer-to-const yet they are still writing into the pointee.Type: dict[bauhaus.analysis.config.QualifiedName, int]
Default:
{ 'cudaMemcpyToSymbol': 0 }