CWE-120

Buffer Copy without Checking Size of Input (‘Classic Buffer Overflow’). [Memory-Buffer-Errors, Improper-Control-Of-A-Resource-Through-Its-Lifetime]

Required inputs: IR, StaticSemanticAnalysis

The product copies an input buffer to an output buffer without verifying that the size of the input buffer is less than the size of the output buffer, leading to a buffer overflow. A buffer overflow condition exists when a product attempts to put more data in a buffer than it can hold, or when it attempts to put data in a memory area outside of the boundaries of a buffer. The simplest type of error, and the most common cause of buffer overflows, is the "classic" case in which the product copies the buffer without restricting how much is copied. Other variants exist, but the existence of a classic overflow strongly suggests that the programmer is not considering even the most basic of security protections.
Demonstrative Examples Functional Areas
Example 1

The following code asks the user to enter their last name and then attempts to store the value entered in the last_name array.

Example Language:C
    char last_name[20];
    printf ("Enter your last name: ");
    scanf ("%s", last_name);

The problem with the code above is that it does not restrict or limit the size of the name entered by the user. If the user enters "Very_very_long_last_name" which is 24 characters long, then a buffer overflow will occur since the array can only hold 20 characters total.

Example 2

The following code attempts to create a local copy of a buffer to perform some manipulations to the data.

Example Language:C
    void manipulate_string(char * string){
        char buf[24];
        strcpy(buf, string);
        ...
    }

However, the programmer does not ensure that the size of the data pointed to by string will fit in the local buffer and copies the data with the potentially dangerous strcpy() function. This may result in a buffer overflow condition if an attacker can influence the contents of the string parameter.

Example 3

The code below calls the gets() function to read in data from the command line.

Example Language:C
        char buf[24];
        printf("Please enter your name and press <Enter>\n");
        gets(buf);
        ...
    }

However, gets() is inherently unsafe, because it copies all input from STDIN to the buffer without checking size. This allows the user to provide a string that is larger than the buffer size, resulting in an overflow condition.

Example 4

In the following example, a server accepts connections from a client and processes the client request. After accepting a client connection, the program will obtain client information using the gethostbyaddr method, copy the hostname of the client that connected to a local variable and output the hostname of the client to a log file.

Example Language:C
    ...
        struct hostent *clienthp;
        char hostname[MAX_LEN];

        // create server socket, bind to server address and listen on socket
        ...

        // accept client connections and process requests
        int count = 0;
        for (count = 0; count < MAX_CONNECTIONS; count++) {
            int clientlen = sizeof(struct sockaddr_in);
            int clientsocket = accept(serversocket, (struct sockaddr *)&clientaddr, &clientlen);

            if (clientsocket >= 0) {
                clienthp = gethostbyaddr((char*) &clientaddr.sin_addr.s_addr, sizeof(clientaddr.sin_addr.s_addr), AF_INET);
                strcpy(hostname, clienthp->h_name);
                logOutput("Accepted client connection from host ", hostname);

                // process client request
                ...
                close(clientsocket);
            }
        }
        close(serversocket);

    ...

However, the hostname of the client that connected may be longer than the allocated size for the local hostname variable. This will result in a buffer overflow when copying the client hostname to the local variable using the strcpy method.

Demonstrative Examples Functional Areas
  • Memory Management
Excerpts from CWE [https://cwe.mitre.org], Copyright (C) 2006-2026, the MITRE Corporation. See section 9.4. "3rd-Party Licenses" in the documentation for full details.

Possible Messages

Key

Text

Severity

Disabled

buffer_too_small

{} may write up to {} characters to buffer of size {}.

None

False

forbidden_libheader_symbol_use

Potential buffer overflow.

None

False

maybe_too_small

Target buffer may be too small. Use snprintf() instead.

None

False

possible_invalid_call_argument

Call to {} with string buffer argument {} that possibly has no valid null delimiter character.

None

False

possible_write_beyond_argument

Call to {} might result in a write access beyond the bounds of argument {}, since argument {} might be too large.

None

False

too_small

Target buffer has {} characters, but sprintf() may write up to {} characters (including null terminator).

None

False

unknown_buffer_size

Potential buffer overflow: {} used with buffer of unknown size.

None

False

unlimited_read

Potential buffer overflow: {} has no limit on amount of characters read.

None

False

Options

allow_extra_args

allow_extra_args : bool = False

Whether to allow additional arguments that are not used by the format string.
 

allow_gnu_extensions

allow_gnu_extensions : bool = False

Whether to allow the GNU extensions to format specifications.
 

allow_unknown_specs

allow_unknown_specs : bool = False

Whether to allow unknown format specifications. It may be necessary to set this option when using implementation-specific extensions. Arguments are not checked when the format string contains unknown format specifications.
 

concat_operations

concat_operations

Type: dict[bauhaus.analysis.config.QualifiedName, typing.Tuple[int, int]]

Default:

{
   'strcat': (0, 1)
}
Names of buffer-concatenating functions being relevant as call targets for this check, with the position of the argument pointing to the destination buffer, and the position of the argument that references the buffer that should be appended at the end of the destination buffer.
 

copy_operations

copy_operations

Type: dict[bauhaus.analysis.config.QualifiedName, typing.Tuple[int, int]]

Default:

{
   'strcpy': (0, 1)
}
Names of buffer copy functions being relevant as call targets for this check, with the position of the destination argument and the source argument of the buffer copy operation.
 

delimiter_of_arguments

delimiter_of_arguments

Type: dict[bauhaus.analysis.config.QualifiedName, set[int]]

Default:

{
   'strcat': {0, 1},
   'strchr': {0},
   'strcmp': {0, 1},
   'strcoll': {0, 1},
   'strcpy': {1},
   'strcspn': {0, 1},
   'strlen': {0},
   'strpbrk': {0, 1},
   'strrchr': {0},
   'strspn': {0, 1},
   'strstr': {0, 1},
   'strtok': {0, 1}
}
Names of functions being relevant as call targets for this check, with the position of parameters whose referenced buffers should be checked for being properly terminated by a null terminator.
 

exclude_warnings_for_unknown_arguments

exclude_warnings_for_unknown_arguments : bool = False

Exclude warnings for cases where nothing at all is known about the arguments of an operation, caused e.g. by using return values of external routines.
 

functions

functions

Type: dict[bauhaus.analysis.config.QualifiedName, typing.Tuple[str, int, typing.Optional[int]]]

Default:

{
   'fscanf': ('scanf', 1, 2),
   'scanf': ('scanf', 0, 1),
   'sscanf': ('scanf', 1, 2),
   'vfscanf': ('scanf', 1, None),
   'vscanf': ('scanf', 0, None),
   'vsscanf': ('scanf', 1, None)
}
A dictionary mapping the names of the functions to check, to a triple (function_kind, fmt_param_index, arg_start_index) where function_kind is either printf or scanf, fmt_param_index is the index of the format-string parameter, and arg_start_index is the index of the first variadic argument.
 

ignore_calls_in_functions

ignore_calls_in_functions : set[bauhaus.analysis.config.QualifiedName] = set()

Qualified names of function definitions in which calls to relevant functions are ignored for this check.
 

included_headers

included_headers : bool = True

Whether the rule should also look in headers included by the configured symbol_header.
 

symbol_header

symbol_header : set[str] = set()

Name of the system or user header files of which the symbols should not be used.
 

symbols

symbols : set[bauhaus.analysis.config.QualifiedName] = {'gets'}

Names of symbols which are forbidden.
 

system_symbol_header

system_symbol_header : set[str] = {'stdio', 'wchar'}

Name of the system header files of which the symbols should not be used.
 

translate_header_name

translate_header_name : bool = True

Whether to auto-translate the symbol_header (e.g. stdlib{.h} -> cstdlib).
 

use_static_semantic_analysis

use_static_semantic_analysis : bool = True

Whether the rule should use the results of the StaticSemanticAnalysis to check buffer sizes. This can produce different findings, as the size of buffer arguments are computed more accurately. This will not enforce StaticSemanticAnalysis to be enabled, but will produce less accurate results if it is not.
 

user_symbol_header

user_symbol_header : set[str] = set()

Name of the user header files of which the symbols should not be used.