CWE-194¶
Unexpected Sign Extension. [Improper-Control-Of-A-Resource-Through-Its-Lifetime]
Required inputs: IR, StaticSemanticAnalysis
Demonstrative Examples
Example 1
The following code reads a maximum size and performs a sanity check on that size. It then performs a strncpy, assuming it will not exceed the boundaries of the array. While the use of "short s" is forced in this particular example, short int's are frequently used within real-world code, such as code that processes structured data.
Example Language:C
int GetUntrustedInt () {
return(0x0000FFFF);
}
void main (int argc, char **argv) {
char path[256];
char *input;
int i;
short s;
unsigned int sz;
i = GetUntrustedInt();
s = i;
/* s is -1 so it passes the safety check - CWE-697 */
if (s > 256) {
DiePainfully("go away!\n");
}
/* s is sign-extended and saved in sz */
sz = s;
/* output: i=65535, s=-1, sz=4294967295 - your mileage may vary */
printf("i=%d, s=%d, sz=%u\n", i, s, sz);
input = GetUserInput("Enter pathname:");
/* strncpy interprets s as unsigned int, so it's treated as MAX_INT
( CWE-195 ), enabling buffer overflow ( CWE-119 ) */
strncpy(path, input, s);
path[255] = '\0'; /* don't want CWE-170 */
printf("Path is: %s\n", path);
}
This code first exhibits an example of CWE-839, allowing "s" to be a negative number. When the negative short "s" is converted to an unsigned integer, it becomes an extremely large positive integer. When this converted integer is used by strncpy() it will lead to a buffer overflow (CWE-119).
Excerpts from CWE [https://cwe.mitre.org], Copyright (C) 2006-2026, the MITRE Corporation. See section 9.4. "3rd-Party Licenses" in the documentation for full details.Possible Messages
Key |
Text |
Severity |
Disabled |
|---|---|---|---|
cast_truncate |
Possible sign extension in conversion from signed to unsigned type. |
None |
False |
cast_underflow |
Possible sign extension in conversion from signed to unsigned type. |
None |
False |
certain_shift_amount_negative |
Shift by a negative bit count (undefined behavior) |
None |
False |
certain_shift_amount_too_large |
Shift by the integer width or more (undefined behavior) |
None |
False |
certain_shift_right_negative |
Right shift with negative left-hand-side |
None |
False |
Options¶
This rule shares the following common options: exclude_in_macros, exclude_messages_in_system_headers, excludes, extend_exclude_to_macro_invocations, includes, justification_checker, languages, post_processing, provider, report_at, severity
The following places define options that affect this rule: Stylechecks, Analysis-GlobalOptions
abstract_interpretation_maximal_tracked_array_index¶
abstract_interpretation_maximal_tracked_array_index : int = 10
The number of explicit indices in array expressions per routine tracked by the "symbolic expression analysis". For example, consider the following program.
extern signed char a[6];
int main()
{
if (a[2] < 0)
{
a[2]++;
}
if (a[3] < 0)
{
a[3]++;
}
if (a[4] < 0)
{
a[4]++;
}
return 0;
}
If the value of this option is set to 2, the first two array index expressions
encountered in the routine are tracked. Hence, the analysis can use the facts
a[2] < 0 and a[3] < 0 to infer that a[2]++
and a[3]++ do not overflow, but it will not track the third array
access in this routine.
A higher value of the option can cause more consumption of memory and time for the analysis.
abstract_interpretation_overflow¶
abstract_interpretation_overflow : bool = False
abstract_interpretation_overflow_unrolling_level¶
abstract_interpretation_overflow_unrolling_level : int = 0
check_signed¶
check_signed : bool = False
check_unsigned¶
check_unsigned : bool = True
suppress_well_defined_findings¶
suppress_well_defined_findings : SuppressionMode = 'NONE'
Some overflows have well-defined semantics in all C/C++ standard
versions. The typical example is UINT_MAX+1 which is
well-defined as 0 via wraparound. This differs from
INT_MAX+1 which is either undefined or implementation-defined
depending on the considered standard version. Most CPUs will compute
INT_MIN but this wraparound is not guaranteed by any C/C++
standard.
Both cases are overflows and are reported by this rule. However, one might want to suppress messages for the well-defined cases. To suppress these activate this option.
Different C and C++ standard versions differ in what is well-defined, implementation-defined, or undefined. Luckily, if we only consider well-defined and do not discern between implementation-defined and undefined, we end up with only two groups: pre-C++20 and since-C++20.
Option Types¶
These types are used by options listed above:
SuppressionMode¶
An enumeration.NONE
Suppress nothing.
PRE_CPP2020
Suppress findings that are well-defined before C++20. These are:
- Over- and underflows of unsigned integers during addition, subtraction, and multiplication
- Conversions from unsigned to unsigned integers
- Wrap-around caused by left-shifting of unsigned integer
CPP2020
Suppress findings that are well-defined since C++20. These are:
- Over- and underflows of unsigned integers during addition, subtraction, and multiplication
- Conversions between signed and unsigned integers
- Wrap-around caused by left-shifting
- Shifting negative integers
Surprising mechanics of C++20 signed narrow integers
Since C++20, casts between signed and unsigned are defined as two-complement wrap-around. Overflows of signed integers are still undefined behavior and are reported by this rule. But, due to integer promotion rules, certain expressions are computed using wider integer types, which can lead to the false impression that this is no longer the case, because no overflow findings are reported there.
Suppose, that the code is compiled on a platform where short
is smaller than or equal to half the size of an int. Very
commonly the sizes are 2 and 4. This assumption is thus true for many
platforms.
In this case, narrow signed integer types such as short or
signed char are first implicitly promoted to int
before the arithmetic operation is executed. Because of this promotion, the
actual operation does not overflow and is thus well-defined. After the
operation, an implicit cast is performed to the narrower type. This cast is
well-defined in C++20 as wrapping around.
Consider the following snippet:
static_assert(sizeof(short) == 2);
static_assert(sizeof(int) == 4);
short a = 0x1000;
short b = 0x1001;
short c = a*b;
C++20 defines c as 0x1000. The reason is that
a*b is implicitly promoted to static_cast<int>
(a)*static_cast<int>(b). After the promotion, the
multiplication does not overflow and yields a well-defined
0x1001000. This number is then implicitly cast to
0x1000 which is also a well-defined operation.
An analogous effect can be observed for signed short addition and
multiplication. Another effect is that it is well-defined to shift by up to
as many bits as int has even if the shifted integer has fewer
bits.
DERIVE_FROM_IR
Derive the language version from the IR compilation flags and suppress findings accordingly.