CWE-194¶

Unexpected Sign Extension. [Improper-Control-Of-A-Resource-Through-Its-Lifetime]

Required inputs: IR, StaticSemanticAnalysis

The product performs an operation on a number that causes it to be sign extended when it is transformed into a larger data type. When the original number is negative, this can produce unexpected values that lead to resultant weaknesses.

Demonstrative Examples

Example 1

The following code reads a maximum size and performs a sanity check on that size. It then performs a strncpy, assuming it will not exceed the boundaries of the array. While the use of "short s" is forced in this particular example, short int's are frequently used within real-world code, such as code that processes structured data.

Example Language:C
    int GetUntrustedInt () {
        return(0x0000FFFF);
    }

    void main (int argc, char **argv) {
        char path[256];
        char *input;
        int i;
        short s;
        unsigned int sz;

        i = GetUntrustedInt();
        s = i;
        /* s is -1 so it passes the safety check -    CWE-697    */
        if (s > 256) {
            DiePainfully("go away!\n");
        }

        /* s is sign-extended and saved in sz */
        sz = s;

        /* output: i=65535, s=-1, sz=4294967295 - your mileage may vary */
        printf("i=%d, s=%d, sz=%u\n", i, s, sz);

        input = GetUserInput("Enter pathname:");

        /* strncpy interprets s as unsigned int, so it's treated as MAX_INT
        (    CWE-195    ), enabling buffer overflow (    CWE-119    ) */
        strncpy(path, input, s);
        path[255] = '\0'; /* don't want    CWE-170    */
        printf("Path is: %s\n", path);
    }

This code first exhibits an example of CWE-839, allowing "s" to be a negative number. When the negative short "s" is converted to an unsigned integer, it becomes an extremely large positive integer. When this converted integer is used by strncpy() it will lead to a buffer overflow (CWE-119).

Possible Messages

Key	Text	Severity	Disabled
cast_truncate	Possible sign extension in conversion from signed to unsigned type.	None	False
cast_underflow	Possible sign extension in conversion from signed to unsigned type.	None	False
certain_shift_amount_negative	Shift by a negative bit count (undefined behavior)	None	False
certain_shift_amount_too_large	Shift by the integer width or more (undefined behavior)	None	False
certain_shift_right_negative	Right shift with negative left-hand-side	None	False

Options¶

This rule shares the following common options: exclude_in_macros, exclude_messages_in_system_headers, excludes, extend_exclude_to_macro_invocations, includes, justification_checker, languages, post_processing, provider, report_at, severity
The following places define options that affect this rule: Stylechecks, Analysis-GlobalOptions

abstract_interpretation_maximal_tracked_array_index¶

abstract_interpretation_maximal_tracked_array_index : int = 10

The number of explicit indices in array expressions per routine tracked by the "symbolic expression analysis". For example, consider the following program.

extern signed char a[6];
int main()
{
    if (a[2] < 0)
    {
        a[2]++;
    }
    if (a[3] < 0)
    {
        a[3]++;
    }
    if (a[4] < 0)
    {
        a[4]++;
    }
    return 0;
}

If the value of this option is set to 2, the first two array index expressions encountered in the routine are tracked. Hence, the analysis can use the facts a[2] < 0 and a[3] < 0 to infer that a[2]++ and a[3]++ do not overflow, but it will not track the third array access in this routine.

A higher value of the option can cause more consumption of memory and time for the analysis.

abstract_interpretation_overflow¶

abstract_interpretation_overflow : bool = False

Use abstract-interpretation-based "symbolic expression analysis" as additional postprocessing step.

abstract_interpretation_overflow_unrolling_level¶

abstract_interpretation_overflow_unrolling_level : int = 0

How many levels of conditions are traversed to compute additional constraints for the "symbolic expression analysis".

check_signed¶

check_signed : bool = False

Whether issues for signed integer operations should be reported. For casts including implicit conversions, the target type of the cast is used.

check_unsigned¶

check_unsigned : bool = True

Whether wrap-around for unsigned integer operations should be reported. For casts including implicit conversions, the target type of the cast is used.

suppress_well_defined_findings¶

suppress_well_defined_findings : SuppressionMode = 'NONE'

Some overflows have well-defined semantics in all C/C++ standard versions. The typical example is UINT_MAX+1 which is well-defined as 0 via wraparound. This differs from INT_MAX+1 which is either undefined or implementation-defined depending on the considered standard version. Most CPUs will compute INT_MIN but this wraparound is not guaranteed by any C/C++ standard.

Both cases are overflows and are reported by this rule. However, one might want to suppress messages for the well-defined cases. To suppress these activate this option.

Different C and C++ standard versions differ in what is well-defined, implementation-defined, or undefined. Luckily, if we only consider well-defined and do not discern between implementation-defined and undefined, we end up with only two groups: pre-C++20 and since-C++20.

Option Types¶

These types are used by options listed above:

SuppressionMode¶

An enumeration.

NONE

Suppress nothing.

PRE_CPP2020

Suppress findings that are well-defined before C++20. These are:

Over- and underflows of unsigned integers during addition, subtraction, and multiplication
Conversions from unsigned to unsigned integers
Wrap-around caused by left-shifting of unsigned integer

CPP2020

Suppress findings that are well-defined since C++20. These are:

Over- and underflows of unsigned integers during addition, subtraction, and multiplication
Conversions between signed and unsigned integers
Wrap-around caused by left-shifting
Shifting negative integers

Surprising mechanics of C++20 signed narrow integers

Since C++20, casts between signed and unsigned are defined as two-complement wrap-around. Overflows of signed integers are still undefined behavior and are reported by this rule. But, due to integer promotion rules, certain expressions are computed using wider integer types, which can lead to the false impression that this is no longer the case, because no overflow findings are reported there.

Suppose, that the code is compiled on a platform where short is smaller than or equal to half the size of an int. Very commonly the sizes are 2 and 4. This assumption is thus true for many platforms.

In this case, narrow signed integer types such as short or signed char are first implicitly promoted to int before the arithmetic operation is executed. Because of this promotion, the actual operation does not overflow and is thus well-defined. After the operation, an implicit cast is performed to the narrower type. This cast is well-defined in C++20 as wrapping around.

Consider the following snippet:

    static_assert(sizeof(short) == 2);
    static_assert(sizeof(int) == 4);
    short a = 0x1000;
    short b = 0x1001;
    short c = a*b;

C++20 defines c as 0x1000. The reason is that a*b is implicitly promoted to

static_cast<int>
(a)*static_cast<int>(b)

. After the promotion, the multiplication does not overflow and yields a well-defined 0x1001000. This number is then implicitly cast to 0x1000 which is also a well-defined operation.

An analogous effect can be observed for signed short addition and multiplication. Another effect is that it is well-defined to shift by up to as many bits as int has even if the shifted integer has fewer bits.

DERIVE_FROM_IR

Derive the language version from the IR compilation flags and suppress findings accordingly.

Axivion Suite 7.12.2-public

Navigation