CertC++-STR34¶
Cast characters to unsigned char before converting to larger integer sizes
Required inputs: IR
Signed character data must be converted to
unsigned char before being assigned or converted to a larger
signed type. This rule applies to both
signed char and (plain)
char characters on implementations where
char is defined to have the same range, representation, and
behaviors as
signed char.
However, this rule is applicable only in cases where the character data
may contain values that can be interpreted as negative numbers. For example, if
the
char type is represented by a two's complement 8-bit value, any
character value greater than +127 is interpreted as a negative value.
This rule is a generalization of STR37-C. Arguments to character-handling functions must be representable as an unsigned char.
Noncompliant Code Example
This noncompliant code example is taken from a
vulnerability
in bash versions 1.14.6 and earlier that led to the release of CERT
Advisory
CA-1996-22. This vulnerability resulted from the sign
extension of character data referenced by the
c_str pointer in the
yy_string_get() function in the
parse.y module of the bash source code:
static int yy_string_get(void) {
register char *c_str;
register int c;
c_str = bash_input.location.string;
c = EOF;
/* If the string doesn't exist or is empty, EOF found */
if (c_str && *c_str) {
c = *c_str++;
bash_input.location.string = c_str;
}
return (c);
}
The
c_str variable is used to traverse the character string containing
the command line to be parsed. As characters are retrieved from this pointer,
they are stored in a variable of type
int. For implementations in which the
char type is defined to have the same range, representation, and
behavior as
signed char, this value is sign-extended when assigned to the
int variable. For character code 255 decimal (-1 in two's
complement form), this sign extension results in the value -1 being assigned to
the integer, which is indistinguishable from
EOF.
Noncompliant Code Example
This problem can be repaired by explicitly declaring the
c_str variable as
unsigned char:
static int yy_string_get(void) {
register unsigned char *c_str;
register int c;
c_str = bash_input.location.string;
c = EOF;
/* If the string doesn't exist or is empty, EOF found */
if (c_str && *c_str) {
c = *c_str++;
bash_input.location.string = c_str;
}
return (c);
}
This example, however, violates STR04-C. Use plain char for characters in the basic character set.
Compliant Solution
In this compliant solution, the result of the expression
*c_str++ is cast to
unsigned char before assignment to the
int variable
c:
static int yy_string_get(void) {
register char *c_str;
register int c;
c_str = bash_input.location.string;
c = EOF;
/* If the string doesn't exist or is empty, EOF found */
if (c_str && *c_str) {
/* Cast to unsigned type */
c = (unsigned char)*c_str++;
bash_input.location.string = c_str;
}
return (c);
}
Noncompliant Code Example
In this noncompliant code example, the cast of
*s to
unsigned int can result in a value in excess of
UCHAR_MAX because of integer promotions, a violation of
ARR30-C.
Do not form or use out-of-bounds pointers or array subscripts:
#include <limits.h>
#include <stddef.h>
static const char table[UCHAR_MAX + 1] = { 'a' /* ... */ };
ptrdiff_t first_not_in_table(const char *c_str) {
for (const char *s = c_str; *s; ++s) {
if (table[(unsigned int)*s] != *s) {
return s - c_str;
}
}
return -1;
}
Compliant Solution
This compliant solution casts the value of type
char to
unsigned char before the implicit promotion to a larger type:
#include <limits.h>
#include <stddef.h>
static const char table[UCHAR_MAX + 1] = { 'a' /* ... */ };
ptrdiff_t first_not_in_table(const char *c_str) {
for (const char *s = c_str; *s; ++s) {
if (table[(unsigned char)*s] != *s) {
return s - c_str;
}
}
return -1;
}
Risk Assessment
Conversion of character data resulting in a value in excess of
UCHAR_MAX is an often-missed error that can result in a
disturbingly broad range of potentially severe
vulnerabilities.
| Rule | Severity | Likelihood | Remediation Cost | Priority | Level |
|---|---|---|---|---|---|
| STR34-C | Medium | Probable | Medium | P8 | L2 |
Bibliography
| [ xorl 2009] | CVE-2009-0887: Linux-PAM Signedness Issue |
Possible Messages
Key |
Text |
Severity |
Disabled |
|---|---|---|---|
cast_from_char_to_larger_type |
Cast characters to unsigned char before converting to larger integer sizes |
None |
False |
Options¶
This rule shares the following common options: exclude_in_macros, exclude_messages_in_system_headers, excludes, extend_exclude_to_macro_invocations, includes, justification_checker, languages, post_processing, provider, report_at, severity
The following places define options that affect this rule: Stylechecks, Analysis-GlobalOptions
ignored_typedefs¶
ignored_typedefs : set[str] = set()
only_arguments_of¶
only_arguments_of : set[str] = set()
show_operand_in_entity¶
show_operand_in_entity : bool = False
type_system¶
type_system : bauhaus.ir.common.types.type_systems.TypeSystem = <bauhaus.ir.common.types.type_systems.CompilerTypeSystem object at 0x7f6f1c5fd510>