CertC-STR37¶
Arguments to character-handling functions must be representable as an unsigned char
Required inputs: IR
According to the C Standard, 7.4 [ ISO/IEC 9899:2011],
The header
<ctype.h>declares several functions useful for classifying and mapping characters. In all cases the argument is anint, the value of which shall be representable as anunsigned charor shall equal the value of the macroEOF. If the argument has any other value, the behavior is undefined.
See also undefined behavior 113.
This rule is applicable only to code that runs on platforms where the
char data type is defined to have the same range, representation,
and behavior as
signed char.
Following are the character classification functions that this rule addresses:
isalnum() |
isalpha() |
isascii()XSI |
isblank() |
iscntrl() |
isdigit() |
isgraph() |
islower() |
isprint() |
ispunct() |
isspace() |
isupper() |
isxdigit() |
toascii()XSI |
toupper() |
tolower() |
XSI denotes an X/Open System Interfaces Extension to ISO/IEC 9945-POSIX. These functions are not defined by the C Standard.
This rule is a specific instance of STR34-C. Cast characters to unsigned char before converting to larger integer sizes.
Noncompliant Code Example
On implementations where plain
char is signed, this code example is noncompliant because the
parameter to
isspace(),
*t, is defined as a
const char *, and this value might not be representable as an
unsigned char:
#include <ctype.h>
#include <string.h>
size_t count_preceding_whitespace(const char *s) {
const char *t = s;
size_t length = strlen(s) + 1;
while (isspace(*t) && (t - s < length)) {
++t;
}
return t - s;
}
The argument to
isspace() must be
EOF or representable as an
unsigned char; otherwise, the result is undefined.
Compliant Solution
This compliant solution casts the character to
unsigned char before passing it as an argument to the
isspace() function:
#include <ctype.h>
#include <string.h>
size_t count_preceding_whitespace(const char *s) {
const char *t = s;
size_t length = strlen(s) + 1;
while (isspace((unsigned char)*t) && (t - s < length)) {
++t;
}
return t - s;
}
Risk Assessment
Passing values to character handling functions that cannot be represented as an
unsigned char to character handling functions is
undefined
behavior.
| Rule | Severity | Likelihood | Remediation Cost | Priority | Level |
|---|---|---|---|---|---|
| STR37-C | Low | Unlikely | Low | P3 | L3 |
Related Guidelines
| Taxonomy | Taxonomy item | Relationship |
|---|---|---|
| CERT C Secure Coding Standard | STR34-C. Cast characters to unsigned char before converting to larger integer sizes | Prior to 2018-01-12: CERT: Unspecified Relationship |
| ISO/IEC TS 17961 | Passing arguments to character-handling functions that are not representable as unsigned char [chrsgnext] | Prior to 2018-01-12: CERT: Unspecified Relationship |
| CWE 2.11 | CWE-704, Incorrect Type Conversion or Cast | 2017-06-14: CERT: Rule subset of CWE |
Bibliography
| [ ISO/IEC 9899:2011] | 7.4, "Character Handling <
ctype.h>"
|
| [ Kettlewell 2002] | Section 1.1, "<
ctype.h> and Characters Types"
|
Possible Messages
Key |
Text |
Severity |
Disabled |
|---|---|---|---|
cast_from_char_to_larger_type |
Arguments to character-handling functions must be representable as an unsigned char |
None |
False |
Options¶
This rule shares the following common options: exclude_in_macros, exclude_messages_in_system_headers, excludes, extend_exclude_to_macro_invocations, includes, justification_checker, languages, post_processing, provider, report_at, severity
The following places define options that affect this rule: Stylechecks, Analysis-GlobalOptions
ignored_typedefs¶
ignored_typedefs : set[str] = set()
only_arguments_of¶
only_arguments_of
Can be used to provide a set of function/macro names; only arguments to them will be considered thenType: set[str]
Default:
{'isalnum', 'isalpha', 'isascii', 'isblank', 'iscntrl', 'isdigit', 'isgraph', 'islower', 'isprint', 'ispunct', 'isspace', 'isupper', 'isxdigit', 'toascii', 'tolower', 'toupper'}
show_operand_in_entity¶
show_operand_in_entity : bool = False
type_system¶
type_system : bauhaus.ir.common.types.type_systems.TypeSystem = <bauhaus.ir.common.types.type_systems.CompilerTypeSystem object at 0x7f6f1c5fd510>