CertC-FIO47¶
Use valid format strings
Required inputs: IR
The formatted output functions (
fprintf() and related functions) convert, format, and print their
arguments under control of a format string. The C Standard, 7.21.6.1,
paragraph 3 [
ISO/IEC
9899:2011], specifies
The format shall be a multibyte character sequence, beginning and ending in its initial shift state. The format is composed of zero or more directives: ordinary multibyte characters (not %), which are copied unchanged to the output stream; and conversion specifications, each of which results in fetching zero or more subsequent arguments, converting them, if applicable, according to the corresponding conversion specifier, and then writing the result to the output stream.
Each conversion specification is introduced by the
% character followed (in order) by
- Zero or more flags (in any order), which modify the meaning of the conversion specification
- An optional minimum field width
- An optional precision that gives the minimum number of digits, the maximum number of digits, or the maximum number of bytes, etc. depending on the conversion specifier
- An optional length modifier that specifies the size of the argument
- A conversion specifier character that indicates the type of conversion to be applied
Common mistakes in creating format strings include
- Providing an incorrect number of arguments for the format string
- Using invalid conversion specifiers
- Using a flag character that is incompatible with the conversion specifier
- Using a length modifier that is incompatible with the conversion specifier
- Mismatching the argument type and conversion specifier
- Using an argument of type other than
intfor width or precision
The following table summarizes the compliance of various conversion
specifications. The first column contains one or more conversion specifier
characters. The next four columns consider the combination of the specifier
characters with the various flags (the apostrophe [
'],
-,
+, the space character,
#, and
0). The next eight columns consider the combination of the
specifier characters with the various length modifiers (
h,
hh,
l,
ll,
j,
z,
t, and
L).
Valid combinations are marked with a type name; arguments matched with the
conversion specification are interpreted as that type. For example, an argument
matched with the specifier
%hd is interpreted as a
short, so
short appears in the cell where
d and
h intersect. The last column denotes the expected types of
arguments matched with the original specifier characters.
Valid and meaningful combinations are marked by the (✓) symbol (save for the length modifier columns, as described previously). Valid combinations that have no effect are labeled N/E. Using a combination marked by the (×) symbol, using a specification not represented in the table, or using an argument of an unexpected type is undefined behavior. (See undefined behaviors 153, 155, 157, 158, 161, and 162.)
| Conversion Specifier Character |
' XSI |
- + SPACE |
# |
0 |
h |
hh |
l |
ll |
j |
z |
t |
L |
Argument Type |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
d,
i
|
(✓) | (✓) | (×) | (✓) | short |
signed char |
long |
long long |
intmax_t |
size_t |
ptrdiff_t |
(×) | Signed integer |
o |
(×) | (✓) | (✓) | (✓) | unsigned
short
|
|
|
|
uintmax_t |
size_t |
ptrdiff_t |
(×) | Unsigned integer |
u |
(✓) | (✓) | (×) | (✓) | unsigned short |
|
|
|
uintmax_t |
size_t |
ptrdiff_t |
(×) | Unsigned integer |
x,
X
|
(×) | (✓) | (✓) | (✓) | unsigned short |
|
|
|
uintmax_t |
size_t |
ptrdiff_t |
(×) | Unsigned integer |
f,
F
|
(✓) | (✓) | (✓) | (✓) | (×) | (×) | N/E | N/E | (×) | (×) | (×) | long double |
double or
long double
|
e,
E
|
(×) | (✓) | (✓) | (✓) | (×) | (×) | N/E | N/E | (×) | (×) | (×) | long double |
double or
long double
|
g,
G
|
(✓) | (✓) | (✓) | (✓) | (×) | (×) | N/E | N/E | (×) | (×) | (×) | long double |
double or
long double
|
a,
A
|
(✓) | (✓) | (✓) | (✓) | (×) | (×) | N/E | N/E | (×) | (×) | (×) | long double |
double or
long double
|
c |
(×) | (✓) | (×) | (×) | (×) | (×) | wint_t |
(×) | (×) | (×) | (×) | (×) | int or
wint_t
|
s |
(×) | (✓) | (×) | (×) | (×) | (×) | NTWS | (×) | (×) | (×) | (×) | (×) | NTBS or NTWS |
p |
(×) | (✓) | (×) | (×) | (×) | (×) | (×) | (×) | (×) | (×) | (×) | (×) | void* |
n |
(×) | (×) | (×) | (×) | short* |
char* |
long* |
long long* |
intmax_t* |
size_t* |
ptrdiff_t* |
(×) | Pointer to integer |
C XSI |
(×) | (✓) | (×) | (×) | (×) | (×) | (×) | (×) | (×) | (×) | (×) | (×) | wint_t |
S XSI |
(×) | (✓) | (×) | (×) | (×) | (×) | (×) | (×) | (×) | (×) | (×) | (×) | NTWS |
% |
(×) | (×) | (×) | (×) | (×) | (×) | (×) | (×) | (×) | (×) | (×) | (×) | None |
SPACE: The space (
" ") character
N/E: No
effect
NTBS:
char* argument pointing to a null-terminated character
string
NTWS:
wchar_t* argument pointing to a null-terminated wide character
string
XSI:
ISO/IEC
9945-2003 XSI extension
The formatted input functions (
fscanf() and related functions) use similarly specified format
strings and impose similar restrictions on their format strings and arguments.
Do not supply an unknown or invalid conversion specification or an invalid combination of flag character, precision, length modifier, or conversion specifier to a formatted IO function. Likewise, do not provide a number or type of argument that does not match the argument type of the conversion specifier used in the format string.
Format strings are usually string literals specified at the call site, but they need not be. However, they should not contain tainted values. (See FIO30-C. Exclude user input from format strings for more information.)
Noncompliant Code Example
Mismatches between arguments and conversion specifications may result in
undefined
behavior. Compilers may diagnose type mismatches in formatted output
function invocations. In this noncompliant code example, the
error_type argument to
printf() is incorrectly matched with the
s specifier rather than with the
d specifier. Likewise, the
error_msg argument is incorrectly matched with the
d specifier instead of the
s specifier. These usages result in
undefined
behavior. One possible result of this invocation is that
printf() will interpret the
error_type argument as a pointer and try to read a string from the
address that
error_type contains, possibly resulting in an access violation.
#include <stdio.h>
void func(void) {
const char *error_msg = "Resource not available to user.";
int error_type = 3;
/* ... */
printf("Error (type %s): %d\n", error_type, error_msg);
/* ... */
}
Compliant Solution
This compliant solution ensures that the arguments to the
printf() function match their respective conversion
specifications:
#include <stdio.h>
void func(void) {
const char *error_msg = "Resource not available to user.";
int error_type = 3;
/* ... */
printf("Error (type %d): %s\n", error_type, error_msg);
/* ... */
}
Risk Assessment
Incorrectly specified format strings can result in memory corruption or abnormal program termination.
| Rule | Severity | Likelihood | Remediation Cost | Priority | Level |
|---|---|---|---|---|---|
| FIO47-C | High | Unlikely | Medium | P6 | L2 |
Related Guidelines
| Taxonomy | Taxonomy item | Relationship |
|---|---|---|
| CERT C | FIO00-CPP. Take care when creating format strings | Prior to 2018-01-12: CERT: Unspecified Relationship |
| ISO/IEC TS 17961:2013 | Using invalid format strings [invfmtstr] | Prior to 2018-01-12: CERT: Unspecified Relationship |
| CWE 2.11 | CWE-686, Function Call with Incorrect Argument Type | 2017-06-29: CERT: Partial overlap |
| CWE 2.11 | CWE-685 | 2017-06-29: CERT: Partial overlap |
Bibliography
| [ ISO/IEC 9899:2011] | Subclause 7.21.6.1, "The
fprintf Function"
|
Possible Messages
Key |
Text |
Severity |
Disabled |
|---|---|---|---|
arg_type_mismatch |
{} expects argument of type ‘{}’, but argument {} has type ‘{}’ |
None |
False |
invalid_conversion |
Invalid or non-standard conversion specification |
None |
False |
matching_arg_expected |
{} expects a matching ‘{}’ argument |
None |
False |
precision_for_conversion |
Precision must not be used with %{} conversion specifier |
None |
False |
too_many_args |
Too many arguments for format. |
None |
False |
unsupported_assignment_suppression |
%n does not support assignment suppression |
None |
False |
unsupported_field_width |
%n does not support field width |
None |
False |
unsupported_flags |
%n does not support flags |
None |
False |
unsupported_flags_modifiers |
Cannot use any flags or modifiers with ‘%%’ |
None |
False |
unsupported_hash |
%{} does not support the ‘#’ flag |
None |
False |
unsupported_i_flag |
%{} does not support the ‘I’ flag |
None |
False |
unsupported_length_modifier |
%{} does not support the ‘{}’ length modifier |
None |
False |
unsupported_tick |
%{} does not support the “’” flag |
None |
False |
unsupported_zero |
%{} does not support the ‘0’ flag |
None |
False |
Options¶
This rule shares the following common options: exclude_in_macros, exclude_messages_in_system_headers, excludes, extend_exclude_to_macro_invocations, includes, justification_checker, languages, post_processing, provider, report_at, severity
The following places define options that affect this rule: Stylechecks, Analysis-GlobalOptions
allow_extra_args¶
allow_extra_args : bool = False
allow_gnu_extensions¶
allow_gnu_extensions : bool = True
allow_unknown_specs¶
allow_unknown_specs : bool = False
functions¶
functions
A dictionary mapping the names of the functions to check, to a tripleType: dict[bauhaus.analysis.config.QualifiedName, typing.Tuple[str, int, typing.Optional[int]]]
Default:
{ '_printf_l': ('printf', 1, 3), 'fprintf': ('printf', 1, 2), 'fscanf': ('scanf', 1, 2), 'printf': ('printf', 0, 1), 'scanf': ('scanf', 0, 1), 'snprintf': ('printf', 2, 3), 'sprintf': ('printf', 1, 2), 'sscanf': ('scanf', 1, 2), 'vfprintf': ('printf', 1, None), 'vfscanf': ('scanf', 1, None), 'vprintf': ('printf', 0, None), 'vscanf': ('scanf', 0, None), 'vsnprintf': ('printf', 2, None), 'vsprintf': ('printf', 1, None), 'vsscanf': ('scanf', 1, None) }
(function_kind, fmt_param_index, arg_start_index) where
function_kind is either printf or scanf,
fmt_param_index is the index of the format-string parameter, and
arg_start_index is the index of the first variadic argument.
use_static_semantic_analysis¶
use_static_semantic_analysis : bool = True
StaticSemanticAnalysis
to be enabled, but will produce less accurate results if it is not.