CertC-FIO47

Use valid format strings

Required inputs: IR

The formatted output functions ( fprintf() and related functions) convert, format, and print their arguments under control of a format string. The C Standard, 7.21.6.1, paragraph 3 [ ISO/IEC 9899:2011], specifies

The format shall be a multibyte character sequence, beginning and ending in its initial shift state. The format is composed of zero or more directives: ordinary multibyte characters (not %), which are copied unchanged to the output stream; and conversion specifications, each of which results in fetching zero or more subsequent arguments, converting them, if applicable, according to the corresponding conversion specifier, and then writing the result to the output stream.

Each conversion specification is introduced by the % character followed (in order) by

  • Zero or more flags (in any order), which modify the meaning of the conversion specification
  • An optional minimum field width
  • An optional precision that gives the minimum number of digits, the maximum number of digits, or the maximum number of bytes, etc. depending on the conversion specifier
  • An optional length modifier that specifies the size of the argument
  • A conversion specifier character that indicates the type of conversion to be applied

Common mistakes in creating format strings include

  • Providing an incorrect number of arguments for the format string
  • Using invalid conversion specifiers
  • Using a flag character that is incompatible with the conversion specifier
  • Using a length modifier that is incompatible with the conversion specifier
  • Mismatching the argument type and conversion specifier
  • Using an argument of type other than int for width or precision

The following table summarizes the compliance of various conversion specifications. The first column contains one or more conversion specifier characters. The next four columns consider the combination of the specifier characters with the various flags (the apostrophe [ '], -, +, the space character, #, and 0). The next eight columns consider the combination of the specifier characters with the various length modifiers ( h, hh, l, ll, j, z, t, and L).

Valid combinations are marked with a type name; arguments matched with the conversion specification are interpreted as that type. For example, an argument matched with the specifier %hd is interpreted as a short, so short appears in the cell where d and h intersect. The last column denotes the expected types of arguments matched with the original specifier characters.

Valid and meaningful combinations are marked by the (✓) symbol (save for the length modifier columns, as described previously). Valid combinations that have no effect are labeled N/E. Using a combination marked by the (×) symbol, using a specification not represented in the table, or using an argument of an unexpected type is undefined behavior. (See undefined behaviors 153, 155, 157, 158, 161, and 162.) 

Conversion
Specifier
Character
' XSI -
+
SPACE

#

0

h

hh

l

ll

j

z

t

L
Argument
Type
d, i (✓) (✓) (×) (✓) short signed char long long long intmax_t size_t ptrdiff_t (×) Signed integer
o (×) (✓) (✓) (✓) unsigned short unsigned char unsigned long unsigned long long uintmax_t size_t ptrdiff_t (×) Unsigned integer
u (✓) (✓) (×) (✓) unsigned short unsigned  char unsigned long unsigned long long uintmax_t size_t ptrdiff_t (×) Unsigned integer
x, X (×) (✓) (✓) (✓) unsigned short unsigned char unsigned long unsigned long long uintmax_t size_t ptrdiff_t (×) Unsigned integer
f, F (✓) (✓) (✓) (✓) (×) (×) N/E N/E (×) (×) (×) long double double or long double
e, E (×) (✓) (✓) (✓) (×) (×) N/E N/E (×) (×) (×) long double double or long double
g, G (✓) (✓) (✓) (✓) (×) (×) N/E N/E (×) (×) (×) long double double or long double
a, A (✓) (✓) (✓) (✓) (×) (×) N/E N/E (×) (×) (×) long double double or long double
c (×) (✓) (×) (×) (×) (×) wint_t (×) (×) (×) (×) (×) int or wint_t
s (×) (✓) (×) (×) (×) (×) NTWS (×) (×) (×) (×) (×) NTBS or NTWS
p (×) (✓) (×) (×) (×) (×) (×) (×) (×) (×) (×) (×) void*
n (×) (×) (×) (×) short* char* long* long long* intmax_t* size_t* ptrdiff_t* (×) Pointer to integer
C XSI (×) (✓) (×) (×) (×) (×) (×) (×) (×) (×) (×) (×) wint_t
S XSI (×) (✓) (×) (×) (×) (×) (×) (×) (×) (×) (×) (×) NTWS
% (×) (×) (×) (×) (×) (×) (×) (×) (×) (×) (×) (×) None

     SPACE: The space ( " ") character
     N/E: No effect
     NTBS: char* argument pointing to a null-terminated character string
     NTWS: wchar_t* argument pointing to a null-terminated wide character string
     XSI: ISO/IEC 9945-2003 XSI extension

The formatted input functions ( fscanf() and related functions) use similarly specified format strings and impose similar restrictions on their format strings and arguments.

Do not supply an unknown or invalid conversion specification or an invalid combination of flag character, precision, length modifier, or conversion specifier to a formatted IO function. Likewise, do not provide a number or type of argument that does not match the argument type of the conversion specifier used in the format string.

Format strings are usually string literals specified at the call site, but they need not be. However, they should not contain tainted values. (See  FIO30-C. Exclude user input from format strings for more information.)

Noncompliant Code Example

Mismatches between arguments and conversion specifications may result in undefined behavior. Compilers may diagnose type mismatches in formatted output function invocations. In this noncompliant code example, the error_type argument to printf() is incorrectly matched with the s specifier rather than with the d specifier. Likewise, the  error_msg argument is incorrectly matched with the d specifier instead of the s specifier. These usages result in undefined behavior. One possible result of this invocation is that printf() will interpret the error_type argument as a pointer and try to read a string from the address that error_type contains, possibly resulting in an access violation.

#include <stdio.h>
 
void func(void) {
  const char *error_msg = "Resource not available to user.";
  int error_type = 3;
  /* ... */
  printf("Error (type %s): %d\n", error_type, error_msg);
  /* ... */
}
Compliant Solution

This compliant solution ensures that the arguments to the printf() function match their respective conversion specifications:

#include <stdio.h>
 
void func(void) {
  const char *error_msg = "Resource not available to user.";
  int error_type = 3;
  /* ... */
  printf("Error (type %d): %s\n", error_type, error_msg);

  /* ... */
}
Risk Assessment

Incorrectly specified format strings can result in memory corruption or abnormal program termination.

Rule Severity Likelihood Remediation Cost Priority Level
FIO47-C High Unlikely Medium P6 L2
Related Guidelines
Taxonomy Taxonomy item Relationship
CERT C FIO00-CPP. Take care when creating format strings Prior to 2018-01-12: CERT: Unspecified Relationship
ISO/IEC TS 17961:2013 Using invalid format strings [invfmtstr] Prior to 2018-01-12: CERT: Unspecified Relationship
CWE 2.11 CWE-686, Function Call with Incorrect Argument Type 2017-06-29: CERT: Partial overlap
CWE 2.11 CWE-685 2017-06-29: CERT: Partial overlap
Bibliography
[ ISO/IEC 9899:2011] Subclause 7.21.6.1, "The fprintf Function"
Excerpt from SEI CERT C Coding Standard: Rules for Developing Safe, Reliable, and Secure Systems (2016 Edition) and SEI CERT C Coding Standard [https://cmu-sei.github.io/secure-coding-standards/sei-cert-c-coding-standard/rules/input-output-fio/fio47-c], Copyright (C) 1995-2026 Carnegie Mellon University. See section 9.4. "3rd-Party Licenses" in the documentation for full details.

Possible Messages

Key

Text

Severity

Disabled

arg_type_mismatch

{} expects argument of type ‘{}’, but argument {} has type ‘{}’

None

False

invalid_conversion

Invalid or non-standard conversion specification

None

False

matching_arg_expected

{} expects a matching ‘{}’ argument

None

False

precision_for_conversion

Precision must not be used with %{} conversion specifier

None

False

too_many_args

Too many arguments for format.

None

False

unsupported_assignment_suppression

%n does not support assignment suppression

None

False

unsupported_field_width

%n does not support field width

None

False

unsupported_flags

%n does not support flags

None

False

unsupported_flags_modifiers

Cannot use any flags or modifiers with ‘%%’

None

False

unsupported_hash

%{} does not support the ‘#’ flag

None

False

unsupported_i_flag

%{} does not support the ‘I’ flag

None

False

unsupported_length_modifier

%{} does not support the ‘{}’ length modifier

None

False

unsupported_tick

%{} does not support the “’” flag

None

False

unsupported_zero

%{} does not support the ‘0’ flag

None

False

Options

allow_extra_args

allow_extra_args : bool = False

Whether to allow additional arguments that are not used by the format string.
 

allow_gnu_extensions

allow_gnu_extensions : bool = True

Whether to allow the GNU extensions to format specifications.
 

allow_unknown_specs

allow_unknown_specs : bool = False

Whether to allow unknown format specifications. It may be necessary to set this option when using implementation-specific extensions. Arguments are not checked when the format string contains unknown format specifications.
 

functions

functions

Type: dict[bauhaus.analysis.config.QualifiedName, typing.Tuple[str, int, typing.Optional[int]]]

Default:

{
   '_printf_l': ('printf', 1, 3),
   'fprintf': ('printf', 1, 2),
   'fscanf': ('scanf', 1, 2),
   'printf': ('printf', 0, 1),
   'scanf': ('scanf', 0, 1),
   'snprintf': ('printf', 2, 3),
   'sprintf': ('printf', 1, 2),
   'sscanf': ('scanf', 1, 2),
   'vfprintf': ('printf', 1, None),
   'vfscanf': ('scanf', 1, None),
   'vprintf': ('printf', 0, None),
   'vscanf': ('scanf', 0, None),
   'vsnprintf': ('printf', 2, None),
   'vsprintf': ('printf', 1, None),
   'vsscanf': ('scanf', 1, None)
}
A dictionary mapping the names of the functions to check, to a triple (function_kind, fmt_param_index, arg_start_index) where function_kind is either printf or scanf, fmt_param_index is the index of the format-string parameter, and arg_start_index is the index of the first variadic argument.
 

use_static_semantic_analysis

use_static_semantic_analysis : bool = True

Whether the rule should use the results of the StaticSemanticAnalysis to check buffer sizes. This can produce different findings, as the size of buffer arguments are computed more accurately. This will not enforce StaticSemanticAnalysis to be enabled, but will produce less accurate results if it is not.