The C Preprocessor’s Arithmetic World

Last modified by Microchip on 2024/01/17 22:21

The C preprocessor allows you to define macros, consisting of a macro name and a string of characters that the macro represents. Preprocessor macros are fully described in the "C Programming Macros" page. The preprocessor replaces macros used in the C source with their replacement string. The characters in this string then form part of a larger C expression which is evaluated by the compiler in the usual way. What is important to note is that the preprocessor purely sees the replacement string as characters. The preprocessor does not parse the string, there is no concept of types or values and there is no simplification or adjustment of the string in any way.

In the following example, even though the replacement string for MAX appears to be a constant expression that could be folded, the preprocessor sees it as the string: open-bracket, one, zero, zero, zero, star, etc.

1  #define MAX (1000*1000)
2
3  if (MAX > 32767)
4    position = 0;

During substitution, the characters of the macro string are copied verbatim, resulting in:

1  if ((1000*1000) > 32767)
2     ...

Later, and long after the preprocessor has finished executing, the compiler will parse this code, which includes substitution, assign types where appropriate, and evaluate any expressions using the usual C rules.

However, there is one instance where the preprocessor will evaluate expressions and that is when they appear as the controlling expression of #if or #elif conditional directives.

The following example shows the macro MAX now being used in a #if directive to determine if the C variable position should be defined in the code.

1  #define MAX (1000*1000)
2
3  #if (MAX > 32767)
4  long int position;
5  #endif

The macro MAX is replaced with its replacement string in exactly the same way, leading to:

1  #if ((1000*1000) > 32767)
2     ...

After substitution, the preprocessor has to evaluate the expression following the #if to see if it is true or false.

Knowing that the evaluation of controlling expressions associated with #if or #elif conditional directives is performed by the preprocessor is important since the results may not be the same as those produced by the compiler for the same expressions.

The preprocessor’s evaluation of the constant tokens in #if or #elif conditional directive controlling expressions is similar to that performed by the compiler, except that signed or unsigned integer types act as if they have the type intmax_t or uintmax_t, respectively, as defined by the compiler’s header <stdint.h>.

To illustrate the importance of this difference, consider the two conditional code sequences taken from the above examples:

1  if ((1000*1000) > 32767)

and

1  #if ((1000*1000) > 32767)

If both code sequences were being compiled using the MPLAB® XC8 compiler which has an int size of 16 bits, the multiplication in the if() statement will be performed as a 16-bit operation by the compiler. This multiplication will overflow, producing the result 16960, which is less than 32767. This controlling expression in this conditional is false.

In the #if directive, the constants will appear to have a intmax_t type, which has a 32-bit representation on the MPLAB XC8 compiler. This means that the preprocessor multiplication will be performed as a 32-bit operation and will not overflow. The result will be 1000000, which is larger than 32767. In this case, the conditional controlling expression is true—the opposite result to that obtained by the if() statement.

The int and intmax_t types used by MPLAB XC16 have different sizes to those used by MPLAB XC8 and these are different again in MPLAB XC32, but the preprocessor type rules are the same for all compilers and can equally surprise unsuspecting programmers. If you need consistency between the preprocessor’s and compiler’s arithmetic, you might be able to cast the values used by the C expressions to the relevant intmax_t type.

Back to Top