In the last tutorial, we got a brief overview of what happens behind the scenes of getting an executable from the source code. We learned that there are 5 stages involved in making this possible, out of which preprocessing is the very first stage a program has to go through. In this tutorial, we will learn preprocessing in depth and get to know what goes inside a preprocessed file. So, let’s dive in.
What is Preprocessing?
Preprocessing refers to the process in which the C code (written by us) is converted to the preprocessed code (also called the expanded source code). This process is carried out by a preprocessor which is nothing more than a software program written in some XYZ programming language. In a nutshell, the preprocessor takes the C code as the input, does some work on it, and generates expanded source code as the output.
At this moment, you might have these questions in mind — why the preprocessor translates source code written in C to expanded source code? What is expanded source code and what goes inside it? Why is it called expanded source code? Let’s address these questions one by one.
Preprocessor got its name from the idea that it does some processing prior to compilation (pre stand for previous or prior; processor refers to someone who does some processing).
Why preprocessor translates the source code to expanded source code?
Do you remember that in the tutorial on Writing Our First C Program, I mentioned that the line #include <stdio.h> is the command to preprocessor? The preprocessor is responsible for including the contents of the header file stdio.h. Without including the contents of this header file, our program may not be able to use functions available in the file. Just like stdio.h, we can ask preprocessor to include any header file (which are available to us by the language). Not only this, it has other responsibilities too.
Preprocessor can
include the contents of the header file.
replace macros with their actual values.
include a specific code in our source file based on some condition.
remove comments
prevent multiple inclusions of a specific code.
Preprocessor can include the contents of the header file
As mentioned, we can ask preprocessor to include the contents of the header file so that we can use functions available in those header files.
In reality, a header file contains more than just function declarations. We do not have to worry about those things yet. As we proceed, we will learn about the different elements of the header file.
As an example, consider the same hello world program:
The first line is meant for preprocessor to handle. As soon as preprocessor sees this line — #include <stdio.h>, it replaces this line with the actual contents of the header file which includes various function declarations that we may want to use in our program. The declaration of the printf() function is also available in stdio.h. As we are including the appropriate header file, we can use this function wherever we want in our program.
There are two important terms associated with functions — declaration and definition. Declaration refers to the structure of the function i.e. the name of the function, the type of value it returns, and the inputs it accepts. It does not include the instructions which make a function do its job. On the other hand, definition refers to a function which is defined (i.e. it has the code that makes it possible for a function to perform its task).
Note that I am not showing at this moment what content will be included by the preprocessor of stdio.h. There are a lot of code inside stdio.h; showing them all here will be a little difficult. Although, later in this tutorial, you will learn how to generate the expanded source code file from the source file. You can inspect this file on your own to see what is included by the preprocessor if you are curious.
Preprocessor can replace macros with their actual values
Macro is the name given to a constant or a piece of code. Preprocessor replaces macro with its actual definition in the code so that the compiler would be able to work on the actual definitions; instead of encountering macros directly.
Consider the following program:
#include <stdio.h>
#define PI 3.14159 // Macro PI is defined with value 3.14159
int main() {
printf("Value of PI: %.2f", PI); // Preprocessor will replace PI by 3.14159
return 0;
}
As mentioned through the comments as well, PI is the name given by us to the constant 3.14159. Now, in place of this value, we can use the name associated with it. Inside the main() function, I tried printing the value of PI. As PI is associated with 3.14159, this value will be displayed on the screen. Eventually, our program may look like the following after preprocessor replaces the macro:
#include <stdio.h>
int main() {
printf("Value of PI: %.2f", 3.1459);
return 0;
}
Preprocessor can include a specific code in our source file based on some condition
With the help of keywords — #if, #ifdef, #ifndef, #else, and #endif, preprocessor can include (or exclude) a specific piece of code. This allows including content based on some condition.
As an example, consider the following program:
#include <stdio.h>
#define TEST
int main()
{
printf("Hello\n");
#ifdef TEST. // checking if TEST is defined
printf("Test\n"); // included if TEST is defined
#endif // end of the if block
printf("World.");
return 0;
}
As the name TEST is defined using #define, the statement printf("Test\n"); will be included in the code, and the program will look like the following:
#include <stdio.h>
#define TEST
int main() {
printf("Hello\n");
printf("Test\n");
printf("World.");
return 0;
}
Output:
Hello
Test
World.
We will discuss conditional inclusion by preprocessor later in this course.
Preprocessor can remove comments
Comments are notes written by a programmer for future reference and to make the code more readable. Apart from this, comments do not contribute anything in the execution of the code. Therefore, it makes sense to remove them before execution, and this is done by preprocessor. One of the most important responsibilities of preprocessor is to remove comments in the code.
A comment is added in the code by using // as demonstrated in the following program:
// C Program to print Hi! on the screen
#include <stdio.h>
int main() {
// The following function prints "Hi!"
printf("Hi!");
return 0;
}
The comments added in the program (lines with //) will be removed by the preprocessor, and the code looks like the following:
#include <stdio.h>
int main() {
printf("Hi!");
return 0;
}
I am deliberately skipping the expansion of #include <stdio.h>. It has a lot of code and including them all here does not make sense.
By the way, we have used comments quite a lot of times in the programs of this tutorial. Can you identify them in the programs we have written so far?
There is another way to add a comment in C. We will discuss that in one of our future tutorials.
Preprocessor can prevent multiple inclusions of the same code
Soon there will be a time when we will work on a project which may contain multiple files. We may choose to define our own header file for the purpose of including the code which will be used by the files of our project. The problem is that the same code cannot be included in multiple files because C language does not allow that. One way to prevent multiple inclusions is to use #ifndef.
Later in this course, we will learn how to define our own header files, how to share the code written in the header file with the files of the same project, and different ways to prevent multiple inclusions of the code.
The following program demonstrates how to use #ifndef in a header file (name it some_header_file.h):
#ifndef TEST // if macro named TEST is not defined
#define TEST // then proceed and define the macro TEST
printf("Welcome to pumpedupbrains"); // and include this code in a file where this header file is included
#endif // End of if block
If there is some file where we want to include the contents of this header file, then first, the preprocessor has to check whether TEST is defined. If yes, then the entire code will be skipped, and the printf() function will not be included, otherwise it will be included. In this way, we prevent multiple inclusions of a code (in our example, it is the printf() function) in the same project.
As we have learned the importance of preprocessor and the role it plays, let’s understand how to generate the expanded source code file so we can examine it.
Obtaining the Expanded Source Code
If you are following along from the beginning, then I am expecting that you have already created the file example.c with the following code:
#include <stdio.h>
#define N 10
int main() {
printf("Value of N is %d", N); // prints the value of N.
return 0;
}
If not, then follow the instructions of the previous tutorial — Introduction to Behind the Scenes. Not only for the code, it is also important to read the tutorial to get an overview of what to expect in the coming tutorials.
If you build and run this code in Code::Blocks, then you may not be able to see the expanded source file in your current folder. This is because preprocessing is the part of the process which makes it possible to execute a code, and expanded source code is not what we are interested as the result, so Code::Blocks intentionally hides this file from us (and other intermediate file which will learn in the coming tutorials). But, if you are curious to examine the expanded source code file, there is a workaround.
First, open the command prompt (or terminal in Mac) and get into your folder where example.c file is situated (it will be Desktop/C Programs if you are following along).
C:/> cd Desktop/C Programs
Desktop/C Programs>
Then, type the following command:
gcc -E example.c -o example.i
Now, let’s understand the meaning of this command.
gcc stands for GNU Compiler Collection. This package allows us to perform all the steps that are required to run our programs. The command which is meant for gcc to handle must always start with gcc.
-E is the flag that enforces gcc to stop all the proccesses accept preprocessing. So, due to this flag, the final output will be the expanded source code file; not the executable.
example.c is the name of our source file which we want to preprocess.
-o allows us to name our expanded source code file.
example.i is the name of the expanded source code file with .i extension.
Note the extension of the source code file is .c and that of the expanded source code file is .i.
After executing the above command, we will see the file named example.i in our C Programs folder. After inspecting the file, you will observe the following:
there is quite a lot of code added by the preprocessor which was not provided in the original source file (this is the reason why the output generated by the preprocessor is called the expanded source code file). Most probably, this code came from the stdio.h header file (The #include <stdio.h> line is removed by the preprocessor). So, preprocessor has included the contents of the header file.
If you scroll down a bit, you will observe the main() function we wrote. Inside the printf() function, macro N has been replaced by its actual value which is 10, and the #define N 10 is removed. So, preprocessor has replaced macro with its actual value.
The comment // prints the value of N just after the printf() function has been removed. So, preprocessor has successfully removed the comment as well.
In this way, preprocessor has done its job and the expanded source code file looks like the following, example.i file:
// header file content goes here. I am not including it here.
int main() {
printf("Value of N is %d", 10);
return 0;
}
That’s all 🙂
Summary
Preprocessor refers to a processor which does some processing prior to the compilation stage.
It is responsible to handle commands followed by #.
It can
include the contents of the header file.
replace a macro with its actual value.
remove comments.
prevent multiple inclusions of the same code.
add content based on some condition.
To obtain the expanded source code file, type gcc -E example.c -o example.i
Review Questions
Q1. What’s the final result of the preprocessing done on the following source code?
#define X 5
int main() {
int n = X + 5;
// Adds 5 to X and stores the result in n as integer.
return 0;
} // I will add more code here in future.
Solution:
int main() {
int n = 5 + 5;
return 0;
}
Here, preprocessor has removed the comments and replaced the macro X with its value 5. Notice that the evaluation of the expression 5 + 5 is not done by the preprocessor; it only replaces the macro.
Q2. What’s the meaning of the flag -E?
Solution: This flag enforces gcc to only allow preprocessing to be done on the source file. The remaining stages will be stopped for processing.
Q3. What is happening in the following code?
#ifndef GREET
#define GREET "hello!"
#endif
Solution:
The # commands are the instructions to the preprocessor. #ifndef GREET followed by #define GREET “hello!” says to the processor that “if macro GREET is not defined, then define it with value “hello!””. The command #endif represents the end of the ifndef block.
Leave a comment