How is a C++ Program Compiled and Executed?
While working in C++ programming language, it is important to know how to compile C++ program codes and how to execute them. When we write code, we write it in a high-level language that machines can't understand, known as source code. The code that machines can understand & execute is in binary form (0's and 1's) and is known as machine code, object code, or executable code.
Translating source code (high-level language code) into machine-readable code consists of the following four processes that we will learn in detail as we move through the course of this article:
- Pre-processing the source code
- Compiling the source code
- Assembling the compiled file
- Linking the object code file to create an executable file
First, Let us learn how to compile C++ and execute C++ program code on a terminal or command prompt practically!
Assuming that we have already installed the GCC compiler and we have the hello.cpp file that we need to compile and execute, we now need to follow the below instructions:
STEP 1: We need to open the terminal window or command prompt (if we are working on Windows).
STEP 2: We must now change the location to the directory where the hello.cpp file exists.
Example: If the hello.cpp file exists in the Documents folder, we need to run the below command on the command prompt:
STEP 3: Now, to compile the hello.cpp source code file, we need to use the below command:
STEP 4: Assuming we have given the name "myProgram1" while compiling hello.cpp file, now to run & execute the code, we need to write the below command:
On Terminal:
While using Windows command prompt:
A C++ compiler translates C++ source code into machine language code and stores it on the disk with file extension .o (here, hello.o). The linker then links this object code file with standard library files required by the program code and, thus, creates an executable file in the end which is again saved on the disk. While we try to run and execute the C++ code file, the executable file is loaded from the disk to the memory, and then the CPU executes the program (one instruction at a time).
The Build Pipeline: Preprocess, Compile, and Link
As we have already discussed, each C++ source code file needs to be compiled into an object file first, and then the linker links this object file into an executable file to execute & run. C++ source code files can contain various header files with the #include directive. The header files can have the extension .h, or even no extension is needed if we are using C++ standard library template.
The first step while translating the source code into machine-readable code is to pre-process the source code file. The header files included within the source code file are not passed directly to the compiler; only the main source code file is passed for pre-processing and compilation purposes. Header files are then indirectly included from the source file itself, where the C++ pre-processor replaces the lines containing the #include directive with the actual content of the included header files.
When we want to compile multiple source code files, header files can be opened multiple times during the pre-processing phase, depending upon how many source code files include them. However, source code files are opened only once for pre-processing and compilation.
The other functionality of the pre-processor is to remove the code from the source code file upon finding the conditional compilation blocks (like if-else statements) whose value evaluates to false and is not needed in retrieving results from the code.
Pre-processor also helps in macro replacements. Macros in C++ usually start with the #define directive, and whenever the compiler encounters a macro name, it replaces the name with the definition of the Macro.
After pre-processing phase gets over, we get a translated unit of code (sometimes huge, depending upon the header files' content).
Example 1. Assuming we have a source code file as hello.cpp and the below C++ code is present, we are now interested in obtaining translated unit (pre-processed source code file).
To obtain the pre-processed file of the above C++ source code file, we need to open a command prompt window in the directory location where the hello.cpp file exists. After that, we need to run the below command:
Here, hello.ii is the name given to the pre-processed file that will be obtained.
To check the number of lines present in the pre-processed file, we can run the below command:
Output: As we can observe, the pre-processed file hello.ii contains 93,675 lines of code in the machine after including the header file content.
Note that the pre-processed file becomes bigger as we keep including header files in the source code. After pre-processing, the compiler starts the compilation phase to produce an object file with an extension .o (Compiler has to compile a much larger file, i.e., translated unit compared to a short & simple source file).
How do Source Files Import and Export Symbols?
Have you ever wondered if we want to manually import and export custom functions instead of using default header file libraries?
This is possible while working in C++, where source files can import and export symbols (like custom function names, etc.)
Example: In this example, we created a C++ source code file named sum.cpp containing two export functions: Int_Sum for adding two integer values and Float_Sum for adding two float values.
Now, we will compile the above source code file to obtain an object file named sum.o using the below command:
After generating the object file sum.o, we can now check for the symbols being exported or imported using the nm command, which displays information about the symbols in the specified file (which can be an object file or an executable file).
Output:
Note that only useful information has been displayed in the output.
We observe from the output that no symbol has been imported, but two symbols are exported: Z7Int_Sumii and Z9Float_Sumff as a part of the text segment (denoted by T), which shows they are function names.
Note that the original function names, i.e., Int_Sum and Float_Sum, are being (translated or changed) and if we want to see the demangled (original) function names as exported symbols. We can use the -C option and the nm command.
Output: Int_Sum and Float_Sum function names are displayed as the exported symbols and parameters.
To import and call the above function names, we need to declare them first. The best way is to create a header file that declares the function names and by which we can include this header file in the source file where we want to call the required function names.
We have created a sum.h header file to declare function names: Int_Sum and Float_Sum.
Now let us create another source code file named output.cpp where we will use the "sum.h" header file to call function names: Int_Sum and Float_Sum.
As we have discussed above also, C++ mangles (changes or translates) function names, but that is not the case when we declare function names using the extern "C". Let us find out that by compiling the source code file output.cpp and then using the nm command to display symbols being exported and imported.
Output:
Note that only useful information has been displayed in the output.
As the function names are not mangled due to the use of extern "C", we had to use different function names for printing: printSumInt and printSumFloat so that they seem different from each other while exporting.
Till now, we have only compiled the source code file into an object file but have yet to link them. Without linking the object files, the linker will stop with a "missing symbol" error.
We have created an output.hpp header file (.hpp files can be imported in both C and C++) to declare printing functions: printSumInt and printSumFloat.
To get the results by calling appropriate function names, we have to link all the files with another created result.cpp file.
To generate the object file result.o for the source code file result.cpp and to check for imported & exported symbols, we have to run the below commands:
Output:
Note that only useful information in the output has been displayed here. As we can observe in the output, the main function has been exported while the printSumFloat and printSumInt functions have been imported.
Main function name has not been mangled despite not using extern "C", just because it is treated as a special implementation-defined function in C++.
To link all the object files together to generate an executable file, we will use C++ linker (g++) and run the below command:
Here, output_generated is the name given to the executable file after linking all the object files: sum.o, output.o, and result.o.
Final Output: (After executing output_generated file)
How Header Guards Work?
Have you ever wondered what will happen if we use the same header file multiple times directly or indirectly for a single source code file? As we know, while pre-processing and compilation, the header file #include* directive is replaced by the actual header file content, and using the same header file multiple times, will only result in duplicated declarations.
Example 1. In this example, we have created an unguarded.hpp header file with the student class. getNum() and setNum() methods have been declared & initialized to return and set the value of private data member "num".
We now have created another header file, guarded.hpp, with the same file content as in the unguarded.hpp header file, but the difference is that we have wrapped the entire header file content here within a conditional block. For the first time while pre-processing, the pre-processor will include this header file upon checking the (#ifndef) condition and define the __GUARDED_HPP macro. Next time if the same source file asks to include this header file again, as the __GUARDED_HPP macro has already been defined, the pre-processor will discard the code placed between the #ifndef and the #endif directives.
Note that the header file can only be included once for every source file to avoid duplicated declarations.
To verify the results, we have created a source code file named source.cpp and have included the guarded.hpp header file two times.
For pre-processing the source.cpp file, we have to run the below command:
Note that the pre-processed file will contain only one student class declaration; hence, we can compile the source.cpp file without any problem. For compilation, we can run the below command:
We will verify the results of including the unguarded.hpp header file multiple times in the newly created source2.cpp file in the code example below.
For pre-processing of the source2.cpp file, we have to run the below command:
Note that the pre-processed file will contain two definitions (duplicated) of the student class after pre-processing. This will result in an error during the compilation phase. For compilation, we can run the below command:
Output: As we can observe in the output, the pre-processed file couldn't be compiled due to multiple declarations of the student class.
Pass by Value and Constness of Parameters
Note that we have used some const parameters in the below C++ example code, which means we can't change their values within the function body. If we try to change the values of const parameters, it will result in a compilation error.
All the parameters passed into the function are passed by value, which means a copy of the actual variables is passed. If we try to change these passed by value variables, we only modify the copy within the function, not the original content.
To obtain the object file and check the imported & exported symbols, we can run the below commands:
Output: Note that only the useful information symbols have been displayed here in the output.
Pass by Reference
While passing variables by reference (&), the constness of variables matters, and when passed as const, it denotes that the variables can't be modified within the functional block.
In the below C++ example code where we have used the first sum function, we have passed const variable 'a' and variable 'b' by reference. This shows that variable 'a' can't be modified, but we can modify variable 'b', whose value will reflect in the main function.
For pre-processing and checking the imported & exported symbols, we can run the below commands:
Output: As we can notice in the output, symbols are being exported with their constness.
Pass by Pointer
- If we want to declare a pointer to a const element, it will be represented in either of the following ways:
- If we want to declare the pointer itself to be const i.e., we can't change the pointer to point to something else, then it can be done using the following way:
- If we want to make just the pointer itself to be const, but not its pointing element, we can do so in the following way:
In the below C++ example code, we have passed a pointer to the vector arr where we can't change (clear/erase) the content of the vector due to its constancy. The pointer to the vector itself is not const, and hence, we can point it to a new location without any error.
Output: When we pass by a pointer, we are using reference, and the only difference is that when we pass by reference, we pass the actual element's reference (not pointing to NULL), while a pointer can also point to NULL. As we can see in the output, the constness of the pointer is also being exported, and it denotes whether we can modify the element pointed by the pointer or not.
Compiling with Different Flags
The most common compiler flags in C++ are as follows:
- std -> It specifies the C++ version or ISO standard version.
For example: -std=c++ 17 (ISO C++ 17) and -std=gnu++ (ISO C++ with gnu extensions)
-
Verbosity [W stands for warning]
1. -Wall -> It turns on mostly all the compiler warning flags (-Waddress, -Wcomment etc.)
2. -Werror -> It turns any warning into a compilation error.
3. -Wextra (-W) -> It turns on extra remaining compiler warning flags not turned on by -Wall flag such as -Wsign-compare, -Wtype-limits, etc.
4. -Wpendantic -> It issues all the warnings required by ISO C++ standard.
-
-o We can use it to get output of a C++ file.
For example: g++ file.cpp -o hello.bin
-
Compilation flags -D
1. -DCOMPILE_VAR -> It enables COMPILE_VAR flag and is equivalent to add to the code (#define COMPILE_VAR).
2. -DDO_SOMETHING=1 -> It is equivalent to adding to the code (#define DO_SOMETHING=1).
Conclusion
- Translating the source code (high-level language code) into machine-readable code consists of the following four processes:
- Pre-processing the source code
- Compiling the source code
- Assembling the compiled file
- Linking the object code file to create an executable file
- To obtain an object file with an extension .o from a source code file (say, code.cpp), we can run the below command:
- How to compile C++? To answer how to compile C++ (say hello.cpp file), we can run the below command in the command prompt:
- For pre-processing a source code file (say, hello.cpp), we can run the below command:
- To check the imported and exported symbols in a pre-processed file (say hello.cpp), we can run the below command: