Topic : Compiling C++ Programs On Unix
Author : LUPG
Page : << Previous 2  Next >>
Go to page :


resulting program will be, and the slower the compiler will complete the compilation. One should note that because optimization alters the code in various ways, as we increase the optimization level of the code, the chances are higher that an improper optimization will actually alter our code, as some of them tend to be non-conservative, or are simply rather complex, and contain bugs. For example, for a long time it was known that using a compilation level higher then 2 (or was it higher then 3?) with gcc results bugs in the executable program. After being warned, if we still want to use a different optimization level (lets say 4), we can do it this way:

cc -O4 single_compile.c -o single_compile

And we're done with it. If you'll read your compiler's manual page, you'll soon notice that it supports an almost infinite number of command line options dealing with optimization. Using them properly requires thorough understanding of compilation theory and source code optimization theory, or you might damage your resulting code. A good compilation theory course (preferably based on "the Dragon Book" by Aho, Sethi and Ulman) could do you good.





Getting Extra Compiler Warnings

Normally the compiler only generates error messages about erroneous code that does not comply with the C standard, and warnings about things that usually tend to cause errors during runtime. However, we can usually instruct the compiler to give us even more warnings, which is useful to improve the quality of our source code, and to expose bugs that will really bug us later. With gcc, this is done using the '-W' flag. For example, to get the compiler to use all types of warnings it is familiar with, we'll use a command line like this:

cc -Wall single_source.c -o single_source

This will first annoy us - we'll get all sorts of warnings that might seem irrelevant. However, it is better to eliminate the warnings then to eliminate the usage of this flag. Usually, this option will save us more time than it will cause us to waste, and if used consistently, we will get used to coding proper code without thinking too much about it. One should also note that some code that works on some architecture with one compiler, might break if we use a different compiler, or a different system, to compile the code on. When developing on the first system, we'll never see these bugs, but when moving the code to a different platform, the bug will suddenly appear. Also, in many cases we eventually will want to move the code to a new system, even if we had no such intentions initially.

Note that sometimes '-Wall' will give you too many errors, and then you could try to use some less verbose warning level. Read the compiler's manual to learn about the various '-W' options, and use those that would give you the greatest benefit. Initially they might sound too strange to make any sense, but if you are (or when you will become) a more experienced programmer, you will learn which could be of good use to you.





Compiling A Single-Source "C++" Program

Now that we saw how to compile C programs, the transition to C++ programs is rather simple. All we need to do is use a C++ compiler, in place of the C compiler we used so far. So, if our program source is in a file named 'single_main.cc' ('cc' to denote C++ code. Some programmers prefer a suffix of 'C' for C++ code), we will use a command such as the following:

g++ single_main.cc -o single_main

Or on some systems you'll use "CC" instead of "g++" (for example, with Sun's compiler for Solaris), or "aCC" (HP's compiler), and so on. You would note that with C++ compilers there is less uniformity regarding command line options, partially because until recently the language was evolving and had no agreed standard. But still, at least with g++, you will use "-g" for debug information in the code, and "-O" for optimization.





Compiling A Multi-Source "C" Program

So you learned how to compile a single-source program properly (hopefully by now you played a little with the compiler and tried out a few examples of your own). Yet, sooner or later you'll see that having all the source in a single file is rather limiting, for several reasons:

As the file grows, compilation time tends to grow, and for each little change, the whole program has to be re-compiled.
It is very hard, if not impossible, that several people will work on the same project together in this manner.
Managing your code becomes harder. Backing out erroneous changes becomes nearly impossible.
The solution to this would be to split the source code into multiple files, each containing a set of closely-related functions (or, in C++, all the source code for a single class).

There are two possible ways to compile a multi-source C program. The first is to use a single command line to compile all the files. Suppose that we have a program whose source is found in files "main.c", "a.c" and "b.c" (found in directory "multi-source" of this tutorial). We could compile it this way:

cc main.c a.c b.c -o hello_world

This will cause the compiler to compile each of the given files separately, and then link them all together to one executable file named "hello_world". Two comments about this program:

If we define a function (or a variable) in one file, and try to access them from a second file, we need to declare them as external symbols in that second file. This is done using the C "extern" keyword.
The order of presenting the source files on the command line may be altered. The compiler (actually, the linker) will know how to take the relevant code from each file into the final program, even if the first source file tries to use a function defined in the second or third source file.
The problem with this way of compilation is that even if we only make a change in one of the source files, all of them will be re-compiled when we run the compiler again.

In order to overcome this limitation, we could divide the compilation process into two phases - compiling, and linking. Lets first see how this is done, and then explain:


cc -c main.cc
cc -c a.c
cc -c b.c
cc main.o a.o b.o -o hello_world


The first 3 commands have each taken one source file, and compiled it into something called "object file", with the same names, but with a ".o" suffix. It is the "-c" flag that tells the compiler only to create an object file, and not to generate a final executable file just yet. The object file contains the code for the source file in machine language, but with some unresolved symbols. For example, the "main.o" file refers to a symbol named "func_a", which is a function defined in file "a.c". Surely we cannot run the code like that. Thus, after creating the 3 object files, we use the 4th command to link the 3 object files into one program. The linker (which is invoked by the compiler now) takes all the symbols from the 3 object files, and links them together - it makes sure that when "func_a" is invoked from the code in object file "main.o", the function code in object file "a.o" gets executed. Further more, the linker also links the standard C library into the program, in this case, to resolve the "printf" symbol properly.

To see why this complexity actually helps us, we should note that normally the link phase is much faster then the compilation phase. This is especially true when doing optimizations, since that step is done before linking. Now, lets assume we change the source file "a.c", and we want to re-compile the program. We'll only need now two commands:


cc -c a.c
cc main.o a.o b.o -o hello_world

In our small example, it's hard to notice the speed-up, but in a case of having few tens of files each containing a few hundred lines of source-code, the time saving is significant; not to mention even larger projects.





Getting a Deeper Understanding - Compilation Steps

Now that we've learned that compilation is not just a simple process, lets try to see what is the complete list of steps taken by the compiler in order to compile a C program.

Driver - what we invoked as "cc". This is actually the "engine", that drives the whole set of tools the compiler is made of. We invoke it, and it begins to invoke the other tools one by one, passing the output of each tool as an input to the next tool.
C Pre-Processor - normally called "cpp". It takes a C source file, and handles all the pre-processor definitions (#include files, #define macros, conditional source code inclusion with #ifdef, etc.) You can invoke it separately on your program, usually with a command like:

cc -E single_source.c

Try this

Page : << Previous 2  Next >>