C is a well-known programming language, popular with experienced and new programmers alike. Source code written in C uses standard English terms, so it's considered human-readable. However, computers only understand binary code. To convert code into machine language, you use a tool called a compiler.
A very common compiler is GCC (GNU C Compiler). The compilation process involves several intermediate steps and adjacent tools.
Install GCC
To confirm whether GCC is already installed on your system, use the gcc
command:
$ gcc --version
If necessary, install GCC using your packaging manager. On Fedora-based systems, use dnf
:
$ sudo dnf install gcc libgcc
On Debian-based systems, use apt
:
$ sudo apt install build-essential
After installation, if you want to check where GCC is installed, then use:
$ whereis gcc
Simple C program using GCC
Here's a simple C program to demonstrate how to compile code using GCC. Open your favorite text editor and paste in this code:
// hellogcc.c
#include <stdio.h>
int main() {
printf("Hello, GCC!\n");
return 0;
}
Save the file as hellogcc.c
and then compile it:
$ ls
hellogcc.c
$ gcc hellogcc.c
$ ls -1
a.out
hellogcc.c
As you can see, a.out
is the default executable generated as a result of compilation. To see the output of your newly-compiled application, just run it as you would any local binary:
$ ./a.out
Hello, GCC!
Name the output file
The filename a.out
isn't very descriptive, so if you want to give a specific name to your executable file, you can use the -o
option:
$ gcc -o hellogcc hellogcc.c
$ ls
a.out hellogcc hellogcc.c
$ ./hellogcc
Hello, GCC!
This option is useful when developing a large application that needs to compile multiple C source files.
Intermediate steps in GCC compilation
There are actually four steps to compiling, even though GCC performs them automatically in simple use-cases.
- Pre-Processing: The GNU C Preprocessor (
cpp
) parses the headers (#include statements), expands macros (#define statements), and generates an intermediate file such ashellogcc.i
with expanded source code. - Compilation: During this stage, the compiler converts pre-processed source code into assembly code for a specific CPU architecture. The resulting assembly file is named with a
.s
extension, such ashellogcc.s
in this example. - Assembly: The assembler (
as
) converts the assembly code into machine code in an object file, such ashellogcc.o
. - Linking: The linker (
ld
) links the object code with the library code to produce an executable file, such ashellogcc
.
When running GCC, use the -v
option to see each step in detail.
$ gcc -v -o hellogcc hellogcc.c
Manually compile code
It can be useful to experience each step of compilation because, under some circumstances, you don't need GCC to go through all the steps.
First, delete the files generated by GCC in the current folder, except the source file.
$ rm a.out hellogcc.o
$ ls
hellogcc.c
Pre-processor
First, start the pre-processor, redirecting its output to hellogcc.i
:
$ cpp hellogcc.c > hellogcc.i
$ ls
hellogcc.c hellogcc.i
Take a look at the output file and notice how the pre-processor has included the headers and expanded the macros.
Compiler
Now you can compile the code into assembly. Use the -S
option to set GCC just to produce assembly code.
$ gcc -S hellogcc.i
$ ls
hellogcc.c hellogcc.i hellogcc.s
$ cat hellogcc.s
Take a look at the assembly code to see what's been generated.
Assembly
Use the assembly code you've just generated to create an object file:
$ as -o hellogcc.o hellogcc.s
$ ls
hellogcc.c hellogcc.i hellogcc.o hellogcc.s
Linking
To produce an executable file, you must link the object file to the libraries it depends on. This isn't quite as easy as the previous steps, but it's educational:
$ ld -o hellogcc hellogcc.o
ld: warning: cannot find entry symbol _start; defaulting to 0000000000401000
ld: hellogcc.o: in function `main`:
hellogcc.c:(.text+0xa): undefined reference to `puts'
An error referencing an undefined puts
occurs after the linker is done looking at the libc.so
library. You must find suitable linker options to link the required libraries to resolve this. This is no small feat, and it's dependent on how your system is laid out.
When linking, you must link code to core runtime (CRT) objects, a set of subroutines that help binary executables launch. The linker also needs to know where to find important system libraries, including libc and libgcc, notably within special start and end instructions. These instructions can be delimited by the --start-group
and --end-group
options or using paths to crtbegin.o
and crtend.o
.
This example uses paths as they appear on a RHEL 8 install, so you may need to adapt the paths depending on your system.
$ ld -dynamic-linker \
/lib64/ld-linux-x86-64.so.2 \
-o hello \
/usr/lib64/crt1.o /usr/lib64/crti.o \
--start-group \
-L/usr/lib/gcc/x86_64-redhat-linux/8 \
-L/usr/lib64 -L/lib64 hello.o \
-lgcc \
--as-needed -lgcc_s \
--no-as-needed -lc -lgcc \
--end-group
/usr/lib64/crtn.o
The same linker procedure on Slackware uses a different set of paths, but you can see the similarity in the process:
$ ld -static -o hello \
-L/usr/lib64/gcc/x86_64-slackware-linux/11.2.0/ \
/usr/lib64/crt1.o /usr/lib64/crti.o \
hello.o /usr/lib64/crtn.o \
--start-group -lc -lgcc -lgcc_eh \
--end-group
Now run the resulting executable:
$ ./hello
Hello, GCC!
Some helpful utilities
Below are a few utilities that help examine the file type, symbol table, and the libraries linked with the executable.
Use the file
utility to determine the type of file:
$ file hellogcc.c
hellogcc.c: C source, ASCII text
$ file hellogcc.o
hellogcc.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
$ file hellogcc
hellogcc: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=bb76b241d7d00871806e9fa5e814fee276d5bd1a, for GNU/Linux 3.2.0, not stripped
The use the nm
utility to list symbol tables for object files:
$ nm hellogcc.o
0000000000000000 T main
U puts
Use the ldd
utility to list dynamic link libraries:
$ ldd hellogcc
linux-vdso.so.1 (0x00007ffe3bdd7000)
libc.so.6 => /lib64/libc.so.6 (0x00007f223395e000)
/lib64/ld-linux-x86-64.so.2 (0x00007f2233b7e000)
Wrap up
In this article, you learned the various intermediate steps in GCC compilation and the utilities to examine the file type, symbol table, and libraries linked with an executable. The next time you use GCC, you'll understand the steps it takes to produce a binary file for you, and when something goes wrong, you know how to step through the process to resolve problems.
Comments are closed.