In the last tutorial, we have built our hello_world.c using the command line, revealing some make steps and mechanisms. Before growing your code with fancy functions and header files, it is also important to understand the process of compilation. Apparently, to transform your source code (hello_world.c) into an executable (hello_world_exc) takes four steps: Preprocessing, Assembling, Compiling and Linking. In fact, all of these four processes invisibly happen behind the scene every time your compile. If you want to see them, try this

gcc -v -Wall -g -o hello_world_exc hello_world.c

Notice a new flag -v, which stands for "verbose", indicating that you want to see everything.

If you tried the previous command, your screen should now be bombarded with information. It was because the simple hello_world.c is indeed not that trivial, especially to your machine. But don't fret, just read on!

1) Preprocessing (from .c to .i)

In summary, there are three tasks in this step, namely: Comment Stripping, Text Substitution and File Inclusion. Firstly, remember we put some non-programmatic texts, or comments, after the //, those are only useful to human coders, hence the Comment Stripping. Secondly, this special character # is called a Preprocessor Directive. #include means File Inclusion which, in our case, request the header file of Standard Input Output Library, which in turn requests for other libraries too. On the other hand,#define will mean Text Substitution.

To see the preprocessed file, try

gcc -E hello_world.c -o hello_world.i

If you peeked inside this file, you would see something similar to this

# 1 "hello_world.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 1 "<command-line>" 2
# 1 "hello_world.c"
# 1 "/usr/include/stdio.h" 1 3 4
# 27 "/usr/include/stdio.h" 3 4
# 1 "/usr/include/features.h" 1 3 4
# 367 "/usr/include/features.h" 3 4
# 1 "/usr/include/x86_64-linux-gnu/sys/cdefs.h" 1 3 4
# 410 "/usr/include/x86_64-linux-gnu/sys/cdefs.h" 3 4
# 1 "/usr/include/x86_64-linux-gnu/bits/wordsize.h" 1 3 4
# 411 "/usr/include/x86_64-linux-gnu/sys/cdefs.h" 2 3 4
# 368 "/usr/include/features.h" 2 3 4
# 391 "/usr/include/features.h" 3 4
# 1 "/usr/include/x86_64-linux-gnu/gnu/stubs.h" 1 3 4
# 10 "/usr/include/x86_64-linux-gnu/gnu/stubs.h" 3 4
# 1 "/usr/include/x86_64-linux-gnu/gnu/stubs-64.h" 1 3 4
# 11 "/usr/include/x86_64-linux-gnu/gnu/stubs.h" 2 3 4
# 392 "/usr/include/features.h" 2 3 4
# 28 "/usr/include/stdio.h" 2 3 4

.........

extern void funlockfile (FILE *__stream) __attribute__ ((__nothrow__ , __leaf__));
# 942 "/usr/include/stdio.h" 3 4

# 2 "hello_world.c" 2

# 2 "hello_world.c"
int main(){
 printf("Hello world! \n");
 return 0;
}

Here, the # indicates line number and the whole file basically means a series of file calling each other, for example file hello_world.c (line #1) called to file /usr/include/stdio.h (line #1 and #27) and subsequently, file /usr/include/features.h (line #1 and #367), and so on.

2) Compiling (from .i to .s)

The next step is to compile the preprocessed file into assembly code, depending on the processor and system.

To see this file, try

gcc -S hello_world.i -o hello_world.s

If you take a peek into hello_world.s, you will see some assembly-language instructions like this

        .file   "hello_world.c"
        .section        .rodata
.LC0:
        .string "Hello world! "
        .text
        .globl  main
        .type   main, @function
main:
.LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        movl    $.LC0, %edi
        call    puts
        movl    $0, %eax
        popq    %rbp
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc
.LFE0:
        .size   main, .-main
        .ident  "GCC: (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609"
        .section        .note.GNU-stack,"",@progbits

3) Assembling (from .s to .o)

We are pretty close now, next step is to convert Assembly-language to machine language, with the Assembler. If your code requires functions from other codes, the Assembler will leave it blank, which is to be filled later by the Linker in the next step. To obtain the machine code, invoke this command

as hello_world.s -o hello_world.o

or directly from source hello_world.c, using gcc

gcc -c hello_world.c -o hello_world.o

4) Linking

The mechanism of linking multiple machine code is quite simple, however, the actually command could be quite involving, if you have to do it manually

ld -dynamic-linker /lib/ld-linux.so.2/usr/lib/crt1.o ... /lib/a/b/d.o ... hello_world.o

Fortunately, gcc have a command to do this automatically for us

gcc hello_world.o

and this would give out the default a.out, which run exactly like our previously compiled hello_world_exc. However, if you still love this name for your executable, you could have linked by this command

gcc hello_world.o -o hello_world_exc

Summary

In summary, the process of compilation comprises of 4 steps: Preprocessing, Compiling, Assembling and Linking. I would like to attach here a beautiful graph by Prof.Chua Hock-Chuan, Nanyang TU, Singapore.

Noted that the output of the graph is an (.exe) because this graph is originally produced to illustrate the process on Windows OS. On Linux, the executable does not have this extension.

Reference

http://codingfreak.blogspot.com/2008/02/compilation-process-in-gcc.html

https://www3.ntu.edu.sg/home/ehchua/programming/cpp/gcc_make.html

results matching ""

    No results matching ""