Building Programs from Source Code

From CLFS-HINTS

(Redirected from ToolChain Triad)
Jump to: navigation, search

Contents

Basics of Building Programs/Libraries

To build a program you need to convert source code into machine readable code, called an object file, and link it with operating system specific executable code.

Image:Compsys.png

From Code to a Bin file

We start with the source code and the header files (if used)

Step 1. The compilers preprocessor takes the source code and puts the header 
file into the source code, wich is stored in ram.
Step 2. The Compiler (proper) takes the full source code from Step 1, and
translates it into assembly language code.
Step 3. The assembler takes the assembly code and translates it to machine
instructions of the the target computer where it will run. The machine
instructions are stored in object files corresponding to the source files.
Step 4. The link editor makes links between different object files, if multiple 
object files are made, as well checks any shared/dynamic libs to make sure that  
the libs will work with the new object file. At this time, all static libs are 
included at this step.
Step 5. The output of the linker is a file called a.out. This file is the
compiled, ready to run version of the source code. We can rename the a.out file
to a better name.

The CLFS System

On the CLFS System, we use:

C and C++ compilation -> GCC (GNU C Compiler)
Linker, Assembler -> Binutils
Main C library -> GLibC

Critical libraries

Glibc (aka GNU Glibc) is an implementation of the C library. It is the core of all programs running on your computer. This library also contains all the C++ library.

Image:caution.png

Note

This is the only package which should NEVER be touched or modified, unless you know what your doing. Modifying this CRITICAL CORE library can very easily render your system inoperable. All programs, including your KERNEL!, will be affected.

Definitions

Linking

Linking is when object code and OS specific executable code is combined to create an executable. There are two types of executable files. Static and dynamic. Static programs contain ALL the library code needed to run the program. A dynamic program is a program which loads it's libraries on the fly.

Linker

Programs that are designed to combine the compiled sourcecode, with executable headers.

Example Program: ld

Assembler

The assembler takes sourcecode written for the processor, called assembly language, and turns it into machine readable code (called an object file).

Example Program: as

Compiler

The compiler is the set of programs which translate sourcecode written in english into machine object code. It does NOT create a binary which is executable. That is the job of the linker. The most common languages being compiled are C, C++ and Java. Other languages include the archaic COBOLT, BASIC and FORTRAN.

Example Program: (g)cc, (g)(c)++

Static Lib

Libraries which contain ONLY the APIs. Also can be called object ARchives. These libs are loaded into the program at linking. After the linking is done the lib file is not needed to run the binary.

Example: /lib/libc.a

Dynamic (Shared) Lib

Libraries which contain an ARchive of object files. The only difference is that it also has special executable code that allows it to be loaded/unloaded by a program when it is needed. These libs are only checked at the linking stage, and require to be present at runtime.

Example: /lib/libc.so (a symlink to /lib/libc.so.6)

Headers

Header files serve as the interface between your program and the libraries supplied by the C compilation system. Because the the functions that perform standard I/O; for example, very often use the same definitions and declarations, the system supplies a common interface to the functions in the header file stdio.h.

You can create a header file with any editor, store it in a convenient directory, and include it in your program.


Header files are traditionally designated by the suffix .h, and are brought into a program at compile time. The preprocessor component of the compiler does this because it interprets the #include statement in your program as a directive. The two most commonly used directives are #include and #define.

The #include directive is used to call in and process the contents of the named file.

   #include <stdio.h>
   #include "headers/myheader.h"

The #define directive is used to define the replacement token string for an identifier:

   #define NULL 0

Commonly used headers:

    assert.h   assertion checking
     ctype.h   character handling
     errno.h   errno handling
     float.h   floating point limits
    limits.h   other data type limits
    locale.h   program's locale
      math.h   mathematics
    setjmp.h   nonlocal jumps
    signal.h   signal handling
    stdarg.h   variable arguments
    stddef.h   common definitions
     stdio.h   standard input/output
    stdlib.h   general utiltiies
    string.h   string handling
      time.h   date and time
    unistd.h   system calls

Return to Main Page

Personal tools