C -complier -options Lab

Before getting into the lab, I must congratulate the Fedora community in finally making the process of setting up the operating system painless. Attempting to install Fedora in the past, I would have to spend some time trying to get my wireless card to work, while other Linux Distros would be good to go out of the box. After installing Fedora 20 on my laptop, I was pleasantly surprised that my wireless network card was ready to connect.

To start the C compliler lab, I needed to install C libraries so I could use the standard libraries, such as #include . However, yum kept returning an error when attempting to install them. After speaking with Professor Tyler, he suggested I turn off SELinux for the yum install; and his suggestion allowed the C libraries to be installed properly. Just don’t forget to turn SELinux back on.

Using a simple Hello, World! C program, we compiled it with the following complier options:

  • -g :: enable debugging information
  • -O0 :: do not optimize
  • -fno-builtin :: do not use builtin function optimizations

Using objdump, and the proper options, we are able to take a closer look at the compiled Hello, World! program. Below are a few of the options we used to take a closer look at the code, and to see what it happening closer to the machine level. You can click on the options to see the output generated:


Looking at the output for the first time, it looked very daunting and may as well been scatches on the wall. And to some degree it still looks that way; however with what we learned in class and spending some time looking at the output, I can see the beginning of some semblance of what is going on. In the output generated with the --source option, where it reads printf("Hello World!\n");, we see:

int main() {
400530: 55 push %rbp
400531: 48 89 e5 mov %rsp,%rbp
printf("Hello World!\n");
400534: bf e0 05 40 00 mov $0x4005e0,%edi
400539: b8 00 00 00 00 mov $0x0,%eax
40053e: e8 cd fe ff ff callq 400410

My interpertation (I am hoping I am right), push %rbp, decreases the stack and places the value of the base pointer into memory, and the following line copies the value on the stack to the base. The line that reads 400534, moves the hex value of 0x4005e0 to the edi register, and then copy the hex value 0x0 to the eax register. Line 40053e calls the printf function (PLT = procedure linkage table?). Hopefully I am right, but if I’m wrong, feel free to let me know.

The lab continues by having us compile the Hello, World! with new options or with the removal of existing options. The first is to add the -static option. Immediatly we can see a difference. The compiled binary without the -static option is 9566B in size; while the binary of the file generated with the -static option is 812710B.

Using objdump with the -d option generates a text file that has 139564 lines, opposed to 183 lines when used against the binary that was complied without the -static option. The -static option prevents the linking of shared libraries, which means those libraries must be compiled with the program. Obviously this is going to have an impact on the size of the binary file, and the objdump file output.

The lab continues by having us remove the -fno-builtin option when compiling. Looking at the objdump with --source option. Looking in the PLT section, instead of printf being pushed, it is a command called puts. Puts is a C command that prints out a string with a new line. Since we are calling the printf function to display a string with a new line, it is more optimal (as the complier sees it) to use puts instead of printf.

Next, we remove the -g option when compling with GCC. This excludes debugging information from from the binary, causing the the output to be smaller. We can see the removal of the debug sections by using objdump with the -s option. The debug information would normally be at the bottom, however it is missing from this. This would explain the size difference in the binary.

Step 4 of the lab has us look at how values are loaded into the registery (order of the registries are used. It seems that they are loaded into registries in a specific order. Here are a few excerpts at various counts:

  • One Arguement:
    mov 0x200abb(%rip),%eax # 601034 =a=
  • Three Arguements:
    mov 0x200acf(%rip),%ecx # 60103c =c=
    mov 0x200ac5(%rip),%edx # 601038 =b=
    mov 0x200abb(%rip),%eax # 601034 =a=
  • Ten Arguements:
    mov 0x200b1c(%rip),%r10d # 60105c =j=
    mov 0x200b0d(%rip),%r9d # 601054 =i=
    mov 0x200b02(%rip),%r8d # 601050 =h=
    mov 0x200af8(%rip),%edi # 60104c =g=
    mov 0x200aee(%rip),%esi # 601048 =f=
    mov 0x200ae4(%rip),%ebx # 601044 =e=
    mov 0x200ad9(%rip),%r11d # 601040 =d=
    mov 0x200acf(%rip),%ecx # 60103c =c=
    mov 0x200ac5(%rip),%edx # 601038 =b=
    mov 0x200abb(%rip),%eax # 601034 =a=

WordPress seems to have an issue display greater than and less than signs (even with code tags), so the = = are less than and greater than signs. You can find a sample output here. It seems the order that the registers are used is eax, edx, ecx, r11d, ebx, esi, edi, r8d, r9d, r10d — perhaps the next incarnation of assemly should display it as like r1, r2, r3, … ?

The subsequence instructions for this lab has us use the printf fuction outside of the main() code; and changing the optimization level from 0 to 3 (-O0 to -O3). The output of objdump for the former is thus. We can see a section of assembly specifically for the function that was created, and the call for it in the “main()” section. This gave me insight to this part of the assembly code:
400540: 55 push %rbp
400541: 48 89 e5 mov %rsp,%rbp

Each time a function is called (both main and outhw), these two lines appear. This seems to move the current base pointer onto the stack, and replaces it with the value of the stack pointer for processing. Then the following lines:
400553: 5d pop %rbp
400554: c3 retq

returns the main function on the stack back to the bp for further processing.

Lastly, we changed the optimization level from 0 to 3. We can see a significant change when using objdump with the –source option. Compared to -O0 option, the main part of the program has been shifted sooner in the output. It is also only 7 lines opposed to the 15 lines in the original compliation. This is mostly due to the removal of moving things onto and off the stack. The GCC optimizer seems to have noticed that it does not need to create a a stack to execute the binary and complied it without the use of it.

After completing this lab, I have never appreciated the C complier more.

-Richard K.

Leave a comment