We all know that writing proper Assembly is hard.
While constructing even a Assembly file, we are constantly being tormented
by weird compiler errors, segmentation faults, undefined behaviour
During my first few weeks at a university that forces us to write Assembly
by hand, I have come across many fellow students who went ape about
some weird error they weren't able to solve.
It's not their fault though. Expecting students (some of which without any previous programming experience) to write god-tier GAS Assembly in the first three weeks of the programme simply isn't going to happen with a tedious and incomplete self-study guide. Resources about GAS Assembly on the internet are scarce, because no-one actually writes assembly by hand and the Intel syntax is more popular. In fact, Assembly becomes ten times easier once you know what you're doing. It's time to write a proper guide.
Hello world: not that simple
Let's start with a proper simple hello world program. I will explain in detail what every line is doing.
1.data23hello_world_str:4.string "Hello, World!"567.text89.global main10main:1112# prologue1314pushq %rbp # save rbp15movq %rsp, %rbp # create a new stackframe1617# output "Hello, World!" and a newline1819movq $hello_world_str, %rdi20call puts2122# epilogue2324movq %rbp, %rsp # end the stackframe25popq %rbp # restore the previous stackframe26movq $0, %rax # exit successfully27ret # returning from the main function exits the program
Let's save the file as
app.s and compile it
gcc app.s -o app -no-pie -g.
This commands calls the GNU Compiler Collection to compile the assembly file
With the output flag we specify that we want the output executable to
-no-pie flag is used to disable the
Position Independent Executable flag.
PIE is a Linux security feature which we have to disable to make our
lives easier when compiling Assembly.
-g flag denotes that we want
to add debugging symbols to our executable, which will be very helpful
when we have to debug our program.
Let's go through the source code line by line.
1: We start by defining the data section.
In this section we put the string we want to print.
The data section can also be used to save any data our program might
need, or to store global variables.
3: This is a label which we can use to
address the string containing our hello world message.
.string keyword denotes that we want
to store a string of characters. The compiler will append a null byte to
the string, which is used to identify the end of the string.
7: Next we define the text section.
This section will hold the actual code of our program.
9: We specify that we will declare a global
Without this, the compiler cannot access the main function and our code
10: Here we put the label for the main
function. This function is special, because the compiler will use the
main function as entry point for the program.
12: Most functions we will write
require a prologue.
In the prologue we will create a new stack frame.
A stack frame is a place for a function to create local variables.
14: We have to ensure the stack frame of the
is left untouched, so we have to save the stack frame pointer (also known
as the base pointer,
15: This line of code will move the base
pointer to the top of the stack (defined by the stack pointer,
%rsp), where we place the new stack
19: In order to print the hello world string,
we move its address into the
register, which holds the value of the first argument for a function.
The dollar sign in front of the hello world string label indicates that we
take the address of the label, instead of the first bytes stored at the
20: Now we call the
puts function. It is a function from the C
standard library. As the acronym implies, it will put
a string onto our screen.
puts expects an address to a string as its
first parameter, and it will print all characters one by one until it
finds a null byte, indicating the end of the string.
Finally it will end the line by printing the newline character.
22: We conclude the function with an
epilogue, which does the opposite of the prologue.
It will clean up the current stackframe, so we can safely pass control
back to the caller.
24: This brings back the stack pointer to
where it was just after the function was called.
25: Next we restore the base pointer to
where it was before the function was called.
26: We move the value 0 into the return
The return value of the main function denotes the exit code of the program.
Zero means success, and anything else implies that an error occured.
27: Lastly, we return from the main function.
This will exit the program.
Hopefully you will have learnt how a basic Assembly file looks like. It's time to dive deeper into some Assembly fundamentals.