Hello, Assembly!

Published September 19th, 2021

Assembly

Introduction

We all know that writing proper Assembly is hard. While constructing even a Assembly file, we are constantly being tormented by weird compiler errors, segmentation faults, undefined behaviour and more. During my first few weeks at a university that forces us to write Assembly by hand, I have come across many fellow students who went ape about some weird error they weren't able to solve.

It's not their fault though. Expecting students (some of which without any previous programming experience) to write god-tier GAS Assembly in the first three weeks of the programme simply isn't going to happen with a tedious and incomplete self-study guide. Resources about GAS Assembly on the internet are scarce, because no-one actually writes assembly by hand and the Intel syntax is more popular. In fact, Assembly becomes ten times easier once you know what you're doing. It's time to write a proper guide.

Hello world: not that simple

Let's start with a proper simple hello world program. I will explain in detail what every line is doing.

			
.data

hello_world_str:

	.string "Hello, World!"

.text

.global main

main:

	# prologue

	pushq %rbp      # save rbp

	movq %rsp, %rbp # create a new stackframe

	# output "Hello, World!" and a newline

	movq $hello_world_str, %rdi

	call puts

	# epilogue

	movq %rbp, %rsp # end the stackframe

	popq %rbp       # restore the previous stackframe

	movq $0, %rax   # exit successfully

	ret             # returning from the main function exits the program

Let's save the file as app.s and compile it with gcc app.s -o app -no-pie -g. This commands calls the GNU Compiler Collection to compile the assembly file app.s. With the output flag we specify that we want the output executable to be called app. The -no-pie flag is used to disable the Position Independent Executable flag. PIE is a Linux security feature which we have to disable to make our lives easier when compiling Assembly. Finally, the -g flag denotes that we want to add debugging symbols to our executable, which will be very helpful when we have to debug our program.

Let's go through the source code line by line.

1: We start by defining the data section. In this section we put the string we want to print. The data section can also be used to save any data our program might need, or to store global variables.

3: This is a label which we can use to address the string containing our hello world message.

4: The .string keyword denotes that we want to store a string of characters. The compiler will append a null byte to the string, which is used to identify the end of the string.

7: Next we define the text section. This section will hold the actual code of our program.

9: We specify that we will declare a global function main. Without this, the compiler cannot access the main function and our code wouldn't compile.

10: Here we put the label for the main function. This function is special, because the compiler will use the main function as entry point for the program.

12: Most functions we will write require a prologue. In the prologue we will create a new stack frame. A stack frame is a place for a function to create local variables.

14: We have to ensure the stack frame of the caller is left untouched, so we have to save the stack frame pointer (also known as the base pointer, %rbp).

15: This line of code will move the base pointer to the top of the stack (defined by the stack pointer, %rsp), where we place the new stack frame.

19: In order to print the hello world string, we move its address into the %rdi register, which holds the value of the first argument for a function. The dollar sign in front of the hello world string label indicates that we take the address of the label, instead of the first bytes stored at the label.

20: Now we call the puts function. It is a function from the C standard library. As the acronym implies, it will put a string onto our screen. puts expects an address to a string as its first parameter, and it will print all characters one by one until it finds a null byte, indicating the end of the string. Finally it will end the line by printing the newline character.

22: We conclude the function with an epilogue, which does the opposite of the prologue. It will clean up the current stackframe, so we can safely pass control back to the caller.

24: This brings back the stack pointer to where it was just after the function was called.

25: Next we restore the base pointer to where it was before the function was called.

26: We move the value 0 into the return register (%rax). The return value of the main function denotes the exit code of the program. Zero means success, and anything else implies that an error occured.

27: Lastly, we return from the main function. This will exit the program.

Hopefully you will have learnt how a basic Assembly file looks like. It's time to dive deeper into some Assembly fundamentals.

Hello, Assembly!

Published September 19th, 2021

Introduction Get link to this section

Hello world: not that simple Get link to this section

Introduction

Hello world: not that simple