Monday, March 14, 2011

How to be Hardcore: Part 3

Good news everyone! Today we're going to actually start programming! I'll show you how to make a really easy "Hello World" program, in assembly! From this, you should get a good handle on the basic syntax of assembly, which will make the rest of assembly seem like a piece of cake.

Thanks GNU for as (assembler) and ld (linker)!




Syntax


GNU's 'as' uses AT&T syntax, as opposed to Intel syntax. The main difference between the two is order of instructions vs operands, and prefixes for registers and constants and such. The general format for instructions is this:

instruction   source, destination


Yup, thats it. Not too bad right? Well, I'm not going to lie but it can be a lot more complicated than that. But don't worry! We'll work out way up to that.


When using AT&T syntax, you denote the type of values by applying a prefix. If you are referring to a register, you apply a "%" prefix. For example, the eax register would be referred to as %eax. On the other hand, literal values get a "$" prefix. So, if you wanted to write the value "90", you would write $90. 


Hello World


Ok, now that you know the basic syntax, lets get started on a hello world program. I'm going to be writing this for a 32 bit x86 Linux system, since that's what I'm currently running. If you aren't running on this, check out my previous article for how to boot into Linux from a USB flash drive. It's really easy and isn't permanent!


So the first step to writing our program is to open up a blank text document. Use whatever your favorite editor is. I use vim, but it's just a preference thing and I don't plan on hosting an EMACS vs Vi flame war.

The first part of the program is:


.text
        .global _start


By placing a "." before the word, we are telling 'as' that it is a section. In this case, the text and global section. The global section is saying what to be run, in this case, whatever is marked "_start". Next, we add the data section with our Hello World message:


.data

hello:
        .ascii "Hello World!\n"
        length = 13


The data section contains our message, an ascii string. We set $length to equal the length of the string, since this is going to be necessary in a later step. Next up, the main _start routine:



_start:


movl $length, %edx
movl $hello, %ecx
movl $1, %ebx
movl $4, %eax
int $0x80



Might look scary, but lets break it down. _start: is a label, saying that whatever is following this is part of the _start routine. The first instruction (movl) is to move the constant value of $length (in this case 13) to the %edx register. We then move our string $hello to the %ecx register. Next, we move the constant $1 to the %ebx register, and finally $4 to the %eax register. Finally, int $0x80 means we are making an 0x80 interrupt, which is the sys_call interrupt.


The values we placed in the registers are there for a reason. When using 0x80 on Linux (I believe it's 21h on Windows), we are making a call to the kernel itself. The kernel first looks in the %eax register to see what type of call it needs. When it sees %eax holds a value of $4, it realizes that it needs to perform a sys_write. After this, it looks to the other registers for more data. It looks at %ebx for the file handle (in this case, $1 means standard out), which is where to write. It looks at %ecx for the message itself, and at %edx for the length of the message. The next part we need to add is how to exit the program:



movl $0, %ebx
movl $1, %eax
int $0x80

First, we are changing the value of %ebx to $0, and the value of %eax to $1. As a result, when we call the kernel with another int $0x80, we make a differnent system call. Since %eax changed to $1, it now executes system call #1, which is....sys_exit. Sys_exit makes the kernel then read %ebx for the return value. In our case, we want it to return a value of 0 since it successfully completed. However, feel free to play around with different return values. 

That's it for the program itself! The resulting program in full should be this:

.text
        .global _start

.data

hello:
        .ascii "Hello World!\n"
        length = 13

_start:

movl $length, %edx
movl $hello, %ecx
movl $1, %ebx
movl $4, %eax
int $0x80

movl $0, %ebx
movl $1, %eax
int $0x80

So now we need to make this into a working program. So, save the file as "helloworld.s" and exit your text editor. Pop open a shell and get into the directory with your program. First, we use 'as' to turn the assembly code into object code:

$ as -o helloworld.o helloworld.s

Next, we finish it up with a linker:

$ ld -s -o helloworld helloworld.o

Now we have a nice "helloworld" executable in our directory. Execute it:

$ ./helloworld 
Hello World!
$

Congratulations! You just programmed and ran your first program written completely in assembly! Yes, you are now a hardcore programmer. But don't worry, this is just the beginning. I'll make a few more assembly tutorials in the future, building off this. In the meantime, try experimenting. Here is a list of Linux system calls and the register values for each one. 


I hope I made this tutorial as clear as possible. If you have any questions, feel free to leave a comment! I usually get back within 12 hours. Good luck, new hardcore programmers :)


15 comments:

  1. ctrl-c
    ctrl-v
    Hooray, I a pooter pogrammor!

    It's totally over my head, but thanks all the same.

    ReplyDelete
  2. assembly is too hardcore for me, but i will be following along

    ReplyDelete
  3. this is a great start, i'll follow and look forward to similar stuff. hooray for learnin stuffs

    ReplyDelete
  4. damn I had no idea assembly was this easy lol. I expected it to be harder. Following because I love Linux, programming and being HARDCORE. =D

    An overwhelming dose of awesome can be found in my 4th electro set! Check it
    Electric Addict Set #4

    ReplyDelete
  5. wow this is a lot of lines for such a small program. iam glad iam a writer/journalist rather than some IT guy. this seems like hard work, but ill def try to keep track on you!

    ReplyDelete
  6. Due to the "Good news everyone!" the beginning, I read this entire thing in prof hubert farnsworth voice =_=

    ReplyDelete
  7. sfxworks: Thats EXACTLY what I was going for! I was going to add a picture of Farnsworth for theatrical effect, but decided a picture of the GNU wildebeest was more on topic :)

    ReplyDelete
  8. This is great! Thanks for the guide!

    ReplyDelete
  9. I read that in Professor Hubert J. Farnsworth's voice.

    ReplyDelete
  10. this is actually really helpful. awesome. thanks.

    following. check my blog out and see what ya think

    ReplyDelete

Please leave a comment