Skip to main content

Important Lessons to Remember from Assembly Language programming

Okay, so I have been writing assembly language programs for a little bit now and to be honest with everyone, not like it is not obvious, assembly can be very frustrating especially when you write code and it seems to make logical sense.  However, when you run the program, it either says segmentation fault, memory corruption issues, or the program displays a bunch of random text to the screen.

Okay so, recently I have been trying to write an application that does something simple using assembly.  The goal was to write a program with Nasm and have it display the number of arguments passed to our program.

From my previous posts, we know that ebp+4 contains the return address after the main method is executed.  We also now that ebp+8 is the first parameter passed to main,  ebp+12 is the second parameter, and so on and so on with 4 added to each time.  This is because from the C programming language the main method has a header declaration that is of the following syntax:

int main(int numArg, char* actualArgContents[]])

With this in mind, I set out to write a program in assembly that would simply display the number of arguments that was passed to our program to the screen.  In my first run at it, I produced the following lines of code:


;an equivalent program to this in assembly
SECTION .data

msg: db "Arg = %s ",10,0
msg2: db "Arg Count = %d", 10,0

SECTION .text
;allow access to printf
extern printf
;make our main available externally
global main


main: ;int main(int numArguments, char* arg[])
     push ebp
     mov ebp , esp
     sub esp, 4


 mov eax, DWORD[ebp +8]          ;points to numArguments
 mov ebx, DWORD[ebp +12]  ;points to arg
 mov ecx , 0
 
        ;attempt to display the number of arguments 
 push eax          ;push 4 bytes, contains number args
 push msg2         ;memory address are 4 bytes in size
 call printf
 add esp, 8         ;cleans up the stack;8= 2 items of 4 bytes in size


     mov esp, ebp
     pop ebp
     ret

Now, just looking at the above code, everything appears to be in order.  I moved the value stored on the stack into the accumulation register (which of course resides in the CPU, for increased performance).  Then I pushed the string formater address that will display the number, and finally, I called printf from the C standard library.

The above code should should work.  Actually, it does work.  However, the problem I encountered was that after the call to printf, the value stored in eax register was not the same value before the call.  Which means that printf method changes the content of registers.

So the take away point from this is that, if you expect the contents of a register to have the same values after a function executes then it behooves you to save those registers value onto the stack and restore them after the function all.  So the above code because safe by changing it to:
;an equivalent program to this in assembly
SECTION .data

msg: db "Arg = %s ",10,0
msg2: db "Arg Count = %d", 10,0

SECTION .text
;allow access to printf
extern printf
;make our main available externally
global main


main: ;int main(int numArguments, char* arg[])
     push ebp
     mov ebp , esp
     sub esp, 4


 mov eax, DWORD[ebp +8]          ;points to numArguments
 mov ebx, DWORD[ebp +12]  ;points to arg
 mov ecx , 0
 
        ;new
        pushad ;save the all the register values on the stack
               ; alternatively you could just do 'push eax'

        ;attempt to display the number of arguments 
 push eax          ;push 4 bytes, contains number args
 push msg2         ;memory address are 4 bytes in size
 call printf
 add esp, 8         ;cleans up the stack;8= 2 items of 4 bytes in size
 
        ;new 
        popad ;restore all the register values on the stack
              ; alternatively you could just do 'pop eax'

     mov esp, ebp
     pop ebp
     ret

Comments

  1. Interesting post... Wish I saw it a few days ago when I was stuck on the same issue! At least now this clears it up for me! Thanks!

    ReplyDelete

Post a Comment

Popular posts from this blog

Creating local variables In Assembly

Lets go over how to create local variables inside of a pure assembly source code. Much like always, you will start with a *.asm file that looks like this: source code SECTION .data SECTION .bss SECTION .text global main ;make main available to operating system(os) main: ;create the stack frame push ebp push mov ebp, esp ;destroy the stack frame mov esp, ebp pop ebp ret So, the above is the general layout of an NASM source file.  Our goal here is to create a local variable inside of the main method.  The only way to create a local variable is by using the stack.  Why?  Because we can only declare variable in storage locations and the only available storage locations are: text, bss, and data.  However, text section is only for code, so it is out of the question.  The bss and data sections are appealing, but to declare our "local" variable in these sections will defeat the purpose of these variables being local, t

NASM Programming

Many of you, if you are like me, might be interested in how assembly works.  You will be very surprised that assembly is very very easy, especially after you write a couple of simple programs.  But don't get me wrong, you will be frustrated at first, however that frustration, if you channel it right, will lead to serious life long learning and will give you a deeper appreciation of the beauty of assembly. For more tutorial on assembly and visualization of these information, visit my youtube channel . Okay so lets get started. We will be using Netwide Assembler (NASM) to write our program. The general format of NASM file is this: ;This is a comment SECTION .data ;declare variable here SECTION .bss ;declare actual, dynamic variable SECTION .text ;where your program code/assembly code lives ; Working with Data Section In your .data section, you can declare variables like this: nameOfVariable: db 32 ;this declares a variable names nameOfVariable with byte valu

Introduction to Linux Kernel Programming

The Linux kernel is designed as a mixture of a monolithic binary image and a micro-kernel.  This combination allows for the best of both worlds.  On the monolithic side, all the code for the kernel to work with the user and hardware is already installed and ready for fast access, but the downside is that to add more functionality you need to rebuild the entire kernel.   In a different manner, a micro-kernel is composed of small pieces  of code that can be meshed today and more pieces can be added or removed as needed.  However, the downside to micro-kernel is a slower performance. Adding a module to the Kernel Linux is organized as both monolithic, one huge binary, and micro-kernel, as you can add more functionality to it.  The process of adding more functionality to the kernel can be illustrated by the crude image to the left. The process begins by using the command insmod with the name of the kernel module you want (which usually ends with extension *.ko).  From here, the mod