Friday, April 27, 2012

Writing a Scripting Language

One of the projects that I am working on is how to write a scripting language using c++.  At first thought, I imagined the task to be hard, but to my surprise it is quiet easy.

In this post I will introduce the basics.  However, I do suggest you visit my youtube channel and watch the video series.

expression, token, compile, c, script, c++, write, programming
token evaluation process
 As you can see from the image to the left, my script engine basically takes a string expression, converts it to tokens, determines its meaning, and then compiles the effective byte code.

Byte code is the instruction set for my script language.  It will tell our virtual machine or virtual process what an instruction is suppose to do.






recursive, call, token, script, evaluation, process, programming
In more detail, the script engine does a recursive call with the tokens to successfully generate all the corresponding byte code.


Source Code

This link contains the source code and is version 1 of script.

Thursday, April 12, 2012

The Additon & Subtraction in Assembly

There is not much difficulty when it comes to addition and subtraction in assembly programming.

Simply, additon and substraction breaks down to the following:


    add eax, ecx                    ; eax = eax + ecx, result in eax
    add eax, DWORD [ebp-4]             ; eax = eax + localVar1, result in eax
    add DWORD [ebp-4],DWORD [ebp -4]     ; illegal, with all instruction both operands can never be memory 
    add DWORD [ebp-4], eax            ; [ebp-4] = [ebp-4] + eax

    sub eax, ecx                    ; eax = eax - ecx, result in eax
    sub eax, DWORD [ebp-4]             ; eax = eax - localVar1, result in eax
    sub DWORD [ebp-4],DWORD [ebp -4]     ; illegal, with all instruction both operands can never be memory 
    sub DWORD [ebp-4], eax            ; [ebp-4] = [ebp-4] - eax
A simple program to display the message about an arithetic operation like "Math: 8 + 4 = ?" can be achived by the following code block:
;an equivalent program to this in assembly
SECTION .data

operChar: db '+',0
msg: db 'Math: %d %c %d = %d',10,0

SECTION .text
;allow access to printf
extern printf
;make our main available externally
global main

main:    ;int main(int numArguments, char* arg[])
     push ebp
     mov ebp , esp
    sub esp, 4    ;reserve space for a 32 bit variable[4 byes= 8*4=32]

    ;set up the register what will hold the values we want to operate on
    mov eax , 8
    mov edx , 4
    
    push eax    ;save value of eax; so msg can be displayed correctly
    add eax, edx ;translates to eax = eax + edx
    mov ecx, eax    ;mov result into ecx
    pop eax     ;restore value of eax 

    ;recall that printf tooks like
    ;printf(msg,eax,operChar,edx,result)
    push ecx    ;temporary- we will get the value using assembly, for now just bare with me
    push edx
    push DWORD [operChar]
    push eax
    push msg
    call printf
    add esp, 20     ;this cleans up the stack; we pushed 5 things unto the stack each of 4 bytes long = 5*4


     mov esp, ebp
     pop ebp
     ret

Wednesday, April 11, 2012

Please Support My Android Game

Hey everyone, I wrote a game for android called Space Cosmos Defender.

Please check it out here or just type "Space cosmos adventure" on your Google/Play market.

Thank you all in advance.

Here are some of the the screen-shots of my game.
android, game, soliduscode, solidus, space, defender
Main menu
In game action

android, space, game, soliduscode, eleanya
More Action

Important Lessons to Remember from Assembly Language programming

Okay, so I have been writing assembly language programs for a little bit now and to be honest with everyone, not like it is not obvious, assembly can be very frustrating especially when you write code and it seems to make logical sense.  However, when you run the program, it either says segmentation fault, memory corruption issues, or the program displays a bunch of random text to the screen.

Okay so, recently I have been trying to write an application that does something simple using assembly.  The goal was to write a program with Nasm and have it display the number of arguments passed to our program.

From my previous posts, we know that ebp+4 contains the return address after the main method is executed.  We also now that ebp+8 is the first parameter passed to main,  ebp+12 is the second parameter, and so on and so on with 4 added to each time.  This is because from the C programming language the main method has a header declaration that is of the following syntax:

int main(int numArg, char* actualArgContents[]])

With this in mind, I set out to write a program in assembly that would simply display the number of arguments that was passed to our program to the screen.  In my first run at it, I produced the following lines of code:


;an equivalent program to this in assembly
SECTION .data

msg: db "Arg = %s ",10,0
msg2: db "Arg Count = %d", 10,0

SECTION .text
;allow access to printf
extern printf
;make our main available externally
global main


main: ;int main(int numArguments, char* arg[])
     push ebp
     mov ebp , esp
     sub esp, 4


 mov eax, DWORD[ebp +8]          ;points to numArguments
 mov ebx, DWORD[ebp +12]  ;points to arg
 mov ecx , 0
 
        ;attempt to display the number of arguments 
 push eax          ;push 4 bytes, contains number args
 push msg2         ;memory address are 4 bytes in size
 call printf
 add esp, 8         ;cleans up the stack;8= 2 items of 4 bytes in size


     mov esp, ebp
     pop ebp
     ret

Now, just looking at the above code, everything appears to be in order.  I moved the value stored on the stack into the accumulation register (which of course resides in the CPU, for increased performance).  Then I pushed the string formater address that will display the number, and finally, I called printf from the C standard library.

The above code should should work.  Actually, it does work.  However, the problem I encountered was that after the call to printf, the value stored in eax register was not the same value before the call.  Which means that printf method changes the content of registers.

So the take away point from this is that, if you expect the contents of a register to have the same values after a function executes then it behooves you to save those registers value onto the stack and restore them after the function all.  So the above code because safe by changing it to:
;an equivalent program to this in assembly
SECTION .data

msg: db "Arg = %s ",10,0
msg2: db "Arg Count = %d", 10,0

SECTION .text
;allow access to printf
extern printf
;make our main available externally
global main


main: ;int main(int numArguments, char* arg[])
     push ebp
     mov ebp , esp
     sub esp, 4


 mov eax, DWORD[ebp +8]          ;points to numArguments
 mov ebx, DWORD[ebp +12]  ;points to arg
 mov ecx , 0
 
        ;new
        pushad ;save the all the register values on the stack
               ; alternatively you could just do 'push eax'

        ;attempt to display the number of arguments 
 push eax          ;push 4 bytes, contains number args
 push msg2         ;memory address are 4 bytes in size
 call printf
 add esp, 8         ;cleans up the stack;8= 2 items of 4 bytes in size
 
        ;new 
        popad ;restore all the register values on the stack
              ; alternatively you could just do 'pop eax'

     mov esp, ebp
     pop ebp
     ret

Friday, April 6, 2012

I wrote, with the help of The Art of Java by Herbert Schildt and James Holmes, a custom parser that evaluates a numerical expression like: "10+32/2".  So the following is the code:


 
package com.soliduscode.eleanya;

import java.util.logging.Handler;

/**
 * 
 * 
 * @author ukaku
 *
 */
public class Parser {

 
 final int NONE =0;
 final int DELIMITER =1;
 final int VARIABLE = 2;
 final int NUMBER = 3;
 
 final int SYNTAX = 0;
 final int UNBALPARENS=1;
 final int NOEXP = 2;
 final int DIVBYZERO=3;
 
 final String EOE = "\0";
 
 /**the expression*/
 private String exp;
 /** expression index */
 private int expIndex;
 /**Current token*/
 private String token;
 /**The token type*/
 private int tokenType;
 
 /**
  * Return the next token in the expression
  */
 private void getToken(){
  //clear values initially
  tokenType = NONE;
  token = "";
  
  //check for endl of expression
  if(expIndex == exp.length()){
   token = EOE;
   return;
  }
  
  //skip over white spaces
  while(expIndex < exp.length() && Character.isWhitespace(exp.charAt(expIndex))) ++expIndex;
  
  //trailing white spaces ends expression
  if(expIndex == exp.length()){
   token = EOE;
   return;
  
  }
  
  if(isDelimiter(exp.charAt(expIndex))){
   token += exp.charAt(expIndex);
   expIndex++;
   tokenType = DELIMITER;
  }else if(Character.isLetter(exp.charAt(expIndex))){
   
   while(expIndex < exp.length() && !isDelimiter(exp.charAt(expIndex))){
    token+=exp.charAt(expIndex);
    expIndex++;
    if(expIndex >= exp.length()) break;
   }
   tokenType = VARIABLE;
  }else if(Character.isDigit(exp.charAt(expIndex))){
   while(expIndex < exp.length() &&  !isDelimiter(exp.charAt(expIndex))){
    token += exp.charAt(expIndex);
    expIndex++;
    if(expIndex > exp.length()) break;
   }
   tokenType = NUMBER;
  }else{
   token = EOE;
   return;
  }
 }
 
 /**
  * Returns the value of the expression
  * 
  * 
  * @param expstr  the expression to evaulate
  * @return    the value of the expression
  * @throws ParserException
  */
 public double evaluate(String expstr) throws ParserException {
  double result;
  exp = expstr;//+"  ";
  expIndex= 0;
  
  getToken();
  if(token.equals(EOE))
   handleError(NOEXP); //no expression present
   
    
  result = evaluateAdditionSubstraction();
  if(!token.equals(EOE))
   handleError(SYNTAX);
   
  return result;
  
 }
 private double evaluateAdditionSubstraction() throws ParserException{
  char op;
  double result;
  double partialResult;
  
  result = evaluateMultDivMod();
  while((op = token.charAt(0)) == '+' || op == '-'){
   getToken();
   partialResult = evaluateMultDivMod();
   switch(op){
   case '-':
    result = result - partialResult;
    break;
   case '+':
    result = result + partialResult;
    break;
   }
  }
  return result; 
 }
 
 private double evaluateMultDivMod() throws ParserException {
  char op;
  double result;
  double partialResult;
  
  result = evaluteExponent();
  
  while((op = token.charAt(0)) == '*' ||op == '/' || op == '%'){
   getToken();
   partialResult = evaluteExponent();
   switch(op){
   case '*':
    result = result * partialResult;
    break;
   case '/':
    if(partialResult == 0.0){
     handleError(DIVBYZERO);
    }
    result = result / partialResult;
    
    break;
   case '%':
    if(partialResult == 0.0)
     handleError(DIVBYZERO);
    result = result % partialResult;
    break;
   }
  }
  return result;
  
 }
 //process exponent
 private double evaluteExponent() throws ParserException{
  double result;
  double partialResult;
  double ex;
  int t;
  
  result = evaluateUnary();
  if(token.equals("^")){
   getToken();
   partialResult = evaluteExponent();
   ex = result;
   if(partialResult == 0.0){
    result = 1.0;
   }else{
    for(t=(int)partialResult-1; t>0; t--){
     result = result * ex;
    }
   }
  }
  return result;
 }
 //evaluate a unarry + or -
 private double evaluateUnary() throws ParserException{
  double result;
  String op;
  op="";
  if((tokenType == DELIMITER) && token.equals("+") || token.equals("-")){
   op = token;
   getToken(); //get the other unary
  }
  result = evaluateParenthesis();
  if(op.equals("-")) result = -result;
  
  return result;
 }
 
 //process parenthesized expression
 private double evaluateParenthesis() throws ParserException{
  double result;
  if(token.equals("(")){
   getToken();
   result = evaluateAdditionSubstraction();
   if(!token.equals(")"))
    handleError(UNBALPARENS);
   getToken();
  }else
   result = atom();
  
  return result;
  
 }
 //get the value of a number
 private double atom() throws ParserException {
  double result = 0.0;
  switch(tokenType){
  case NUMBER:
   try{
    result = Double.parseDouble(token);
   }catch(NumberFormatException exc){
    handleError(SYNTAX);
   }
   getToken();
   break;
  default:
   handleError(SYNTAX);
   break;
  }
  return result;
 }
 
  private void handleError(int error) throws ParserException {
   String[] e = { "Syntax Error",
     "nbalanced parentheses",
     "No Expression Present",
     "Division by zero"
   };
   throw new ParserException(e[error]);
  }
 /**
  * Determin is character c is a delimiter symbol
  * @param c
  * @return
  */
 private boolean isDelimiter(char c){
  if((" +-/*%^=()".indexOf(c) != -1)){
   return true;
  }
  return false;
 }


 public static void main(String arg[]){
  Parser parser = new Parser();
  
  try {
   System.out.println("Hello world " + parser.evaluate("10*10^2"));
  } catch (ParserException e) {
   e.printStackTrace();
  }
  
 }
}



Thursday, April 5, 2012

Creating local variables In Assembly

Lets go over how to create local variables inside of a pure assembly source code.

Much like always, you will start with a *.asm file that looks like this:

source code

SECTION .data

SECTION .bss

SECTION .text
global main                    ;make main available to operating system(os)
main:
     ;create the stack frame
     push ebp
     push mov ebp, esp


     ;destroy the stack frame
     mov esp, ebp
     pop ebp
     ret 

So, the above is the general layout of an NASM source file.  Our goal here is to create a local variable inside of the main method.  The only way to create a local variable is by using the stack.  Why?  Because we can only declare variable in storage locations and the only available storage locations are: text, bss, and data.  However, text section is only for code, so it is out of the question.  The bss and data sections are appealing, but to declare our "local" variable in these sections will defeat the purpose of these variables being local, they would be global.

stack frame looks visualize tutorial
So really, the only way to declare local variables is by utilizing the stack.  That no problem, in fact the stack is our friend.  Refer to the image to the right, and you will find that the stack (stack frame) is nothing than a dynamic memory abstraction, designed to make our lives easier.  Its got two components, esp (the stack pointer) and ebp (the base pointer).  The esp is always changing while ebp is "created" and destroyed by moving the current value of esp into and out of it.

From our code template above, creation and destruction of our function main's stack frame is achieved by saving the callers ebp, creating a new ebp by assigning ebp a new value.  Then we destroy the frame by restoring esp to where it was prior to main being code.  Then finally by popping the value of the callers base pointer back into ebp.

Okay, okay, you just want to create local variables inside of main. So let us do that:

SECTION .data

msg: db "the variable has value of %d",10,0 ;we use this to display the variable
SECTION .bss

SECTION .text

     extern printf                  ;tell nasm that we want to call printf in this asm
     global main                    ;make main available to operating system(os)
main:
     ;create the stack frame
     push ebp
     push mov ebp, esp

     ;create local variables by reserving space on the stack
     sub esp, 0x10  ;reserve space of 16 bytes-- maybe 4 integers(4bytes*8bit=32bits)



     ;we use increment of 4 because check push is 4 bytes in length
     ;because the stack is structured to hold 32 bit(4byte) values
     ;notably address, which are 4 bytes in length
     mov DWORD [ebp-4], 0xf ;store 15 into first variable 
     mov DWORD [ebp-8], 0xff  ;255
     mov DWORD [ebp-12],0xfff ;etc
     mov DWORD [ebp-12],0xffff ;etc

     push DWORD [ebp-4]   ;push the value stored at ebp -4 onto stack
     push DWORD msg       ;push the address of msg onto the stack
     call printf          ;call the extern, c standard library

     push DWORD [ebp-8]   ;push the value 
     push DWORD msg       ;push the address of msg onto the stack
     call printf          ;call the extern, c standard library

     push DWORD [ebp-12]   ;push the value 
     push DWORD msg       ;push the address of msg onto the stack
     call printf          ;call the extern, c standard library

     push DWORD [ebp-16]   ;push the value 
     push DWORD msg       ;push the address of msg onto the stack all printf
     call printf


     ;destroy the stack frame
     mov esp, ebp
     pop ebp
     ret 


Effectively, what we did was reserve space on the stack, the store values inside of those reserved spaces. That is all there is to making local variables in assembly.

Tuesday, April 3, 2012

Writing if statements in assembly language

Programs become more and more interesting when you have dynamic elements in them.  On such way of bringing your program to life is by adding logic.  In assembly, the task can seem dubious and awkward, but once you get a grip on the concept, it will be but second nature.

So lets get started!



//We want to write an equivalent program to this in assembly
#include <stdio.h>
int main(){
int x = 40;
if( x > 10){
printf("x is greater than 10\n");
}else{
printf("x is lesser than 10\n");
}
return 0;
} 

To write this in assembly, consider the following:

;an equivalent program to this in assembly
SECTION .data
x: dd 40
msg1: db "x is greater than 10", 10, 0
msg2: db "x is lesser than 10", 10, 0 
SECTION .text




Here, all we did is create our variable x, and the respective message that we will display depending on the result of our if statement.

Continually:
;an equivalent program to this in assembly
SECTION .data

x: dd 40
msg1: db "x is greater than 10", 10, 0
msg2: db "x is lesser than 10", 10, 0 
SECTION .text

;allow access to printf
extern printf
;make our main available externally
global main
main:
 push ebp
 mov ebp , esp
 cmp DWORD [x] , 10
 jg .conditionIsTrue   ;translates to: jmp if [x] > 10
.conditionNotTrue:         ;translate to: else 
 push DWORD msg2
 call printf
.conditionIsTrue:
 push DWORD msg1
 call printf
 jmp .done
.done:
 mov esp, ebp
 pop ebp
 ret
 

NASM Assembly - Hello World

Whenever you start programming, there is usually the first program that prints the phrase "Hello world" to the screen.  Well, let us keep that tradition and write an entire assembly program that print that message to the screen.


;Our Assembly Program file
SECTION .data
SECTION .bss
SECTION .text

The preceding is the standard file format of an assembly program using the Netwide assembler, or NASM.

To write something to the screen, we first need to store the value of what we want to render to the screen by declaring variables.

;Our Assembly Program file
SECTION .data
ourHelloMsg: db "Hello world, we are in assembly", 10, 0 ;our simple message
SECTION .bss
SECTION .text



Next, we want to use some real world practical assembly coding to print this message to the screen.  We could simple using the Linux int80h instruction to tell the operating system to print this message (if you aren't sure what I mean by this, do not worry), however we will use the printf function which is part of the C standard library.   This method will teach us how to mix assembly and C.

So let us get started:



;Our Assembly Program file
SECTION .data
ourHelloMsg: db "Hello world, we are in assembly", 10, 0 ;our simple message
SECTION .bss
SECTION .text
extern printf  ;this tell our compiler that printf is available remotely
global main    ;this tells our compiler to make "main" available to others 

main:
 ;create the stack frame
push ebp mov ebp, esp

 ;push the address of the msg onto the stack
;-->NOTE: label are aliases for memory address
 push msg ;so here, msg stands in place of something like 0x3048503
 call printf
 ;destroy the stack frame
 mov esp, ebp
 pop ebp
 ret





Now to compile this program, pop open a terminal and bash out these commands:

nasm -f elf -o asm1.o asm1.asm
gcc -o asmProgram asm1.o
./asmProgram



NASM Programming

Many of you, if you are like me, might be interested in how assembly works.  You will be very surprised that assembly is very very easy, especially after you write a couple of simple programs.  But don't get me wrong, you will be frustrated at first, however that frustration, if you channel it right, will lead to serious life long learning and will give you a deeper appreciation of the beauty of assembly.

For more tutorial on assembly and visualization of these information, visit my youtube channel.

Okay so lets get started.

We will be using Netwide Assembler (NASM) to write our program.
The general format of NASM file is this:

;This is a comment
SECTION .data
;declare variable here


SECTION .bss
;declare actual, dynamic variable


SECTION .text
;where your program code/assembly code lives


;

Working with Data Section
In your .data section, you can declare variables like this:

nameOfVariable: db 32 ;this declares a variable names nameOfVariable with byte value of 32
nameOfVariable2: dw 302 ;declare variable of 2 bytes
someString: db "This is an example of a string",10,0 ;we declare an array of string, 10=newline return and 0=termination of array(NULL)

So, db is define byte, dw is define word, and dd is define double word.  Here, byte is 8 bits, word is two bytes, and double word is two words.  Basically you used these to define a initialized variables.

Working with BSS Section
In your .bss section, you declare variables like this:



nameOfVariable: 
resb  256 ;this reserves space for a 256 bytes
nameOfVariable2: resw  1 ;this reserves space for 1 word, otherwise known as 2 bytes
someString: resd  ;this reserves space for 1 double word



Working with Text Section
The .text section is your program code.  It is what makes a program a program.  Simple as that.

You will arrange your instructions in whatever manner to achieve your particular logic goals.

Usually, you will want to have or start with something like this:



global main     ;export the symbol main so that external program can see this memory address
main:     ;here "main:" is a label--labels are an alias for memory address
       ;setup our stack frame
       push ebp
       mov ebp, esp
   

       ;clean u stack frame
       mov esp, ebp
       pop ebp
       ret