Introduction
Assembler is a low-level programming language that has very few 'bells and whistles'. There is no concept of classes as such (although there are are few hybrid languages such as HLA that offer this type of feature), so object-oriented programming is not part of the assembler programmers repertoire. For all it's shortcomings however, it is an elegant programming language that is pure and natural.
I am going to assume that you have some knowledge of C++, because I think that it is important to be able to compare the various assembler statements with equivalent high-level language statements.
Obtaining an assembler
Before you can start programmming in assembler, you will of course need an assembler to compile with. There are many free assemblers out there and I shall name some of the more popular ones for the x86-64 processors and that are capable of running on both Windows and (with the exception of GoASM) Linux systems.
I am got going to mention assemblers such as TASM or MASM, as these a proprietary products from Borland and Microsoft. As such code produced by said assemblers is under license and subject to restrictions, so we will leave them alone.
Of the three most popular assemblers listed above, I would strongly recommend using the Netwide Assembler as it provides the best features.
Other resources you will need
All of the Intel software developer manuals may be found here.
Getting started with a 'Hello World' example
Let's dive in at the deep end and provide you with the code for our 'Hello World' application.
; This is a console application that writes the infamous 'Hello World'
; text to the console and then exits
global _mainCRTStartup ; This is the main program entry point
extern _ExitProcess@4 ; Windows API call to exit the process
extern _GetStdHandle@4 ; Windows API call to get the standard output handle
extern _WriteFile@20 ; Windows API call to write to a handle
section .data ; Start of the data segment
hello_world db 'Hello World', 10, 0 ; The hello world message
bytes_written dd 0 ; Return 32-bit word from WriteFile
section .code ; Start of the code segment
; We need to get hold of the standard output handle so we can write
; our 'Hello World' text to it. This is provided by the GetStdHandle
; windows API call
_mainCRTStartup:
push -11 ; We want the standard output handle
call _GetStdHandle@4 ; Call the Windows API GetStdHandle to retrieve it
; The EAX register now contains the handle we need to write to. As
; we are just going to stack the parameters needed for the WriteFile
; API call, and the push instruction does not affect the registers,
; we can leave the returned handle in EAX for the moment
push 0 ; We do not want overlapped I/O
push dword bytes_written ; The address of the number of bytes written
push 13 ; The length of the text we are writing
push dword hello_world ; The address of the text we are writing
push eax ; The handle returned from GetStdHandle call
call _WriteFile@20 ; Write the text to the standard output handle
; The text has been written, so all we need to do now is exit the
; process and return control back to the console
push 0 ; Stack the exit code
call _ExitProcess@4 ; Exit the process
Right, now let's look at it in some detail making comparisions with C++. The first line
global _mainCRTStartup ; This is the main program entry point
does not directly have a C++ equivalent statement as most top-level symbols are automatically made global (i.e. other modules can reference the symbols declared in any given module. This statement tells the assembler that the label _mainCRTStartup in the code is made visible to the outside world.
The next three lines
extern _ExitProcess@4 ; Windows API call to exit the process
extern _GetStdHandle@4 ; Windows API call to get the standard output handle
extern _WriteFile@20 ; Windows API call to write to a handle
would be equivalent to
extern "C" {
VOID WINAPI ExitProcess(UINT uExitCode);
HANDLE WINAPI GetStdHandle(DWORD nStdHandle);
BOOL WINAPI WriteFile(HANDLE hFile, LPCVOID lpBuffer, DWORD nBytesToWrite, LPDWORD lpBytesWritten, LPOVERLAPPED lpOverlapped);
}
The C++ compiler actually adds an underscore to the names you are prototyping and counts the number of bytes used as parameter space appending the @ sign and the number calculated.
The next line
section .data ; Start of the data segment
tells the assembler that the lines following this statement are to be assembled into the data segment. Again, there is no C++ equivalent as it is taken care of by the compiler. (The closest in VC++ is the #pragma data_seg(".data") statement - which all but the most experienced programmers would use).
The next two lines
hello_world db 'Hello World', 10, 0 ; The hello world message bytes_written dd 0 ; Return 32-bit word from WriteFile
define the data that the program needs to run. These two statements would be written as
char hello_world[] = "Hello World\n"; int bytes_written;
The next line
section .code ; Start of the code segment
tells the assembler that we are now switching to the code segment, has no C++ equivalent as it is taken care of by the compiler. (The closest in VC++ is the #pragma code_seg(".code") statement - which all but the most experienced programmers would use).
The next line
mainCRTStartup:
is the global label that we declared in the first line of the program. This is the default entry point for the program, and needs to be visible to the outside world otherwise the linker will not know where the start point is, and the executable will not be produced. Notice the colon ( : ) at the end of the label. This says it is a code label.
The next two lines are pushing the value of -11 onto the stack and calling the GetStdHandle windows API call to obtain the handle for standard output.
push -11 ; We want the standard output handle
call _GetStdHandle@4 ; Call the Windows API GetStdHandle to retrieve it
The C++ equivalent would be
HANDLE hFile = GetStdHandle(-11);
Notice that the result from the C++ call is being placed into a variable hFile. In assembler, all API functions (that is routines that return a value) return the result of the call in the EAX register.
The next six lines actually push the parameters for the WriteFile API call onto the stack ready to be used.
push 0 ; We do not want overlapped I/O
push dword bytes_written ; The address of the number of bytes written
push 13 ; The length of the text we are writing
push dword hello_world ; The address of the text we are writing
push eax ; The handle returned from GetStdHandle call
call _WriteFile@20 ; Write the text to the standard output handle
The C++ equivalent would be
WriteFile(hFile, hello_world, 13, &bytes_written, 0);
Notice that for windows API calls, parameters are stacked last parameter first, next to last parameter next etc so the first parameter is always at the top of the stack.
Finally, the last two lines stack the exit code and call ExitProcess.
push 0 ; Stack the exit code
call _ExitProcess@4 ; Exit the process
The equivalent C++ code would be
return 0;
Compiling and linking
To compile the assembler source into an object file ready for linking, you would use the command line
nasm -f win32 HelloWorld.asm
You would then link the object file to create an executable (I am not going to detail that part of the process because it depends on which IDE you are using if any, which linker you chose to use etc.
Conclusion
This tutorial provides a small insight into the world of assembler programming. We have used two instructions push and call to write our 'Hello World' application (all the other statements are assembler control statements). In future tutorials, we will look at some more basic assembler instructions and how they can be used to build applications.
The next tutorial is here.
This post has been edited by Martyn.Rae: 06 April 2010 - 08:56 AM





MultiQuote











|