After I read the books<CS: APP>, I think that comprehend relationship between usage of register and function called is very important to improve our development programming skill, and could make our programming more effective.
It’s commonly conception about “Function call” in the high-level language, but it still a little difficult to handle the theory through assembly language way. This article will involves the knowledge assemble language related such as some conception of Register, data instruction.
Programs are translated buy other programs into different form:
The preprocessor modifies the original C program according to directives that begin with the # character, they will insert the file which #include command point to, then enlarge the macro. For the second step, C compilers generate an output file which is assembly language. At the following steps, the assembler translates the assembly language into machine-language which is a binary file whose bytes encode. Finally, generate a executable object file that is ready to be loaded into memory and executed by the system.
Preprocessing->Compilation->Assembly->Linking->Loaded
Equivalent assembly language :It’s difficult for us to read object code, but equivalent assembly language is very closely to object code, so it was a valiant attempt to handle the equivalent assemble language skills.
Data instruction & Register: Know the usage of regular rules about instruction, such as MOV, LEA. And there are some differences on instruction’s usage. For example: mov ebp esp means put the value from register esp into ebp.
Stack Frame Structure: IA32 programs make use of the program stack to support procedure calls, the stack is used to pass procedure arguments, to store return information, to save register for later restoration, and for local storage. The portion of the stack allocated for a single procedure call is called a stack frame. But there need notify: system just only allocates a stack frame for each procedure but the usage of register is sharing way. The topmost stack frame is delimited by two pointers, with register %ebp serving as the frame pointer, and register %esp serving as the stack pointer.
Near calls and Far calls: For sub procedure, there are two types of calls, near calls and Far calls. The formers return address spend 4 bytes, and the later is 8 bytes.
Notify: In this article, so many assembly knowledge be involved
Ok, base on materials prepared, we could begin our contents. Following code give us instance about 2 function calls
-----------------------------------------------------------------------
int swap_add(int *xp,int *yp)
{
int x=*xp;
int y=*yp;
return x+y;
}
int caller()
{
int arg1=534;
int ary2=1057;
int sum=swap+add(&arg1,&arg2);
return 0
}
main()
{
caller();
}
-----------------------------------------------------------------------
Caller () called swap_add(), and transfer their argument by address value. Now we reach on argument transferring, function calls, usage of register and stack through equivalent assembly language
-----------------------------------------------------------------------
10: int caller()
11: {
00401070 push ebp //push %ebp into stack frame, %esp point to the top of stack all the time
00401071 mov ebp,esp // to assign value of top stack to %ebp, at this time, %esp=%ebp
00401073 sub esp, 4Ch //%esp-48h, prepare for space of memory to input Variable in the following steps
00401076 push ebx // the callee
00401077 push esi
00401078 push edi
00401079 lea edi,[ebp-4Ch]
0040107C mov ecx,13h
00401081 mov eax,0CCCCCCCCh
00401086 rep stos dword ptr [edi]
12: int arg1 = 534;
00401088 mov dword ptr [ebp-4],216h // assign value 216h into &ebp-4
13: int arg2 = 1057;
0040108F mov dword ptr [ebp-8],421h
14: int sum = swap_add( &arg1, &arg2);
00401096 lea eax,[ebp-8]
00401099 push eax
0040109A lea ecx,[ebp-4]
0040109D push ecx
0040109E call @ILT+5(_swap_add) (0040100a) // call swap_add()
004010A3 add esp,8
004010A6 mov dword ptr [ebp-0Ch],eax
15:
16: return 0;
004010A9 xor eax,eax
17:
18: }
004010AB pop edi
004010AC pop esi
004010AD pop ebx
004010AE add esp,4Ch
004010B1 cmp ebp,esp
004010B3 call __chkesp (00401110)
004010B8 mov esp,ebp
004010BA pop ebp
004010BB ret
---------------------------------------------------------------------------
1: #include <stdio.h>
2: int swap_add(int *xp, int *yp)
3: {
00401030 push ebp
00401031 mov ebp,esp
00401033 sub esp,48h
00401036 push ebx
00401037 push esi
00401038 push edi
00401039 lea edi,[ebp-48h]
0040103C mov ecx,12h
00401041 mov eax,0CCCCCCCCh
00401046 rep stos dword ptr [edi]
4: int x = *xp;
00401048 mov eax,dword ptr [ebp+8] //.chart 1
0040104B mov ecx,dword ptr [eax]
0040104D mov dword ptr [ebp-4],ecx
5: int y = *yp;
00401050 mov edx,dword ptr [ebp+0Ch]
00401053 mov eax,dword ptr [edx]
00401055 mov dword ptr [ebp-8],eax
6:
7: return x + y;
00401058 mov eax,dword ptr [ebp-4]
0040105B add eax,dword ptr [ebp-8] eax中
8: }
0040105E pop edi
0040105F pop esi
00401060 pop ebx
00401061 mov esp,ebp
00401063 pop ebp
00401064 ret
--------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------
↓ ↓
address stack
┆ ┆
├──────┤
│ ebp │
├─┄┄┄┄─┤
│216h (arg1)│
├─┄┄┄┄─┤
│216h (arg1)│
├─┄┄┄┄─┤
┆ 4ch的空间 ┆
┆ ┆ call() structure of stack frame
│ │
│ │
├─┄┄┄┄─┤
│ ebx/esi/edi│
├──────┤
│ &arg2 │
├──────┤
│ &arg1 │← --------|
┆ 4ch的空间 ┆
┆ ┆ call() structure of stack frame
│ │
│ │
├─┄┄┄┄─┤
│ ebx/esi/edi│
├──────┤
│ &arg2 │
├──────┤
│ &arg1 │← --------|
├──────┤ |
│ 返回地址 │ | (ebp+8 access to &arg1)chart 1
├──────┤ | |
│ ebp │_________| |
├─┄┄┄┄─┤ |
│ x │←--------------------------|
├─┄┄┄┄─┤
│ 返回地址 │ | (ebp+8 access to &arg1)chart 1
├──────┤ | |
│ ebp │_________| |
├─┄┄┄┄─┤ |
│ x │←--------------------------|
├─┄┄┄┄─┤
│ y │
├─┄┄┄┄─┤
┆ 48h的空间 ┆
┆ ┆ swap_add() structure of stack frame
│ │
│ │
├─┄┄┄┄─┤
│ ebx/esi/edi│
├──────┤
┆ 48h的空间 ┆
┆ ┆ swap_add() structure of stack frame
│ │
│ │
├─┄┄┄┄─┤
│ ebx/esi/edi│
├──────┤
┆ . ┆
│ . │
│ . │
├─┄┄┄┄─┤
│ . │
│ . │
├─┄┄┄┄─┤
Finally, we have finished the introduction. For each function or producer, system allocated the stack frame for them. And we could divide the hole code into three parts: initialization of stack frame; execute; recover the stack frame;
In my opinion, handle some knowledge which assembly related is very useful for us. We could improve our major foundation by practice and thinking.