现在的位置: 首页 > 综合 > 正文

Intel 80386 arch

2013年11月16日 ⁄ 综合 ⁄ 共 5119字 ⁄ 字号 评论关闭
================
Intel 80386 arch
================

Registers : 

 (1) data

     (a) eax : accumulator
     (b) ebx : base (can hold addy of procedure or variable)
     (c) ecx : counter
     (d) edx : data

 (2) segment : provides base locations for program instructions, data, and
	       the stack

     (a) cs : code segment
     (b) ds : data segment
     (c) ss : stack segment
     (d) es : extra segment
     (e) fs and gs

 (3) index : contain offsets (from base locations as above, segment registers)
	     of data and instructions
 
     (a) ebp : base pointer
     (b) esp : stack pointer (contains offset to top of stack from SS)
	  -- holds address of last data element to be pushed onto the stack
     (c) esi : soure index
     (d) edi : destination index

 (4) control

     (a) eip : instruction pointer (contains address of next instruction
	       to be executed)
     (b) eflags

Adding to the stack : done via PUSH, when a new value is pushed, the ESP
is decremented (to point to lower mem addys, since the stack grows down)

E.g. ESP points to mem loc 0x9ffffc04
Then want to push value 0x00a5
So, move that value (0x00a5) into eax
Decremetn ESP by 4, so ESP now points to mem loc 0x9ffffc00
Write the value of eax at mem loc eax

The pop operation removes a value from the stack, then increments the ESP
E.g. esp points to 0x9ffffe00
So we move the value at ESP (0x9ffffe00) into eax
Then we increment ESP by four to get 0x9ffffe04

The stack frame pointer (aka ebp) always points to a fixed location within
the stack frame. Prior to function invocation, args to that function are
pushed onto the stack, then the return address (this will often be a value
in the code segment of a program, it is the address of the next
instruction to execute after this function returns), then the EBP (SFP). Then
when the function is actually called, any local variables of that function
will be pushed onto the stack. Ergo, function parameters have positive
offsets from the SFP (EBP), whereas local vars have negative offsets from
the SFP (EBP).

void bar( int barvar ) {

  int bar1 = barvar % 3;

  char buf[ 64 ];
  memset( buf, '/0', 64 );
  strncpy( buf, itoa( bar1 ), 63 );
}

void foo( int blah ) {

   int foo1 = 3;

   bar( blah + foo1 );
}

int main( int argc, char **argv ) {

    int arg = atoi( argv[1] );

    foo( arg );
}

[Don't ask whether this code is useful (or, much less, makes sense); 
 it doesn't].

The value stored at the SFP is the address of the calling function's
ESP. For example if main() calls foo() and foo() calls bar(), then the 
stack might look something like this:

                 |-------------------------------------------------
		 |  argv[1]
		 |-------------------------------------------------
		 |  argv[0]
		 |-------------------------------------------------
		 |  argc
		 |-------------------------------------------------
		 | 0x80c1248  ( <start> + 24 )
		 |-------------------------------------------------
0x9ffffd18	 | 0x9ffffe18
		 |-------------------------------------------------
0x9ffffd14	 | int arg
		 |-------------------------------------------------	
0x9ffffd10	 | arg
		 |-------------------------------------------------
0x9ffffc0c	 | 0x80c256 ( <main> + 30 )
		 |-------------------------------------------------
0x9ffffc08	 | 0x9ffffd18
		 |-------------------------------------------------
0x9ffffc04	 | into foo1 = 3
		 |-------------------------------------------------
0x9ffffc00	 | blah + foo1
		 |-------------------------------------------------
0x9ffffbfc	 | 0x80c588 ( <foo> + 12 )
		 |-------------------------------------------------
0x9ffffbf8	 | 0x9ffffc08
		 |-------------------------------------------------
0x9ffffbf4	 | int barvar
		 |-------------------------------------------------

and so on...

When bar() is done executing, bar will : 

 - EIP = mem[ ESP + 4 ]

   --> so in this case, ESP = 0x9ffffbf8
   --> so then the EIP is stored at ESP + 4 == 0x9ffffbfc
   --> then the value of the EIP is 0x80c588

 - ESP = mem[ ESP ]

   --> the new ESP is the value stored at the memory address
       of the old ESP

   --> so in this case, the old ESP = 0x9ffffbf8
   --> and the value stored at that address is 0x9ffffc08
   --> so the new ESP will be 0x9ffffc08

Note that when main() called foo(), main() pushed the argument to foo (int
blah) onto the stack, then called foo(). The CALL instruction pushes the
EIP onto the stack: in this case, that value was stored at the current
ESP, which is 0x9ffffc0c, and that value was 0x80c256 -- which is where in
main(), control should return when foo() completes execution. 

Then foo() for its part, (1) pushes the the EBP (which is 0x9ffffd18) onto
the stack -- now the ESP points to 0x9ffffc08 (cuz remember that the ESP
is decremented whenever we push onto the stack), then (2) copies the ESP into 
the EBP, so the new EBP = 0x9ffffc08, (3) then subtracts from ESP some # of 
bytes which represents how much local var space is required by foo(). In this 
case it's just 4 bytes, to store variable "int foo1." Then foo() does its thing...

++++++++++++++++++++++
On function invocation
++++++++++++++++++++++
--------------------------
So the calling procedure : 
--------------------------
 - pushes args to the called procedure
 - calls the called procedure [and, in so doing, pushes the return address --
   the location in the calling procedure where control should return)

---------------------------
Then the called procedure :
---------------------------
 - pushes the current EBP
 - sets the new EBP to be the value of the current ESP
 - decrements the ESP by some number of bytes to account for how much
   local storage is needed (so that, the ESP will again point to the
   "top" of the stack)

+=+=+=+=+=+=+=+=+=+=+=+
Now for linux syscalls
+=+=+=+=+=+=+=+=+=+=+=+

 - on syscall entry, eax contains the syscall #

 - ebx, ecx, edx, esi, edi, ebp contain the params to that syscall

 - so if there's just one argument, routine sys1 is used; sys1 does : 
   pop that arg is popped off the stack into ebx,
   then direct execution to sys0 (common syscall handler)

 - if there are two args, routine sys2 is used; sys2 does :
   pop the first param off the stack into ebx,
   pop the second off the stack into ecx,
   then direct execution to sys0

 - ... and so on for syscalls which require 3, 4, 5, 6 args

+=+=+=+=+=+=+=+=+=+=+=+=+
Now for Windows API calls
+=+=+=+=+=+=+=+=+=+=+=+=+

 

抱歉!评论已关闭.