X86 Assembly Language
Assembly languages are low-level languages directly interacting with a processor such as an Intel or an ARM processor.
X86 assembly is one of the assembly languages used for CISC processors (Complex Instruction Set Computing) such as Intel processors.
- π§ Rich instruction set compared to RISC processors
- π§©οΈ Versatile and capable of handling a wide range of tasks
- π» Commonly used in desktop and server environments
- π¨ May consume more power while power-efficient options exist
X86 transitioned from 32-bit to 64-bit architecture with x86-64.
Where to learn? π₯
Basic Overview
.data
Global and static variables.
.bss
Statically allocated variables. ref.
.text
Contains the assembler instructions. Can be read-only.
MOV Instructions
The mov
instruction can be used to put a value inside a registry, or put a value at an address.
mov destination, source ; put value into registry
mov dword [xxx], 2 ; store '2' in xxx
mov eax, dword [xxx] ; store '2' in eax
mov dword [xxx], eax ; store '2' in 'xxx'
imul eax, dword [xxx] ; store '4' in eax
X86 Registers
General-Purpose Registers
Used for general-purpose computation and storage.
x86 | x86-x64 | Description |
---|---|---|
EAX | RAX | Accumulator register for arithmetic/data operations |
EBX | RBX | Base register, often used as a pointer to data. |
ECX | RCX | Counter register, frequently used as a loop counter. |
EDX | RDX | Data register, used in some operations. |
Index Registers
They can be used to access/write values in arrays.
x86 | x86-x64 | Description |
---|---|---|
ESI | RSI | Index to access a value |
EDI | RDI | Index to write a value |
Stack Registers
As the stack grows, it is logically divided into sections called Stack Frames, with each frame corresponding to an individual function.
x86 | x86-x64 | Description |
---|---|---|
ESP | RSP | Stack pointer. Head of the stack. |
EBP | RBP | Base pointer. Bottom of the stack. |
β οΈ As a reminder, the head is the last value added (LIFO).
π Local variables addresses expressed as rbp-0x4
are expressed relatively to their function stack frame.
Pointer Registers
x86 | x86-x64 | Description |
---|---|---|
EIP | RIP | Instruction Pointer. Next instruction address. |
X86 Functions
The call
instruction is used to call a subroutine/function. It pushes the return address to the stack and set the EIP
/RIP
.
_start:
call my_function
my_function:
; ...
ret
π RDI
typically holds argc
and RSI
typically holds argv
in the context of the main
function. More generically, we use general purpose registers such as RDI
for the first argument, RSI
for the second, RDX
for the third, RCX
for the fourth, etc. We often use RAX
for the result.
test %rax,%rax ; set flags such as "N" if rax == -1
Function Prologue And Epilogue
Each function using stack frames will perform a suite of steps called the prologue and the epilogue to save the previous pointers values during the execution and restore them at the end.
push ebp ; prologue
mov ebp,esp
sub esp,value ; allocate space for variables
...
mov esp, ebp ; epilogue
pop ebp
ret
You can alternatively use leave
for the epilogue.
π» To-do π»
Stuff that I found, but never read/used yet.
je
opcode is 74, thenjne
is most likely 75