Slide 1 CSC 221 Computer Organization and Assembly Language Lecture 32: Intel x86 Instruction Encoding Lecture Outline Encoding Real x86 Instructions x86 Instruction Format Reference x86 Opcode Sizes x86 ADD Instruction Opcode Encoding x86 Instruction Operands, MOD-REG-R/M Byte REG Field of the MOD-REG-R/M Byte MOD R/M Byte and Addressing Modes SIB (Scaled Index Byte) Layout Scaled Indexed Addressing Mode Lecture Outline Encoding ADD Instruction Example Encoding ADD CL, AL Instruction Encoding ADD ECX, EAX Instruction Encoding ADD EDX, DISPLACEMENT Instruction Encoding ADD EDI, [EBX] Instruction Encoding ADD EAX, [ ESI + disp8 ] Instruction Encoding ADD EBX, [ EBP + disp32 ] Instruction Encoding ADD EBP, [ disp32 + EAX*1 ] Instruction Encoding ADD ECX, [ EBX + EDI*4 ] Instruction Encoding ADD Immediate Instruction Encoding Real x86 Instructions It is time to take a look that the actual machine instruction format of the x86 CPU family. They don't call the x86 CPU a Complex Instruction Set Computer (CISC) for nothing! Although more complex instruction encodings exist, no one is going to challenge that the x86 has a complex instruction encoding: Encoding Real x86 Instructions Prefix Bytes 0 to 4 special prefix values that affect the operation of instruction. One or Two byte Instruction opcode (two bytes if the special 0Fh opcode expansion prefix is present) âmod-reg-r/mâ byte that spcifies the addressing mode and Instruction operand size. This byte is only required if the instruction supports register or memory operands. Optional Scaled Index Byte if the instruction uses a scaled index memory addressing mode. Displacement. This is 0,1, 2, or 4 byte value that specifies a memory address displacement for the instruction. Imm./Constant data. This is a 0,1, 2, or 4 byte constant value if the instruction has an immediate operand. Encoding Real x86 Instructions Although the diagram seems to imply that instructions can be up to 16 bytes long, in actuality the x86 will not allow instructions greater than 15 bytes in length. The prefix bytes are not the opcode expansion prefix discussed earlier - they are special bytes to modify the behavior of existing instructions. x86 Instruction Format Reference Another view of the x86 instruction format: Instruction Prefix Address-Size Prefix Operand-Size Prefix Segment Override Number 0 or 1 0 or 1 0 or 1 0 or 1 of Bytes Number 0 or 1 0 or 1 0 or 1 0, 1, 2 or 4 0, 1, 2 or 4 of Bytes OpCode Mod-R/M SIB Displacement Immediate Scale Index Base 7 6 5 4 3 2 1 0 Mod Reg/OpCode R/M 7 6 5 4 3 2 1 0 Bits (b) General Instruction Format (a) Optional Instruction Prefix x86 Instruction Format Reference Instructions have some combination of the following fields (but no instruction has all parts) instruction prefix â sets certain options opcode - specifies the operation to perform Mod R/M - specifies addressing mode/operands SIB (scale index base) - used for array index address displacement - used for addressing memory immediate value - holds value of a constant operand x86 Instruction Format Reference Displacement We are really talking about an address offset within a segment (usually given as a named variable or a label in code) it could be a relative address like the 8-bit value used for jumping forward or backward from the current location in the code segment or it could be the location of a variable in the data segment or it could be a FAR reference to code or data in another segment x86 Instruction Format Reference Displacement Examples jmp next â where next is a label in the current code segment add eax, var1 â where var1 is a 32-bit variable in the current data segment sub bx, var2[ecx] â where var2 is a 16-bit variable in the current code segment and ecx is an index register x86 Instruction Format Reference Immediate Values These are usually constants used directly in the operation â available immediately for example: ADD EAX, 7 where 7 is the immediate value â there is no variable name and no source register or: CMP AL, [EDX+10] where 10 is a constant that will be added to the contents of the EDX register to get the operandâs location or: SHL AX, 4 where the constant 4 is the number of bit positions to shift the AX register x86 Instruction Format Reference Instruction Prefix Used to specify options for instruction execution, for example: when executing String operators (MOVS, SCAS, etc) the prefix is used to indicate that the operation should be repeated for REP and REPE, the prefix is set to F3h for REPNE, the prefix is F2h some values indicate the memory segment that should be used (instead of the default) for ES prefix is set to 26h, for FS it is 64h, etc. x86 Instruction Format Reference Instruction Prefix Used to specify ways that the instruction should be executed, for example: to change the default data size for an instruction (from 32-bit to 16-bit or vice-versa), the prefix is set to 66h similarly, to change the size of the default address size for an instruction, set it to 67h to lock shared memory so that only this instruction has access, set it to F0h x86 Opcode Sizes The x86 CPU supports two basic opcode sizes: standard one-byte opcode two-byte opcode consisting of a 0Fh opcode expansion prefix byte. The second byte then specifies the actual instruction. This provides for up to 512 different instruction classes, although the x86 does not yet use them all. Number 0 or 1 0 or 1 0 or 1 0, 1, 2 or 4 0, 1, 2 or 4 of Bytes OpCode Mod-R/M SIB Displacement Immediate Scale Index Base 7 6 5 4 3 2 1 0 Mod Reg/OpCode R/M 7 6 5 4 3 2 1 0 Bits (b) General Instruction Format x86 ADD Instruction Opcode Bit number zero marked s specifies the size of the operands the ADD instruction operates upon: If s = 0 then the operands are 8-bit registers and memory locations. If s = 1 then the operands are either 16-bits or 32-bits: Under 32-bit operating systems the default is 32-bit operands if s = 1. To specify a 16-bit operand (under Windows or Linux) you must insert a special operand-size prefix byte in front of the instruction. x86 ADD instruction opcode : Bit number one, marked d, specifies the direction of the data transfer: If d = 0 then the destination operand is a memory location, e.g. add [ebx], al If d = 1 then the destination operand is a register, e.g. add al, [ebx] 0 0 0 0 d s ADD Opcode. d =0 if adding from register to memory. d =1 if adding from memory to register. s =0 if adding eight-bit operands. s =1 if adding 16/32-bit operands. Encoding x86 Instruction Operands, MOD-REG-R/M Byte (1/4) The MOD-REG-R/M byte specifies instruction operands and their addressing mode(*): The R/M field, combined with MOD, specifies either the second operand in a two-operand instruction, or the only operand in a single-operand instruction like NOT or NEG. The d bit in the opcode determines which operand is the source, and which is the destination: d=0: MOD R/M
Comments
Report "CSC 221 Computer Organization and Assembly Language Lecture 32: Intel x86 Instruction Encoding."