Opcode

What Is an Opcode?

Opcode, short for Operation Code, is the portion of a machine language instruction that specifies the operation the CPU should perform. It is essentially a numerical or binary representation of an operation, like addition, subtraction, move, load, jump, etc.

Think of opcode as the verb in a sentence — it tells the computer what to do with the data.

For example:

In assembly: ADD R1, R2
In machine code: 0001 0001 0010

Here, the first bits (0001) may represent the opcode for ADD.

1. Structure of a Machine Instruction

Machine instructions often follow a structured format:

+---------+------------+--------------+
| Opcode  | Operand 1  | Operand 2... |
+---------+------------+--------------+

Opcode indicates the operation.
Operands indicate the data or locations involved.

Example: ADD Instruction

Field	Value
Opcode	`0001` (ADD)
Operand 1	`0100` (Register R4)
Operand 2	`0011` (Register R3)

The CPU interprets this as: R4 = R4 + R3

2. Opcode vs Mnemonic

Term	Description
Opcode	Binary code representing an operation
Mnemonic	Human-readable representation used in assembly

Example:

Opcode: 0x89 (hexadecimal) → Mnemonic: MOV

Programmers write mnemonics, but CPUs execute opcodes.

3. Opcode in the Instruction Cycle

Opcodes are executed during the Instruction Cycle, which typically consists of:

Fetch – Load instruction from memory
Decode – Identify opcode and operands
Execute – Perform the operation

Role of Opcode in the Cycle:

During Decode, the CPU extracts the opcode to determine which circuit or logic unit to activate (e.g., ALU, memory access, branching).

4. Common Opcodes and Operations

Mnemonic	Operation	Description
`MOV`	Move/Copy	Copy data between registers/memory
`ADD`	Addition	Add two values
`SUB`	Subtraction	Subtract values
`MUL`	Multiplication	Multiply values
`DIV`	Division	Divide values
`AND`	Bitwise AND	Logic operation
`OR`	Bitwise OR	Logic operation
`JMP`	Jump	Go to specific instruction
`CMP`	Compare	Set flags for conditional jumps
`NOP`	No operation	Placeholder, often used for timing
`INT`	Interrupt	Invoke OS or system service

Each of these has a unique opcode in machine code.

5. Instruction Set and Opcodes

Every CPU architecture defines its own Instruction Set Architecture (ISA), which includes:

The list of valid opcodes
Their binary formats
Supported operand types and lengths

Examples:

x86 ISA

Complex instruction set
Opcodes may be 1 to 3 bytes
E.g., B8 → MOV EAX, immediate

ARM ISA

RISC design
Fixed 32-bit instruction length
E.g., 0xE3A00001 → MOV R0, #1

RISC-V ISA

Open-source RISC standard
Simple, modular opcodes

6. Opcode Tables

Here’s a small sample from the x86 opcode table:

Opcode (Hex)	Mnemonic	Description
`90`	`NOP`	No operation
`B8+rd`	`MOV r32, imm32`	Move immediate to register
`01`	`ADD r/m32, r32`	Add registers
`E9`	`JMP rel32`	Jump to relative offset
`C3`	`RET`	Return from procedure

7. Example: Assembly to Opcode Translation

Assembly Code:

MOV EAX, 5
ADD EAX, EBX

Corresponding Opcodes (x86):

MOV EAX, 5 → B8 05 00 00 00
ADD EAX, EBX → 01 D8

Each instruction is translated by an assembler into these machine-level opcodes, which the CPU executes directly.

8. Encodings and Instruction Length

Different ISAs have different encoding schemes:

ISA	Instruction Length	Opcode Length
RISC (e.g., ARM, RISC-V)	Fixed (e.g., 32 bits)	Usually fixed
CISC (e.g., x86)	Variable (1–15 bytes)	Varies by prefix, mode

Variable-length opcodes allow more functionality but increase complexity.

9. Prefixes, Suffixes, and Modifiers

Some ISAs like x86 use prefixes to modify opcode behavior:

Segment override
Operand size override
Lock prefix
Repeat prefix

Example:

F3 0F 1E FA → Used for Intel CPU optimizations

In modern CPUs, opcode length and complexity affect instruction decoding speed, influencing performance.

10. Role in CPU Design and Execution

Decoder: Hardware unit that interprets opcode bits
Control Unit: Maps opcodes to control signals for data paths
Microcode: In some architectures (like Intel x86), complex opcodes are interpreted into simpler internal steps

Thus, the opcode is the command center of instruction processing.

11. Security and Exploits

Opcode-level manipulation is sometimes used in:

Buffer overflow exploits (injected shellcode)
Opcode obfuscation in malware
Reverse engineering (disassemblers reconstruct opcodes)

Understanding opcodes is critical in cybersecurity, debugging, and OS kernel development.

12. Opcodes in Virtual Machines and Emulators

Many virtual machines (like Java’s JVM or Python’s CPython) use bytecode, a virtual opcode system.

Example (Python bytecode):

def add(x, y): return x + y

Compiles to:

LOAD_FAST 0
LOAD_FAST 1
BINARY_ADD
RETURN_VALUE

Here, BINARY_ADD is a Python opcode, executed by the CPython interpreter.

Summary

Opcode is the fundamental building block of machine-level programming. Whether for CPU execution, reverse engineering, or compiler construction, understanding opcodes provides deep insight into how software commands are translated into hardware behavior.