Instruction Page Help

This page details many of the features of the instruction pages.

Overview Table

The overview table lists all the various forms that an instruction can take. Each row of the table consists of the following items, in order:

  • Opcode and Mnemonic: A single form of the instruction listing both the binary encoding and assembly form. Italics in the mnemonic part signify operands. See below for an explanation on interpreting VEX and EVEX opcodes.
    Some vector instructions (such as ADDPS (Add Packed Single-Precision Floating-Point Values)) require that no legacy prefixes (such as 0x66) be present. Including a prefix will change how the instruction is decoded. For example, prefixing the ADDPS instruction with 0x66 will change it into an ADDPD (Add Packed Double-Precision Floating-Point Values) instruction. These mandatory "no prefix" opcodes are notated with NP at the beginning of the opcode line.
    EVEX forms commonly feature other bits of information such as the mask register ({k1}), error masking ({er}), and more.
  • Encoding: A reference to the encoding table. This value represents where in the instruction the operands are encoded.
  • ## bit Mode (multiple): Whether a given instruction form is valid, invalid, or not encodable in the specified processor mode. "Valid" forms are allowed while "invalid" forms will usually throw an exception if encountered. "Not encodable" forms are also invalid, but because they also encode a different (valid) instruction, they will be interpreted incorrectly.
    For example, in 64 bit mode, the byte range 40 through 4F was repurposed for the REX prefix. This makes encoding INC eax as 40 impossible. Should the processor encounter what the author thinks is INC eax, it will treat it as a REX prefix with the lower four bits set to 0. The correct encoding would be FF C0.
  • CPUID Feature Flag (optional): If present, these CPUID "feature flags" must be present (set). The existence of these flags does not necessarily imply the ability to execute the instruction; Some CPU features must be enabled before use. Failure to do so will result in a processor exception being thrown.
  • Description: A short description of what the instruction form does. For most instructions, the various "Description" cells will be almost carbon copies of each other with minor changes.

Interpreting VEX and EVEX Opcodes

VEX and EVEX opcodes are written differently than normal instructions. This is because the prefixes are multiple (two to four) bytes long and encode quite a bit of information. Both prefixes take the form of (E)VEX.{length}.{prefixes}.{w} with each field representing a specific field in the VEX or EVEX prefix. The other fields in the prefix are unspecified here and are dependent on the operands. The various fields in the opcode prefix encoding are:

  • length: The amount of bits this instruction operates on. This is encoded in the L and (for EVEX) L' bits. This can be one of: 128 (XMM), 256 (YMM), 512 (ZMM), or LIG. LIG stands for "length ignored" and means just that - the length field is ignored. This is typically used in situations involving scalars as only a single piece of data is operated on, not the whole register.
    In some situations, despite an instruction being defined with LIG, Intel may recommend a specific value is used instead for future proofing. For example, the ADDSD (Add Scalar Double-Precision Floating-Point Value) instruction is defined to be LIG, but Intel recommends setting L (and L' for EVEX) to zero.
  • prefixes: The implied prefix bytes that are encoded in the prefix. Due to the nature of the VEX and EVEX prefixes, there can be up to two prefix fields specified in the opcode encoding: one for operand size and type prefixes (the pp field), and one for escape codes (the m bits). If unspecified, a prefix group's bits must be all zeros (indicating no prefix).
  • w: The single W bit in the VEX or EVEX prefix. This is commonly used as an extra bit to specify the opcode, but will sometimes be used as its predecessor, REX.W, meant - expanding the operand size to 64 bits. WIG stands for "W ignored" and means just that - the W bit is ignored.


The "Encoding" section is a table listing the encoding of the operands for the various opcodes in the overview table at the start. Each row of the table consists of the following items, in order:

  • Encoding: The name of the encoding this row is for. For example, if the "Encoding" cell of a mnemonic in the overview table contains RM, the row containing RM in this cell would list how the operands are encoded.
  • Tuple Type (optional): The EVEX encoding's tuple form.This column is only present if an EVEX encoding for this instruction exists. If present, any encoding that does not use an EVEX prefix will contain "N/A".
  • Operand(s): The actual encoding of each operand. Instructions that contain a different number of operands depending on the mnemonic (for example, vector instructions with a legacy encoding) will contain "N/A" for disallowed operands. In other words, "legacy" vector encodings will typically have the first source and the destination be the same operand (MNEMONIC dest, src), but VEX and EVEX versions with a "non-destructive" form (MNEMONIC dest, src1, src2) will not. In these cases, the "legacy" form will only have two operands while the VEX and EVEX forms will have three. As such, the "Operand 3" cell will be empty.
    See below for an explanation on interpreting this value.

Interpreting the Operand Value

The operand value cell takes the form of source[rw] which represents a data, source, that is both read from and written to ([rw]). Read only or write only data is signified by [r] and [w], respectively.

source only specifies where the register number is encoded. It does not specify which register file is used (general purpose, segment, vector, etc.); That is specified by the mnemonic's encoding.

source will be one of the following values:

  • address##: An immediate value of size ## that represents a "direct" address in the address space. If multiple values of ## are allowed, they will be separated with a slash.
  • AL/AX/EAX/RAX: The accumulator register.
  • DS:SI: Memory addressed by the DS:SI register pair. DS:ESI and DS:RSI may be used instead depending on the processor's mode.
  • ES:DI: Memory addressed by the ES:DI register pair. ES:EDI and ES:RDI may be used instead depending on the processor's mode.
  • EVEX.vvvv: The vvvv field of an EVEX prefix represents the register.
  • FLAGS: The FLAGS register.
  • imm##: An immediate value of size ##. If multiple values of ## are allowed, they will be separated with a slash.
  • imm8(7..4): The upper four bits of an 8 bit immediate represents the register. In 32 bit "protected" mode, the most significant bit (MSB; bit 7) is ignored and treated as if it were 0.
  • ModRM.reg: The reg field of a ModR/M byte represents the register. The three bits can be extended to four using one of the following prefixes: REX, VEX, or EVEX.
  • ModRM.r/m: If the mod field of a ModR/M byte signifies a register, the r/m field represents the register. The three bits can be extended to four using one of the following prefixes: REX, VEX, or EVEX. If, however, the mod field of a ModR/M byte signifies memory, the address is calculated and used instead.
  • offset##: An immediate value of size ## that represents an offset from the following instruction. If multiple values of ## are allowed, they will be separated with a slash.
    For example, an infinite loop (a: JMP a) would be encoded as EB FE where FE represents negative 2. This would jump backwards two bytes to the a label and begin again. In fact, a "nop" could be encoded as EB 00 which would be a simple jump to the following instruction (zero bytes ahead).
  • VEX.vvvv: The vvvv field of a VEX prefix represents the register.


The "Description" section, as the name implies, contains a simplified description of the instruction's operation. In some cases, graphics will be used for illustrative purposes.


The "Operation" section is pseudo-code that uses a Rust-like syntax. While attempts are made to mimic Rust's syntax, some things are "incorrect". For example, Rust's ranges follow other programming languages with a "start to end" order. This mimics how arrays are laid out in memory (index 0 is at a lower address than index n), however, a string of bits follows positional notation with the most significant bit (MSB) at the left. Due to this, bit position slices use a "high to low" ("end to start") order.


The MODE global variable represents the current operating mode of the processor thread. It can be one of: 16, 32, or 64, each representing the "bit width" of the current mode. However, it is only compared against 64 for instructions that are illegal in long (64 bit) mode.


In some rare cases, the operation of an instruction depends on which processor version is being used. In those (known) instances, the PROCESSOR global variable represents the current processor. For example, the AAA (ASCII Adjust AL After Addition) instruction operates slightly differently on the 80186 and prior.


Registers are accessed as if they were global variables. Any aliasing, and the zero extension to RrX when setting ErX, is handled implicitly.


Flags are accessed as if they were global variables. For example, OF would refer to the overflow flag (which is either a zero or a one). These single bit values, when used in if conditions, are implicitly coerced to a boolean. The only multibit flag, IOPL, is a two bit value and, as such, cannot be coerced.

Instruction Bits

Instruction prefixes are exposed as pseudo global variables. For example, EVEX.b refers to the b (broadcast) bit in the EVEX prefix for the currently executing instruction.


Integer Types

Rust's integer types, by default, do not allow access to the individual bits through slices. The only way to do so (without external crates) is through bit shifts and masking operations. Despite that, individual bits are exposed through slices. For example, to get the lowest three bits of an integer, one would normally do something similar to data & 7, but the operations show that as data[2..=0].


The most used type in the pseudo-code (besides integers) is the Simd<T> type. It represents an x86 vector register. Currently, Simd::max() is 512 to correspond with the ZMM registers, but this will change if an "AVX-768" or "AVX-1024" were to be created.

The T generic is a numeric type (integer or floating point) that represents what the ZMM register contains. For example, Simd<f64> (on a machine supporting AVX-512) represents a ZMM register containing eight "double precision" floating point (64 bit) numbers.

Operations on Simd<T> are at the "bit level". In other words, even though T represents the type of data, data[0] does not represent the first data value, but the first bit. For example, to access the second data value in a Simd<u32>, data[63..=32] would be used.


In addition to the Simd<T> type for vector instructions, there also exists the KMask type. It represents an x86 mask register (k0 through k7). KMask is a 64 bit wide bit addressable type. Each bit corresponds to the same bit in the x86 mask register with k[n] referring to the "n-th" bit of the underlying mask register.


Intel MPX (Memory Protection Extensions) created four bounds registers that can be used to check if a memory address is within a specified range. These registers are represented with the Bound type. The Bound type contains two accessible values: lower (the 64 bit lower bound) and upper (the 64 bit upper bound).

In addition to the bounds registers, there also exists three configuration and status registers: BNDCFGS, BNDCFGU, and BNDSTATUS. Each of those three are 64 bits wide and are accessed as if they were global variables.


The "Examples" section (if present) contains one or more example assembly snippets that demonstrate the instruction. Any examples provided use NASM (Intel) syntax.

Flags Affected

The "Flags Affected" section (if present) contains a description of how the processor's arithmetic flags are affected by the instruction. If this section is not present, then no arithmetic flags are changed.


The "Intrinsics" section(s) (if present) contain C or Rust function definitions that can be used in one's code to utilize the instruction without inline assembly.


The "Exceptions" sections contain a list of possible processor exceptions that can result from execution of the instruction. For regular (non-vector) instructions, each subsection will be for the various processor modes. Vector instructions, on the other hand, will typically only have two subsections: "SIMD Floating-Point" and "Other".