BLENDVPS: Variable Blend Packed Single-Precision Floating-Point Values

For information about interpreting this page, see the help page.
Opcode and MnemonicEncoding16 bit Mode32 bit Mode64 bit ModeCPUID Feature FlagDescription
66 0F 38 14 /r
BLENDVPS xmm1, xmm2/m128, <XMM0>
LEGACYInvalidValidValidSSE4_1Selects packed single-precision floating-point values from xmm1 and xmm2/m128 from a mask specified in XMM0. Stores the result in xmm1.
VEX.128.66.0F3A.W0 4A /r ib
VBLENDVPS xmm1, xmm2, xmm3/m128, xmm4
VEXInvalidValidValidAVXSelects packed single-precision floating-point values from xmm2 and xmm3/m128 from a mask specified in xmm4. Stores the result in xmm1.
VEX.256.66.0F3A.W0 4A /r ib
VBLENDVPS ymm1, ymm2, ymm3/m256, ymm4
VEXInvalidValidValidAVXSelects packed single-precision floating-point values from ymm2 and ymm3/m256 from a mask specified in ymm4. Stores the result in ymm1.

Encoding

EncodingOperand 1Operand 2Operand 3Operand 4
LEGACYModRM.reg[rw]ModRM.r/w[r]None (implicitly XMM0)
VEXModRM.reg[w]VEX.vvvv[r]ModRM.r/m[r]imm8[7:4]

Description

The (V)BLENDVPS instruction conditionally moves single-precision floating-point values from the second source operand into the first source operand. The result is stored in the destination operand.

This instruction is similar to BLENDPS (Blend Packed Single-Precision Floating-Point Values), but differs in that the mask selection is stored in a register instead of the hardcoded immediate.

This instruction, despite being named as if it operates on floating-point numbers, will work on 64 bit integers as well.

The VEX encoded instruction forms have a different opcode than the legacy SSE form. All versions except the legacy SSE version zero the unused upper SIMD register bits.

In 32 bit mode, imm8[7] is treated as a 0, preventing access to more than 8 vector registers.

The VEX form of this instruction reserves VEX.W = 1 (i.e. it must be set to 0). Failure to do so will result in a #UD exception.

Operation

This pseudo-code uses C# syntax. A list of the types used is available here.
public void BLENDVPS(SimdF32 dest, SimdF32 src)
{
  // If `XMM0.Bit[n]` is 0, `dest` will be copied into itself (i.e. nothing will happen)
  if (XMM0.Bit[0])
    dest[0] = src[0];
  if (XMM0.Bit[1])
    dest[1] = src[1];
  if (XMM0.Bit[2])
    dest[2] = src[2];
  if (XMM0.Bit[3])
    dest[3] = src[3];
  // dest[4..Simd.MAX] (unmodified)
}

void VBLENDVPS_Vex(SimdF32 dest, SimdF32 src1, SimdF32 src2, SimdF32 src3, int kl)
{
  for (int n = 0; n < kl, n++) {
    if (src3.Bit[n])
      dest[n] = src2[n];
    else
      dest[n] = src1[n];
  }
  dest[kl..Simd.MAX] = 0;
}
public void VBLENDVPS_Vex128(SimdF32 dest, SimdF32 src1, SimdF32 src2, SimdF32 src3)
{
  VBLENDVPS_Vex(dest, src1, src2, src3, 4);
}
public void VBLENDVPS_Vex256(SimdF32 dest, SimdF32 src1, SimdF32 src2, SimdF32 src3)
{
  VBLENDVPS_Vex(dest, src1, src2, src3, 8);
}

C Intrinsics

Exceptions

SIMD Floating-Point

None

Other

See Exceptions Type 4.

#UD
If VEX.W is 1.