List of x86 cryptographic instructions

From testwiki
Jump to navigation Jump to search

Template:Short description Template:X86 instruction listings Instructions that have been added to the x86 instruction set in order to assist efficient calculation of cryptographic primitives, such as e.g. AES encryption, SHA hash calculation and random number generation.

Intel AES instructions

Template:Main 6 new instructions.

Instruction Encoding Description Added in
AESENC xmm1,xmm2/m128 66 0F 38 DC /r Perform one round of an AES encryption flow.
Performs the Template:Code, Template:Code, Template:Code and Template:Code steps of an AES encryption round, in that order.Template:Efn
The first source argument provides a 128-bit data-block to perform an encryption round on, the second source argument provides a round key for the Template:Code stage.

Template:GlossaryTemplate:TermTemplate:DefnTemplate:TermTemplate:DefnTemplate:Glossary end

AESENCLAST xmm1,xmm2/m128 66 0F 38 DD /r Perform the last round of an AES encryption flow.
Performs the Template:Code, Template:Code and Template:Code steps of an AES encryption round, in that order.Template:Efn
AESDEC xmm1,xmm2/m128 66 0F 38 DE /r Perform one round of an AES decryption flow.
Performs the Template:Code, Template:Code, Template:Code and Template:Code steps of an AES decryption round, in that order.Template:EfnTemplate:Efn
AESDECLAST xmm1,xmm2/m128 66 0F 38 DF /r Perform the last round of an AES decryption flow.
Performs the Template:Code, Template:Code and Template:Code steps of an AES decryption round, in that order.Template:Efn
Template:Nowrap Template:Nowrap Assist in AES round key generation. The operation performed is:
temp[127: 0] := SubBytes( src[127:0] )  // AES SubBytes step
dest[ 31: 0] := temp[63:32]
dest[ 63:32] := rotate_left( temp[63:32], 8 ) XOR RCON
dest[ 95:64] := temp[127:96]
dest[127:96] := rotate_left( temp[127:96], 8 ) XOR RCON

where RCON is the instruction's imm8 argument zero-extended to 32 bits.

AESIMC xmm1,xmm2/m128 66 0F 38 DB /r Perform the Template:Code step of an AES decryption round on one 128-bit block.
Mainly used to help prepare an AES key for use with the AESDEC instruction.Template:Efn

Template:Notelist

CLMUL instructions

Template:Main

Instruction Opcode Description
Template:Nowrap Template:Nowrap Perform a carry-less multiplication of two 64-bit polynomials over the finite field GF(2k).
PCLMULLQLQDQ xmm1,xmm2/m128 Template:Nowrap Multiply the low halves of the two 128-bit operands.
PCLMULHQLQDQ xmm1,xmm2/m128 66 0F 3A 44 /r 01 Multiply the high half of the destination register by the low half of the source operand.
PCLMULLQHQDQ xmm1,xmm2/m128 66 0F 3A 44 /r 10 Multiply the low half of the destination register by the high half of the source operand.
PCLMULHQHQDQ xmm1,xmm2/m128 66 0F 3A 44 /r 11 Multiply the high halves of the two 128-bit operands.

RDRAND and RDSEED

Template:Main

Instruction Encoding Description Added in
RDRAND r16
RDRAND r32
NFx 0F C7 /6 Return a random number that has been generated with a CSPRNG (Cryptographically Secure Pseudo-Random Number Generator) compliant with Template:NowrapTemplate:Efn Ivy Bridge,
Silvermont,
Excavator,
Puma,
ZhangJiang,
Template:Nowrap
Template:Nowrap Template:Nowrap
RDSEED r16
RDSEED r32
NFx 0F C7 /7 Return a random number that has been generated with a HRNG/TRNG (Hardware/"True" Random Number Generator) compliant with Template:NowrapTemplate:Efn Broadwell,
ZhangJiang,
Template:Nowrap
Zen 1,
Gracemont
Template:Nowrap Template:Nowrap

Template:Notelist

Intel SHA and SM3 instructions

Template:Main These instructions provide support for cryptographic hash functions such as SHA-1, SHA-256, SHA-512 and SM3. Each of these hash functions works on fixed-size data blocks, where the processing of each data-block mostly consists of two major phases:

  • First expand the data-block using a message schedule (that is specific to each hash function)
  • Then perform a series of rounds of a compression function to combine the expanded data into a hash state.

For each of the supported hash functions, separate instructions are provided to help compute the message schedule and to help perform the compression function rounds.

Hash function extension Instructions EncodingTemplate:Efn Description Added in
Template:GlossaryTemplate:TermTemplate:DefnTemplate:Glossary end Template:Nowrap Template:Nowrap Perform Four Rounds of SHA-1 Operation Goldmont,
Zen 1,
Template:Nowrap
LuJiaZui,
Rocket Lake
SHA1NEXTE xmm1,xmm2/m128 NP 0F 38 C8 /r Calculate SHA-1 State Variable E after Four Rounds
SHA1MSG1 xmm1,xmm2/m128 NP 0F 38 C9 /r Perform an Intermediate Calculation for the Next Four SHA-1 Message Dwords
SHA1MSG2 xmm1,xmm2/m128 NP 0F 38 CA /r Perform a Final Calculation for the Next Four SHA-1 Message Dwords
SHA256RNDS2 xmm1,xmm2/m128,<XMM0> NP 0F 38 CB /r Perform Two Rounds of SHA256 Operation
SHA256MSG1 xmm1,xmm2/m128 NP 0F 38 CC /r Perform an Intermediate Calculation for the Next Four SHA-256 Message Dwords
SHA256MSG2 xmm1,xmm2/m128 NP 0F 38 CD /r Perform a Final Calculation for the Next Four SHA-256 Message Dwords
Template:GlossaryTemplate:TermTemplate:DefnTemplate:Glossary end VSHA512RNDS2 ymm1,ymm2,xmm3 Template:Small Perform Two Rounds of SHA-512 operation Lunar Lake,
Arrow Lake
VSHA512MSG1 ymm1,xmm2 Template:Small Perform an Intermediate Calculation for the Next Four SHA-512 Message Qwords
VSHA512MSG2 ymm1,ymm2 Template:Small Perform a Final Calculation for the Next Four SHA-512 Message Qwords
Template:GlossaryTemplate:TermTemplate:DefnTemplate:Glossary end Template:Nowrap Template:Nowrap Perform Two Rounds of SM3 Operation Lunar Lake,
Arrow Lake
VSM3MSG1 xmm1,xmm2,xmm3/m128 Template:Small Perform Initial Calculation for the Next Four SM3 Message Words
VSM3MSG2 xmm1,xmm2,xmm3/m128 Template:Small Perform Final Calculation for the Next Four SM3 Message Words

Template:Notelist

Intel Key Locker instructions

These instructions, available in Tiger Lake and later Intel processors, are designed to enable encryption/decryption with an AES key without having access to any unencrypted copies of the key during the actual encryption/decryption process.

Key Locker subset Instruction EncodingTemplate:Efn Description
Template:GlossaryTemplate:TermTemplate:DefnTemplate:Glossary end LOADIWKEY xmm1,xmm2 Template:Nowrap Load internal wrapping key ("IWKey") from xmm1, xmm2 and XMM0.

The two explicit operands (which must be register operands) specify a 256-bit encryption key. The implicit operand in XMM0 specifies a 128-bit integrity key. EAX contains flags controlling operation of instruction.Template:Efn After being loaded, the IWKey cannot be directly read from software, but is used for the key wrapping done by ENCODEKEY128/256 and checked by the Key Locker encode/decode instructions.

LOADIWKEY is privileged and can run in Ring 0 only.

Template:GlossaryTemplate:TermTemplate:DefnTemplate:Glossary end ENCODEKEY128 r32,r32 F3 0F 38 FA /r Wrap a 128-bit AES key from XMM0 into a 384-bit key handle - and output this handle to XMM0-2. Source operand specifies handle restrictions to build into the handle.Template:Efn

Destination operand is initialized with information about the source and attributes of the key (this matches the value that was provided in EAX for the most recent invocation of LOADIWKEY)

These instructions may also modify XMM4-6 (zeroed out in existing implementations, but this should not be relied on).

ENCODEKEY256 r32,r32 F3 0F 3A FB /r Wrap a 256-bit AES key from XMM1:XMM0 into a 512-bit key handle - and output this handle to XMM0-3.
AESENC128KL xmm,m384 F3 0F 38 DC /r Encrypt xmm using 128-bit AES key indicated by handle at m384 and store result in xmm.Template:Efn
Template:Nowrap F3 0F 38 DD /r Decrypt xmm using 128-bit AES key indicated by handle at m384 and store result in xmm.Template:Efn
AESENC256KL xmm,m512 F3 0F 38 DE /r Encrypt xmm using 256-bit AES key indicated by handle at m512 and store result in xmm.Template:Efn
AESDEC256KL xmm,m512 F3 0F 38 DF /r Decrypt xmm using 256-bit AES key indicated by handle at m512 and store result in xmm.Template:Efn
Template:GlossaryTemplate:TermTemplate:DefnTemplate:Glossary end AESENCWIDE128KL m384 F3 0F 38 D8 /0 Encrypt XMM0-7 using 128-bit AES key indicated by handle at m384 and store each resultant block back to its corresponding register.Template:Efn
Template:Nowrap F3 0F 38 D8 /1 Decrypt XMM0-7 using 128-bit AES key indicated by handle at m384 and store each resultant block back to its corresponding register.Template:Efn
AESENCWIDE256KL m512 F3 0F 38 D8 /2 Encrypt XMM0-7 using 256-bit AES key indicated by handle at m512 and store each resultant block back to its corresponding register.Template:Efn
AESDECWIDE256KL m512 F3 0F 38 D8 /3 Decrypt XMM0-7 using 256-bit AES key indicated by handle at m512 and store each resultant block back to its corresponding register.Template:Efn

Template:Notelist

VIA/Zhaoxin PadLock instructions

Template:Main The VIA/Zhaoxin PadLock instructions are instructions designed to apply cryptographic primitives in bulk, similar to the 8086 repeated string instructions. As such, unless otherwise specified, they take, as applicable, pointers to source data in ES:rSI and destination data in ES:rDI, and a data-size or count in rCX. Like the old string instructions, they are all designed to be interruptible.[1][2]

PadLock subset Instruction mnemonicsTemplate:Efn Encoding Description Added in
Template:GlossaryTemplate:TermTemplate:DefnTemplate:Glossary end XSTORE,
XSTORE-RNG
Template:Nowrap Store random bytes to ES:[rDI], and increment ES:rDI accordingly. XSTORE will store currently-available bytes, which may be from 0 to 8 bytes. REP XSTORE and REP XRNG2 will write the number of random bytes specified by rCX, waiting for the random number generator when needed.Template:Efn EDX specifies a "quality factor".Template:Efn Nehemiah
Template:Nowrap
REP XSTORE,
Template:Nowrap
Template:Nowrap
REP XRNG2 F3 0F A7 F8 ZhangJiangTemplate:Efn
Template:GlossaryTemplate:TermTemplate:DefnTemplate:Glossary end REP XCRYPT-ECB F3 0F A7 C8 Encrypt/Decrypt data, using the AES cipher in various block modes (ECB, CBC, CFB, OFB and CTR, respectively). rCX contains the number of 16-byte blocks to encrypt/decrypt, rBX contains a pointer to an encryption key, ES:rAX a pointer to an initialization vector for block modes that need it, and ES:rDX a pointer to a control word.Template:Efn Nehemiah
Template:Nowrap
Template:Nowrap F3 0F A7 D0
REP XCRYPT-CFB F3 0F A7 E0
REP XCRYPT-OFB F3 0F A7 E8
Template:GlossaryTemplate:TermTemplate:Glossary end REP XCRYPT-CTR F3 0F A7 D8 Template:Nowrap
Template:GlossaryTemplate:TermTemplate:DefnTemplate:Glossary end REP XSHA1 F3 0F A6 C8 Compute a cryptographic hash (using the SHA-1 and SHA-256 functions, respectively). ES:rSI points to data to compute a hash for, ES:rDI points to a message digest and rCX specifies the number of bytes. rAX should be set to 0 at the start of a calculation.Template:Efn Esther
REP XSHA256 F3 0F A6 D0
REP XSHA384 F3 0F A6 D8 Perform computation of a SHA-384/SHA-512 cryptographic hash. ES:rSI points to a series of 128-byte data chunks to perform hash computation for, ES:rDI points to a 64-byte digest to update, and ECX specifies the number of chunks to process.Template:Efn ZhangJiangTemplate:Efn
REP XSHA512 F3 0F A6 E0
Template:GlossaryTemplate:TermTemplate:DefnTemplate:Glossary end REP MONTMUL Template:Nowrap Perform Montgomery Multiplication. Takes an operand width in ECX (given as a number of bits – must be in range 256..32768 and divisible by 128) and pointer to a data structure in ES:ESI.Template:Efn

When starting a new Montgomery Multiplication, EAX and the result buffer in memory must be filled with all-0s before executing the REP MONTMUL instruction. (Nonzero values are used to help resume the calculation if the instruction was interrupted.)

Esther
REP MONTMUL2 F3 0F A6 F0 Perform modular multiplication/exponentiation. Takes pointers (all using the ES: segment) to bignum integers Template:Tmath in registers rAX, rBX, rDX, rDI, respectively, where Template:Tmath and Template:Tmath are input numbers, Template:Tmath is a modulus,Template:Efn and Template:Tmath will be overwritten with the result. The operation performed is:
  • REP MONTMUL2: R:=(A*B) mod M
  • REP XMODEXP: R:=(AB) mod M

ECX provides the size of the bignums, in number of bits (256..32768, must be divisble by 128), and ES:rSI provides a pointer to a scratchpad area to use during the calculation.Template:Efn

ZhangJiangTemplate:Efn
REP XMODEXP F3 0F A6 F8
Template:GlossaryTemplate:TermTemplate:DefnTemplate:Glossary end CCS_HASH,
CCS_SM3Template:Efn
F3 0F A6 E8 Compute SM3 hash, similar to the REP XSHA* instructions. The rBX register is used to specify hash function (20h for SM3 being the only documented value). ZhangJiang
CCS_ENCRYPT,
CCS_SM4Template:Efn
F3 0F A7 F0 Encrypt/Decrypt data, using the SM4 cipher in various block modes. rCX contains the number of 16-byte blocks to encrypt/decrypt, rBX contains a pointer to an encryption key, rDX a pointer to an initialization vector for block modes that need it, and rAX contains a control word.Template:Efn
SM2[3] F2 0F A6 C0 Perform SM2 (public key cryptographic algorithm) function. The function to perform is specified in bits 5:0 of EDXTemplate:Efn - depending on function, rAX/rBX/rCX/rSI/rDI may provide additional input arguments. The instruction returns a status bit in EDX bit 6 (0=success, 1=failure) - depending on function, rAX, rCX and rDI may be modified as well. KX-6000G

Footnotes

Template:Notelist

References

Template:Reflist

  1. VIA, PadLock Programming Guide, rev 1.66, 4 Aug 2005. Archived from the original on 26 May 2010.
  2. Cite error: Invalid <ref> tag; no text was provided for refs named zhaoxin_padlock
  3. Binutils mailing list, (PATCH v1) x86: Support ZHAOXIN GMI instructions, 14 Oct 2024, see "ZX_GMI_Reference.docx" attachment for Zhaoxin-provided documentation of the SM2 instruction. Archived on 9 Nov 2024; attachment archived on 9 Nov 2024.