

# **AMD64 Technology**

# AMD64 Architecture Programmer's Manual Volume 5: 64-Bit Media and x87 FloatingPoint Instructions

Publication No. Revision Date

26569 3.04 September 2003



© 2002, 2003 Advanced Micro Devices, Inc. All rights reserved.

The contents of this document are provided in connection with Advanced Micro Devices, Inc. ("AMD") products. AMD makes no representations or warranties with respect to the accuracy or completeness of the contents of this publication and reserves the right to make changes to specifications and product descriptions at any time without notice. No license, whether express, implied, arising by estoppel or otherwise, to any intellectual property rights is granted by this publication. Except as set forth in AMD's Standard Terms and Conditions of Sale, AMD assumes no liability whatsoever, and disclaims any express or implied warranty, relating to its products including, but not limited to, the implied warranty of merchantability, fitness for a particular purpose, or infringement of any intellectual property right.

AMD's products are not designed, intended, authorized or warranted for use as components in systems intended for surgical implant into the body, or in other applications intended to support or sustain life, or in any other application in which the failure of AMD's product could create a situation where personal injury, death, or severe property or environmental damage may occur. AMD reserves the right to discontinue or make changes to its products at any time without notice.

#### **Trademarks**

AMD, the AMD arrow logo, and combinations thereof, and 3DNow! are trademarks, and AMD-K6 is a registered trademark of Advanced Micro Devices, Inc.

MMX is a trademark of Intel Corporation.

Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies.

### **Contents**

| Figur | es                                   |
|-------|--------------------------------------|
| Table | esxi                                 |
| Revis | sion History                         |
| Prefa | cexv                                 |
|       | About This Book xv                   |
|       | Audience xv                          |
|       | Contact Information xv               |
|       | Organization xv                      |
|       | Definitionsxvi                       |
|       | Related Documents xxvii              |
| 1     | 64-Bit Media Instruction Reference 1 |
|       | CVTPD2PI                             |
|       | CVTPI2PD                             |
|       | CVTPI2PS9                            |
|       | CVTPS2PI11                           |
|       | CVTTPD2PI                            |
|       | CVTTPS2PI17                          |
|       | EMMS20                               |
|       | FEMMS21                              |
|       | FNSAVE                               |
|       | (FSAVE)                              |
|       | FRSTOR                               |
|       | FXRSTOR                              |
|       | FXSAVE                               |
|       | MASKMOVQ31                           |
|       | MOVD33                               |
|       | MOVDQ2Q                              |
|       | MOVNTQ                               |
|       | MOVQ                                 |
|       | PACKSSDW                             |
|       | PACKSSWB                             |
|       | PACKUSWB                             |
|       | PADDB                                |
|       | PADDD                                |
|       | PADDQ                                |
|       | PADDSB                               |
|       | PADDSW                               |
|       | PADDUSB60                            |
|       | PADDUSW                              |

#### 26569–Rev. 3.04–September 2003

| PADDW                                            |      |
|--------------------------------------------------|------|
| PAND                                             |      |
| PANDN                                            | . 68 |
| PAVGB                                            | . 70 |
| PAVGUSB                                          | . 72 |
| PAVGW                                            | . 74 |
| PCMPEQB                                          | . 76 |
| PCMPEQD                                          | . 78 |
| PCMPEQW                                          | . 80 |
| PCMPGTB                                          | . 82 |
| PCMPGTD                                          | . 84 |
| PCMPGTW                                          | . 86 |
| PEXTRW                                           | . 88 |
| PF2ID                                            | . 90 |
| PF2IW                                            | . 92 |
| PFACC                                            | . 95 |
| PFADD                                            | . 98 |
| PFCMPEQ                                          | 101  |
| PFCMPGE                                          |      |
| PFCMPGT                                          | 107  |
| PFMAX                                            | 110  |
| PFMIN                                            | 113  |
| ${\sf PFMUL}\dots\dots\dots\dots\dots\dots\dots$ | 116  |
| PFNACC                                           | 119  |
| PFPNACC                                          |      |
| PFRCP                                            |      |
| PFRCPIT1                                         | 128  |
| PFRCPIT2                                         |      |
| PFRSQIT1                                         |      |
| PFRSQRT                                          |      |
| PFSUB                                            |      |
| PFSUBR                                           |      |
| PI2FD                                            |      |
| PI2FW                                            |      |
| PINSRW                                           |      |
| PMADDWD                                          |      |
| PMAXSW                                           |      |
| PMAXUB                                           |      |
| PMINSW                                           |      |
| PMINUB                                           |      |
| PMOVMSKB                                         |      |
| PMULHRW                                          |      |
| PMULHUW                                          |      |
| PMULHW                                           |      |
| PMULLW                                           |      |
| PMULUDQ                                          |      |
| D∩D                                              | 17/  |

|   | PSADBW                                                                                                                                                                            | 176                                                    |
|---|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------|
|   | PSHUFW                                                                                                                                                                            |                                                        |
|   | PSLLD                                                                                                                                                                             | 181                                                    |
|   | PSLLQ                                                                                                                                                                             | 183                                                    |
|   | PSLLW                                                                                                                                                                             |                                                        |
|   | PSRAD                                                                                                                                                                             |                                                        |
|   | PSRAW                                                                                                                                                                             |                                                        |
|   | PSRLD                                                                                                                                                                             |                                                        |
|   | PSRLQ                                                                                                                                                                             |                                                        |
|   | PSRLW                                                                                                                                                                             |                                                        |
|   | PSUBB                                                                                                                                                                             |                                                        |
|   | PSUBD                                                                                                                                                                             |                                                        |
|   | PSUBQ                                                                                                                                                                             |                                                        |
|   | PSUBSB                                                                                                                                                                            |                                                        |
|   | PSUBSW                                                                                                                                                                            |                                                        |
|   | PSUBUSB                                                                                                                                                                           |                                                        |
|   | PSUBUSW                                                                                                                                                                           |                                                        |
|   |                                                                                                                                                                                   |                                                        |
|   | PSUBW                                                                                                                                                                             |                                                        |
|   | PSWAPD                                                                                                                                                                            |                                                        |
|   | PUNPCKHBW                                                                                                                                                                         |                                                        |
|   | PUNPCKHDQ                                                                                                                                                                         |                                                        |
|   | PUNPCKHWD                                                                                                                                                                         |                                                        |
|   | PUNPCKLBW                                                                                                                                                                         |                                                        |
|   | PUNPCKLDQPUNPCKLWD                                                                                                                                                                |                                                        |
|   | PIINP('KIWI)                                                                                                                                                                      | 111                                                    |
|   |                                                                                                                                                                                   |                                                        |
|   | PXOR                                                                                                                                                                              | 228                                                    |
| 2 |                                                                                                                                                                                   | 228                                                    |
| 2 | PXOR                                                                                                                                                                              | 228<br>. <b>231</b>                                    |
| 2 | PXOR                                                                                                                                                                              | 228<br>. <b>231</b><br>232                             |
| 2 | PXOR                                                                                                                                                                              | 228<br>. <b>231</b><br>232<br>234                      |
| 2 | PXOR                                                                                                                                                                              | 228<br>. <b>231</b><br>232<br>234                      |
| 2 | PXOR  x87 Floating-Point Instruction Reference  F2XM1  FABS  FADDx  FBLD                                                                                                          | 228<br>. 231<br>232<br>234<br>236                      |
| 2 | PXOR  x87 Floating-Point Instruction Reference  F2XM1  FABS  FADDx  FBLD  FBSTP                                                                                                   | 228<br>. 231<br>232<br>234<br>239<br>241               |
| 2 | PXOR  x87 Floating-Point Instruction Reference  F2XM1  FABS  FADDx  FBLD                                                                                                          | 228<br>. 231<br>232<br>234<br>236<br>241<br>243        |
| 2 | PXOR  x87 Floating-Point Instruction Reference  F2XM1  FABS  FADDx  FBLD  FBSTP  FCHS.  FCLEX                                                                                     | 228<br>. 231<br>232<br>234<br>236<br>241<br>243        |
| 2 | PXOR  x87 Floating-Point Instruction Reference  F2XM1  FABS  FADDx  FBLD  FBSTP  FCHS  FCLEX  (FNCLEX)                                                                            | 228<br>. 231<br>232<br>234<br>236<br>243<br>243        |
| 2 | PXOR  x87 Floating-Point Instruction Reference  F2XM1  FABS  FADDx  FBLD  FBSTP  FCHS  FCLEX  (FNCLEX)  FCMOVcc                                                                   | 228<br>. 231<br>232<br>236<br>236<br>243<br>243<br>245 |
| 2 | PXOR  x87 Floating-Point Instruction Reference  F2XM1  FABS  FADDx  FBLD  FBSTP  FCHS.  FCLEX  (FNCLEX)  FCMOVcc  FCOMx                                                           | 228 . 231 232 234 236 241 243 245 247                  |
| 2 | PXOR  x87 Floating-Point Instruction Reference  F2XM1  FABS  FADDx  FBLD  FBSTP  FCHS  FCLEX  (FNCLEX)  FCMOVcc  FCOMx  FCOMIx                                                    | 228 . 231 232 234 236 239 241 243 245 247              |
| 2 | PXOR  x87 Floating-Point Instruction Reference  F2XM1 FABS FADDx FBLD FBSTP FCHS FCHS FCLEX (FNCLEX) FCMOVcc FCOMx FCOMIx FCOMIx FCOS                                             | 228 . 231 232 234 236 243 243 245 245 252              |
| 2 | PXOR  x87 Floating-Point Instruction Reference  F2XM1 FABS FADDx FBLD FBLD FBSTP FCHS FCHS FCLEX (FNCLEX) FCMOVcc FCOMx FCOMIx FCOS FDECSTP                                       | 228 . 231 232 236 236 239 241 243 245 245 252 254      |
| 2 | PXOR  x87 Floating-Point Instruction Reference  F2XM1  FABS  FADDx  FBLD  FBSTP  FCHS  FCHS  FCLEX  (FNCLEX)  FCMOVcc  FCOMx  FCOMIx  FCOS  FDECSTP  FDIVx                        | 228 . 231 232 234 236 243 247 247 246 256 256          |
| 2 | PXOR  x87 Floating-Point Instruction Reference  F2XM1 FABS FADDx FBLD FBSTP FCHS FCHS FCLEX (FNCLEX) FCMOVcc FCOMx FCOMIx FCOS FDECSTP FDIVx FDIVX                                | 228 231 232 234 236 243 245 245 256 258 261            |
| 2 | PXOR  x87 Floating-Point Instruction Reference  F2XM1 FABS FADDx FBLD FBSTP FCHS FCHS FCLEX (FNCLEX) FCMOVcc FCOMx FCOMIx FCOS FDECSTP FDIVx FDIVX FFREE                          | 228 . 231 232 236 236 243 245 245 252 254 256 258      |
| 2 | PXOR  x87 Floating-Point Instruction Reference  F2XM1 FABS FADDx FBLD FBSTP FCHS FCHS FCLEX (FNCLEX) FCMOVcc FCOMx FCOMIx FCOS FDECSTP FDIVx FDIVx FDIVx FFREE FICOMx             |                                                        |
| 2 | PXOR  x87 Floating-Point Instruction Reference  F2XM1 FABS FADDx FBLD FBSTP FCHS FCHS FCLEX (FNCLEX) FCMOVcc FCOMx FCOMIx FCOS FDECSTP FDIVx FDIVX FFREE FICOMx FFREE FICOMx FILD |                                                        |
| 2 | PXOR  x87 Floating-Point Instruction Reference  F2XM1 FABS FADDx FBLD FBSTP FCHS FCHS FCLEX (FNCLEX) FCMOVcc FCOMx FCOMIx FCOS FDECSTP FDIVx FDIVx FDIVx FFREE FICOMx             |                                                        |

#### 26569-Rev. 3.04-September 2003

| (FNINIT)         |     |       |
|------------------|-----|-------|
| FISTx            |     | . 273 |
| FLD              |     | . 276 |
| FLD1             |     | . 278 |
| FLDCW            |     | . 279 |
| FLDENV           |     | . 281 |
| FLDL2E           |     | . 283 |
| FLDL2T           |     | . 285 |
| FLDLG2           |     | . 287 |
| FLDLN2           |     | . 289 |
| FLDPI            |     | . 291 |
| FLDZ             |     | . 293 |
| $\mathrm{FMUL}x$ |     | . 294 |
| FNOP             |     |       |
| FPATAN           |     |       |
| FPREM            |     |       |
| FPREM1           |     |       |
| FPTAN            |     |       |
| FRNDINT          |     |       |
| FRSTOR           |     |       |
| FSAVE            | • • | .010  |
| (FNSAVE)         |     | 312   |
| FSCALE           |     |       |
| FSIN             |     |       |
| FSINCOS          |     |       |
| FSQRT            |     |       |
| FST              | • • | . 520 |
| FSTP             |     | 322   |
| FSTCW            | • • | . 522 |
| (FNSTCW)         |     | 325   |
| FSTENV           | • • | . 323 |
| (FNSTENV)        |     | 327   |
| FSTSW            | • • | . 321 |
| (FNSTSW)         |     | 329   |
| FSUBx            |     |       |
| FSUBRx           |     |       |
| FTST             |     |       |
| FUCOMx.          |     |       |
| FUCOMIX          |     |       |
| FWAIT            | • • | . 541 |
| r waii<br>(WAIT) |     | 242   |
| FXAM             |     |       |
| FXAM             |     |       |
| FXCHFXRSTOR      |     |       |
| FXRSIORFXSAVE    |     |       |
|                  |     |       |
| FXTRACT          |     |       |
| FYL2X            |     | . 355 |

| 26569—Rev. 3.04—September 2003 | AMD 64-Bit Technology |
|--------------------------------|-----------------------|
|                                |                       |

Contents vii

AMD 64-Bit Technology

26569-Rev. 3.04-September 2003

viii Contents

| į                |    |            |
|------------------|----|------------|
| ĮΨ               | ur |            |
|                  | ч  | <b>C</b> 3 |
| <br>- <b>-</b> - |    |            |

Figures ix

26569-Rev. 3.04-September 2003

x Figures

## **Tables**

| Table 1-1.  | Immediate-Byte Operand Encoding for 64-Bit PEXTRW88           |
|-------------|---------------------------------------------------------------|
| Table 1-2.  | Numeric Range for PF2ID Results                               |
| Table 1-3.  | Numeric Range for PF2IW Results93                             |
| Table 1-4.  | Numeric Range for PFACC Results                               |
| Table 1-5.  | Numeric Range for the PFADD Results99                         |
| Table 1-6.  | Numeric Range for the PFCMPEQ Instruction102                  |
| Table 1-7.  | Numeric Range for the PFCMPGE Instruction105                  |
| Table 1-8.  | Numeric Range for the PFCMPGT Instruction108                  |
| Table 1-9.  | Numeric Range for the PFMAX Instruction                       |
| Table 1-10. | Numeric Range for the PFMIN Instruction114                    |
| Table 1-11. | Numeric Range for the PFMUL Instruction                       |
| Table 1-12. | Numeric Range of PFNACC Results                               |
| Table 1-13. | Numeric Range of PFPNACC Result (Low Result)123               |
| Table 1-14. | Numeric Range of PFPNACC Result (High Result)123              |
| Table 1-15. | Numeric Range for the PFRCP Result126                         |
| Table 1-16. | Numeric Range for the PFRCP Result                            |
| Table 1-17. | Numeric Range for the PFSUB Results141                        |
| Table 1-18. | Numeric Range for the PFSUBR Results144                       |
| Table 1-19. | Immediate-Byte Operand Encoding for 64-Bit PINSRW $\dots$ 150 |
| Table 1-20. | Immediate-Byte Operand Encoding for PSHUFW 179                |
| Table 2-1.  | Storing Numbers as Integers                                   |
| Table 2-2.  | Computing Arctangent of Numbers                               |

Tables xi

26569-Rev. 3.04-September 2003

xii Tables

# **Revision History**

| Date           | Revision | Description                                                                                                                                        |
|----------------|----------|----------------------------------------------------------------------------------------------------------------------------------------------------|
| September 2003 | 3.04     | Clarified x87 condition codes for FPREM and FPREM1 instructions. Corrected tables of numeric ranges for results of PF2ID and PF2IW instructions.   |
| April 2003     | 303      | Corrected numerous typos and stylistic errors. Corrected description of FYL2XP1 instruction. Clarified the description of the FXRSTOR instruction. |

Chapter: xiii

26569-Rev. 3.04-September 2003

xiv Chapter:

#### **Preface**

#### **About This Book**

This book is part of a multivolume work entitled the *AMD64 Architecture Programmer's Manual*. This table lists each volume and its order number.

| Title                                                      | Order No. |
|------------------------------------------------------------|-----------|
| Volume 1, Application Programming                          | 24592     |
| Volume 2, System Programming                               | 24593     |
| Volume 3, General-Purpose and System Instructions          | 24594     |
| Volume 4, 128-Bit Media Instructions                       | 26568     |
| Volume 5, 64-Bit Media and x87 Floating-Point Instructions | 26569     |

#### **Audience**

This volume (Volume 5) is intended for all programmers writing application or system software for a processor that implements the x86-64 architecture.

#### **Contact Information**

To submit questions or comments concerning this document, contact our technical documentation staff at AMD64.Feedback@amd.com.

#### **Organization**

Volumes 3, 4, and 5 describe the AMD64 architecture's instruction set in detail. Together, they cover each instruction's mnemonic syntax, opcodes, functions, affected flags, and possible exceptions.

The AMD64 instruction set is divided into five subsets:

- General-purpose instructions
- System instructions
- 128-bit media instructions

Preface xv

- 64-bit media instructions
- x87 floating-point instructions

Several instructions belong to—and are described identically in—multiple instruction subsets.

This volume describes the 64-bit media and x87 floating-point instructions. The index at the end cross-references topics within this volume. For other topics relating to the AMD64 architecture, and for information on instructions in other subsets, see the tables of contents and indexes of the other volumes.

#### **Definitions**

Many of the following definitions assume an in-depth knowledge of the legacy x86 architecture. See "Related Documents" on page xxvii for descriptions of the legacy x86 architecture.

#### **Terms and Notation**

In addition to the notation described below, "Opcode-Syntax Notation" in volume 3 describes notation relating specifically to opcodes.

1011b

A binary value—in this example, a 4-bit value.

F0EAh

A hexadecimal value—in this example a 2-byte value.

[1,2)

A range that includes the left-most value (in this case, 1) but excludes the right-most value (in this case, 2).

7-4

A bit range, from bit 7 to 4, inclusive. The high-order bit is shown first.

128-bit media instructions

Instructions that use the 128-bit XMM registers. These are a combination of the SSE and SSE2 instruction sets.

64-bit media instructions

Instructions that use the 64-bit MMX registers. These are primarily a combination of MMX $^{\text{TM}}$  and 3DNow! $^{\text{TM}}$ 

**xvi** Preface

instruction sets, with some additional instructions from the SSE and SSE2 instruction sets.

#### 16-bit mode

Legacy mode or compatibility mode in which a 16-bit address size is active. See *legacy mode* and *compatibility mode*.

#### 32-bit mode

Legacy mode or compatibility mode in which a 32-bit address size is active. See *legacy mode* and *compatibility mode*.

#### 64-bit mode

A submode of *long mode*. In 64-bit mode, the default address size is 64 bits and new features, such as register extensions, are supported for system and application software.

#### #GP(0)

Notation indicating a general-protection exception (#GP) with error code of 0.

#### absolute

Said of a displacement that references the base of a code segment rather than an instruction pointer. Contrast with *relative*.

#### biased exponent

The sum of a floating-point value's exponent and a constant bias for a particular floating-point data type. The bias makes the range of the biased exponent always positive, which allows reciprocation without overflow.

#### byte

Eight bits.

#### clear

To write a bit value of 0. Compare set.

#### compatibility mode

A submode of *long mode*. In compatibility mode, the default address size is 32 bits, and legacy 16-bit and 32-bit applications run without modification.

Preface xvii

#### commit

To irreversibly write, in program order, an instruction's result to software-visible storage, such as a register (including flags), the data cache, an internal write buffer, or memory.

#### CPL

Current privilege level.

#### CR0-CR4

A register range, from register CR0 through CR4, inclusive, with the low-order register first.

#### CR0.PE = 1

Notation indicating that the PE bit of the CR0 register has a value of 1.

#### direct

Referencing a memory location whose address is included in the instruction's syntax as an immediate operand. The address may be an absolute or relative address. Compare indirect.

#### dirty data

Data held in the processor's caches or internal buffers that is more recent than the copy held in main memory.

#### displacement

A signed value that is added to the base of a segment (absolute addressing) or an instruction pointer (relative addressing). Same as *offset*.

#### doubleword

Two words, or four bytes, or 32 bits.

#### double quadword

Eight words, or 16 bytes, or 128 bits. Also called *octword*.

#### DS:rSI

The contents of a memory location whose segment address is in the DS register and whose offset relative to that segment is in the rSI register.

xviii Preface

#### EFER.LME = 0

Notation indicating that the LME bit of the EFER register has a value of 0.

#### effective address size

The address size for the current instruction after accounting for the default address size and any address-size override prefix.

#### effective operand size

The operand size for the current instruction after accounting for the default operand size and any operand-size override prefix.

#### element

See vector.

#### exception

An abnormal condition that occurs as the result of executing an instruction. The processor's response to an exception depends on the type of the exception. For all exceptions except 128-bit media SIMD floating-point exceptions and x87 floating-point exceptions, control is transferred to the handler (or service routine) for that exception, as defined by the exception's vector. For floating-point exceptions defined by the IEEE 754 standard, there are both masked and unmasked responses. When unmasked, the exception handler is called, and when masked, a default response is provided instead of calling the handler.

#### FF /0

Notation indicating that FF is the first byte of an opcode, and a subfield in the second byte has a value of 0.

#### flush

An often ambiguous term meaning (1) writeback, if modified, and invalidate, as in "flush the cache line," or (2) invalidate, as in "flush the pipeline," or (3) change a value, as in "flush to zero."

#### GDT

Global descriptor table.

#### IDT

Interrupt descriptor table.

Preface xix

#### IGN

Ignore. Field is ignored.

#### indirect

Referencing a memory location whose address is in a register or other memory location. The address may be an absolute or relative address. Compare *direct*.

#### IRB

The virtual-8086 mode interrupt-redirection bitmap.

#### IST

The long-mode interrupt-stack table.

#### IVT

The real-address mode interrupt-vector table.

#### LDT

Local descriptor table.

#### legacy x86

The legacy x86 architecture. See "Related Documents" on page xxvii for descriptions of the legacy x86 architecture.

#### legacy mode

An operating mode of the AMD64 architecture in which existing 16-bit and 32-bit applications and operating systems run without modification. A processor implementation of the AMD64 architecture can run in either *long mode* or *legacy mode*. Legacy mode has three submodes, *real mode*, *protected mode*, and *virtual-8086 mode*.

#### long mode

An operating mode unique to the AMD64 architecture. A processor implementation of the AMD64 architecture can run in either *long mode* or *legacy mode*. Long mode has two submodes, *64-bit mode* and *compatibility mode*.

#### lsb

Least-significant bit.

#### LSB

Least-significant byte.

xx Preface

#### main memory

Physical memory, such as RAM and ROM (but not cache memory) that is installed in a particular computer system.

#### mask

(1) A control bit that prevents the occurrence of a floatingpoint exception from invoking an exception-handling routine. (2) A field of bits used for a control purpose.

#### MBZ

Must be zero. If software attempts to set an MBZ bit to 1, a general-protection exception (#GP) occurs.

#### memory

Unless otherwise specified, main memory.

#### ModRM

A byte following an instruction opcode that specifies address calculation based on mode (Mod), register (R), and memory (M) variables.

#### moffset

A 16, 32, or 64-bit offset that specifies a memory operand directly, without using a ModRM or SIB byte.

#### msb

Most-significant bit.

#### MSB

Most-significant byte.

#### multimedia instructions

A combination of 128-bit media instructions and 64-bit media instructions.

#### octword

Same as double quadword.

#### offset

Same as displacement.

#### overflow

The condition in which a floating-point number is larger in magnitude than the largest, finite, positive or negative

Preface xxi

number that can be represented in the data-type format being used.

#### packed

See vector.

#### PAE

Physical-address extensions.

#### physical memory

Actual memory, consisting of main memory and cache.

#### probe

A check for an address in a processor's caches or internal buffers. *External probes* originate outside the processor, and *internal probes* originate within the processor.

#### protected mode

A submode of legacy mode.

#### quadword

Four words, or eight bytes, or 64 bits.

#### RAZ

Read as zero (0), regardless of what is written.

#### real-address mode

See real mode.

#### real mode

A short name for *real-address mode*, a submode of *legacy mode*.

#### relative

Referencing with a displacement (also called offset) from an instruction pointer rather than the base of a code segment. Contrast with *absolute*.

#### REX

An instruction prefix that specifies a 64-bit operand size and provides access to additional registers.

#### RIP-relative addressing

Addressing relative to the 64-bit RIP instruction pointer.

**xxii** Preface

set

To write a bit value of 1. Compare *clear*.

#### SIB

A byte following an instruction opcode that specifies address calculation based on scale (S), index (I), and base (B).

#### SIMD

Single instruction, multiple data. See vector.

#### SSE

Streaming SIMD extensions instruction set. See 128-bit media instructions and 64-bit media instructions.

#### SSE2

Extensions to the SSE instruction set. See 128-bit media instructions and 64-bit media instructions.

#### sticky bit

A bit that is set or cleared by hardware and that remains in that state until explicitly changed by software.

#### TOP

The x87 top-of-stack pointer.

#### TPR

Task-priority register (CR8).

#### **TSS**

Task-state segment.

#### underflow

The condition in which a floating-point number is smaller in magnitude than the smallest nonzero, positive or negative number that can be represented in the data-type format being used.

#### vector

(1) A set of integer or floating-point values, called *elements*, that are packed into a single operand. Most of the 128-bit and 64-bit media instructions use vectors as operands. Vectors are also called *packed* or *SIMD* (single-instruction multiple-data) operands.

Preface xxiii

(2) An index into an interrupt descriptor table (IDT), used to access exception handlers. Compare *exception*.

#### virtual-8086 mode

A submode of legacy mode.

#### word

Two bytes, or 16 bits.

#### *x*86

See *legacy* x86.

#### Registers

In the following list of registers, the names are used to refer either to a given register or to the contents of that register:

#### AH-DH

The high 8-bit AH, BH, CH, and DH registers. Compare *AL-DL*.

#### AL-DL

The low 8-bit AL, BL, CL, and DL registers. Compare *AH–DH*.

#### AL-r15B

The low 8-bit AL, BL, CL, DL, SIL, DIL, BPL, SPL, and R8B-R15B registers, available in 64-bit mode.

#### BP

Base pointer register.

#### CRn

Control register number n.

#### CS

Code segment register.

#### eAX-eSP

The 16-bit AX, BX, CX, DX, DI, SI, BP, and SP registers or the 32-bit EAX, EBX, ECX, EDX, EDI, ESI, EBP, and ESP registers. Compare *rAX-rSP*.

#### EBP

Extended base pointer register.

#### **EFER**

Extended features enable register.

**xxiv** Preface

**eFLAGS** 

16-bit or 32-bit flags register. Compare *rFLAGS*.

**EFLAGS** 

32-bit (extended) flags register.

eIP

16-bit or 32-bit instruction-pointer register. Compare *rIP*.

**EIP** 

32-bit (extended) instruction-pointer register.

**FLAGS** 

16-bit flags register.

**GDTR** 

Global descriptor table register.

**GPRs** 

General-purpose registers. For the 16-bit data size, these are AX, BX, CX, DX, DI, SI, BP, and SP. For the 32-bit data size, these are EAX, EBX, ECX, EDX, EDI, ESI, EBP, and ESP. For the 64-bit data size, these include RAX, RBX, RCX, RDX, RDI, RSI, RBP, RSP, and R8–R15.

**IDTR** 

Interrupt descriptor table register.

IP

16-bit instruction-pointer register.

**LDTR** 

Local descriptor table register.

**MSR** 

Model-specific register.

r8-r15

The 8-bit R8B-R15B registers, or the 16-bit R8W-R15W registers, or the 32-bit R8D-R15D registers, or the 64-bit R8-R15 registers.

rAX-rSP

The 16-bit AX, BX, CX, DX, DI, SI, BP, and SP registers, or the 32-bit EAX, EBX, ECX, EDX, EDI, ESI, EBP, and ESP registers, or the 64-bit RAX, RBX, RCX, RDX, RDI, RSI,

RBP, and RSP registers. Replace the placeholder *r* with nothing for 16-bit size, "E" for 32-bit size, or "R" for 64-bit size.

#### RAX

64-bit version of the EAX register.

#### RBP

64-bit version of the EBP register.

#### RBX

64-bit version of the EBX register.

#### RCX

64-bit version of the ECX register.

#### RDI

64-bit version of the EDI register.

#### RDX

64-bit version of the EDX register.

#### *rFLAGS*

16-bit, 32-bit, or 64-bit flags register. Compare *RFLAGS*.

#### **RFLAGS**

64-bit flags register. Compare *rFLAGS*.

#### rIP

16-bit, 32-bit, or 64-bit instruction-pointer register. Compare *RIP*.

#### RIP

64-bit instruction-pointer register.

#### RSI

64-bit version of the ESI register.

#### RSP

64-bit version of the ESP register.

#### SP

Stack pointer register.

#### SS

Stack segment register.

xxvi Preface

TPR

Task priority register, a new register introduced in the AMD64 architecture to speed interrupt management.

TR

Task register.

#### **Endian Order**

The x86 and AMD64 architectures address memory using littleendian byte-ordering. Multibyte values are stored with their least-significant byte at the lowest byte address, and they are illustrated with their least significant byte at the right side. Strings are illustrated in reverse order, because the addresses of their bytes increase from right to left.

#### **Related Documents**

- Peter Abel, *IBM PC Assembly Language and Programming*, Prentice-Hall, Englewood Cliffs, NJ, 1995.
- Rakesh Agarwal, 80x86 Architecture & Programming: Volume II, Prentice-Hall, Englewood Cliffs, NJ, 1991.
- AMD, *AMD-K6*<sup>TM</sup> *MMX*<sup>TM</sup> *Enhanced Processor Multimedia Technology*, Sunnyvale, CA, 2000.
- AMD, *3DNow!*<sup>TM</sup> *Technology Manual*, Sunnyvale, CA, 2000.
- AMD, *AMD Extensions to the 3DNow!*<sup>TM</sup> *and MMX*<sup>TM</sup> *Instruction Sets*, Sunnyvale, CA, 2000.
- Don Anderson and Tom Shanley, *Pentium Processor System Architecture*, Addison-Wesley, New York, 1995.
- Nabajyoti Barkakati and Randall Hyde, *Microsoft Macro Assembler Bible*, Sams, Carmel, Indiana, 1992.
- Barry B. Brey, 8086/8088, 80286, 80386, and 80486 Assembly Language Programming, Macmillan Publishing Co., New York, 1994.
- Barry B. Brey, *Programming the 80286*, 80386, 80486, and *Pentium Based Personal Computer*, Prentice-Hall, Englewood Cliffs, NJ, 1995.
- Ralf Brown and Jim Kyle, *PC Interrupts*, Addison-Wesley, New York, 1994.
- Penn Brumm and Don Brumm, 80386/80486 Assembly Language Programming, Windcrest McGraw-Hill, 1993.
- Geoff Chappell, DOS Internals, Addison-Wesley, New York, 1994.

Preface xxvii

- Chips and Technologies, Inc. Super386 DX Programmer's Reference Manual, Chips and Technologies, Inc., San Jose, 1992.
- John Crawford and Patrick Gelsinger, *Programming the* 80386, Sybex, San Francisco, 1987.
- Cyrix Corporation, 5x86 Processor BIOS Writer's Guide, Cyrix Corporation, Richardson, TX, 1995.
- Cyrix Corporation, *M1 Processor Data Book*, Cyrix Corporation, Richardson, TX, 1996.
- Cyrix Corporation, MX Processor MMX Extension Opcode Table, Cyrix Corporation, Richardson, TX, 1996.
- Cyrix Corporation, *MX Processor Data Book*, Cyrix Corporation, Richardson, TX, 1997.
- Ray Duncan, Extending DOS: A Programmer's Guide to Protected-Mode DOS, Addison Wesley, NY, 1991.
- William B. Giles, Assembly Language Programming for the Intel 80xxx Family, Macmillan, New York, 1991.
- Frank van Gilluwe, *The Undocumented PC*, Addison-Wesley, New York, 1994.
- John L. Hennessy and David A. Patterson, *Computer Architecture*, Morgan Kaufmann Publishers, San Mateo, CA, 1996.
- Thom Hogan, *The Programmer's PC Sourcebook*, Microsoft Press, Redmond, WA, 1991.
- Hal Katircioglu, *Inside the 486*, *Pentium, and Pentium Pro*, Peer-to-Peer Communications, Menlo Park, CA, 1997.
- IBM Corporation, 486SLC Microprocessor Data Sheet, IBM Corporation, Essex Junction, VT, 1993.
- IBM Corporation, 486SLC2 Microprocessor Data Sheet, IBM Corporation, Essex Junction, VT, 1993.
- IBM Corporation, 80486DX2 Processor Floating Point Instructions, IBM Corporation, Essex Junction, VT, 1995.
- IBM Corporation, 80486DX2 Processor BIOS Writer's Guide, IBM Corporation, Essex Junction, VT, 1995.
- IBM Corporation, *Blue Lightening 486DX2 Data Book*, IBM Corporation, Essex Junction, VT, 1994.
- Institute of Electrical and Electronics Engineers, *IEEE Standard for Binary Floating-Point Arithmetic*, ANSI/IEEE Std 754-1985.

**xxviii** Preface

- Institute of Electrical and Electronics Engineers, *IEEE* Standard for Radix-Independent Floating-Point Arithmetic, ANSI/IEEE Std 854-1987.
- Muhammad Ali Mazidi and Janice Gillispie Mazidi, 80X86 IBM PC and Compatible Computers, Prentice-Hall, Englewood Cliffs, N.J., 1997.
- Hans-Peter Messmer, *The Indispensable Pentium Book*, Addison-Wesley, New York, 1995.
- Karen Miller, An Assembly Language Introduction to Computer Architecture: Using the Intel Pentium, Oxford University Press, New York, 1999.
- Stephen Morse, Eric Isaacson, and Douglas Albert, *The* 80386/387 *Architecture*, John Wiley & Sons, New York, 1987.
- NexGen Inc., *Nx586 Processor Data Book*, NexGen Inc., Milpitas, CA, 1993.
- NexGen Inc., *Nx686 Processor Data Book*, NexGen Inc., Milpitas, CA, 1994.
- Bipin Patwardhan, *Introduction to the Streaming SIMD Extensions in the Pentium III*, www.x86.org/articles/sse\_pt1/simd1.htm, June, 2000.
- Peter Norton, Peter Aitken, and Richard Wilton, *PC Programmer's Bible*, Microsoft Press, Redmond, WA, 1993.
- PharLap 386\ASM Reference Manual, Pharlap, Cambridge MA, 1993.
- PharLap TNT DOS-Extender Reference Manual, Pharlap, Cambridge MA, 1995.
- Sen-Cuo Ro and Sheau-Chuen Her, *i386/i486 Advanced Programming*, Van Nostrand Reinhold, New York, 1993.
- Jeffrey P. Royer, *Introduction to Protected Mode Programming*, course materials for an onsite class, 1992.
- Tom Shanley, *Protected Mode System Architecture*, Addison Wesley, NY, 1996.
- SGS-Thomson Corporation, 80486DX Processor SMM Programming Manual, SGS-Thomson Corporation, 1995.
- Walter A. Triebel, *The 80386DX Microprocessor*, Prentice-Hall, Englewood Cliffs, NJ, 1992.
- John Wharton, *The Complete x86*, MicroDesign Resources, Sebastopol, California, 1994.
- Web sites and newsgroups:

Preface xxix

- www.amd.com
- news.comp.arch
- news.comp.lang.asm.x86
- news.intel.microprocessors
- news.microsoft

**XXX** Preface

#### 1 64-Bit Media Instruction Reference

This chapter describes the function, mnemonic syntax, opcodes, affected flags, and possible exceptions generated by the 64-bit media instructions. These instructions operate on data located in the 64-bit MMX registers. Most of the instructions operate in parallel on sets of packed elements called *vectors*, although some operate on scalars. The instructions define both integer and floating-point operations, and include the legacy MMX<sup>TM</sup> instructions, the 3DNow!<sup>TM</sup> instructions, and the AMD extensions to the MMX and 3DNow! instruction sets.

Each instruction that performs a vector (packed) operation is illustrated with a diagram. Figure 1-1 on page 2 shows the conventions used in these diagrams. The particular diagram shows the PSLLW (packed shift left logical words) instruction.

Arrowheads going to a source operand indicate the writing of the result. In this case, the result is written to the first source operand, which is also the destination operand.



Arrowheads coming *from* a source operand indicate that the source operand provides a *control function*. In this case, the second source operand specifies the *number* of bits to shift, and the first source operand specifies the *data* to be shifted.

Ellipses indicate that the operation is repeated for each element of the source vectors. In this case, there are 4 elements in each source vector, so the operation is performed 4 times, in parallel.

Figure 1-1. Diagram Conventions for 64-Bit Media Instructions

*Gray* areas in diagrams indicate unmodified operand bits.

Like the 128-bit media instructions, many of the 64-bit instructions independently and simultaneously perform a single operation on multiple elements of a vector and are thus classified as *single-instruction*, *multiple-data* (SIMD) instructions. A few 64-bit media instructions convert operands in MMX registers to operands in GPR, XMM, or x87 registers (or vice versa), or save or restore MMX state, or reset x87 state.

Hardware support for a specific 64-bit media instruction depends on the presence of at least one of the following CPUID functions:

■ MMX Instructions, indicated by bit 23 of CPUID standard function 1 and extended function 8000\_0001h.

- AMD Extensions to MMX Instructions, indicated by bit 22 of CPUID extended function 8000 0001h.
- SSE, indicated by bit 25 of CPUID standard function 1.
- SSE2, indicated by bit 26 of CPUID standard function 1.
- AMD 3DNow! Instructions, indicated by bit 31 of CPUID extended function 8000 0001h.
- AMD Extensions to 3DNow! Instructions, indicated by bit 30 of CPUID extended function 8000 0001h.
- FXSAVE and FXRSTOR, indicated by bit 24 of CPUID standard function 1 and extended function 8000\_0001h.

The 64-bit media instructions can be used in legacy mode or long mode. Their use in long mode is available if the following CPUID function is set:

■ Long Mode, indicated by bit 29 of CPUID extended function 8000\_0001h.

Compilation of 64-bit media programs for execution in 64-bit mode offers four primary advantages: access to the eight extended, 64-bit general-purpose registers (for a register set consisting of GPR0–GPR15), access to the eight extended XMM registers (for a register set consisting of XMM0–XMM15), access to the 64-bit virtual address space, and access to the RIP-relative addressing mode.

#### For further information, see:

- "64-Bit Media Programming" in volume 1.
- "Summary of Registers and Data Types" in volume 3.
- "Notation" in volume 3.
- "Instruction Prefixes" in volume 3.

#### CVTPD2PI

# **Convert Packed Double-Precision Floating-Point to Packed Doubleword Integers**

Converts two packed double-precision floating-point values in an XMM register or a 128-bit memory location to two packed 32-bit signed integer values and writes the converted values in an MMX register.

Mnemonic Opcode Description

CVTPD2PI mmx, xmm2/mem128

66 0F 2D /r

Converts packed double-precision floating-point values in an XMM register or 128-bit memory location to packed doubleword integers values in the destination MMX register.



If the result of the conversion is an inexact value, the value is rounded as specified by the rounding control bits (RC) in the MXCSR register. If the floating-point value is a NaN, infinity, or if the result of the conversion is larger than the maximum signed doubleword  $(-2^{31} \text{ to } +2^{31} - 1)$ , the instruction returns the 32-bit indefinite integer value (8000 0000h) when the invalid-operation exception (IE) is masked.

#### **Related Instructions**

CVTDQ2PD, CVTPD2DQ, CVTPI2PD, CVTSD2SI, CVTSI2SD, CVTTPD2DQ, CVTTPD2PI, CVTTSD2SI

#### rFLAGS Affected

None

4 CVTPD2PI

#### **MXCSR Flags Affected**

| FZ | R  | C  | PM | UM | OM | ZM | DM | IM | DAZ | PE | UE | OE | ZE | DE | IE |
|----|----|----|----|----|----|----|----|----|-----|----|----|----|----|----|----|
|    |    |    |    |    |    |    |    |    |     | М  |    |    |    |    | M  |
| 15 | 14 | 13 | 12 | 11 | 10 | 9  | 8  | 7  | 6   | 5  | 4  | 3  | 2  | 1  | 0  |

Note:

A flag that can be set to one or zero is M (modified). Unaffected flags are blank.

#### **Exceptions**

| F                                            | D '  | Virtual | Durate et 1 | Course of Franchisco                                                                                                                  |
|----------------------------------------------|------|---------|-------------|---------------------------------------------------------------------------------------------------------------------------------------|
| Exception                                    | Real | 8086    | Protected   | Cause of Exception                                                                                                                    |
| Invalid opcode, #UD                          | X    | X       | Х           | The SSE2 instructions are not supported, as indicated by bit 26 of CPUID standard function 1.                                         |
|                                              | X    | X       | Х           | The emulate bit (EM) of CR0 was set to 1.                                                                                             |
|                                              | Х    | X       | Х           | The operating-system FXSAVE/FXRSTOR support bit (OSFXSR) of CR4 was cleared to 0.                                                     |
|                                              | Х    | X       | Х           | There was an unmasked SIMD floating-point exception while CR4.OSXMMEXCPT = 0. See SIMD Floating-Point Exceptions, below, for details. |
| D ' ' 'I I WANA                              |      |         |             | -                                                                                                                                     |
| Device not available, #NM                    | X    | Х       | Х           | The task-switch bit (TS) of CR0 was set to 1.                                                                                         |
| Stack, #SS                                   | Х    | Х       | Х           | A memory address exceeded the stack segment limit or was non-canonical.                                                               |
| General protection, #GP                      | Х    | Х       | Х           | A memory address exceeded a data segment limit or was non-canonical.                                                                  |
|                                              |      |         | Х           | A null data segment was used to reference memory.                                                                                     |
|                                              | Х    | X       | Х           | The memory operand was not aligned on a 16-byte boundary.                                                                             |
| Page fault, #PF                              |      | Χ       | Х           | A page fault resulted from the execution of the instruction.                                                                          |
| x87 floating-point<br>exception pending, #MF | Х    | Х       | Х           | An exception is pending due to an x87 floating-point instruction.                                                                     |
| SIMD Floating-Point<br>Exception, #XF        | Х    | X       | Х           | There was an unmasked SIMD floating-point exception while CR4.OSXMMEXCPT = 1. See SIMD Floating-Point Exceptions, below, for details. |

CVTPD2PI 5

| Exception                        | Real             | Virtual<br>8086 | Protected | Cause of Exception                                                   |
|----------------------------------|------------------|-----------------|-----------|----------------------------------------------------------------------|
|                                  | Point Exceptions |                 |           |                                                                      |
| Invalid-operation exception (IE) | Х                | Х               | Х         | A source operand was an SNaN value, a QNaN value, or ±infinity.      |
|                                  | Х                | X               | Х         | A source operand was too large to fit in the destination format.     |
| Precision exception (PE)         | Х                | Х               | Х         | A result could not be represented exactly in the destination format. |

6 CVTPD2PI

### **CVTPI2PD**

# **Convert Packed Doubleword Integers to Packed Double-Precision Floating-Point**

Converts two packed 32-bit signed integer values in an MMX register or a 64-bit memory location to two double-precision floating-point values and writes the converted values in an XMM register.

Mnemonic Opcode Description

CVTPI2PD xmm, mmx/mem64

66 0F 2A /r

Converts two packed doubleword integer values in an MMX register or 64-bit memory location to two packed double-precision floating-point values in the destination XMM register.



### **Related Instructions**

CVTDQ2PD, CVTPD2DQ, CVTPD2PI, CVTSD2SI, CVTSI2SD, CVTTPD2DQ, CVTTPD2PI, CVTTSD2SI

#### rFLAGS Affected

None

#### **MXCSR Flags Affected**

# Exceptions

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                            |
|----------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Х    | Х               | Х         | The SSE2 instructions are not supported, as indicated by bit 26 of CPUID standard function 1. |
|                                              | Х    | х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                     |
|                                              | Х    | Х               | Х         | The operating-system FXSAVE/FXRSTOR support bit (OSFXSR) of CR4 was cleared to 0.             |
| Device not available, #NM                    | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                       |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                          |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                             |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An exception was pending due to an x87 floating-point instruction.                            |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.             |

8 CVTPI2PD

### **CVTPI2PS**

# **Convert Packed Doubleword Integers to Packed Single-Precision Floating-Point**

Converts two packed 32-bit signed integer values in an MMX register or a 64-bit memory location to two single-precision floating-point values and writes the converted values in the low-order 64 bits of an XMM register. The high-order 64 bits of the XMM register are not modified.

| Mnemonic                | Opcode   | Description                                                                                                                                                       |
|-------------------------|----------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| CVTPI2PS xmm, mmx/mem64 | 0F 2A /r | Converts packed doubleword integer values in an MMX register or 64-bit memory location to single-precision floating-point values in the destination XMM register. |



#### **Related Instructions**

CVTDQ2PS, CVTPS2DQ, CVTPS2PI, CVTSI2SS, CVTSS2SI, CVTTPS2DQ, CVTTPS2PI, CVTTSS2SI

#### rFLAGS Affected

# **MXCSR Flags Affected**

| FZ | R  | C  | PM | UM | OM | ZM | DM | IM | DAZ | PE | UE | OE | ZE | DE | IE |
|----|----|----|----|----|----|----|----|----|-----|----|----|----|----|----|----|
|    |    |    |    |    |    |    |    |    |     | М  |    |    |    |    |    |
| 15 | 14 | 13 | 12 | 11 | 10 | 9  | 8  | 7  | 6   | 5  | 4  | 3  | 2  | 1  | 0  |

Note:

A flag that can be set to one or zero is M (modified). Unaffected flags are blank.

# **Exceptions**

| Exception                                 | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                                                                                    |
|-------------------------------------------|------|-----------------|--------------|---------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Х    | Х               | Х            | The SSE instructions are not supported, as indicated by bit 25 of CPUID standard function 1.                                          |
|                                           | Х    | х               | Х            | The emulate bit (EM) of CR0 was set to 1.                                                                                             |
|                                           | Х    | Х               | Х            | The operating-system FXSAVE/FXRSTOR support bit (OSFXSR) of CR4 was cleared to 0.                                                     |
|                                           | Х    | Х               | Х            | There was an unmasked SIMD floating-point exception while CR4.OSXMMEXCPT = 0. See SIMD Floating-Point Exceptions, below, for details. |
| Device not available, #NM                 | Х    | Х               | Х            | The task-switch bit (TS) of CR0 was set to 1.                                                                                         |
| Stack, #SS                                | Х    | Х               | Х            | A memory address exceeded the stack segment limit or was non-canonical.                                                               |
| General protection, #GP                   | Х    | Х               | Х            | A memory address exceeded a data segment limit or was non-canonical.                                                                  |
|                                           |      |                 | Х            | A null data segment was used to reference memory                                                                                      |
| Page fault, #PF                           |      | Х               | Х            | A page fault resulted from the execution of the instruction.                                                                          |
| x87 floating-point exception pending, #MF | Х    | Х               | Х            | An exception was pending due to an x87 floating-point instruction.                                                                    |
| Alignment check, #AC                      |      | Х               | Х            | An unaligned memory reference was performed while alignment checking was enabled.                                                     |
| SIMD Floating-Point<br>Exception, #XF     | Х    | Х               | Х            | There was an unmasked SIMD floating-point exception while CR4.OSXMMEXCPT = 1. See SIMD Floating-Point Exceptions, below, for details. |
|                                           |      | SIN             | AD Floating- | Point Exceptions                                                                                                                      |
| Precision exception (PE)                  | Х    | Х               | Х            | A result could not be represented exactly in the destination format.                                                                  |

10 CVTPI2PS

### CVTPS2PI

# **Convert Packed Single-Precision Floating-Point to Packed Doubleword Integers**

Converts two packed single-precision floating-point values in the low-order 64 bits of an XMM register or a 64-bit memory location to two packed 32-bit signed integers and writes the converted values in an MMX register.

### Mnemonic Opcode Description

CVTPS2PI mmx, xmm/mem64

0F 2D /r

Converts packed single-precision floating-point values in an XMM register or 64-bit memory location to packed doubleword integers in the destination MMX register.



If the result of the conversion is an inexact value, the value is rounded as specified by the rounding control bits (RC) in the MXCSR register. If the floating-point value is a NaN, infinity, or if the result of the conversion is larger than the maximum signed doubleword  $(-2^{31} \text{ to } +2^{31} -1)$ , the instruction returns the 32-bit indefinite integer value (8000\_000h) when the invalid-operation exception (IE) is masked.

#### **Related Instructions**

CVTDQ2PS, CVTPI2PS, CVTPS2DQ, CVTSI2SS, CVTSS2SI, CVTTPS2DQ, CVTTPS2PI, CVTTSS2SI

#### **rFLAGS Affected**

# **MXCSR Flags Affected**

| FZ | R  | C  | PM | UM | OM | ZM | DM | IM | DAZ | PE | UE | OE | ZE | DE | IE |
|----|----|----|----|----|----|----|----|----|-----|----|----|----|----|----|----|
|    |    |    |    |    |    |    |    |    |     | М  |    |    |    |    | M  |
| 15 | 14 | 13 | 12 | 11 | 10 | 9  | 8  | 7  | 6   | 5  | 4  | 3  | 2  | 1  | 0  |

Note:

A flag that can be set to one or zero is M (modified). Unaffected flags are blank.

# **Exceptions**

| Evention                                     | Dool | Virtual | Duetested | Course of Europhian                                                                                                                   |
|----------------------------------------------|------|---------|-----------|---------------------------------------------------------------------------------------------------------------------------------------|
| Exception                                    | Real | 8086    | Protected | Cause of Exception                                                                                                                    |
| Invalid opcode, #UD                          | Х    | X       | Х         | The SSE instructions are not supported, as indicated by bit 25 of CPUID standard function 1.                                          |
|                                              | X    | X       | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                             |
|                                              | Х    | X       | х         | The operating-system FXSAVE/FXRSTOR support bit (OSFXSR) of CR4 was cleared to 0.                                                     |
|                                              | X    | X       | Х         | There was an unmasked SIMD floating-point exception while CR4.OSXMMEXCPT = 0. See SIMD Floating-Point Exceptions, below, for details. |
| Device not available, #NM                    | Х    | Х       | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                         |
| Stack, #SS                                   | Х    | X       | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                               |
| General protection, #GP                      | Х    | Х       | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                  |
|                                              |      |         | Х         | A null data segment was used to reference memory.                                                                                     |
| Page fault, #PF                              |      | Х       | Х         | A page fault resulted from the execution of the instruction.                                                                          |
| x87 floating-point<br>exception pending, #MF | Х    | Х       | Х         | An exception was pending due to an x87 floating-point instruction.                                                                    |
| Alignment check, #AC                         |      | Χ       | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                     |
| SIMD Floating-Point<br>Exception, #XF        | Х    | Х       | Х         | There was an unmasked SIMD floating-point exception while CR4.OSXMMEXCPT = 1. See SIMD Floating-Point Exceptions, below, for details. |

12 CVTPS2PI

| Exception                        | Real | Virtual<br>8086 | Protected     | Cause of Exception                                                   |
|----------------------------------|------|-----------------|---------------|----------------------------------------------------------------------|
|                                  |      | SIN             | ID Floating-P | oint Exceptions                                                      |
| Invalid-operation exception (IE) | Х    | Х               | Х             | A source operand was an SNaN value, a QNaN value, or ±infinity.      |
|                                  | Х    | Х               | Х             | A source operand was too large to fit in the destination format.     |
| Precision exception (PE)         | Х    | Х               | Х             | A result could not be represented exactly in the destination format. |

### CVTTPD2PI

# Convert Packed Double-Precision Floating-Point to Packed Doubleword Integers, Truncated

Converts two packed double-precision floating-point values in an XMM register or a 128-bit memory location to two packed 32-bit signed integer values and writes the converted values in an MMX register.

Mnemonic Opcode Description

CVTPD2PI mmx, xmm/mem128 66 0F 2C/r

Converts packed double-precision floating-point values in an XMM register or 128-bit memory location to packed doubleword integer values in the destination MMX register. Inexact results are truncated.



If the result of the conversion is an inexact value, the value is truncated (rounded toward zero). If the floating-point value is a NaN, infinity, or if the result of the conversion is larger than the maximum signed doubleword  $(-2^{31} \text{ to } +2^{31} -1)$ , the instruction returns the 32-bit indefinite integer value (8000\_000h) when the invalid-operation exception (IE) is masked.

#### **Related Instructions**

CVTDQ2PD, CVTPD2DQ, CVTPD2PI, CVTPI2PD, CVTSD2SI, CVTSI2SD, CVTTPD2DQ, CVTTSD2SI

#### rFLAGS Affected

# **MXCSR Flags Affected**

| FZ | R  | C  | PM | UM | OM | ZM | DM | IM | DAZ | PE | UE | OE | ZE | DE | IE |
|----|----|----|----|----|----|----|----|----|-----|----|----|----|----|----|----|
|    |    |    |    |    |    |    |    |    |     | М  |    |    |    |    | M  |
| 15 | 14 | 13 | 12 | 11 | 10 | 9  | 8  | 7  | 6   | 5  | 4  | 3  | 2  | 1  | 0  |

Note:

A flag that can be set to one or zero is M (modified). Unaffected flags are blank.

# **Exceptions**

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                    |
|-------------------------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | X    | Х               | Х         | The SSE2 instructions are not supported, as indicated by bit 26 of CPUID standard function 1.                                         |
|                                           | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                             |
|                                           | Х    | Х               | Х         | The operating-system FXSAVE/FXRSTOR support bit (OSFXSR) of CR4 was cleared to 0.                                                     |
|                                           | Х    | X               | Х         | There was an unmasked SIMD floating-point exception while CR4.OSXMMEXCPT = 0. See SIMD Floating-Point Exceptions, below, for details. |
| Device not available, #NM                 | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                         |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                               |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                  |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                                     |
|                                           | Х    | Х               | Х         | The memory operand was not aligned on a 16-byte boundary.                                                                             |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                          |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An exception is pending due to an x87 floating-point instruction.                                                                     |
| SIMD Floating-Point<br>Exception, #XF     | Х    | Х               | Х         | There was an unmasked SIMD floating-point exception while CR4.OSXMMEXCPT = 1. See SIMD Floating-Point Exceptions, below, for details. |

| Exception                        | Real | Virtual<br>8086 | Protected   | Cause of Exception                                                   |
|----------------------------------|------|-----------------|-------------|----------------------------------------------------------------------|
|                                  |      | SIM             | D Floating- | Point Exceptions                                                     |
| Invalid-operation exception (IE) | Х    | Х               | Х           | A source operand was an SNaN value, a QNaN value, or ±infinity.      |
|                                  | Х    | X               | Х           | A source operand was too large to fit in the destination format.     |
| Precision exception (PE)         | Х    | Х               | Х           | A result could not be represented exactly in the destination format. |

16 CVTTPD2PI

### **CVTTPS2PI**

# Convert Packed Single-Precision Floating-Point to Packed Doubleword Integers, Truncated

Converts two packed single-precision floating-point values in the low-order 64 bits of an XMM register or a 64-bit memory location to two packed 32-bit signed integer values and writes the converted values in an MMX register.

#### Mnemonic Opcode Description

CVTTPS2PI mmx xmm/mem64

0F 2C/r

Converts packed single-precision floating-point values in an XMM register or 64-bit memory location to doubleword integer values in the destination MMX register. Inexact results are truncated.



If the result of the conversion is an inexact value, the value is truncated (rounded toward zero). If the floating-point value is a NaN, infinity, or if the result of the conversion is larger than the maximum signed doubleword ( $-2^{31}$  to  $+2^{31}-1$ ), the instruction returns the 32-bit indefinite integer value ( $8000\_0000h$ ) when the invalid-operation exception (IE) is masked.

#### **Related Instructions**

CVTDQ2PS, CVTPI2PS, CVTPS2DQ, CVTPS2PI, CVTSI2SS, CVTSS2SI, CVTTPS2DQ, CVTTSS2SI

#### rFLAGS Affected

# **MXCSR Flags Affected**

| FZ | R  | C  | PM | UM | OM | ZM | DM | IM | DAZ | PE | UE | OE | ZE | DE | IE |
|----|----|----|----|----|----|----|----|----|-----|----|----|----|----|----|----|
|    |    |    |    |    |    |    |    |    |     | М  |    |    |    |    | M  |
| 15 | 14 | 13 | 12 | 11 | 10 | 9  | 8  | 7  | 6   | 5  | 4  | 3  | 2  | 1  | 0  |

Note:

A flag that can be set to one or zero is M (modified). Unaffected flags are blank.

# **Exceptions**

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                    |
|-------------------------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Х    | Х               | Х         | The SSE instructions are not supported, as indicated by bit 25 of CPUID standard function 1.                                          |
|                                           | Х    | Χ               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                             |
|                                           | Х    | X               | Х         | The operating-system FXSAVE/FXRSTOR support bit (OSFXSR) of CR4 was cleared to 0.                                                     |
|                                           | X    | X               | х         | There was an unmasked SIMD floating-point exception while CR4.OSXMMEXCPT = 0. See SIMD Floating-Point Exceptions, below, for details. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                         |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                               |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                  |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                                     |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                          |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An exception was pending due to an x87 floating-point instruction.                                                                    |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                     |
| SIMD Floating-Point<br>Exception, #XF     | Х    | Х               | Х         | There was an unmasked SIMD floating-point exception while CR4.OSXMMEXCPT = 1. See SIMD Floating-Point Exceptions, below, for details. |

18 CVTTPS2PI

| Exception                        | Real | Virtual<br>8086 | Protected | Cause of Exception                                                   |  |
|----------------------------------|------|-----------------|-----------|----------------------------------------------------------------------|--|
| SIMD Floating-Point Exceptions   |      |                 |           |                                                                      |  |
| Invalid-operation exception (IE) | Х    | Х               | Х         | A source operand was an SNaN value, a QNaN value, or ±infinity.      |  |
|                                  | Х    | X               | Х         | A source operand was too large to fit in the destination format.     |  |
| Precision exception (PE)         | Х    | Х               | Х         | A result could not be represented exactly in the destination format. |  |

### **EMMS**

### **Exit Multimedia State**

Clears the MMX state by setting the state of the x87 stack registers to *empty* (tag-bit encoding of all 1s for all MMX registers) indicating that the contents of the registers are available for a new procedure, such as an x87 floating-point procedure. This setting of the tag bits is referred to as "clearing the MMX state".

Because the MMX registers and tag word are shared with the x87 floating-point instructions, software should execute an EMMS instruction to clear the MMX state before executing code that includes x87 floating-point instructions.

The functions of the EMMS and FEMMS instructions are identical.

For details about the setting of x87 tag bits, see "Media and x87 Processor State" in volume 2.

| Mnemonic | Opcode | Description           |
|----------|--------|-----------------------|
| EMMS     | 0F 77  | Clears the MMX state. |

### **Related Instructions**

FEMMS (a 3DNow! instruction)

#### rFLAGS Affected

None

### **Exceptions**

|                                              |      | Virtual |           |                                                                                                                               |
|----------------------------------------------|------|---------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Exception                                    | Real | 8086    | Protected | Cause of Exception                                                                                                            |
| Invalid opcode, #UD                          | Х    | Х       | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                              | Х    | Х       | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                    | Х    | Х       | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| x87 floating-point<br>exception pending, #MF | Х    | Х       | Х         | An unmasked x87 floating-point exception was pending.                                                                         |

20 EMMS

### **FEMMS**

### **Fast Exit Multimedia State**

Clears the MMX state by setting the state of the x87 stack registers to *empty* (tag-bit encoding of all 1s for all MMX registers) indicating that the contents of the registers are available for a new procedure, such as an x87 floating-point procedure. This setting of the tag bits is referred to as "clearing the MMX state".

Because the MMX registers and tag word are shared with the x87 floating-point instructions, software should execute an EMMS or FEMMS instruction to clear the MMX state before executing code that includes x87 floating-point instructions.

FEMMS is a 3DNow! instruction. The functions of the FEMMS and EMMS instructions are identical. The FEMMS instruction is supported for backward-compatibility with certain AMD processors. Software that must be both compatible with both AMD and non-AMD processors should use the EMMS instruction.

For details about the setting of x87 tag bits, see "Media and x87 Processor State" in volume 2.

| Mnemonic | Opcode | Description       |
|----------|--------|-------------------|
| FEMMS    | 0F 0E  | Clears MMX state. |

#### **Related Instructions**

**EMMS** 

#### rFLAGS Affected

None

### **Exceptions**

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                            |
|-------------------------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | X    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                     |
|                                           | X    | Х               | Х         | The AMD 3DNow!™ instructions are not supported, as indicated by bit 31 in CPUID extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                 |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                         |

FEMMS 21

# FNSAVE Floating-Point Save No-Wait x87 and MMX State (FSAVE)

Stores the complete x87 state to memory starting at the specified address and reinitializes the x87 state. The x87 state requires 94 or 108 bytes of memory, depending upon whether the processor is operating in real or protected mode and whether the operand-size attribute is 16-bit or 32-bit. Because the MMX registers are mapped onto the low 64 bits of the x87 floating-point registers, this operation also saves the MMX state. For details about the memory image saved by FNSAVE, see "Media and x87 Processor State" in volume 2.

The FNSAVE instruction does not wait for pending unmasked x87 floating-point exceptions to be processed. Processor interrupts should be disabled before using this instruction.

Assemblers usually provide an FSAVE macro that expands into the instruction sequence

WAIT ; Opcode 9B FNSAVE destination ; Opcode DD /6

The WAIT (9Bh) instruction checks for pending x87 exceptions and calls an exception handler, if necessary. The FNSAVE instruction then stores the x87 state to the specified destination.

| Mnemonic            | Opcode   | Description                                                                                                                        |
|---------------------|----------|------------------------------------------------------------------------------------------------------------------------------------|
| FNSAVE mem94/108env | DD /6    | Copy the x87 state to <i>mem94/108env</i> without checking for pending floating-point exceptions, then reinitialize the x87 state. |
| FSAVE mem94/108env  | 9B DD /6 | Copy the x87 state to <i>mem94/108env</i> after checking for pending floating-point exceptions, then reinitialize the x87 state.   |

#### **Related Instructions**

FRSTOR, FXSAVE, FXRSTOR

#### rFLAGS Affected

# **x87 Condition Code**

| x87 Condition Code | Value | Description |
|--------------------|-------|-------------|
| CO                 | 0     |             |
| C1                 | 0     |             |
| C2                 | 0     |             |
| C3                 | 0     |             |

# **Exceptions**

| Exception                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| Stack, #SS                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                      |
| General protection,<br>#GP   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                         |
|                              |      |                 | Х         | The destination operand was in a nonwritable segment.                                        |
|                              |      |                 | Х         | A null data segment was used to reference memory.                                            |
| Page fault, #PF              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                 |
| Alignment check, #AC         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.            |

### **FRSTOR**

# Floating-Point Restore x87 and MMX™ State

Restores the complete x87 state from memory starting at the specified address, as stored by a previous call to FNSAVE. The x87 state occupies 94 or 108 bytes of memory depending on whether the processor is operating in real or protected mode and whether the operand-size attribute is 16-bit or 32-bit. Because the MMX registers are mapped onto the low 64 bits of the x87 floating-point registers, this operation also restores the MMX state.

If FRSTOR results in set exception flags in the loaded x87 status word register, and these exceptions are unmasked in the x87 control word register, a floating-point exception occurs when the next floating-point instruction is executed (except for the no-wait floating-point instructions).

To avoid generating exceptions when loading a new environment, use the FCLEX or FNCLEX instruction to clear the exception flags in the x87 status word before storing that environment.

For details about the memory image restored by FRSTOR, see "Media and x87 Processor State" in volume 2.

| Mnemonic            | Opcode | Description                           |
|---------------------|--------|---------------------------------------|
| FRSTOR mem94/108env | DD /4  | Load the x87 state from mem94/108env. |

#### **Related Instructions**

FSAVE, FNSAVE, FXSAVE, FXRSTOR

#### rFLAGS Affected

None

24 FRSTOR

# **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description         |  |
|-----------------------------------------------------------------------------------------------------|-------|---------------------|--|
| CO                                                                                                  | М     | Loaded from memory. |  |
| C1                                                                                                  | М     | Loaded from memory. |  |
| C2                                                                                                  | М     | Loaded from memory. |  |
| C3                                                                                                  | М     | Loaded from memory. |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                     |  |

# Exceptions

| Exception                                       | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|-------------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                    | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| Stack, #SS                                      | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                      |
| General protection,<br>#GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                         |
|                                                 |      |                 | Х         | A null data segment was used to reference memory.                                            |
| Page fault, #PF                                 |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                 |
| Alignment check, #AC                            |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.            |
| x87 floating-point<br>exception pending,<br>#MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                        |

FRSTOR 25

### **FXRSTOR**

## Restore XMM, MMX™, and x87 State

Restores the XMM, MMX, and x87 state. The data loaded from memory is the state information previously saved using the FXSAVE instruction. Restoring data with FXRSTOR that had been previously saved with an FSAVE (rather than FXSAVE) instruction results in an incorrect restoration.

If FXRSTOR results in set exception flags in the loaded x87 status word register, and these exceptions are unmasked in the x87 control word register, a floating-point exception occurs when the next floating-point instruction is executed (except for the no-wait floating-point instructions).

If the restored MXCSR register contains a set bit in an exception status flag, and the corresponding exception mask bit is cleared (indicating an unmasked exception), loading the MXCSR register does not cause a SIMD floating-point exception (#XF).

FXRSTOR does not restore the x87 error pointers (last instruction pointer, last data pointer, and last opcode), except in the relatively rare cases in which the exception-summary (ES) bit in the x87 status word is set to 1, indicating that an unmasked x87 exception has occurred.

The architecture supports two memory formats for FXRSTOR, a 512-byte 32-bit legacy format and a 512-byte 64-bit format. Selection of the 32-bit or 64-bit format is accomplished by using the corresponding effective operand size in the FXRSTOR instruction. If software running in 64-bit mode executes an FXRSTOR with a 32-bit operand size (no REX-prefix operand-size override), the 32-bit legacy format is used. If software running in 64-bit mode executes an FXRSTOR with a 64-bit operand size (requires REX-prefix operand-size override), the 64-bit format is used. For details about the memory image restored by FXRSTOR, see "Saving Media and x87 Processor State" in volume 2.

If the fast-FXSAVE/FXRSTOR (FFXSR) feature is enabled in EFER, FXRSTOR does not restore the XMM registers (XMM0-XMM15) when executed in 64-bit mode at CPL 0. MXCSR is restored whether fast-FXSAVE/FXRSTOR is enabled or not. Software can use CPUID to determine whether the fast-FXSAVE/FXRSTOR feature is available. (See "CPUID" in Volume 3.)

If the operating-system FXSAVE/FXRSTOR support bit (OSFXSR) of CR4 is cleared to 0, the saved image of XMM0–XMM15 and MXCSR is not loaded into the processor. A general-protection exception occurs if there is an attempt to load a non-zero value to the bits in MXCSR that are defined as reserved (bits 31–16).

26 FXRSTOR

.

| Mnemonic          | Opcode   | Description                                                                   |
|-------------------|----------|-------------------------------------------------------------------------------|
| FXRSTOR mem512env | 0F AE /1 | Restores XMM, MMX <sup>™</sup> , and x87 state from 512-byte memory location. |

### **Related Instructions**

FWAIT, FXSAVE

### rFLAGS Affected

None

## **MXCSR Flags Affected**

| FZ | R  | C  | PM | UM | OM | ZM | DM | IM | DAZ | PE | UE | OE | ZE | DE | IE |
|----|----|----|----|----|----|----|----|----|-----|----|----|----|----|----|----|
| М  | М  | М  | М  | M  | М  | М  | М  | М  | М   | M  | М  | M  | M  | М  | M  |
| 15 | 14 | 13 | 12 | 11 | 10 | 9  | 8  | 7  | 6   | 5  | 4  | 3  | 2  | 1  | 0  |

#### Note:

## **Exceptions**

| Exception                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                      |
|---------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD       | Х    | X               | Х         | The FXSAVE/FXRSTOR instructions are not supported, as indicated by bit 24 of CPUID standard function 1 or extended function 8000_0001h. |
|                           | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                               |
| Device not available, #NM | Χ    | Х               | X         | The task-switch bit (TS) of CR0 was set to 1.                                                                                           |

FXRSTOR 27

A flag that can be set to one or zero is M (modified). Unaffected flags are blank. Shaded fields are reserved.

| Exception               | Real | Virtual<br>8086 | Protected | Cause of Exception                                                       |
|-------------------------|------|-----------------|-----------|--------------------------------------------------------------------------|
| Stack, #SS              | Х    | Х               | Х         | A memory address exceeded the stack segment limit, or was non-canonical. |
| General protection, #GP | Х    | Х               | Х         | A memory address exceeded the data segment limit or was non-canonical.   |
|                         |      |                 | Х         | A null data segment was used to reference memory.                        |
|                         | Х    | Х               | Х         | The memory operand was not aligned on a 16-byte boundary.                |
|                         | Х    | X               | Х         | Ones were written to the reserved bits in MXCSR.                         |
| Page fault, #PF         |      | Х               | Х         | A page fault resulted from the execution of the instruction.             |

28 FXRSTOR

### **FXSAVE**

## Save XMM, MMX, and x87 State

Saves the XMM, MMX, and x87 state. A memory location that is not aligned on a 16-byte boundary causes a general-protection exception.

Unlike FSAVE and FNSAVE, FXSAVE does not alter the x87 tag bits. The contents of the saved MMX/x87 data registers are retained, thus indicating that the registers may be valid (or whatever other value the x87 tag bits indicated prior to the save). To invalidate the contents of the MMX/x87 data registers after FXSAVE, software must execute an FINIT instruction. Also, FXSAVE (like FNSAVE) does not check for pending unmasked x87 floating-point exceptions. An FWAIT instruction can be used for this purpose.

FXSAVE does not save the x87 pointer registers (last instruction pointer, last data pointer, and last opcode), except in the relatively rare cases in which the exception-summary (ES) bit in the x87 status word is set to 1, indicating that an unmasked x87 exception has occurred.

The architecture supports two memory formats for FXSAVE, a 512-byte 32-bit legacy format and a 512-byte 64-bit format. Selection of the 32-bit or 64-bit format is accomplished by using the corresponding effective operand size in the FXSAVE instruction. If software running in 64-bit mode executes an FXSAVE with a 32-bit operand size (no REX-prefix operand-size override), the 32-bit legacy format is used. If software running in 64-bit mode executes an FXSAVE with a 64-bit operand size (requires REX-prefix operand-size override), the 64-bit format is used. For details about the memory image restored by FXRSTOR, see "Saving Media and x87 Processor State" in volume 2.

If the fast-FXSAVE/FXRSTOR (FFXSR) feature is enabled in EFER, FXSAVE does not save the XMM registers (XMM0-XMM15) when executed in 64-bit mode at CPL 0. MXCSR is saved whether fast-FXSAVE/FXRSTOR is enabled or not. Software can use CPUID to determine whether the fast-FXSAVE/FXRSTOR feature is available. (See "CPUID" in Volume 3.)

If the operating-system FXSAVE/FXRSTOR support bit (OSFXSR) of CR4 is cleared to 0, FXSAVE does not save the image of XMM0–XMM15 or MXCSR. For details about the CR4.OSFXSR bit, see "FXSAVE/FXRSTOR Support (OSFXSR) Bit" in volume 2.

| Mnemonic         | Opcode   | Description                                                |
|------------------|----------|------------------------------------------------------------|
| FXSAVE mem512env | 0F AE /0 | Saves XMM, MMX, and x87 state to 512-byte memory location. |

FXSAVE 29

## **Related Instructions**

FINIT, FNSAVE, FRSTOR, FSAVE, FXRSTOR, LDMXCSR, STMXCSR

### **rFLAGS Affected**

None

# **MXCSR Flags Affected**

None

## **Exceptions**

| Exception                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                      |
|---------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD       | Х    | Х               | Х         | The FXSAVE/FXRSTOR instructions are not supported, as indicated by bit 24 of CPUID standard function 1 or extended function 8000_0001h. |
|                           | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                               |
| Device not available, #NM | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                           |
| Stack, #SS                | Х    | Х               | Х         | A memory address exceeded the stack segment limit, or was non-canonical.                                                                |
| General protection, #GP   | Х    | Х               | Х         | A memory address exceeded the data segment limit or was non-canonical.                                                                  |
|                           |      |                 | Х         | A null data segment was used to reference memory.                                                                                       |
|                           |      |                 | Х         | The destination operand was in a non-writable segment.                                                                                  |
|                           | Х    | Х               | Х         | The memory operand was not aligned on a 16-byte boundary.                                                                               |
| Page fault, #PF           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                            |

30 FXSAVE

### **MASKMOVQ**

# **Masked Move Quadword**

Stores bytes from the first source operand, as selected by the second source operand, to a memory location specified in the DS:rDI registers (except that DS is ignored in 64-bit mode). The first source operand is an MMX register, and the second source operand is another MMX register. The most-significant bit (msb) of each byte in the second source operand specifies the store (1 = store, 0 = no store) of the corresponding byte of the first source operand.

| Mnemonic            | Opcode          | Description                                                                                                                          |
|---------------------|-----------------|--------------------------------------------------------------------------------------------------------------------------------------|
| MASKMOVQ mmx1, mmx2 | 0F F7 <i>/r</i> | Store bytes from an MMX register, selected by the most-significant bit of the corresponding byte in another MMX register, to DS:rDI. |



A mask value of all 0s results in the following behavior:

- No data is written to memory.
- Page faults and exceptions associated with memory addressing are not guaranteed to be generated in all implementations.
- Data breakpoints are not guaranteed to be generated in all implementations (although code breakpoints are guaranteed).

MASKMOVQ implicitly uses weakly-ordered, write-combining buffering for the data, as described in "Buffering and Combining Memory Writes" in volume 2. If the stored data is shared by multiple processors, this instruction should be used together with a fence instruction in order to ensure data coherency (refer to "Cache and TLB Management" in volume 2).

#### **Related Instructions**

MASKMOVDQU

## **Exceptions**

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                                                                                                               |
|----------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                                                                                                        |
|                                              | X    | Х               | Х         | The operating-system FXSAVE/FXRSTOR support bit (OSFXSR) of CR4 was cleared to 0.                                                                                                                                                |
|                                              | Х    | Х               | X         | The SSE instructions are not supported, as indicated by bit 25 in CPUID standard function 1; and the AMD extensions to the MMX™ instruction set are not supported, as indicated by bit 22 of CPUID extended function 8000_0001h. |
| Device not available,<br>#NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                                                                                                                    |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                                                                                                          |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                                                                                                             |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                                                                                                                                |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                                                                                                                     |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                                                                                                            |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                                                                                                                |

### MOVD

# **Move Doubleword or Quadword**

Moves a 32-bit or 64-bit value in one of the following ways:

- from a 32-bit or 64-bit general-purpose register or memory location to the low-order 32 or 64 bits of an XMM register, with zero-extension to 128 bits
- from the low-order 32 or 64 bits of an XMM to a 32-bit or 64-bit general-purpose register or memory location
- from a 32-bit or 64-bit general-purpose register or memory location to the loworder 32 bits (with zero-extension to 64 bits) or the full 64 bits of an MMX register
- from the low-order 32 or the full 64 bits of an MMX register to a 32-bit or 64-bit general-purpose register or memory location.

| Mnemonic            | Opcode  | Description                                                                                     |
|---------------------|---------|-------------------------------------------------------------------------------------------------|
| MOVD mmx, reg/mem32 | 0F 6E/r | Move 32-bit value from a general-purpose register or 32-bit memory location to an MMX register. |
| MOVD mmx, reg/mem64 | 0F 6E/r | Move 64-bit value from a general-purpose register or 64-bit memory location to an MMX register. |
| MOVD reg/mem32, mmx | 0F 7E/r | Move 32-bit value from an MMX register to a 32-bit general-purpose register or memory location. |
| MOVD reg/mem64, mmx | 0F 7E/r | Move 64-bit value from an MMX register to a 64-bit general-purpose register or memory location. |

The following diagrams illustrate the operation of the MOVD instruction.

MOVD 33



34 MOVD

### **Related Instructions**

MOVDQA, MOVDQU, MOVDQ2Q, MOVQ, MOVQ2DQ

### rFLAGS Affected

None

# **MXCSR Flags Affected**

None

## **Exceptions (All Modes)**

| Exception                                    | Real | Virtual<br>8086 | Protected | Description                                                                                   |
|----------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Х    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 of CPUID standard function 1. |
|                                              | Х    | Х               | Х         | The SSE2 instructions are not supported, as indicated by bit 26 of CPUID standard function 1. |
|                                              | Х    | х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                     |
|                                              | Χ    | X               | Х         | The instruction used XMM registers while CR4.OSFXSR=0.                                        |
| Device not available,<br>#NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                       |
| General protection, #GP                      | Х    | X               | Х         | A memory address exceeded a data segment limit or was non-canonical.                          |
|                                              |      |                 | Х         | The destination operand was in a non-writable segment.                                        |
|                                              |      |                 | Х         | A null data segment was used to reference memory                                              |
| Page fault, #PF                              |      | Χ               | Х         | A page fault resulted from the execution of the instruction.                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                         |
| Alignment check, #AC                         |      | X               | Х         | An unaligned memory reference was performed while alignment checking was enabled.             |

# MOVDQ2Q

# **Move Quadword to Quadword**

Moves the low-order 64-bit value in an XMM register to a 64-bit MMX register.

| Mnemonic         | Opcode      | Description                                                                        |
|------------------|-------------|------------------------------------------------------------------------------------|
| MOVDQ2Q mmx, xmm | F2 0F D6 /r | Moves low-order 64-bit value from an XMM register to the destination MMX register. |



### **Related Instructions**

MOVD, MOVDQA, MOVDQU, MOVQ, MOVQ2DQ

### rFLAGS Affected

None

## **MXCSR Flags Affected**

None

### **Exceptions**

| Exception           | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                            |
|---------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------------------|
| Invalid opcode, #UD | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                     |
|                     | Х    | X               | Х         | The SSE2 instructions are not supported, as indicated by bit 26 in CPUID standard function 1. |

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                    |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------|
| Device not available,<br>#NM              | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.         |
| General protection, #GP                   | Х    | Х               | Х         | The destination operand was in non-writable segment.  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending. |

### **MOVNTQ**

# **Move Non-Temporal Quadword**

Stores a 64-bit MMX register value into a 64-bit memory location. This instruction indicates to the processor that the data is non-temporal, and is unlikely to be used again soon. The processor treats the store as a write-combining (WC) memory write, which minimizes cache pollution. The exact method by which cache pollution is minimized depends on the hardware implementation of the instruction. For further information, see "Memory Optimization" in volume 1.

| Mnemonic          | Opcode   | Description                                                                                   |
|-------------------|----------|-----------------------------------------------------------------------------------------------|
| MOVNTQ mem64, mmx | 0F E7 /r | Stores a 64-bit MMX register value into a 64-bit memory location, minimizing cache pollution. |



MOVNTQ is weakly-ordered with respect to other instructions that operate on memory. Software should use an SFENCE instruction to force strong memory ordering of MOVNTQ with respect to other stores.

MOVNTQ implicitly uses weakly-ordered, write-combining buffering for the data, as described in "Buffering and Combining Memory Writes" in volume 2. For data that is shared by multiple processors, this instruction should be used together with a fence instruction in order to ensure data coherency (refer to "Cache and TLB Management" in volume 2).

#### **Related Instructions**

MOVNTDQ, MOVNTI, MOVNTPD, MOVNTPS

# Exceptions

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                                                                                                               |
|----------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                                                                                                        |
|                                              | Х    | X               | X         | The SSE instructions are not supported, as indicated by bit 25 in CPUID standard function 1; and the AMD extensions to the MMX™ instruction set are not supported, as indicated by bit 22 of CPUID extended function 8000_0001h. |
| Device not available,<br>#NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                                                                                                                    |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                                                                                                          |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                                                                                                             |
|                                              |      |                 | х         | The destination operand was in a non-writable segment.                                                                                                                                                                           |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                                                                                                                                |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                                                                                                                     |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                                                                                                            |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                                                                                                                |

# **MOVQ**

# **Move Quadword**

Moves a 64-bit value:

- from an MMX register or 64-bit memory location to another MMX register, or
- from an MMX register to another MMX register or 64-bit memory location.

| Mnemonic              | Opcode          | Description                                                                    |
|-----------------------|-----------------|--------------------------------------------------------------------------------|
| MOVQ mmx1, mmx2/mem64 | 0F 6F <i>/r</i> | Moves 64-bit value from an MMX register or memory location to an MMX register. |
| MOVQ mmx1/mem64, mmx2 | 0F 7F <i>/r</i> | Moves 64-bit value from an MMX register to an MMX register or memory location. |



#### **Related Instructions**

MOVD, MOVDQA, MOVDQU, MOVDQ2Q, MOVQ2DQ

### rFLAGS Affected

# Exceptions

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                              | Х    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available,<br>#NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeds the stack segment limit or is non-canonical.                                                         |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
|                                              |      |                 | Х         | The destination operand was in a non-writable segment.                                                                        |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

# MOVQ2DQ

# **Move Quadword to Quadword**

Moves a 64-bit value from an MMX register to the low-order 64 bits of an XMM register, with zero-extension to 128 bits.

| Mnemonic         | Opcode      | Description                                                 |
|------------------|-------------|-------------------------------------------------------------|
| MOVQ2DQ xmm, mmx | F3 0F D6 /r | Moves 64-bit value from an MMX register to an XMM register. |



#### **Related Instructions**

MOVD, MOVDQA, MOVDQU, MOVDQ2Q, MOVQ

### **rFLAGS Affected**

None

### **MXCSR Flags Affected**

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                            |
|----------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | X    | X               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                     |
|                                              | Х    | X               | X         | The operating-system FXSAVE/FXRSTOR support bit (OSFXSR) of CR4 was cleared to 0.             |
|                                              | Х    | Х               | Х         | The SSE2 instructions are not supported, as indicated by bit 26 in CPUID standard function 1. |
| Device not available,<br>#NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                 |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                         |

### **PACKSSDW**

# Pack with Saturation Signed Doubleword to Word

Converts each 32-bit signed integer in the first and second source operands to a 16-bit signed integer and packs the converted values into words in the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

Converted values from the first source operand are packed into the low-order words of the destination, and the converted values from the second source operand are packed into the high-order words of the destination.

| Opcode | Description |
|--------|-------------|
|        | Opcode      |

PACKSSDW mmx1, mmx2/mem64

0F 6B /r

Packs 32-bit signed integers in an MMX register and another MMX register or 64-bit memory location into 16-bit signed integers in an MMX register.



For each packed value in the destination, if the value is larger than the largest signed 16-bit integer, it is saturated to 7FFFh, and if the value is smaller than the smallest signed 16-bit integer, it is saturated to 8000h.

#### **Related Instructions**

PACKSSWB, PACKUSWB

### rFLAGS Affected

None

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                              | Х    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                    | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

### **PACKSSWB**

### **Pack with Saturation Signed Word to Byte**

Converts each 16-bit signed integer in the first and second source operands to an 8-bit signed integer and packs the converted values into bytes in the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

Converted values from the first source operand are packed into the low-order bytes of the destination, and the converted values from the second source operand are packed into the high-order bytes of the destination.

### Mnemonic Opcode Description

PACKSSWB mmx1, mmx2/mem64

0F 63 /r

Packs 16-bit signed integers in an MMX register and another MMX register or 64-bit memory location into 8-bit signed integers in an MMX register.



For each packed value in the destination, if the value is larger than the largest signed 8-bit integer, it is saturated to 7Fh, and if the value is smaller than the smallest signed 8-bit integer, it is saturated to 80h.

#### **Related Instructions**

PACKSSDW, PACKUSWB

### rFLAGS Affected

None

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                         |
|-------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                  |
|                                           | Х    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                              |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                    |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                       |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                          |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                               |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                      |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                          |

### **PACKUSWB**

# Pack with Saturation Signed Word to Unsigned Byte

Converts each 16-bit signed integer in the first and second source operands to an 8-bit unsigned integer and packs the converted values into bytes in the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

Converted values from the first source operand are packed into the low-order bytes of the destination, and the converted values from the second source operand are packed into the high-order bytes of the destination.

### Mnemonic Opcode Description

PACKUSWB mmx1, mmx2/mem64

0F 67 /r

Packs 16-bit signed integers in an MMX register and another MMX register or 64-bit memory location into 8-bit unsigned integers in an MMX register.



For each packed value in the destination, if the value is larger than the largest unsigned 8-bit integer, it is saturated to FFh, and if the value is smaller than the smallest unsigned 8-bit integer, it is saturated to 00h.

### **Related Instructions**

PACKSSDW, PACKSSWB

### rFLAGS Affected

None

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                              | X    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                    | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

### **PADDB**

### **Packed Add Bytes**

Adds each packed 8-bit integer value in the first source operand to the corresponding packed 8-bit integer in the second source operand and writes the integer result of each addition in the corresponding byte of the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic | Opcode | Description |
|----------|--------|-------------|

PADDB mmx1, mmx2/mem64

OF FC/r

Adds packed byte integer values in an MMX register and another MMX register or 64-bit memory location and writes the result in the destination MMX register.



This instruction operates on both signed and unsigned integers. If the result overflows, the carry is ignored (neither the overflow nor carry bit in rFLAGS is set), and only the low-order 8 bits of each result are written in the destination.

### **Related Instructions**

PADDD, PADDO, PADDSB, PADDSW, PADDUSB, PADDUSW, PADDW

### rFLAGS Affected

None

50 PADDB

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                              | X    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                    | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

### **PADDD**

### **Packed Add Doublewords**

Adds each packed 32-bit integer value in the first source operand to the corresponding packed 32-bit integer in the second source operand and writes the integer result of each addition in the corresponding doubleword of the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic               | Opcode  | Description                                                                                                                                                    |
|------------------------|---------|----------------------------------------------------------------------------------------------------------------------------------------------------------------|
| PADDD mmx1, mmx2/mem64 | OF FE/r | Adds packed 32-bit integer values in an MMX register and another MMX register or 64-bit memory location and writes the result in the destination MMX register. |



This instruction operates on both signed and unsigned integers. If the result overflows, the carry is ignored (neither the overflow nor carry bit in rFLAGS is set), and only the low-order 32 bits of each result are written in the destination.

#### **Related Instructions**

PADDB, PADDQ, PADDSB, PADDSW, PADDUSB, PADDUSW, PADDW

#### rFLAGS Affected

None

52 PADDD

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                              | X    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                    | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

### **PADDQ**

### **Packed Add Quadwords**

Adds each packed 64-bit integer value in the first source operand to the corresponding packed 64-bit integer in the second source operand and writes the integer result of each addition in the corresponding quadword of the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic | Opcode | Description |
|----------|--------|-------------|
|          |        |             |

PADDQ mmx1, mmx2/mem64

0F D4 /r

Adds 64-bit integer value in an MMX register and another MMX register or 64-bit memory location and writes the result in the destination MMX register.



This instruction operates on both signed and unsigned integers. If the result overflows, the carry is ignored (neither the overflow nor carry bit in rFLAGS is set), and only the low-order 64 bits of each result are written in the destination.

#### **Related Instructions**

PADDB, PADDD, PADDSB, PADDSW, PADDUSB, PADDUSW, PADDW

### rFLAGS Affected

None

54 PADDQ

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                              | X    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                    | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

### **PADDSB**

### **Packed Add Signed with Saturation Bytes**

Adds each packed 8-bit signed integer value in the first source operand to the corresponding packed 8-bit signed integer in the second source operand and writes the signed integer result of each addition in the corresponding byte of the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

Mnemonic Opcode Description

PADDSB mmx1, mmx2/mem64

OF EC/r

Adds packed byte signed integer values in an MMX register and another MMX register or 64-bit memory location and writes the result in the destination MMX register.



For each packed value in the destination, if the value is larger than the largest representable signed 8-bit integer, it is saturated to 7Fh, and if the value is smaller than the smallest signed 8-bit integer, it is saturated to 80h.

### **Related Instructions**

PADDB, PADDD, PADDO, PADDSW, PADDUSB, PADDUSW, PADDW

#### rFLAGS Affected

None

56 PADDSB

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                        |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                 |
|                                           | Х    | Х               | Х         | The MMX <sup>™</sup> instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                             |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                   |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                      |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                                         |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                              |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                     |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                         |

### **PADDSW**

### **Packed Add Signed with Saturation Words**

Adds each packed 16-bit signed integer value in the first source operand to the corresponding packed 16-bit signed integer in the second source operand and writes the signed integer result of each addition in the corresponding word of the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic | Opcode | Description |
|----------|--------|-------------|
|          | - P    | - 00 th     |

PADDSW mmx1, mmx2/mem64

OF ED /r

Adds packed 16-bit signed integer values in an MMX register and another MMX register or 64-bit memory location and writes the result in the destination MMX register.



For each packed value in the destination, if the value is larger than the largest representable signed 16-bit integer, it is saturated to 7FFFh, and if the value is smaller than the smallest signed 16-bit integer, it is saturated to 8000h.

### **Related Instructions**

PADDB, PADDD, PADDO, PADDSB, PADDUSB, PADDUSW, PADDW

### **rFLAGS Affected**

None

58 PADDSW

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                           | Х    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

### **PADDUSB**

### **Packed Add Unsigned with Saturation Bytes**

Adds each packed 8-bit unsigned integer value in the first source operand to the corresponding packed 8-bit unsigned integer in the second source operand and writes the unsigned integer result of each addition in the corresponding byte of the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic | Opcode | Description |
|----------|--------|-------------|
|          |        |             |

PADDUSB mmx1, mmx2/mem64

OF DC/r

Adds packed byte unsigned integer values in an MMX register and another MMX register or 64-bit memory location and writes the result in the destination MMX register.



For each packed value in the destination, if the value is larger than the largest unsigned 8-bit integer, it is saturated to FFh, and if the value is smaller than the smallest unsigned 8-bit integer, it is saturated to 00h.

#### **Related Instructions**

PADDB, PADDD, PADDQ, PADDSB, PADDSW, PADDUSW, PADDW

### **rFLAGS Affected**

None

60 PADDUSB

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                           | Х    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

### **PADDUSW**

### **Packed Add Unsigned with Saturation Words**

Adds each packed 16-bit unsigned integer value in the first source operand to the corresponding packed 16-bit unsigned integer in the second source operand and writes the unsigned integer result of each addition in the corresponding word of the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic                 | Opcode  | Description                                                                                                                                                         |
|--------------------------|---------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| PADDUSW mmx1, mmx2/mem64 | 0F DD/r | Adds packed 16-bit unsigned integer values in an MMX register and another MMX register or 64-bit memory location and writes result in the destination MMX register. |



For each packed value in the destination, if the value is larger than the largest unsigned 16-bit integer, it is saturated to FFFFh, and if the value is smaller than the smallest unsigned 16-bit integer, it is saturated to 0000h.

#### **Related Instructions**

PADDB, PADDD, PADDO, PADDSB, PADDSW, PADDUSB, PADDW

#### rFLAGS Affected

None

62 PADDUSW

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                              | X    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                    | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

### **PADDW**

### **Packed Add Words**

Adds each packed 16-bit integer value in the first source operand to the corresponding packed 16-bit integer in the second source operand and writes the integer result of each addition in the corresponding word of the destination (second source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic | Opcode | Description |
|----------|--------|-------------|
|          |        |             |

PADDW mmx1, mmx2/mem64

OF FD /r

Adds packed 16-bit integer values in an MMX register and another MMX register or 64-bit memory location and writes the result in the destination MMX register.



This instruction operates on both signed and unsigned integers. If the result overflows, the carry is ignored (neither the overflow nor carry bit in rFLAGS is set), and only the low-order 16 bits of the result are written in the destination.

### **Related Instructions**

PADDB, PADDD, PADDQ, PADDSB, PADDSW, PADDUSB, PADDUSW

#### rFLAGS Affected

None

64 PADDW

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                           | Х    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

### **PAND**

### **Packed Logical Bitwise AND**

Performs a bitwise logical AND of the values in the first and second source operands and writes the result in the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic | Opcode | Description |
|----------|--------|-------------|
|          |        |             |

PAND mmx1, mmx2/mem64 OF DB/r

Performs bitwise logical AND of values in an MMX register and in another MMX register or 64-bit memory location and writes the result in the destination MMX register.



#### **Related Instructions**

PANDN, POR, PXOR

#### rFLAGS Affected

None

66 PAND

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Χ    | X               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                              | X    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available,<br>#NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                      | X    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                              |      |                 | X         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

### **PANDN**

# **Packed Logical Bitwise AND NOT**

Performs a bitwise logical AND of the value in the second source operand and the one's complement of the value in the first source operand and writes the result in the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic | Opcode | Description |
|----------|--------|-------------|
|          |        |             |

PANDN mmx1, mmx2/mem64

OF DF /r

Performs bitwise logical AND NOT of values in an MMX register and in another MMX register or 64-bit memory location and writes the result in the destination MMX register.



### **Related Instructions**

PAND, POR, PXOR

#### rFLAGS Affected

None

68 PANDN

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                              | X    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available,<br>#NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

### **PAVGB**

### **Packed Average Unsigned Bytes**

Computes the rounded average of each packed unsigned 8-bit integer value in the first source operand and the corresponding packed 8-bit unsigned integer in the second source operand and writes each average in the corresponding byte of the destination (first source). The average is computed by adding each pair of operands, adding 1 to the 9-bit temporary sum, and then right-shifting the temporary sum by one bit position. The destination and source operands are an MMX register and another MMX register or 64-bit memory location.

### Mnemonic Opcode Description

PAVGB *mmx1*, *mmx2/mem64* 

0F E0 /r

Averages packed 8-bit unsigned integer values in an MMX register and another MMX register or 64-bit memory location and writes the result in the destination MMX register.



#### **Related Instructions**

**PAVGW** 

#### rFLAGS Affected

None

70 PAVGB

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                                                                                                                           |
|----------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                                                                                                                    |
|                                              | Х    | Х               | Х         | The SSE instructions are not supported, as indicated by bit 25 in CPUID standard function 1; and the AMD extensions to the MMX <sup>™</sup> instruction set are not supported, as indicated by bit 22 of CPUID extended function 8000_0001h. |
| Device not available, #NM                    | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                                                                                                                                |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                                                                                                                      |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                                                                                                                         |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                                                                                                                                            |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                                                                                                                                 |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                                                                                                                        |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                                                                                                                            |

### **PAVGUSB**

### **Packed Average Unsigned Bytes**

Computes the rounded-up average of each packed unsigned 8-bit integer value in the first source operand and the corresponding packed 8-bit unsigned integer in the second source operand and writes each average in the corresponding byte of the destination (first source). The average is computed by adding each pair of operands, adding 1 to the 9-bit temporary sum, and then right-shifting the temporary sum by one bit position. The first source/destination operand is an MMX register. The second source operand is another MMX register or 64-bit memory location.

### Mnemonic Opcode Description

PAVGUSB mmx1, mmx2/mem64

OF OF /r BF

Averages packed 8-bit unsigned integer values in an MMX register and another MMX register or 64-bit memory location and writes the result in the destination MMX register.



The PAVGUSB instruction performs a function identical to the 64-bit version of the PAVGB instruction, although the two instructions have different opcodes. PAVGUSB is a 3DNow! instruction. It is useful for pixel averaging in MPEG-2 motion compensation and video scaling operations.

#### **Related Instructions**

None

### **rFLAGS Affected**

None

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                            |
|----------------------------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                     |
|                                              | X    | Х               | Х         | The AMD 3DNow!™ instructions are not supported, as indicated by bit 31 in CPUID extended function 8000_0001h. |
| Device not available, #NM                    | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                       |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                          |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                             |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                         |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                             |

### **PAVGW**

### **Packed Average Unsigned Words**

Computes the rounded average of each packed unsigned 16-bit integer value in the first source operand and the corresponding packed 16-bit unsigned integer in the second source operand and writes each average in the corresponding word of the destination (first source). The average is computed by adding each pair of operands, adding 1 to the 17-bit temporary sum, and then right-shifting the temporary sum by one bit position. The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

### Mnemonic Opcode Description

PAVGW mmx1, mmx2/mem64

0F E3 /r

Averages packed 16-bit unsigned integer values in an MMX register and another MMX register or 64-bit memory location and writes the result in the destination MMX register.



#### **Related Instructions**

**PAVGB** 

#### rFLAGS Affected

None

74 PAVGW

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                                                                                                               |
|-------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | X    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                                                                                                        |
|                                           | Х    | X               | Х         | The SSE instructions are not supported, as indicated by bit 25 in CPUID standard function 1; and the AMD extensions to the MMX™ instruction set are not supported, as indicated by bit 22 of CPUID extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                                                                                                                    |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                                                                                                          |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                                                                                                             |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                                                                                                                                |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                                                                                                                     |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                                                                                                            |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                                                                                                                |

MMX

### **PCMPEQB**

# **Packed Compare Equal Bytes**

Compares corresponding packed bytes in the first and second source operands and writes the result of each compare in the corresponding byte of the destination (first source). For each pair of bytes, if the values are equal, the result is all 1s. If the values are not equal, the result is all 0s. The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic                 | Opcode          | Description                                                                         |
|--------------------------|-----------------|-------------------------------------------------------------------------------------|
| PCMPEQB mmx1, mmx2/mem64 | 0F 74 <i>/r</i> | Compares packed bytes in an MMX register and an register or 64-bit memory location. |



#### **Related Instructions**

PCMPEQD, PCMPEQW, PCMPGTB, PCMPGTD, PCMPGTW

### **rFLAGS Affected**

None

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                        |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Х    | X               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                 |
|                                              | X    | Х               | Х         | The MMX <sup>™</sup> instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                    | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                             |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                   |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                      |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                                         |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                              |
| x87 floating-point<br>exception pending, #MF | Χ    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                     |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                         |

### **PCMPEQD**

# **Packed Compare Equal Doublewords**

Compares corresponding packed 32-bit values in the first and second source operands and writes the result of each compare in the corresponding 32 bits of the destination (first source). For each pair of doublewords, if the values are equal, the result is all 1s. If the values are not equal, the result is all 0s. The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic                 | Opcode          | Description                                                                                   |
|--------------------------|-----------------|-----------------------------------------------------------------------------------------------|
| PCMPEQD mmx1, mmx2/mem64 | 0F 76 <i>/r</i> | Compares packed doublewords in an MMX register and an MMX register or 64-bit memory location. |



### **Related Instructions**

PCMPEQB, PCMPEQW, PCMPGTB, PCMPGTD, PCMPGTW

### rFLAGS Affected

None

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Х    | X               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                              | X    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                    | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

# **PCMPEQW**

# **Packed Compare Equal Words**

Compares corresponding packed 16-bit values in the first and second source operands and writes the result of each compare in the corresponding 16 bits of the destination (first source). For each pair of words, if the values are equal, the result is all 1s. If the values are not equal, the result is all 0s. The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic                 | Opcode          | Description                                                                                     |
|--------------------------|-----------------|-------------------------------------------------------------------------------------------------|
| PCMPEQW mmx1, mmx2/mem64 | 0F 75 <i>/r</i> | Compares packed 16-bit values in an MMX register and an MMX register or 64-bit memory location. |



#### **Related Instructions**

PCMPEQB, PCMPEQD, PCMPGTB, PCMPGTD, PCMPGTW

#### rFLAGS Affected

None

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                        |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                 |
|                                           | X    | Х               | Х         | The MMX <sup>™</sup> instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                 | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                             |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                   |
| General protection, #GP                   | X    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                      |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                                         |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                              |
| x87 floating-point exception pending, #MF | X    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                     |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                         |

## **PCMPGTB**

# **Packed Compare Greater Than Signed Bytes**

Compares corresponding packed signed bytes in the first and second source operands and writes the result of each compare in the corresponding byte of the destination (first source). For each pair of bytes, if the value in the first source operand is greater than the value in the second source operand, the result is all 1s. If the value in the first source operand is less than or equal to the value in the second source operand, the result is all 0s. The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic                 | Opcode  | Description                                                                                    |
|--------------------------|---------|------------------------------------------------------------------------------------------------|
| PCMPGTB mmx1, mmx2/mem64 | 0F 64/r | Compares packed signed bytes in an MMX register and an MMX register or 64-bit memory location. |



#### **Related Instructions**

PCMPEQB, PCMPEQD, PCMPEQW, PCMPGTD, PCMPGTW

#### rFLAGS Affected

None

82 PCMPGTB

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                              | X    | х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                    | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

## **PCMPGTD**

# Packed Compare Greater Than Signed Doublewords

Compares corresponding packed signed 32-bit values in the first and second source operands and writes the result of each compare in the corresponding 32 bits of the destination (first source). For each pair of doublewords, if the value in the first source operand is greater than the value in the second source operand, the result is all 1s. If the value in the first source operand is less than or equal to the value in the second source operand, the result is all 0s. The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

MnemonicOpcodeDescriptionPCMPGTD mmx1, mmx2/mem640F 66/rCompares packed signed 32-bit values in an MMX register and an MMX register or 64-bit memory location.



#### **Related Instructions**

PCMPEQB, PCMPEQD, PCMPEQW, PCMPGTB, PCMPGTW

#### rFLAGS Affected

None

84 PCMPGTD

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                              | X    | х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                    | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

## **PCMPGTW**

# **Packed Compare Greater Than Signed Words**

Compares corresponding packed signed 16-bit values in the first and second source operands and writes the result of each compare in the corresponding 16 bits of the destination (first source). For each pair of words, if the value in the first source operand is greater than the value in the second source operand, the result is all 1s. If the value in the first source operand is less than or equal to the value in the second source operand, the result is all 0s. The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

MnemonicOpcodeDescriptionPCMPGTW mmx1, mmx2/mem640F 65/rCompares packed signed 16-bit values in an MMX register and an MMX register or 64-bit memory location.



#### **Related Instructions**

PCMPEQB, PCMPEQD, PCMPEQW, PCMPGTB, PCMPGTD

#### rFLAGS Affected

None

86 PCMPGTW

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                              | X    | х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                    | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

## **PEXTRW**

## **Extract Packed Word**

Extracts a 16-bit value from an MMX register, as selected by the immediate byte operand (as shown in Table 1-1) and writes it to the low-order word of a 32-bit general-purpose register, with zero-extension to 32 bits.

MnemonicOpcodeDescriptionPEXTRW reg32, mmx, imm80F C5/r ibExtracts a 16-bit value from an MMX register and writes it to low-order 16 bits of a general-purpose register.



**Table 1-1. Immediate-Byte Operand Encoding for 64-Bit PEXTRW** 

| Immediate-Byte<br>Bit Field | Value of Bit Field | Source Bits Extracted |
|-----------------------------|--------------------|-----------------------|
|                             | 0                  | 15-0                  |
| 1–0                         | 1                  | 31–16                 |
|                             | 2                  | 47–32                 |
|                             | 3                  | 63–48                 |

#### **Related Instructions**

**PINSRW** 

88 PEXTRW

# rFLAGS Affected

None

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                                                                                                               |
|-------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                                                                                                        |
|                                           | X    | Х               | Х         | The SSE instructions are not supported, as indicated by bit 25 in CPUID standard function 1; and the AMD extensions to the MMX™ instruction set are not supported, as indicated by bit 22 of CPUID extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                                                                                                                    |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                                                                                                            |

# PF2ID

# Packed Floating-Point to Integer Doubleword Converson

Converts two packed single-precision floating-point values in an MMX register or a 64-bit memory location to two packed 32-bit signed integer values and writes the converted values in another MMX register. If the result of the conversion is an inexact value, the value is truncated (rounded toward zero). The numeric range for source and destination operands is shown in Table 1-2.

#### Mnemonic Opcode Description

PF2ID mmx1, mmx2/mem64

0F 0F /r 1D

Converts packed single-precision floating-point values in an MMX register or memory location to a doubleword integer value in the destination MMX register.



Table 1-2. Numeric Range for PF2ID Results

| Source 2                                  | Source 1 and Destination |
|-------------------------------------------|--------------------------|
| 0                                         | 0                        |
| Normal, abs(Source 2) < 1                 | 0                        |
| Normal, −2 <sup>31</sup> < Source 2 <= −1 | Round to zero (Source 2) |
| Normal, 1 <= Source 2 < 2 <sup>31</sup>   | Round to zero (Source 2) |
| Normal, Source $2 \ge 2^{31}$             | 7FFF_FFFFh               |
| Normal, Source 2 <= −2 <sup>31</sup>      | 8000_0000h               |
| Unsupported                               | Undefined                |

90 PF2ID

# **Related Instructions**

PF2IW, PI2FD, PI2FW

## rFLAGS Affected

None

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                            |
|----------------------------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                     |
|                                              | Х    | Х               | Х         | The AMD 3DNow!™ instructions are not supported, as indicated by bit 31 in CPUID extended function 8000_0001h. |
| Device not available, #NM                    | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                       |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                          |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                             |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                         |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                             |

# PF2IW

# Packed Floating-Point to Integer Word Conversion

Converts two packed single-precision floating-point values in an MMX register or a 64-bit memory location to two packed 16-bit signed integer values, sign-extended to 32 bits, and writes the converted values in another MMX register. If the result of the conversion is an inexact value, the value is truncated (rounded toward zero). The numeric range for source and destination operands is shown in Table 1-3 on page 93. Arguments outside the range representable by signed 16-bit integers are saturated to the largest and smallest 16-bit integer, depending on their sign.

#### Mnemonic Opcode Description

PF2IW mmx1, mmx2/mem64

0F 0F /r 1C

Converts packed single-precision floating-point values in an MMX register or memory location to word integer values in the destination MMX register.



92 PF2IW

Table 1-3. Numeric Range for PF2IW Results

| Source 2                                  | Source 1 and Destination |
|-------------------------------------------|--------------------------|
| 0                                         | 0                        |
| Normal, abs(Source 2) < 1                 | 0                        |
| Normal, -2 <sup>15</sup> < Source 2 <= -1 | Round to zero (Source 2) |
| Normal, 1 <= Source 2 < 2 <sup>15</sup>   | Round to zero (Source 2) |
| Normal, Source 2 ≥= 2 <sup>15</sup>       | 0000_7FFFh               |
| Normal, Source 2 <= −2 <sup>15</sup>      | FFFF_8000h               |
| Unsupported                               | Undefined                |

## **Related Instructions**

PF2ID, PI2FD, PI2FW

## rFLAGS Affected

None

| Exception                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                             |
|---------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD       | X    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                      |
|                           | Х    | Х               | Х         | The AMD extensions to 3DNow!™ are not supported, as indicated by bit 30 in CPUID extended function 8000_0001h. |
| Device not available, #NM | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                  |
| Stack, #SS                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                        |
| General protection, #GP   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                           |
|                           |      |                 | Х         | A null data segment was used to reference memory.                                                              |

|                                              |      | Virtual |           |                                                                                   |
|----------------------------------------------|------|---------|-----------|-----------------------------------------------------------------------------------|
| Exception                                    | Real | 8086    | Protected | Cause of Exception                                                                |
| Page fault, #PF                              |      | Х       | Х         | A page fault resulted from the execution of the instruction.                      |
| x87 floating-point<br>exception pending, #MF | Х    | Х       | Х         | An unmasked x87 floating-point exception was pending.                             |
| Alignment check, #AC                         |      | Х       | Х         | An unaligned memory reference was performed while alignment checking was enabled. |

94 PF2IW

## **PFACC**

# **Packed Floating-Point Accumulate**

Adds the two single-precision floating-point values in the first source operand and adds the two single-precision values in the second source operand and writes the two results to the low-order and high-order doubleword, respectively, of the destination (first source). The first source/destination operand is an MMX register. The second source operand is another MMX register or 64-bit memory location.

| Minemonic              | Opcode      | Description                                                                                                                                                                             |
|------------------------|-------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| PFACC mmx1, mmx2/mem64 | OF OF /r AE | Accumulates packed single-precision floating-point values in an MMX register or 64-bit memory location and another MMX register and writes each result in the destination MMX register. |



The numeric range for operands is shown in Table 1-4 on page 96.

PFACC 95

**Table 1-4.** Numeric Range for PFACC Results

| Source (                 | Inerand                  | High Operand <sup>2</sup> |                            |              |  |  |
|--------------------------|--------------------------|---------------------------|----------------------------|--------------|--|--|
| Source Operand           |                          | 0                         | Normal                     | Unsupported  |  |  |
|                          | 0                        | +/- 0 <sup>3</sup>        | High Operand               | High Operand |  |  |
| Low Operand <sup>1</sup> | Normal                   | Low Operand               | Normal, +/- 0 <sup>4</sup> | Undefined    |  |  |
|                          | Unsupported <sup>5</sup> | Low Operand               | Undefined                  | Undefined    |  |  |

- 1. Least-significant floating-point value in first or second source operand.
- 2. Most-significant floating-point value in first or second source operand.
- 3. The sign of the result is the logical AND of the signs of the low and high operands
- 4. If the absolute value of the infinitely precise result is less than 2<sup>-126</sup> (but not zero), the result is a zero with the sign of the operand (low or high) that is larger in magnitude. If the infinitely precise result is exactly zero, the result is zero with the sign of the low operand. If the absolute value of the infinitely precise result is greater than or equal to 2<sup>128</sup>, the result is the largest normal number with the sign of the low operand.
- 5. "Unsupported" means that the exponent is all ones (1s).

#### **Related Instructions**

PFADD, PFNACC, PFPNACC

## **rFLAGS Affected**

None

## **Exceptions**

| Exception                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                         |
|---------------------------|------|-----------------|-----------|------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD       | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                  |
|                           | X    | Х               | х         | The AMD 3DNow!™ instructions are not supported, as indicated bit 31 in CPUID extended function 8000_0001h. |
| Device not available, #NM | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                              |
| Stack, #SS                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                    |
| General protection, #GP   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                       |
|                           |      |                 | Х         | A null data segment was used to reference memory.                                                          |

96 PFACC

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                |
|-------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------|
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                      |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                             |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled. |

## **PFADD**

# **Packed Floating-Point Add**

Adds each packed single-precision floating-point value in the first source operand to the corresponding packed single-precision floating-point value in the second operand and writes the result of each addition in the corresponding doubleword of the destination (first source). The first source/destination operand is an MMX register. The second source operand is another MMX register or 64-bit memory location. The numeric range for operands is shown in Table 1-5 on page 99.

## Mnemonic Opcode Description

PFADD mmx1, mmx2/mem64

0F 0F /r 9E

Adds two packed single-precision floating-point values in an MMX register or 64-bit memory location and another MMX register and writes each result in the destination MMX register.



98 PFADD

| Table 1-5. | Numeric | Range for | the PFADD | Results |
|------------|---------|-----------|-----------|---------|
|------------|---------|-----------|-----------|---------|

| Source (                    | )norand                  | Most-Significant Doubleword |                            |             |  |  |
|-----------------------------|--------------------------|-----------------------------|----------------------------|-------------|--|--|
| Source Operand              |                          | 0                           | Normal                     | Unsupported |  |  |
|                             | 0                        | +/- 01                      | Source 2                   | Source 2    |  |  |
| Source 1 and<br>Destination | Normal                   | Source 1                    | Normal, +/- 0 <sup>2</sup> | Undefined   |  |  |
|                             | Unsupported <sup>3</sup> | Source 1                    | Undefined                  | Undefined   |  |  |

- 1. The sign of the result is the logical AND of the signs of the source operands
- If the absolute value of the infinitely precise result is less than 2<sup>-126</sup> (but not zero), the result is a zero with the sign of the source operand that is larger in magnitude. If the infinitely precise result is exactly zero, the result is zero with the sign of source 1. If the absolute value of the infinitely precise result is greater than or equal to 2<sup>128</sup>, the result is the largest normal number with the sign of source 1.
- 3. "Unsupported" means that the exponent is all ones (1s).

#### **Related Instructions**

PFACC, PFNACC, PFPNACC

#### rFLAGS Affected

None

| Exception                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                            |
|---------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD       | X    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                     |
|                           | X    | Х               | X         | The AMD 3DNow!™ instructions are not supported, as indicated by bit 31 in CPUID extended function 8000_0001h. |
| Device not available, #NM | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                 |
| Stack, #SS                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                       |
| General protection, #GP   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                          |
|                           |      |                 | Х         | A null data segment was used to reference memory.                                                             |

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                |
|----------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------|
| Page fault, #PF                              |      | Χ               | Х         | A page fault resulted from the execution of the instruction.                      |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                             |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled. |

100 PFADD

# **PFCMPEQ**

# **Packed Floating-Point Compare Equal**

Compares each of the two packed single-precision floating-point values in the first source operand with the corresponding packed single-precision floating-point value in the second source operand and writes the result of each comparison in the corresponding doubleword of the destination (first source). For each pair of floating-point values, if the values are equal, the result is all 1s. If the values are not equal, the result is all 0s. The first source/destination operand is an MMX register. The second source operand is another MMX register or 64-bit memory location. The numeric range for operands is shown in Table 1-6 on page 102.

Mnemonic Opcode Description

PFCMPEQ mmx1, mmx2/mem64

OF OF /r BO

Compares two pairs of packed single-precision floating-point values in an MMX register and an MMX register or 64-bit memory location.



Table 1-6. Numeric Range for the PFCMPEQ Instruction

| Operand                     | Value                    | Source 2                  |                                          |             |  |  |  |
|-----------------------------|--------------------------|---------------------------|------------------------------------------|-------------|--|--|--|
| Operand                     | value                    | 0                         | Normal                                   | Unsupported |  |  |  |
|                             | 0                        | O FFFF_FFFFh <sup>1</sup> |                                          | 0000_0000h  |  |  |  |
| Source 1 and<br>Destination | Normal                   | 0000_0000h                | 0000_0000h or<br>FFFF_FFFFh <sup>2</sup> | 0000_0000h  |  |  |  |
|                             | Unsupported <sup>3</sup> | 0000_0000h                | 0000_0000h                               | Undefined   |  |  |  |

- 1. Positive zero is equal to negative zero.
- The result is FFFF\_FFFFh if source 1 and source 2 have identical signs, exponents, and mantissas. Otherwise, the result is 0000\_0000h.
- 3. "Unsupported" means that the exponent is all ones (1s).

## **Related Instructions**

PFCMPGE, PFCMPGT

#### rFLAGS Affected

None

## **Exceptions**

| Exception                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                            |
|---------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD       | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                     |
|                           | X    | X               | Х         | The AMD 3DNow!™ instructions are not supported, as indicated by bit 31 in CPUID extended function 8000_0001h. |
| Device not available, #NM | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                 |
| Stack, #SS                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                       |
| General protection, #GP   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                          |
|                           |      |                 | Х         | A null data segment was used to reference memory.                                                             |

**PFCMPEQ** *102* 

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                |
|-------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------|
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                      |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                             |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled. |

# **PFCMPGE**

# Packed Floating-Point Compare Greater or Equal

Compares each of the two packed single-precision floating-point values in the first source operand with the corresponding packed single-precision floating-point value in the second source operand and writes the result of each comparison in the corresponding doubleword of the destination (first source). For each pair of floating-point values, if the value in the first source operand is greater than or equal to the value in the second source operand, the result is all 1s. If the value in the first source operand is less than the value in the second source operand, the result is all 0s. The first source/destination operand is an MMX register. The second source operand is another MMX register or 64-bit memory location. The numeric range for operands is shown in Table 1-7 on page 105.

Mnemonic Opcode Description

PFCMPGE mmx1, mmx2/mem64

0F 0F /r 90

Compares two pairs of packed single-precision floating-point values in an MMX register and an MMX register or 64-bit memory location.



104 PFCMPGE

| Operand                     | Value                    | Source 2                               |                                        |             |  |  |
|-----------------------------|--------------------------|----------------------------------------|----------------------------------------|-------------|--|--|
| Operand                     | value                    | 0                                      | Normal                                 | Unsupported |  |  |
|                             | 0                        | o FFFF_FFFFh <sup>1</sup>              |                                        | Undefined   |  |  |
| Source 1 and<br>Destination | Normal                   | 0000_0000h,<br>FFFF_FFFFh <sup>3</sup> | 0000_0000h,<br>FFFF_FFFFh <sup>4</sup> | Undefined   |  |  |
|                             | Unsupported <sup>5</sup> | Undefined                              | Undefined                              | Undefined   |  |  |

- 1. Positive zero is equal to negative zero.
- 2. The result is FFFF\_FFFFh, if source 2 is negative. Otherwise, the result is 0000\_0000h.
- 3. The result is FFFF FFFFh, if source 1 is positive. Otherwise, the result is 0000 0000h.
- 4. The result is FFFF\_FFFFh, if source 1 is positive and source 2 is negative, or if they are both negative and source 1 is smaller than or equal in magnitude to source 2, or if source 1 and source 2 are both positive and source 1 is greater than or equal in magnitude to source 2. The result is 0000\_0000h in all other cases.
- 5. "Unsupported" means that the exponent is all ones (1s).

#### **Related Instructions**

PFCMPEQ, PFCMPGT

#### rFLAGS Affected

None

## **Exceptions**

| Exception                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                            |
|---------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD       | X    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                     |
|                           | Х    | Х               | Х         | The AMD 3DNow!™ instructions are not supported, as indicated by bit 31 in CPUID extended function 8000_0001h. |
| Device not available, #NM | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                 |
| Stack, #SS                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                       |
| General protection, #GP   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                          |
|                           |      |                 | Х         | A null data segment was used to reference memory.                                                             |

PFCMPGE 105

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                |
|----------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------|
| Page fault, #PF                              |      | Χ               | Х         | A page fault resulted from the execution of the instruction.                      |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                             |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled. |

106 PFCMPGE

## **PFCMPGT**

# **Packed Floating-Point Compare Greater Than**

Compares each of the two packed single-precision floating-point values in the first source operand with the corresponding packed single-precision floating-point value in the second source operand and writes the result of each comparison in the corresponding doubleword of the destination (first source). For each pair of floating-point values, if the value in the first source operand is greater than the value in the second source operand, the result is all 1s. If the value in the first source operand is less than or equal to the value in the second source operand, the result is all 0s. The first source/destination operand is an MMX register. The second source operand is another MMX register or 64-bit memory location. The numeric range for operands is shown in Table 1-8 on page 108.

## Mnemonic Opcode Description

PFCMPGT mmx1, mmx2/mem64

OF OF /r AO

Compares two pairs of packed single-precision floating-point values in an MMX register and an MMX register or 64-bit memory location.



PFCMPGT 107

Table 1-8. Numeric Range for the PFCMPGT Instruction

| Operand                     | Value                    | Source 2                               |                                        |             |  |  |
|-----------------------------|--------------------------|----------------------------------------|----------------------------------------|-------------|--|--|
| Operand                     | value                    | 0                                      | Normal                                 | Unsupported |  |  |
|                             | 0                        | 0000_0000h                             | 0000_0000h,<br>FFFF_FFFFh <sup>1</sup> | Undefined   |  |  |
| Source 1 and<br>Destination | Normal                   | 0000_0000h,<br>FFFF_FFFFh <sup>2</sup> | 0000_0000h,<br>FFFF_FFFFh <sup>3</sup> | Undefined   |  |  |
|                             | Unsupported <sup>4</sup> | Unsupported <sup>4</sup> Undefined     |                                        | Undefined   |  |  |

- 1. The result is FFFF\_FFFFh, if source 2 is negative. Otherwise, the result is 0000\_0000h.
- 2. The result is FFFF FFFFh, if source 1 is positive. Otherwise, the result is 0000 0000h.
- 3. The result is FFFF\_FFFFh, if source 1 is positive and source 2 is negative, or if they are both negative and source 1 is smaller in magnitude than source 2, or if source 1 and source 2 are positive and source 1 is greater in magnitude than source 2. The result is 0000\_0000h in all other cases.
- 4. "Unsupported" means that the exponent is all ones (1s).

#### **Related Instructions**

PFCMPEQ, PFCMPGE

#### rFLAGS Affected

None

## **Exceptions**

| Exception                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                            |
|---------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD       | X    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                     |
|                           | X    | Х               | Х         | The AMD 3DNow!™ instructions are not supported, as indicated by bit 31 in CPUID extended function 8000_0001h. |
| Device not available, #NM | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                 |
| Stack, #SS                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                       |
| General protection, #GP   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                          |
|                           |      |                 | Х         | A null data segment was used to reference memory.                                                             |

108 PFCMPGT

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                |
|-------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------|
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                      |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                             |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled. |

# **PFMAX**

# Packed Single-Precision Floating-Point Maximum

Compares each of the two packed single-precision floating-point values in the first source operand with the corresponding packed single-precision floating-point value in the second source operand and writes the maximum of the two values for each comparison in the corresponding doubleword of the destination (first source). The first source/destination operand is an MMX register. The second source operand is another MMX register or 64-bit memory location.

## Mnemonic Opcode Description

PFMAX mmx1, mmx2/mem64

OF OF /r A4

Compares two pairs of packed single-precision values in an MMX register and another MMX register or 64-bit memory location and writes the maximum value of each comparison in the destination MMX register.



Any operation with a zero and a negative number returns positive zero. An operation consisting of two zeros returns positive zero. If either source operand is an undefined value, the result is undefined. The numeric range for source and destination operands is shown in Table 1-9 on page 111.

110 PFMAX

| Operand                     | Value                    | Source 2                  |                                |             |  |  |  |
|-----------------------------|--------------------------|---------------------------|--------------------------------|-------------|--|--|--|
| Operand                     | value                    | 0                         | Normal                         | Unsupported |  |  |  |
|                             | 0                        | +0                        | Source 2, +0 <sup>1</sup>      | Undefined   |  |  |  |
| Source 1 and<br>Destination | Normal                   | Source 1, +0 <sup>2</sup> | Source 1/Source 2 <sup>3</sup> | Undefined   |  |  |  |
| 2 33                        | Unsupported <sup>4</sup> | Undefined                 | Undefined                      | Undefined   |  |  |  |

- 1. The result is source 2, if source 2 is positive. Otherwise, the result is positive zero.
- 2. The result is source 1, if source 1 is positive. Otherwise, the result is positive zero.
- 3. The result is source 1, if source 1 is positive and source 2 is negative. The result is source 1, if both are positive and source 1 is greater in magnitude than source 2. The result is source 1, if both are negative and source 1 is lesser in magnitude than source 2. The result is source 2 in all other cases.
- 4. "Unsupported" means that the exponent is all ones (1s).

#### **Related Instructions**

**PFMIN** 

#### rFLAGS Affected

None

## **Exceptions**

| Exception                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                            |
|---------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD       | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                     |
|                           | Х    | X               | Х         | The AMD 3DNow!™ instructions are not supported, as indicated by bit 31 in CPUID extended function 8000_0001h. |
| Device not available, #NM | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                 |
| Stack, #SS                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                       |
| General protection, #GP   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                          |
|                           |      |                 | Х         | A null data segment was used to reference memory.                                                             |

PFMAX 111

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                |
|----------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------|
| Page fault, #PF                              |      | Χ               | Х         | A page fault resulted from the execution of the instruction.                      |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                             |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled. |

112 PFMAX

## **PFMIN**

# **Packed Single-Precision Floating-Point Minimum**

Compares each of the two packed single-precision floating-point values in the first source operand with the corresponding packed single-precision floating-point value in the second source operand and writes the minimum of the two values for each comparison in the corresponding doubleword of the destination (first source). The first source/destination operand is an MMX register. The second source operand is another MMX register or 64-bit memory location.

## Mnemonic Opcode Description

PFMIN mmx1, mmx2/mem64

0F 0F /r 94

Compares two pairs of packed single-precision values in an MMX register and another MMX register or 64-bit memory location and writes the minimum value of each comparison in the destination MMX register.



Any operation with a zero and a positive number returns positive zero. An operation consisting of two zeros returns positive zero. If either source operand is an undefined value, the result is undefined. The numeric range for source and destination operands is shown in Table 1-10 on page 114.

PFMIN 113

Table 1-10. Numeric Range for the PFMIN Instruction

| Operand                     | Value                    | Source 2                  |                                |             |
|-----------------------------|--------------------------|---------------------------|--------------------------------|-------------|
|                             |                          | 0                         | Normal                         | Unsupported |
| Source 1 and<br>Destination | 0                        | +0                        | Source 2, +0 <sup>1</sup>      | Undefined   |
|                             | Normal                   | Source 1, +0 <sup>2</sup> | Source 1/Source 2 <sup>3</sup> | Undefined   |
|                             | Unsupported <sup>4</sup> | Undefined                 | Undefined                      | Undefined   |

- 1. The result is source 2, if source 2 is negative. Otherwise, the result is positive zero.
- 2. The result is source 1, if source 1 is negative. Otherwise, the result is positive zero.
- 3. The result is source 1, if source 1 is negative and source 2 is positive. The result is source 1, if both are negative and source 1 is greater in magnitude than source 2. The result is source 1, if both are positive and source 1 is lesser in magnitude than source 2. The result is source 2 in all other cases.
- 4. "Unsupported" means that the exponent is all ones (1s).

#### **Related Instructions**

**PFMAX** 

#### rFLAGS Affected

None

## **Exceptions**

| Exception                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                            |
|---------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD       | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                     |
|                           | Х    | X               | Х         | The AMD 3DNow!™ instructions are not supported, as indicated by bit 31 in CPUID extended function 8000_0001h. |
| Device not available, #NM | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                 |
| Stack, #SS                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                       |
| General protection, #GP   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                          |
|                           |      |                 | Х         | A null data segment was used to reference memory.                                                             |

114 PFMIN

|                                           |      | Virtual |           |                                                                                   |
|-------------------------------------------|------|---------|-----------|-----------------------------------------------------------------------------------|
| Exception                                 | Real | 8086    | Protected | Cause of Exception                                                                |
| Page fault, #PF                           |      | Х       | Х         | A page fault resulted from the execution of the instruction.                      |
| x87 floating-point exception pending, #MF | Х    | Х       | Х         | An unmasked x87 floating-point exception was pending.                             |
| Alignment check, #AC                      |      | Х       | Х         | An unaligned memory reference was performed while alignment checking was enabled. |

### **PFMUL**

# **Packed Floating-Point Multiply**

Multiplies each of the two packed single-precision floating-point values in the first source operand by the corresponding packed single-precision floating-point value in the second source operand and writes the result of each multiplication in the corresponding doubleword of the destination (first source). The numeric range for source and destination operands is shown in Table 1-11 on page 117. The first source/destination operand is an MMX register. The second source operand is another MMX register or 64-bit memory location.

## Mnemonic Opcode Description

PFMUL mmx1, mmx2/mem64

OF OF /r B4

Multiplies packed single-precision floating-point values in an MMX register and another XMM register or 64-bit memory location and writes the result in the destination MMX register.



116 PFMUL

| Table 1-11. | Numeric Ran | ge for the PFMUL | Instruction |
|-------------|-------------|------------------|-------------|
|-------------|-------------|------------------|-------------|

| Operand                  | Value                    | Value Source 2 |                            |             |  |  |
|--------------------------|--------------------------|----------------|----------------------------|-------------|--|--|
| Operanu                  | value                    | 0              | Normal                     | Unsupported |  |  |
|                          | 0                        | +/- 01         | +/- 01                     | +/- 01      |  |  |
| Source 1 and Destination | Normal                   | +/- 01         | Normal, +/- 0 <sup>2</sup> | Undefined   |  |  |
|                          | Unsupported <sup>3</sup> | +/- 01         | Undefined                  | Undefined   |  |  |

#### Note:

- The sign of the result is the exclusive-OR of the signs of the source operands.
   If the absolute value of the result is less then 2<sup>-126</sup>, the result is zero with the sign being the exclusive-OR of the signs of the source operands. If the absolute value of the product is greater than or equal to 2<sup>128</sup>, the result is the largest normal number with the sign being the exclusive-OR of the signs of the source operands.
- 3. "Unsupported" means that the exponent is all ones (1s).

#### **Related Instructions**

None

#### rFLAGS Affected

None

| Exception                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                            |
|---------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD       | X    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                     |
|                           | Х    | X               | Х         | The AMD 3DNow!™ instructions are not supported, as indicated by bit 31 in CPUID extended function 8000_0001h. |
| Device not available, #NM | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                 |
| Stack, #SS                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                       |
| General protection, #GP   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                          |
|                           |      |                 | Х         | A null data segment was used to reference memory.                                                             |

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                |
|----------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------|
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                      |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                             |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled. |

118 PFMUL

## **PFNACC**

# **Packed Floating-Point Negative Accumulate**

Subtracts the first source operand's high-order single-precision floating-point value from its low-order single-precision floating-point value, subtracts the second source operand's high-order single-precision floating-point value from its low-order single-precision floating-point value, and writes each result to the low-order or high-order doubleword, respectively, of the destination (first source). The first source/destination operand is an MMX register. The second source operand is another MMX register or 64-bit memory location.

#### Mnemonic Opcode Description

PFNACC mmx1, mmx2/mem64

0F 0F /r 8A

Subtracts the packed single-precision floating-point values in an MMX register or 64-bit memory location and another MMX register and writes each value in the destination MMX register.



The numeric range for operands is shown in Table 1-12 on page 120.

 Table 1-12.
 Numeric Range of PFNACC Results

| Source (                 | Onerand                  | High Operand <sup>2</sup> |                            |                |  |  |
|--------------------------|--------------------------|---------------------------|----------------------------|----------------|--|--|
| Source Operand           |                          | 0                         | Normal                     | Unsupported    |  |  |
|                          | 0                        |                           | - High Operand             | - High Operand |  |  |
| Low Operand <sup>1</sup> | Normal                   | Low Operand               | Normal, +/- 0 <sup>4</sup> | Undefined      |  |  |
|                          | Unsupported <sup>5</sup> | Low Operand               | Undefined                  | Undefined      |  |  |

#### Note:

- 1. Least-significant floating-point value in first or second source operand.
- 2. Most-significant floating-point value in first or second source operand.
- 3. The sign is the logical AND of the sign of the low operand and the inverse of the sign of the high operand.
- 4. If the absolute value of the infinitely precise result is less than 2<sup>-126</sup> (but not zero), the result is a zero. If the low operand is larger in magnitude than the high operand, the sign of this zero is the same as the sign of the low operand, else it is the inverse of the sign of the high operand. If the infinitely precise result is exactly zero, the result is zero with the sign of the low operand. If the absolute value of the infinitely precise result is greater than or equal to 2<sup>128</sup>, the result is the largest normal number with the sign of the low operand.
- 5. "Unsupported" means that the exponent is all ones (1s).

#### **Related Instructions**

PFSUB, PFACC, PFPNACC

#### rFLAGS Affected

None

#### **Exceptions**

| Exception                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                             |
|---------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD       | X    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                      |
|                           | Х    | Х               | Х         | The AMD extensions to 3DNow!™ are not supported, as indicated by bit 30 in CPUID extended function 8000_0001h. |
| Device not available, #NM | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                  |
| Stack, #SS                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                        |
| General protection, #GP   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                           |
|                           |      |                 | Х         | A null data segment was used to reference memory.                                                              |

120 PFNACC

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                |
|-------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------|
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                      |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                             |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled. |

# **PFPNACC**

# Packed Floating-Point Positive-Negative Accumulate

Subtracts the first source operand's high-order single-precision floating-point value from its low-order single-precision floating-point value, adds the two single-precision values in the second source operand, and writes each result to the low-order or high-order doubleword, respectively, of the destination (first source). The first source/destination operand is an MMX register. The second source operand is another MMX register or 64-bit memory location.

## Mnemonic Opcode Description

PFPNACC mmx1, mmx2/mem64

0F 0F /r 8E

Subtracts the packed single-precision floating-point values in an MMX register, adds the packed single-precision floating-point values in another MMX register or 64-bit memory location, and writes each value in the destination MMX register.



The numeric range for operands is shown in Table 1-13 (for the low result) and Table 1-14 (for the high result), both on page 123.

122 PFPNACC

| Source Operand           |                          | High Operand <sup>2</sup> |                            |                |  |
|--------------------------|--------------------------|---------------------------|----------------------------|----------------|--|
|                          |                          | 0                         | Normal                     | Unsupported    |  |
|                          | 0                        | +/- 0 <sup>3</sup>        | - High Operand             | - High Operand |  |
| Low Operand <sup>1</sup> | Normal                   | Low Operand               | Normal, +/- 0 <sup>4</sup> | Undefined      |  |
| -                        | Unsupported <sup>5</sup> | Low Operand               | Undefined                  | Undefined      |  |

Table 1-13. Numeric Range of PFPNACC Result (Low Result)

#### Note:

- 1. Least-significant floating-point value in first or second source operand.
- 2. Most-significant floating-point value in first or second source operand.
- 3. The sign is the logical AND of the sign of the low operand and the inverse of the sign of the high operand.
- 4. If the absolute value of the infinitely precise result is less than 2<sup>-126</sup> (but not zero), the result is a zero. If the low operand is larger in magnitude than the high operand, the sign of this zero is the same as the sign of the low operand, else it is the inverse of the sign of the high operand. If the infinitely precise result is exactly zero, the result is zero with the sign of the low operand. If the absolute value of the infinitely precise result is greater than or equal to 2<sup>128</sup>, the result is the largest normal number with the sign of the low operand.
- 5. "Unsupported" means that the exponent is all ones (1s).

Table 1-14. Numeric Range of PFPNACC Result (High Result)

| Source (                 | Inerand                  | High Operand <sup>2</sup> |                            |              |  |  |
|--------------------------|--------------------------|---------------------------|----------------------------|--------------|--|--|
| Source Operand           |                          | 0                         | Normal                     | Unsupported  |  |  |
|                          | 0                        |                           | High Operand               | High Operand |  |  |
| Low Operand <sup>1</sup> | Normal                   | Low Operand               | Normal, +/- 0 <sup>4</sup> | Undefined    |  |  |
|                          | Unsupported <sup>5</sup> | Low Operand               | Undefined                  | Undefined    |  |  |

#### Note:

- 1. Least-significant floating-point value in first or second source operand.
- 2. Most-significant floating-point value in first or second source operand.
- 3. The sign is the logical AND of the signs of the low and high operands.
- 4. If the absolute value of the infinitely precise result is less than 2<sup>-126</sup> (but not zero), the result is zero with the sign of the operand (low or high) that is larger in magnitude. If the infinitely precise result is exactly zero, the result is zero with the sign of the low operand. If the absolute value of the infinitely precise result is greater than or equal to 2<sup>128</sup>, the result is the largest normal number with the sign of the low operand.
- 5. "Unsupported" means that the exponent is all ones (1s).

#### **Related Instructions**

PFADD, PFSUB, PFACC, PFNACC

# rFLAGS Affected

None

# Exceptions

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                             |
|----------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                      |
|                                              | X    | Х               | Х         | The AMD extensions to 3DNow!™ are not supported, as indicated by bit 30 in CPUID extended function 8000_0001h. |
| Device not available, #NM                    | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                  |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                        |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                           |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                              |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                   |
| x87 floating-point<br>exception pending, #MF | X    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                          |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                              |

124 PFPNACC

## **PFRCP**

# **Floating-Point Reciprocal Approximation**

Computes the approximate reciprocal of the single-precision floating-point value in the low-order 32 bits of an MMX register or 64-bit memory location and writes the result in both doublewords of another MMX register. The result is accurate to 14 bits.

MnemonicOpcodeDescriptionPFRCP mmx1, mmx2/mem640F 0F /r 96Computes approximate reciprocal of single-precision floating-point value in an MMX register or 64-bit memory location and writes the result in both doublewords of the destination MMX register.



The PFRCP result can forwarded to the Newton-Raphson iteration step 1 (PFRCPIT1) and Newton-Raphson iteration step 2 (PFRCPIT2) instructions to increase the accuracy of the reciprocal. The first stage of this refinement in accuracy (PFRCPIT1) requires that the input and output of the previously executed PFRCP instruction be used as input to the PFRCPIT1 instruction.

The estimate contains the correct round-to-nearest value for approximately 99% of all arguments. The remaining arguments differ from the correct round-to-nearest value for the reciprocal by 1 unit-in-the-last-place (ulp). For details, see the data sheet or other software-optimization documentation relating to particular hardware implementations.

PFRCP 125

PFRCP(x) returns 0 for  $x \ge 2^{-126}$ . The numeric range for operands is shown in Table 1-15.

Table 1-15. Numeric Range for the PFRCP Result

| Ope      | erand                    | Source 1 and Destination        |  |
|----------|--------------------------|---------------------------------|--|
|          | 0                        | +/- Maximum Normal <sup>1</sup> |  |
| Source 2 | Normal                   | Normal, +/- 0 <sup>2</sup>      |  |
|          | Unsupported <sup>3</sup> | Undefined                       |  |

#### Note:

- 1. The result has the same sign as the source operand.
- 2. If the absolute value of the result is less then  $2^{-126}$ , the result is zero with the sign being the sign of the source operand. Otherwise, the result is a normal with the sign being the same sign as the source operand.
- 3. "Unsupported" means that the exponent is all ones (1s).

## **Examples**

The general Newton-Raphson recurrence for the reciprocal 1/b is:

$$Z_{i+1} \leftarrow Z_{i} \cdot (2 - b \cdot Z_{i})$$

The following code sequence shows the computation of a/b:

```
X_0 = PFRCP(b)

X_1 = PFRCPIT1(b, X_0)

X_2 = PFRCPIT2(X_1, X_0)

q = PFMUL(a, X_2)
```

The 24-bit final reciprocal value is  $X_2$ . The quotient is formed in the last step by multiplying the reciprocal by the dividend a.

#### **Related Instructions**

PFRCPIT1, PFRCPIT2

#### rFLAGS Affected

None

126 PFRCP

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                            |
|-------------------------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                     |
|                                           | X    | Х               | Х         | The AMD 3DNow!™ instructions are not supported, as indicated by bit 31 in CPUID extended function 8000_0001h. |
| Device not available, #NM                 | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                 |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                       |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                          |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                             |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                         |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                             |

## PFRCPIT1

# **Packed Floating-Point Reciprocal Iteration 1**

Performs the first step in the Newton-Raphson iteration to refine the reciprocal approximation produced by the PFRCP instruction. The first source/destination operand is an MMX register containing the results of two previous PFRCP instructions, and the second source operand is another MMX register or 64-bit memory location containing the source operands from the same PFRCP instructions.

Mnemonic Opcode Description

PFRCPIT1 mmx1, mmx2/mem64

OF OF /r A6 Refine approximate reciprocal of result from previous PFRCP instruction.



This instruction is only defined for those combinations of operands such that the first source operand (mmx1) is the approximate reciprocal of the second source operand (mmx2/mem64), and thus the range of the product, mmx1 \* mmx2/mem64, is (0.5, 2). The initial approximation of an operand is accurate to about 12 bits, and the length of the operand itself is 24 bits, so the product of these two operands is greater than 24 bits. PFRCPIT1 applies the one's complement of the product and rounds the result to 32 bits. It then compresses the result to fit into 24 bits by removing the 8 redundant most-significant bits after the hidden integer bit.

The estimate contains the correct round-to-nearest value for approximately 99% of all arguments. The remaining arguments differ from the correct round-to-nearest value for the reciprocal by 1 unit-in-the-last-place (ulp). For details, see the data sheet or

128 PFRCPIT1

other software-optimization documentation relating to particular hardware implementations.

### **Operation**

```
mmx1[31:0] = Compress (2 - mmx1[31:0] * (mmx2/mem64[31:0]) - 2^{31});

mmx1[63:32] = Compress (2 - mmx1[63:32] * (mmx2/mem64[63:32]) - 2^{31});
```

#### where:

"Compress" means discard the 8 redundant most-significant bits after the hidden integer bit.

## **Examples**

The general Newton-Raphson recurrence for the reciprocal 1/b is:

$$Z_{i+1} \leftarrow Z_{i} \cdot (2 - b \cdot Z_{i})$$

The following code sequence computes a 24-bit approximation to a/b with one Newton-Raphson iteration:

```
X_0 = PFRCP(b)

X_1 = PFRCPIT1(b, X_0)

X_2 = PFRCPIT2(X_1, X_0)

Q = PFMUL(a, X_2)
```

a/b is formed in the last step by multiplying the reciprocal approximation by a.

#### **Related Instructions**

PFRCP, PFRCPIT2

#### **rFLAGS Affected**

None

| Exception                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                            |
|---------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD       | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                     |
|                           | X    | Х               | Х         | The AMD 3DNow!™ instructions are not supported, as indicated by bit 31 in CPUID extended function 8000_0001h. |
| Device not available, #NM | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                 |
| Stack, #SS                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                       |

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                |
|----------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------|
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.              |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                 |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                      |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                             |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled. |

130 PFRCPIT1

## PFRCPIT2

# Packed Floating-Point Reciprocal or Reciprocal Square Root Iteration 2

Performs the second and final step in the Newton-Raphson iteration to refine the reciprocal approximation produced by the PFRCP instruction or the reciprocal square-root approximation produced by the PFSQRT instruction. PFRCPIT2 takes two paired elements in each source operand. These paired elements are the results of a PFRCP and PFRCPIT1 instruction sequence or of a PFRSQRT and PFRSQIT1 instruction sequence. The first source/destination operand is an MMX register that contains the PFRCPIT1 or PFRSQIT1 results and the second source operand is another MMX register or 64-bit memory location that contains the PFRCP or PFRSQRT results.

#### Mnemonic Opcode Description

PFRCPIT2 mmx1, mmx2/mem64

OF OF /r B6

Refines approximate reciprocal result from previous PFRCP and PFRCPIT1 instructions or from previous PFRSQRT and PFRSQIT1 instructions.



The PFRCPIT2 instruction expands the compressed PFRCPIT1 or PFRSQIT1 results from 24 to 32 bits and multiplies them by their respective source operands. An optimal correction factor is added to the product, which is then rounded to 24 bits.

PFRCPIT2 131

The estimate contains the correct round-to-nearest value for approximately 99% of all arguments. The remaining arguments differ from the correct round-to-nearest value for the reciprocal by 1 unit-in-the-last-place (ulp). For details, see the data sheet or other software-optimization documentation relating to particular hardware implementations.

### **Operation**

```
mmx1[31:0] = Expand(mmx1[31:0]) * mmx2/mem64[31:0];

mmx1[63:32] = Expand(mmx1[63:32]) * mmx2/mem64[63:32];
```

where:

"Expand" means convert a 24-bit significand to a 32-bit significand according to the following rule:

```
temp[31:0] = \{1'b1, 8\{mmx1[22]\}, mmx1[22:0]\};
```

## **Examples**

The general Newton-Raphson recurrence for the reciprocal 1/b is:

```
Z_{i+1} \leftarrow Z_{i} \cdot (2 - b \cdot Z_{i})
```

The following code sequence computes a 24-bit approximation to a/b with one Newton-Raphson iteration:

```
X_0 = PFRCP(b)

X_1 = PFRCPIT1(b, X_0)

X_2 = PFRCPIT2(X_1, X_0)

Q = PFMUL(a, X_2)
```

a/b is formed in the last step by multiplying the reciprocal approximation by a.

#### **Related Instructions**

PFRCP, PFRCPIT1, PFRSQRT, PFRSQIT1

#### **rFLAGS Affected**

None

132 PFRCPIT2

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                            |
|-------------------------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                     |
|                                           | X    | Х               | Х         | The AMD 3DNow!™ instructions are not supported, as indicated by bit 31 in CPUID extended function 8000_0001h. |
| Device not available, #NM                 | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                 |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                       |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                          |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                             |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                         |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                             |

# PFRSQIT1

# Packed Floating-Point Reciprocal Square Root Iteration 1

Performs the first step in the Newton-Raphson iteration to refine the reciprocal square-root approximation produced by the PFSQRT instruction. The first source/destination operand is an MMX register containing the result from a previous PFRSQRT instruction, and the second source operand is another MMX register or 64-bit memory location containing the source operand from the same PFRSQRT instruction.

## Mnemonic Opcode Description

PFRSQIT1 mmx1, mmx2/mem64

OF OF /r A7

Refines reciprocal square root approximation of previous PFRSQRT instruction



This instruction is only defined for those combinations of operands such that the first source operand (mmx1) is the approximate reciprocal of the second source operand (mmx2/mem64), and thus the range of the product, mmx1 \* mmx2/mem64, is (0.5, 2). The length of both operands is 24 bits, so the product of these two operands is greater than 24 bits. The product is normalized and then rounded to 32 bits. The one's complement of the result is applied, a 1 is added as the most-significant bit, and the result re-normalized. The result is then compressed to fit into 24 bits by removing 8

134 PFRSQIT1

redundant most-significant bits after the hidden integer bit, and the exponent is reduced by 1 to account for the division by 2.

### **Operation**

```
mmx1[31:0] = Compress ((3 - mmx1[31:0] * (mmx2/mem64[31:0]) - <math>2^{31})/2);

mmx1[63:32] = Compress ((3 - mmx1[63:32] * (mmx2/mem64[63:32]) - <math>2^{31})/2);
```

#### where:

"Compress" means discard the 8 redundant most-significant bits after the hidden integer bit.

## **Examples**

The following code sequence shows how the PFRSQRT and PFMUL instructions can be used to compute a = 1/sqrt(b):

```
X_0 = PFRSQRT(b)

X_1 = PFMUL(XO, XO)

X_2 = PFRSQIT1(b, X_1)

A = PFRCPIT2(X_2, X_0)
```

#### **Related Instructions**

PFRCPIT2, PFRSQRT

#### rFLAGS Affected

None

| Exception                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                            |
|---------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD       | X    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                     |
|                           | X    | Х               | Х         | The AMD 3DNow!™ instructions are not supported, as indicated by bit 31 in CPUID extended function 8000_0001h. |
| Device not available, #NM | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                 |
| Stack, #SS                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                       |
| General protection, #GP   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                          |
|                           |      |                 | Х         | A null data segment was used to reference memory.                                                             |

*136* 

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                |
|----------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------|
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                      |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                             |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled. |

# **PFRSQRT**

# **Packed Floating-Point Reciprocal Square Root Approximation**

Computes the approximate reciprocal square root of the single-precision floating-point value in the low-order 32 bits of an MMX register or 64-bit memory location and writes the result in each doubleword of another MMX register. The source operand is single-precision with a 24-bit significand, and the result is accurate to 15 bits. Negative operands are treated as positive operands for purposes of reciprocal square-root computation, with the sign of the result the same as the sign of the source operand.

#### Mnemonic Opcode Description

PFRSQRT mmx1, mmx2/mem64

0F 0F /r 97

Computes approximate reciprocal square root of a packed singleprecision floating-point value.



This instruction can be used together with the PFRSQIT1 and PFRCPIT2 instructions to increase accuracy. The first stage of this refinement in accuracy (PFRSQIT1) requires that the input and output of the previously executed PFRSQRT instruction be used as input to the PFRSQIT1 instruction.

The estimate contains the correct round-to-nearest value for approximately 99% of all arguments. The remaining arguments differ from the correct round-to-nearest value for the reciprocal by 1 unit-in-the-last-place (ulp). For details, see the data sheet or other software-optimization documentation relating to particular hardware implementations.

The numeric range for operands is shown in Table 1-16 on page 138.

**Table 1-16.** Numeric Range for the PFRCP Result

| Operand  |                          | Source 1 and Destination        |
|----------|--------------------------|---------------------------------|
|          | 0                        | +/- Maximum Normal <sup>1</sup> |
| Source 2 | Normal                   | Normal <sup>1</sup>             |
|          | Unsupported <sup>2</sup> | Undefined <sup>1</sup>          |

#### Note:

- 1. The result has the same sign as the source operand.
- 2. "Unsupported" means that the exponent is all ones (1s).

## **Examples**

The following code sequence shows how the PFRSQRT and PFMUL instructions can be used to compute a = 1/sqrt (b):

```
X_0 = PFRSQRT(b)

X_1 = PFMUL(X_0, X_0)

X_2 = PFRSQIT1(b, X_1)

A = PFRCPIT2(X_2, X_0)
```

#### **Related Instructions**

PFRCPIT2, PFRSQIT1

#### rFLAGS Affected

None

| Exception                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                            |
|---------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD       | X    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                     |
|                           | X    | Х               | Х         | The AMD 3DNow!™ instructions are not supported, as indicated by bit 31 in CPUID extended function 8000_0001h. |
| Device not available, #NM | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                 |
| Stack, #SS                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                       |

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                |
|----------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------|
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.              |
|                                              |      |                 | X         | A null data segment was used to reference memory.                                 |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                      |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                             |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled. |

### **PFSUB**

# **Packed Floating-Point Subtract**

Subtracts each packed single-precision floating-point value in the second source operand from the corresponding packed single-precision floating-point value in the first source operand and writes the result of each subtraction in the corresponding doubleword of the destination (first source). The first source/destination operand is an MMX register. The second source operand is another MMX register or 64-bit memory location. The numeric range for operands is shown in Table 1-17 on page 141.

## Mnemonic Opcode Description

PFSUB mmx1, mmx2/mem64

0F 0F /r 9A

Subtracts packed single-precision floating-point values in an MMX register or 64-bit memory location from packed single-precision floating-point values in another MMX register and writes the result in the destination MMX register.



140 PFSUB

| Table 1-17. Numeric Range for the Pl | FSUB Results |
|--------------------------------------|--------------|
|--------------------------------------|--------------|

| Source (                    | Inorand                  | Source 2     |                            |             |  |  |
|-----------------------------|--------------------------|--------------|----------------------------|-------------|--|--|
| Source Operand              |                          | 0 Normal Uns |                            | Unsupported |  |  |
|                             | 0                        | +/- 01       | - Source 2                 | - Source 2  |  |  |
| Source 1 and<br>Destination | Normal                   | Source 1     | Normal, +/- 0 <sup>2</sup> | Undefined   |  |  |
|                             | Unsupported <sup>3</sup> | Source 1     | Undefined                  | Undefined   |  |  |

#### Note:

- 1. The sign of the result is the logical AND of the sign of source 1 and the inverse of the sign of source 2
- 2. If the absolute value of the infinitely precise result is less than 2<sup>-126</sup> (but not zero), the result is a zero. If the source operand that is larger in magnitude is source 1, the sign of this zero is the same as the sign of source 1, else it is the inverse of the sign of source 2. If the infinitely precise result is exactly zero, the result is zero with the sign of source 1. If the absolute value of the infinitely precise result is greater than or equal to 2<sup>128</sup>, the result is the largest normal number with the sign of source 1.
- 3. "Unsupported" means that the exponent is all ones (1s).

#### **Related Instructions**

**PFSUBR** 

#### rFLAGS Affected

None

| Exception                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                            |
|---------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD       | X    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                     |
|                           | X    | Х               | Х         | The AMD 3DNow!™ instructions are not supported, as indicated by bit 31 in CPUID extended function 8000_0001h. |
| Device not available, #NM | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                 |
| Stack, #SS                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                       |
| General protection, #GP   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                          |
|                           |      |                 | Х         | A null data segment was used to reference memory.                                                             |

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                |
|----------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------|
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                      |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                             |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled. |

142 PFSUB

## **PFSUBR**

# **Packed Floating-Point Subtract Reverse**

Subtracts each packed single-precision floating-point value in the first source operand from the corresponding packed single-precision floating-point value in the second source operand and writes the result of each subtraction in the corresponding dword of the destination (first source). The first source/destination operand is an MMX register. The second source operand is another MMX register or 64-bit memory location. The numeric range for operands is shown in Table 1-18 on page 144.

## Mnemonic Opcode Description

PFSUBR mmx1, mmx2/mem64

OF OF /r AA

Subtracts packed single-precision floating-point values in an MMX register from packed single-precision floating-point values in another MMX register or 64-bit memory location and writes the result in the destination MMX register.



PFSUBR 143

Table 1-18. Numeric Range for the PFSUBR Results

| Source (                 | Inorand                  | Source 2   |                            |             |  |  |
|--------------------------|--------------------------|------------|----------------------------|-------------|--|--|
| Source Operand           |                          | 0          | Normal                     | Unsupported |  |  |
|                          | 0                        | +/- 01     | Source 2                   | Source 2    |  |  |
| Source 1 and Destination | Normal                   | - Source 1 | Normal, +/- 0 <sup>2</sup> | Undefined   |  |  |
| 2 333.144011             | Unsupported <sup>3</sup> | - Source 1 | Undefined                  | Undefined   |  |  |

#### Note:

- 1. The sign is the logical AND of the sign of source 2 and the inverse of the sign of source 1.
- 2. If the absolute value of the infinitely precise result is less than 2<sup>-126</sup> (but not zero), the result is a zero. If the source operand that is larger in magnitude is source 2, the sign of this zero is the same as the sign of source 2, else it is the inverse of the sign of source 1. If the infinitely precise result is exactly zero, the result is zero with the sign of source 2. If the absolute value of the infinitely precise result is greater than or equal to 2<sup>128</sup>, the result is the largest normal number with the sign of source 2.
- 3. "Unsupported" means that the exponent is all ones (1s).

#### **Related Instructions**

**PFSUB** 

#### rFLAGS Affected

None

#### **Exceptions0**

| Exception                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                            |
|---------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD       | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                     |
|                           | Х    | Х               | Х         | The AMD 3DNow!™ instructions are not supported, as indicated by bit 31 in CPUID extended function 8000_0001h. |
| Device not available, #NM | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                 |
| Stack, #SS                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                       |
| General protection, #GP   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                          |
|                           |      |                 | Х         | A null data segment was used to reference memory.                                                             |

144 PFSUBR

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                |
|-------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------|
| Page fault, #PF                           |      | Χ               | Х         | A page fault resulted from the execution of the instruction.                      |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                             |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled. |

# PI2FD

# Packed Integer to Floating-Point Doubleword Conversion

Converts two packed 32-bit signed integer values in an MMX register or a 64-bit memory location to two packed single-precision floating-point values and writes the converted values in another MMX register. If the result of the conversion is an inexact value, the value is truncated (rounded toward zero).

### Mnemonic Opcode Description

PI2FD mmx1, mmx2/mem64

OF OF /r OD

Converts packed doubleword integers in an MMX register or 64-bit memory location to single-precision floating-point values in the destination MMX register. Inexact results are truncated.



#### **Related Instructions**

PF2ID, PF2IW, PI2FW

#### **rFLAGS** Affected

None

146 PI2FD

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                            |
|----------------------------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Χ    | X               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                     |
|                                              | X    | Х               | Х         | The AMD 3DNow!™ instructions are not supported, as indicated by bit 31 in CPUID extended function 8000_0001h. |
| Device not available, #NM                    | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                       |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                          |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                             |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                         |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                             |

## PI2FW

# **Packed Integer to Floating-Point Word Conversion**

Converts two packed 16-bit signed integer values in an MMX register or a 64-bit memory location to two packed single-precision floating-point values and writes the converted values in another MMX register.

Mnemonic Opcode Description

PI2FW mmx1, mmx2/mem64

0F 0F /r 0C

Converts packed 16-bit integers in an XMM register or 64-bit memory location to packed single-precision floating-point values in the destination MMX register.



#### **Related Instructions**

PF2ID, PF2IW, PI2FD

### **Exceptions**

| Exception                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                             |
|---------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD       | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                      |
|                           | X    | Х               | Х         | The AMD extensions to 3DNow!™ are not supported, as indicated by bit 30 in CPUID extended function 8000_0001h. |
| Device not available, #NM | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                  |
| Stack, #SS                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                        |

148 PI2FW

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                |
|-------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------|
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.              |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                 |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                      |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                             |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled. |

## **PINSRW**

## **Packed Insert Word**

Inserts a 16-bit value from the low-order word of a 32-bit general purpose register or a 16-bit memory location into an MMX register. The location in the destination register is selected by the immediate byte operand, a shown in Table 1-19. The other words in the destination register operand are not modified.

MnemonicOpcodeDescriptionPINSRW mmx, reg32/mem16, imm80F C4/r ibInserts a 16-bit value from a general-purpose register or memory location into an MMX register.



Table 1-19. Immediate-Byte Operand Encoding for 64-Bit PINSRW

| Immediate-Byte<br>Bit Field | Value of Bit Field | Destination Bits Filled |
|-----------------------------|--------------------|-------------------------|
| 1–0                         | 0                  | 15-0                    |
|                             | 1                  | 31–16                   |
|                             | 2                  | 47–32                   |
|                             | 3                  | 63–48                   |

#### **Related Instructions**

**PEXTRW** 

150 PINSRW

## rFLAGS Affected

None

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                                                                                                               |  |
|-------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| Invalid opcode, #UD                       | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                                                                                                        |  |
|                                           | Х    | Х               | X         | The SSE instructions are not supported, as indicated by bit 25 in CPUID standard function 1; and the AMD extensions to the MMX™ instruction set are not supported, as indicated by bit 22 of CPUID extended function 8000_0001h. |  |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                                                                                                                    |  |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                                                                                                          |  |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                                                                                                             |  |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                                                                                                                                |  |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                                                                                                                     |  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                                                                                                            |  |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                                                                                                                |  |

#### **PMADDWD**

## **Packed Multiply Words and Add Doublewords**

Multiplies each packed 16-bit signed value in the first source operand by the corresponding packed 16-bit signed value in the second source operand, adds the adjacent intermediate 32-bit results of each multiplication (for example, the multiplication results for the adjacent bit fields 63–48 and 47–32, and 31–16 and 15–0), and writes the 32-bit result of each addition in the corresponding doubleword of the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

#### Mnemonic Opcode Description

PMADDWD mmx1, mmx2/mem64

0F F5 /r

Multiplies four packed 16-bit signed values in an MMX register and another MMX register or 64-bit memory location, adds intermediate results, and writes the result in the destination MMX register.



If all four of the 16-bit source operands used to produce a 32-bit multiply-add result have the value 8000h, the 32-bit result is 8000\_0000h, which is not the correct 32-bit signed result.

#### **Related Instructions**

PMULHUW, PMULHW, PMULLW, PMULUDQ

152 PMADDWD

## rFLAGS Affected

None

|                                              |      | Virtual |           |                                                                                                                               |
|----------------------------------------------|------|---------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Exception                                    | Real | 8086    | Protected | Cause of Exception                                                                                                            |
| Invalid opcode, #UD                          | Х    | Х       | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                              | Х    | Х       | Х         | The MMX™ instrucitons are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                    | Х    | Х       | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                   | Х    | Х       | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                      | Х    | Х       | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                              |      |         | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                              |      | Х       | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х       | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                         |      | Х       | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

#### **PMAXSW**

# **Packed Maximum Signed Words**

Compares each of the packed 16-bit signed integer values in the first source operand with the corresponding packed 16-bit signed integer value in the second source operand and writes the maximum of the two values for each comparison in the corresponding word of the destination (first source). The first source/destination and second source operands are an MMX register and an MMX register or 64-bit memory location.

| Mnemonic | Opcode | Description |
|----------|--------|-------------|
|----------|--------|-------------|

PMAXSW mmx1, mmx2/mem64

OF EE/r

Compares packed signed 16-bit integer values in an MMX register and another MMX register or 64-bit memory location and writes the maximum value of each compare in destination MMX register.



#### **Related Instructions**

PMAXUB, PMINSW, PMINUB

#### rFLAGS Affected

None

154 PMAXSW

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                                                                                          |  |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| Invalid opcode, #UD                          | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                                                                                   |  |
|                                              | Х    | Х               | X         | The SSE instructions are not supported, as indicated by bit 25 in CPUID standard function 1; and the AMD extensions to MMX are not supported, as indicated by bit 22 of CPUID extended function 8000_0001h. |  |
| Device not available, #NM                    | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                                                                                               |  |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                                                                                     |  |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                                                                                        |  |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                                                                                                           |  |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                                                                                                |  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                                                                                       |  |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                                                                                           |  |

## **PMAXUB**

# **Packed Maximum Unsigned Bytes**

Compares each of the packed 8-bit unsigned integer values in the first source operand with the corresponding packed 8-bit unsigned integer value in the second source operand and writes the maximum of the two values for each comparison in the corresponding byte of the destination (first source). The first source/destination and second source operands are an MMX register and an MMX register or 64-bit memory location.

| Opcode | Description |
|--------|-------------|
|        | Opcode      |

PMAXUB mmx1, mmx2/mem64

0F DE /r

Compares packed unsigned 8-bit integer values in an MMX register and another MMX register or 64-bit memory location and writes the maximum value of each compare in the destination MMX register.



#### **Related Instructions**

PMAXSW, PMINSW, PMINUB

#### rFLAGS Affected

None

156 PMAXUB

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                                                                                          |  |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| Invalid opcode, #UD                          | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                                                                                   |  |
|                                              | Х    | Х               | X         | The SSE instructions are not supported, as indicated by bit 25 in CPUID standard function 1; and the AMD extensions to MMX are not supported, as indicated by bit 22 of CPUID extended function 8000_0001h. |  |
| Device not available, #NM                    | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                                                                                               |  |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                                                                                     |  |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                                                                                        |  |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                                                                                                           |  |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                                                                                                |  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                                                                                       |  |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                                                                                           |  |

#### **PMINSW**

# **Packed Minimum Signed Words**

Compares each of the packed 16-bit signed integer values in the first source operand with the corresponding packed 16-bit signed integer value in the second source operand and writes the minimum of the two values for each comparison in the corresponding word of the destination (first source). The first source/destination and second source operands are an MMX register and an MMX register or 64-bit memory location.

#### Mnemonic Opcode Description

PMINSW mmx1, mmx2/mem64

OF EA /r

Compares packed signed 16-bit integer values in an MMX register and another MMX register or 64-bit memory location and writes the minimum value of each compare in the destination MMX register.



#### **Related Instructions**

PMAXSW, PMAXUB, PMINUB

#### rFLAGS Affected

None

158 PMINSW

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                                                                                          |  |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| Invalid opcode, #UD                          | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                                                                                   |  |
|                                              | Х    | Х               | X         | The SSE instructions are not supported, as indicated by bit 25 in CPUID standard function 1; and the AMD extensions to MMX are not supported, as indicated by bit 22 of CPUID extended function 8000_0001h. |  |
| Device not available, #NM                    | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                                                                                               |  |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                                                                                     |  |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                                                                                        |  |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                                                                                                           |  |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                                                                                                |  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                                                                                       |  |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                                                                                           |  |

#### **PMINUB**

## **Packed Minimum Unsigned Bytes**

Compares each of the packed 8-bit unsigned integer values in the first source operand with the corresponding packed 8-bit unsigned integer value in the second source operand and writes the minimum of the two values for each comparison in the corresponding byte of the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

#### Mnemonic Opcode Description

PMINUB mmx1, mmx2/mem64 OF DA /r

Compares packed unsigned 8-bit integer values in an MMX register and another MMX register or 64-bit memory location and writes the minimum value of each comparison in the destination MMX register.



#### **Related Instructions**

PMAXSW, PMAXUB, PMINSW

#### rFLAGS Affected

None

160 PMINUB

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                                                                                          |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | X    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                                                                                   |
|                                              | Х    | Х               | Х         | The SSE instructions are not supported, as indicated by bit 25 in CPUID standard function 1; and the AMD extensions to MMX are not supported, as indicated by bit 22 of CPUID extended function 8000_0001h. |
| Device not available, #NM                    | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                                                                                               |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                                                                                     |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                                                                                        |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                                                                                                           |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                                                                                                |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                                                                                       |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                                                                                           |

#### **PMOVMSKB**

# **Packed Move Mask Byte**

Moves the most-significant bit of each byte in the source operand to the destination, with zero-extension to 32. The destination and source operands are a 32-bit general-purpose register and an MMX register.

If the source operand is an XMM register, the result is written to the low-order word of the general-purpose register. If the source operand is an MMX register, the result is written to the low-order byte of the general-purpose register.

| Mnemonic            | Opcode   | Description                                                                                                            |
|---------------------|----------|------------------------------------------------------------------------------------------------------------------------|
| PMOVMSKB reg32, mmx | 0F D7 /r | Moves most-significant bit of each byte in an MMX register to the low-order byte of a 32-bit general-purpose register. |



#### **Related Instructions**

MOVMSKPD, MOVMSKPS

#### rFLAGS Affected

None

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                                                                                          |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                                                                                   |
|                                           | Х    | Х               | Х         | The SSE instructions are not supported, as indicated by bit 25 in CPUID standard function; 1 and the AMD extensions to MMX are not supported, as indicated by bit 22 of CPUID extended function 8000_0001h. |
| Device not available,<br>#NM              | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                                                                                               |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                                                                                       |

#### **PMULHRW**

## **Packed Multiply High Rounded Word**

Multiplies each of the four packed 16-bit signed integer values in the first source operand by the corresponding packed 16-bit integer value in the second source operand, adds 8000h to the lower 16 bits of the intermediate 32-bit result of each multiplication, and writes the high-order 16 bits of each result in the corresponding word of the destination (first source) The addition of 8000h results in the rounding of the result, providing a numerically more accurate result than the PMULHW instruction, which truncates the result. The first source/destination operand is an MMX register. The second source operand is another MMX register or 64-bit memory location.

#### Mnemonic Opcode Description

PMULHRW mmx1, mmx2/mem64

OF OF /r B7

Multiply 16-bit signed integer values in an MMX register and another MMX register or 64-bit memory location and write rounded result in the destination MMX register.



#### **Related Instructions**

None

#### rFLAGS Affected

None

164 PMULHRW

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                            |
|-------------------------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                     |
|                                           | X    | Х               | Х         | The AMD 3DNow!™ instructions are not supported, as indicated by bit 31 in CPUID extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                 |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                       |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                          |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                             |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                         |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                             |

#### **PMULHUW**

## **Packed Multiply High Unsigned Word**

Multiplies each packed unsigned 16-bit values in the first source operand by the corresponding packed unsigned word in the second source operand and writes the high-order 16 bits of each intermediate 32-bit result in the corresponding word of the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic | Opcode | Description |
|----------|--------|-------------|
|          |        |             |

PMULHUW mmx1, mmx2/mem64

0F E4 /r

Multiplies packed 16-bit values in an MMX register by the packed 16-bit values in another MMX register or 64-bit memory location and writes the high-order 16 bits of each result in the destination MMX register.



#### **Related Instructions**

PMADDWD, PMULHW, PMULLW, PMULUDQ

#### rFLAGS Affected

None

166 PMULHUW

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                                                                                          |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                                                                                   |
|                                              | Х    | Х               | Х         | The SSE instructions are not supported, as indicated by bit 25 in CPUID standard function 1; and the AMD extensions to MMX are not supported, as indicated by bit 22 of CPUID extended function 8000_0001h. |
| Device not available, #NM                    | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                                                                                               |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                                                                                     |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                                                                                        |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                                                                                                           |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                                                                                                |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                                                                                       |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                                                                                           |

#### **PMULHW**

## **Packed Multiply High Signed Word**

Multiplies each packed 16-bit signed integer value in the first source operand by the corresponding packed 16-bit signed integer in the second source operand and writes the high-order 16 bits of the intermediate 32-bit result of each multiplication in the corresponding word of the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic | Opcode | Description |
|----------|--------|-------------|
|----------|--------|-------------|

PMULHW mmx1, mmx2/mem64 OF E5 /r

Multiplies packed 16-bit signed integer values in an MMX register and another MMX register or 64-bit memory location and writes the high-order 16 bits of each result in the destination MMX register.



#### **Related Instructions**

PMADDWD, PMULHUW, PMULLW, PMULUDQ

#### rFLAGS Affected

None

168 PMULHW

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                              | X    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                    | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

#### **PMULLW**

## **Packed Multiply Low Signed Word**

Multiplies each packed 16-bit signed integer value in the first source operand by the corresponding packed 16-bit signed integer in the second source operand and writes the low-order 16 bits of the intermediate 32-bit result of each multiplication in the corresponding word of the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic | Opcode | Description |
|----------|--------|-------------|
|----------|--------|-------------|

PMULLW mmx1, mmx2/mem64 0F D5 /r

Multiplies packed 16-bit signed integer values in an MMX register and another MMX register or 64-bit memory location and writes the low-order 16 bits of each result in the destination MMX register.



#### **Related Instructions**

PMADDWD, PMULHUW, PMULHW, PMULUDQ

#### rFLAGS Affected

None

170 PMULLW

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                        |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                 |
|                                           | Х    | Х               | Х         | The MMX <sup>™</sup> instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                             |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                   |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                      |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                                         |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                              |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                     |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                         |

# **PMULUDQ**

# **Packed Multiply Unsigned Doubleword and Store Quadword**

Multiplies two 32-bit unsigned integer values in the low-order doubleword of the first and second source operands and writes the 64-bit result in the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| winemonic                | Opcoae          | Description                                                                                                                                                                            |
|--------------------------|-----------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| PMULUDQ mmx1, mmx2/mem64 | 0F F4 <i>/r</i> | Multiplies low-order 32-bit unsigned integer value in an MMX register and another MMX register or 64-bit memory location and writes the 64-bit result in the destination MMX register. |



#### **Related Instructions**

PMADDWD, PMULHUW, PMULHW, PMULLW

#### rFLAGS Affected

None

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                            |
|-------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                     |
|                                           | Х    | Х               | х         | The SSE2 instructions are not supported, as indicated by bit 26 in CPUID standard function 1. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                 |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                       |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                          |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                             |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                         |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.             |

## **POR**

# **Packed Logical Bitwise OR**

Performs a bitwise logical OR of the values in the first and second source operands and writes the result in the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic | Opcode | Description |
|----------|--------|-------------|
|          |        |             |

POR mmx1, mmx2/mem64 OF EB/r

Performs bitwise logical OR of values in an MMX register and in another MMX register or 64-bit memory location and writes the result in the destination MMX register.



#### **Related Instructions**

PAND, PANDN, PXOR

#### rFLAGS Affected

None

174 POR

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                              | Х    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available,<br>#NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

### **PSADBW**

# Packed Sum of Absolute Differences of Bytes Into a Word

Computes the absolute differences of eight corresponding packed 8-bit unsigned integers in the first and second source operands and writes the unsigned 16-bit integer result of the sum of the eight differences in a word in the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location. The result is stored in the low-order word of the destination operand, and the remaining bytes in the destination are cleared to all 0s.

#### Mnemonic Opcode Description

PSADBW mmx1, mmx2/mem64

0F F6 /r

Compute the sum of the absolute differences of packed 8-bit unsigned integer values in an MMX register and another MMX register or 64-bit memory location and writes the 16-bit unsigned integer result in the destination MMX register.



#### rFLAGS Affected

None

176 PSADBW

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                                                                                          |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | X    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                                                                                   |
|                                              | Х    | Х               | Х         | The SSE instructions are not supported, as indicated by bit 25 in CPUID standard function 1; and the AMD extensions to MMX are not supported, as indicated by bit 22 of CPUID extended function 8000_0001h. |
| Device not available, #NM                    | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                                                                                               |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                                                                                     |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                                                                                        |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                                                                                                           |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                                                                                                |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                                                                                       |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                                                                                           |

#### **PSHUFW**

## **Packed Shuffle Words**

Moves any one of the four packed words in an MMX register or 64-bit memory location to a specified word location in another MMX register. In each case, the selection of the value of the destination word is determined by a two-bit field in the immediate-byte operand, with bits 0 and 1 selecting the contents of the low-order word, bits 2 and 3 selecting the second word, bits 4 and 5 selecting the third word, and bits 6 and 7 selecting the high-order word. Refer to Table 1-20 on page 179. A word in the source operand may be copied to more than one word in the destination.

Mnemonic Opcode Description

PSHUFW mmx1, mmx2/mem64, imm8

0F 70 /r ib

Shuffles packed 16-bit values in an MMX register or 64-bit memory location and puts the result in another XMM register.



178 PSHUFW

Table 1-20. Immediate-Byte Operand Encoding for PSHUFW

| Destination Bits Filled | Immediate-Byte<br>Bit Field | Value of Bit Field | Source Bits Moved |
|-------------------------|-----------------------------|--------------------|-------------------|
|                         |                             | 0                  | 15-0              |
| 15–0                    | 1–0                         | 1                  | 31–16             |
| 15-0                    | 1-0                         | 2                  | 47–32             |
|                         |                             | 3                  | 63-48             |
|                         |                             | 0                  | 15-0              |
| 71 10                   | 7.2                         | 1                  | 31–16             |
| 31–16                   | 3–2                         | 2                  | 47–32             |
|                         |                             | 3                  | 63-48             |
|                         |                             | 0                  | 15-0              |
| 47–32                   | 5–4                         | 1                  | 31–16             |
| 47-32                   | 3-4                         | 2                  | 47–32             |
|                         |                             | 3                  | 63-48             |
|                         |                             | 0                  | 15-0              |
| 67.40                   | 7–6                         | 1                  | 31–16             |
| 63–48                   | 7-0                         | 2                  | 47–32             |
|                         |                             | 3                  | 63–48             |

## **Related Instructions**

PSHUFD, PSHUFHW, PSHUFLW

## **rFLAGS Affected**

None

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                                                                                          |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Х    | X               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                                                                                   |
|                                           | Х    | Х               | Х         | The SSE instructions are not supported, as indicated by bit 25 in CPUID standard function 1; and the AMD extensions to MMX are not supported, as indicated by bit 22 of CPUID extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                                                                                               |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                                                                                     |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                                                                                        |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                                                                                                           |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                                                                                                |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                                                                                       |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                                                                                           |

180 PSHUFW

#### **PSLLD**

# **Packed Shift Left Logical Doublewords**

Left-shifts each of the packed 32-bit values in the first source operand by the number of bits specified in the second source operand and writes each shifted value in the corresponding doubleword of the destination (first source). The first source/destination and second source operands are:

- an MMX register and another MMX register or 64-bit memory location, or
- an MMX register and an immediate byte value.

The low-order bits that are emptied by the shift operation are cleared to 0. If the shift value is greater than 31, the destination is cleared to all 0s.

| Mnemonic               | Opcode             | Description                                                                                                             |
|------------------------|--------------------|-------------------------------------------------------------------------------------------------------------------------|
| PSLLD mmx1, mmx2/mem64 | 0F F2 <i>/r</i>    | Left-shifts packed doublewords in an MMX register by the amount specified in an MMX register or 64-bit memory location. |
| PSLLD mmx, imm8        | 0F 72 /6 <i>ib</i> | Left-shifts packed doublewords in an MMX register by the amount specified in an immediate byte value.                   |



PSLLD 181

## **Related Instructions**

PSLLDQ, PSLLQ, PSLLW, PSRAD, PSRAW, PSRLD, PSRLDQ, PSRLQ, PSRLW

#### rFLAGS Affected

None

## **Exceptions**

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                              | X    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                    | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

182 PSLLD

## **PSLLQ**

# **Packed Shift Left Logical Quadwords**

Left-shifts each 64-bit value in the first source operand by the number of bits specified in the second source operand and writes each shifted value in the corresponding quadword of the destination (first source). The first source/destination and second source operands are:

- an MMX register and another MMX register or 64-bit memory location, or
- an MMX register and an immediate byte value.

The low-order bits that are emptied by the shift operation are cleared to 0. If the shift value is greater than 63, the destination is cleared to all 0s.

| Mnemonic               | Opcode             | Description                                                                                                   |  |  |
|------------------------|--------------------|---------------------------------------------------------------------------------------------------------------|--|--|
| PSLLQ mmx1, mmx2/mem64 | 0F F3 /r           | Left-shifts quadword in an MMX register by the amount specified in an MMX register or 64-bit memory location. |  |  |
| PSLLQ mmx imm8         | 0F 73 /6 <i>ib</i> | Left-shifts quadword in an MMX register by the amount specified in an immediate byte value.                   |  |  |



#### **Related Instructions**

PSLLD, PSLLDQ, PSLLW, PSRAD, PSRAW, PSRLD, PSRLDQ, PSRLQ, PSRLW

## rFLAGS Affected

None

# Exceptions

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                           | X    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

184 PSLLQ

#### **PSLLW**

# **Packed Shift Left Logical Words**

Left-shifts each of the packed 16-bit values in the first source operand by the number of bits specified in the second source operand and writes each shifted value in the corresponding word of the destination (first source). The first source/destination and second source operands are:

- an MMX register and another MMX register or 64-bit memory location, or
- an MMX register and an immediate byte value.

The low-order bits that are emptied by the shift operation are cleared to 0. If the shift value is greater than 15, the destination is cleared to all 0s.

| Mnemonic               | Opcode             | Description                                                                                                       |
|------------------------|--------------------|-------------------------------------------------------------------------------------------------------------------|
| PSLLW mmx1, mmx2/mem64 | 0F F1 /r           | Left-shifts packed words in an MMX register by the amount specified in an MMX register or 64-bit memory location. |
| PSLLW mmx, imm8        | 0F 71 /6 <i>ib</i> | Left-shifts packed words in an MMX register by the amount specified in an immediate byte value.                   |



PSLLW *185* 

## **Related Instructions**

PSLLD, PSLLDQ, PSLLQ, PSRAD, PSRAW, PSRLD, PSRLDQ, PSRLQ, PSRLW

#### **rFLAGS Affected**

None

## **Exceptions**

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                           | Х    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                 | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

186 PSLLW

### **PSRAD**

# **Packed Shift Right Arithmetic Doublewords**

Right-shifts each of the packed 32-bit values in the first source operand by the number of bits specified in the second source operand and writes each shifted value in the corresponding doubleword of the destination (first source). The first source/destination and second source operands are:

- an MMX register and another MMX register or 64-bit memory location, or
- an MMX register and an immediate byte value.

The high-order bits that are emptied by the shift operation are filled with the sign bit of the doubleword's initial value. If the shift value is greater than 31, each doubleword in the destination is filled with the sign bit of the doubleword's initial value.

| Mnemonic                               | Opcode             | Description                                                                                                              |
|----------------------------------------|--------------------|--------------------------------------------------------------------------------------------------------------------------|
| PSRAD <i>mmx1</i> , <i>mmx2</i> /mem64 | 0F E2 <i>/r</i>    | Right-shifts packed doublewords in an MMX register by the amount specified in an MMX register or 64-bit memory location. |
| PSRAD mmx, imm8                        | 0F 72 /4 <i>ib</i> | Right-shifts packed doublewords in an MMX register by the amount specified in an immediate byte value.                   |



PSRAD 187

# **Related Instructions**

PSLLD, PSLLDQ, PSLLQ, PSLLW, PSRAW, PSRLD, PSRLDQ, PSRLQ, PSRLW

### rFLAGS Affected

None

## **Exceptions**

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |  |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|--|
| Invalid opcode, #UD                          | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |  |
|                                              | X    | Х               | х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |  |
| Device not available, #NM                    | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |  |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |  |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |  |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                             |  |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |  |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |  |

188 PSRAD

### **PSRAW**

# **Packed Shift Right Arithmetic Words**

Right-shifts each of the packed 16-bit values in the first source operand by the number of bits specified in the second source operand and writes each shifted value in the corresponding word of the destination (first source). The first source/destination and second source operands are:

- an MMX register and another MMX register or 64-bit memory location, or
- an MMX register and an immediate byte value.

The high-order bits that are emptied by the shift operation are filled with the sign bit of the word's initial value. If the shift value is greater than 15, each word in the destination is filled with the sign bit of the word's initial value.

| Mnemonic               | Opcode             | Description                                                                                                        |  |  |
|------------------------|--------------------|--------------------------------------------------------------------------------------------------------------------|--|--|
| PSRAW mmx1, mmx2/mem64 | 0F E1 /r           | Right-shifts packed words in an MMX register by the amount specified in an MMX register or 64-bit memory location. |  |  |
| PSRAW mmx, imm8        | 0F 71 /4 <i>ib</i> | Right-shifts packed words in an MMX register by the amount specified in an immediate byte value.                   |  |  |

PSRAW 189



## **Related Instructions**

PSLLD, PSLLDQ, PSLLQ, PSLLW, PSRAD, PSRLD, PSRLDQ, PSRLQ, PSRLW

## **rFLAGS Affected**

None

## **Exceptions**

| Exception                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|---------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Exception                 | Real | 0000            | Protecteu | Cause of Exception                                                                                                            |
| Invalid opcode, #UD       | Х    | X               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                           | Х    | X               | X         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |

190 PSRAW

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                |  |
|-------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------|--|
| General protection, #GP                   | Х    | Х               | X         | non-canonical.                                                                    |  |
|                                           |      |                 | ^         | A fidit data segment was asea to reference memory.                                |  |
| Page fault, #PF                           |      | X               | Χ         | A page fault resulted from the execution of the instruction.                      |  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                             |  |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled. |  |

## **PSRLD**

# **Packed Shift Right Logical Doublewords**

Right-shifts each of the packed 32-bit values in the first source operand by the number of bits specified in the second source operand and writes each shifted value in the corresponding doubleword of the destination (first source). The first source/destination and second source operands are:

- an MMX register and another MMX register or 64-bit memory location, or
- an MMX register and an immediate byte value.

The high-order bits that are emptied by the shift operation are cleared to 0. If the shift value is greater than 31, the destination is cleared to 0.

| Mnemonic               | Opcode             | Description                                                                                                              |
|------------------------|--------------------|--------------------------------------------------------------------------------------------------------------------------|
| PSRLD mmx1, mmx2/mem64 | 0F D2 /r           | Right-shifts packed doublewords in an MMX register by the amount specified in an MMX register or 64-bit memory location. |
| PSRLD <i>mmx, imm8</i> | 0F 72 /2 <i>ib</i> | Right-shifts packed doublewords in an MMX register by the amount specified in an immediate byte value.                   |



192 PSRLD

# **Related Instructions**

PSLLD, PSLLDQ, PSLLQ, PSLLW, PSRAD, PSRAW, PSRLDQ, PSRLQ, PSRLW

### rFLAGS Affected

None

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                        |  |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------------------|--|
| Invalid opcode, #UD                       | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                 |  |
|                                           | X    | Х               | Х         | The MMX <sup>™</sup> instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |  |
| Device not available, #NM                 | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                             |  |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                   |  |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                      |  |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                                         |  |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                              |  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                     |  |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                         |  |

## **PSRLQ**

# **Packed Shift Right Logical Quadwords**

Right-shifts each 64-bit value in the first source operand by the number of bits specified in the second source operand and writes each shifted value in the corresponding quadword of the destination (first source). The first source/destination and second source operands are:

- an MMX register and another MMX register or 64-bit memory location, or
- an MMX register and an immediate byte value.

The high-order bits that are emptied by the shift operation are cleared to 0. If the shift value is greater than 63, the destination is cleared to 0.

| Mnemonic               | Opcode             | Description                                                                                                    |  |  |
|------------------------|--------------------|----------------------------------------------------------------------------------------------------------------|--|--|
| PSRLQ mmx1, mmx2/mem64 | 0F D3 /r           | Right-shifts quadword in an MMX register by the amount specified in an MMX register or 64-bit memory location. |  |  |
| PSRLQ mmx, imm8        | 0F 73 /2 <i>ib</i> | Right-shifts quadword in an MMX register by the amount specified in an immediate byte value.                   |  |  |



#### **Related Instructions**

PSLLD, PSLLDQ, PSLLQ, PSLLW, PSRAD, PSRAW, PSRLD, PSRLDQ, PSRLW

194 PSRLQ

# rFLAGS Affected

None

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |  |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|--|
| Invalid opcode, #UD                       | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |  |
|                                           | X    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |  |
| Device not available, #NM                 | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |  |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |  |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |  |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                             |  |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |  |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |  |

## **PSRLW**

# **Packed Shift Right Logical Words**

Right-shifts each of the packed 16-bit values in the first source operand by the number of bits specified in the second operand and writes each shifted value in the corresponding word of the destination (first source). The first source/destination and second source operands are:

- an MMX register and another MMX register or 64-bit memory location, or
- an MMX register and an immediate byte value.

The high-order bits that are emptied by the shift operation are cleared to 0. If the shift value is greater than 15, the destination is cleared to 0.

| Mnemonic               | Opcode             | Description                                                                                                        |  |  |
|------------------------|--------------------|--------------------------------------------------------------------------------------------------------------------|--|--|
| PSRLW mmx1, mmx2/mem64 | 0F D1 /r           | Right-shifts packed words in an MMX register by the amount specified in an MMX register or 64-bit memory location. |  |  |
| PSRLW mmx, imm8        | 0F 71 /2 <i>ib</i> | Right-shifts packed words in an MMX register by the amount specified in an immediate byte value.                   |  |  |





196 PSRLW

# **Related Instructions**

PSLLD, PSLLDQ, PSLLQ, PSLLW, PSRAD, PSRAW, PSRLD, PSRLDQ, PSRLQ

### rFLAGS Affected

None

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |  |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|--|
| Invalid opcode, #UD                       | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |  |
|                                           | Х    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |  |
| Device not available, #NM                 | Χ    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |  |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |  |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |  |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                             |  |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |  |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |  |

### **PSUBB**

# **Packed Subtract Bytes**

Subtracts each packed 8-bit integer value in the second source operand from the corresponding packed 8-bit integer in the first source operand and writes the integer result of each subtraction in the corresponding byte of the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic | Opcode | Description |
|----------|--------|-------------|
|----------|--------|-------------|

PSUBB mmx1, mmx2/mem64

0F F8 /r

Subtracts packed byte integer values in an MMX register or 64-bit memory location from packed byte integer values in another MMX register and writes the result in the destination MMX register.



This instruction operates on both signed and unsigned integers. If the result overflows, the carry is ignored (neither the overflow nor carry bit in rFLAGS is set), and only the low-order 8 bits of each result are written in the destination.

#### **Related Instructions**

PSUBD, PSUBQ, PSUBSB, PSUBSW, PSUBUSB, PSUBUSW, PSUBW

#### **rFLAGS Affected**

None

198 PSUBB

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                           | Х    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

### **PSUBD**

### **Packed Subtract Doublewords**

Subtracts each packed 32-bit integer value in the second source operand from the corresponding packed 32-bit integer in the first source operand and writes the integer result of each subtraction in the corresponding doubleword of the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic | Opcode | Description |
|----------|--------|-------------|
|----------|--------|-------------|

PSUBD mmx1, mmx2/mem64

OF FA /r

Subtracts packed 32-bit integer values in an MMX register or 64-bit memory location from packed 32-bit integer values in another MMX register and writes the result in the destination MMX register.



This instruction operates on both signed and unsigned integers. If the result overflows, the carry is ignored (neither the overflow nor carry bit in rFLAGS is set), and only the low-order 32 bits of each result are written in the destination.

#### **Related Instructions**

PSUBB, PSUBQ, PSUBSB, PSUBSW, PSUBUSB, PSUBUSW, PSUBW

#### rFLAGS Affected

None

200 PSUBD

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                           | Х    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

## **PSUBQ**

## **Packed Subtract Quadword**

Subtracts each packed 64-bit integer value in the second source operand from the corresponding packed 64-bit integer in the first source operand and writes the integer result of each subtraction in the corresponding quadword of the destination (first source). The first source/destination and source operands are an MMX register and another MMX register or 64-bit memory location.

| Mnemonic | Opcode | Description |
|----------|--------|-------------|
|----------|--------|-------------|

PSUBQ mmx1, mmx2/mem64

OF FB /r

Subtracts packed 64-bit integer values in an MMX register or 64-bit memory location from packed 64-bit integer values in another MMX register and writes the result in the destination MMX register.



This instruction operates on both signed and unsigned integers. If the result overflows, the carry is ignored (neither the overflow nor carry bit in rFLAGS is set), and only the low-order 64 bits of each result are written in the destination.

#### **Related Instructions**

PSUBB, PSUBD, PSUBSB, PSUBSW, PSUBUSB, PSUBUSW, PSUBW

#### **rFLAGS** Affected

None

202 PSUBQ

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                            |
|----------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | X    | X               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                     |
|                                              | Х    | Х               | х         | The SSE2 instructions are not supported, as indicated by bit 26 in CPUID standard function 1. |
| Device not available, #NM                    | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                       |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                          |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                             |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                         |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.             |

### **PSUBSB**

# **Packed Subtract Signed With Saturation Bytes**

Subtracts each packed 8-bit signed integer value in the second source operand from the corresponding packed 8-bit signed integer in the first source operand and writes the signed integer result of each subtraction in the corresponding byte of the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic | Opcode | Description |
|----------|--------|-------------|
|          |        |             |

0F E8 /r

PSUBSB mmx1, mmx2/mem64

Subtracts packed byte signed integer values in an MMX register or 64-bit memory location from packed byte integer values in another MMX register and writes the result in the destination MMX register.



For each packed value in the destination, if the value is larger than the largest signed 8-bit integer, it is saturated to 7Fh, and if the value is smaller than the smallest signed 8-bit integer, it is saturated to 80h.

#### **Related Instructions**

PSUBB, PSUBD, PSUBQ, PSUBSW, PSUBUSB, PSUBUSW, PSUBW

### **rFLAGS Affected**

None

204 PSUBSB

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                           | Х    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

### **PSUBSW**

# **Packed Subtract Signed With Saturation Words**

Subtracts each packed 16-bit signed integer value in the second source operand from the corresponding packed 16-bit signed integer in the first source operand and writes the signed integer result of each subtraction in the corresponding word of the destination (first source). The first source/destination and source operands are an MMX register and another MMX register or 64-bit memory location.

| Mnemonic | Opcode | Description |
|----------|--------|-------------|
|          |        |             |

PSUBSW mmx1, mmx2/mem64 OF E9/r

Subtracts packed 16-bit signed integer values in an MMX register or 64-bit memory location from packed 16-bit integer values in another MMX register and writes the result in the destination MMX register.



For each packed value in the destination, if the value is larger than the largest signed 16-bit integer, it is saturated to 7FFFh, and if the value is smaller than the smallest signed 16-bit integer, it is saturated to 8000h.

#### **Related Instructions**

PSUBB, PSUBD, PSUBQ, PSUBSB, PSUBUSB, PSUBUSW, PSUBW

#### rFLAGS Affected

None

206 PSUBSW

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                           | Х    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

### **PSUBUSB**

# **Packed Subtract Unsigned and Saturate Bytes**

Subtracts each packed 8-bit unsigned integer value in the second source operand from the corresponding packed 8-bit unsigned integer in the first source operand and writes the unsigned integer result of each subtraction in the corresponding byte of the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic Opcode Description |
|-----------------------------|
|-----------------------------|

PSUBUSB mmx1, mmx2/mem64

0F D8 /r

Subtracts packed byte unsigned integer values in an MMX register or 64-bit memory location from packed byte integer values in another MMX register and writes the result in the destination MMX register.



For each packed value in the destination, if the value is larger than the largest unsigned 8-bit integer, it is saturated to FFh, and if the value is smaller than the smallest unsigned 8-bit integer, it is saturated to 00h.

#### **Related Instructions**

PSUBB, PSUBD, PSUBQ, PSUBSB, PSUBSW, PSUBUSW, PSUBW

### **rFLAGS Affected**

None

208 PSUBUSB

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                        |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                 |
|                                           | Х    | Х               | Х         | The MMX <sup>™</sup> instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                             |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                   |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                      |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                                         |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                              |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                     |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                         |

## **PSUBUSW**

# **Packed Subtract Unsigned and Saturate Words**

Subtracts each packed 16-bit unsigned integer value in the second source operand from the corresponding packed 16-bit unsigned integer in the first source operand and writes the unsigned integer result of each subtraction in the corresponding word of the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic | Opcode | Description |
|----------|--------|-------------|
|          |        |             |

PSUBUSW mmx1, mmx2/mem64

0F D9 /r

Subtracts packed 16-bit unsigned integer values in an MMX register or 64-bit memory location from packed 16-bit integer values in another MMX register and writes the result in the destination MMX register.



For each packed value in the destination, if the value is larger than the largest unsigned 16-bit integer, it is saturated to FFFFh, and if the value is smaller than the smallest unsigned 16-bit integer, it is saturated to 0000h.

### **Related Instructions**

PSUBB, PSUBD, PSUBQ, PSUBSB, PSUBSW, PSUBUSB, PSUBW

#### rFLAGS Affected

None

210 PSUBUSW

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                        |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                 |
|                                              | X    | Х               | Х         | The MMX <sup>™</sup> instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                    | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                             |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                   |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                      |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                                         |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                              |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                     |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                         |

### **PSUBW**

### **Packed Subtract Words**

Subtracts each packed 16-bit integer value in the second source operand from the corresponding packed 16-bit integer in the first source operand and writes the integer result of each subtraction in the corresponding word of the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic | Opcode | Description |
|----------|--------|-------------|
|          |        |             |

PSUBW mmx1, mmx2/mem64

0F F9 /r

Subtracts packed 16-bit integer values in an MMX register or 64-bit memory location from packed 16-bit integer values in another MMX register and writes the result in the destination MMX register.



This instruction operates on both signed and unsigned integers. If the result overflows, the carry is ignored (neither the overflow nor carry bit in rFLAGS is set), and only the low-order 16 bits of the result are written in the destination.

#### **Related Instructions**

PSUBB, PSUBD, PSUBQ, PSUBSB, PSUBSW, PSUBUSB, PSUBUSW

#### rFLAGS Affected

None

212 PSUBW

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                        |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                 |
|                                           | Х    | Х               | Х         | The MMX <sup>™</sup> instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                             |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                   |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                      |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                                         |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                              |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                     |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                         |

## **PSWAPD**

# **Packed Swap Doubleword**

Swaps (reverses) the two packed 32-bit values in the source operand and writes each swapped value in the corresponding doubleword of the destination. The source operand is an MMX register or 64-bit memory location. The destination is another MMX register.

| Mnemonic |  |  |   | Opcode | Description | 1 |  |  |  |
|----------|--|--|---|--------|-------------|---|--|--|--|
|          |  |  | _ |        | _           |   |  |  |  |

PSWAPD mmx1, mmx2/mem64

OF OF /r BB

Swaps packed 32-bit values in an MMX register or 64-bit memory location and writes each value in the destination MMX register.



### **Related Instructions**

None

## **rFLAGS Affected**

None

214 PSWAPD

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                             |
|-------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                      |
|                                           | X    | Х               | Х         | The AMD Extensions to 3DNow!™ are not supported, as indicated by bit 30 in CPUID extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                  |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                        |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                           |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                              |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                   |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                          |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                              |

### PUNPCKHBW

# **Unpack and Interleave High Bytes**

Unpacks the high-order bytes from the first and second source operands and packs them into interleaved-byte words in the destination (first source). The low-order bytes of the source operands are ignored. The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

Mnemonic Opcode Description

PUNPCKHBW mmx1, mmx2/mem64 0F 68/r

Unpacks the four high-order bytes in an MMX register and another MMX register or 64-bit memory location and packs them into interleaved bytes in the destination MMX register.



If the second source operand is all 0s, the destination contains the bytes from the first source operand zero-extended to 16 bits. This operation is useful for expanding unsigned 8-bit values to unsigned 16-bit operands for subsequent processing that requires higher precision.

#### **Related Instructions**

PUNPCKHDQ, PUNPCKHQDQ, PUNPCKHWD, PUNPCKLBW, PUNPCKLDQ, PUNPCKLQDQ, PUNPCKLWD

#### rFLAGS Affected

None

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                        |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                 |
|                                           | Х    | Х               | X         | The MMX <sup>™</sup> instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                             |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                   |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                      |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                                         |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                              |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                     |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                         |

# PUNPCKHDQ Unpack and Interleave High Doublewords

Unpacks the high-order doublewords from the first and second source operands and packs them into interleaved-doubleword quadwords in the destination (first source). The low-order doublewords of the source operands are ignored. The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

#### Mnemonic Opcode Description

PUNPCKHDQ mmx1, mmx2/mem64 0F 6A/r

Unpacks the high-order doubleword in an MMX register and another MMX register or 64-bit memory location and packs them into interleaved doublewords in the destination MMX register.



If the second source operand is all 0s, the destination contains the doubleword(s) from the first source operand zero-extended to 64 bits. This operation is useful for expanding unsigned 32-bit values to unsigned 64-bit operands for subsequent processing that requires higher precision.

#### **Related Instructions**

PUNPCKHBW, PUNPCKHQDQ, PUNPCKHWD, PUNPCKLBW, PUNPCKLDQ, PUNPCKLQDQ, PUNPCKLWD

#### rFLAGS Affected

None

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                           | Х    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

# PUNPCKHWD Unpack and Interleave High Words

Unpacks the high-order words from the first and second source operands and packs them into interleaved-word doublewords in the destination (first source). The low-order words of the source operands are ignored. The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

PUNPCKHWD mmx1, mmx2/mem64

0F 69 /r

Unpacks two high-order words in an MMX register and another MMX register or 64-bit memory location and packs them into interleaved words in the destination MMX register.



If the second source operand is all 0s, the destination contains the words from the first source operand zero-extended to 32 bits. This operation is useful for expanding unsigned 16-bit values to unsigned 32-bit operands for subsequent processing that requires higher precision.

#### **Related Instructions**

PUNPCKHBW, PUNPCKHDQ, PUNPCKHQDQ, PUNPCKLBW, PUNPCKLDQ, PUNPCKLQDQ, PUNPCKLWD

#### rFLAGS Affected

None

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                        |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                                 |
|                                           | Х    | Х               | X         | The MMX <sup>™</sup> instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                             |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                                   |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                                      |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                                         |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                              |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                                     |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                                         |

### **PUNPCKLBW**

# **Unpack and Interleave Low Bytes**

Unpacks the low-order bytes from the first and second source operands and packs them into interleaved-byte words in the destination (first source). The high-order bytes of the source operands are ignored. The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

Mnemonic Opcode Description

PUNPCKLBW mmx1, mmx2/mem64

0F 60 /r

Unpacks the four low-order bytes in an MMX register and another MMX register or 64-bit memory location and packs them into interleaved bytes in the destination MMX register.



If the second source operand is all 0s, the destination contains the bytes from the first source operand zero-extended to 16 bits. This operation is useful for expanding unsigned 8-bit values to unsigned 16-bit operands for subsequent processing that requires higher precision.

#### **Related Instructions**

PUNPCKHBW, PUNPCKHDQ, PUNPCKHQDQ, PUNPCKHWD, PUNPCKLDQ, PUNPCKLQDQ, PUNPCKLWD

#### rFLAGS Affected

None

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Х    | Х               | X         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                           | Х    | Х               | х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                           |      |                 | х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

### **PUNPCKLDQ**

### **Unpack and Interleave Low Doublewords**

Unpacks the low-order doublewords from the first and second source operands and packs them into interleaved-doubleword quadwords in the destination (first source). The high-order doublewords of the source operands are ignored. The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

#### Mnemonic Opcode Description

PUNPCKLDQ mmx1, mmx2/mem64 0F 62 /r

Unpacks the low-order doubleword in an MMX register and another MMX register or 64-bit memory location and packs them into interleaved doublewords in the destination MMX register.



If the second source operand is all 0s, the destination contains the doubleword(s) from the first source operand zero-extended to 64 bits. This operation is useful for expanding unsigned 32-bit values to unsigned 64-bit operands for subsequent processing that requires higher precision.

#### **Related Instructions**

PUNPCKHBW, PUNPCKHDQ, PUNPCKHQDQ, PUNPCKHWD, PUNPCKLBW, PUNPCKLQDQ, PUNPCKLWD

#### rFLAGS Affected

None

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                           | Х    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

#### **PUNPCKLWD**

### **Unpack and Interleave Low Words**

Unpacks the low-order words from the first and second source operands and packs them into interleaved-word doublewords in the destination (first source). The high-order words of the source operands are ignored. The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic Opc | code Descrip | tion |
|--------------|--------------|------|
|--------------|--------------|------|

PUNPCKLWD mmx1, mmx2/mem64

0F 61 /r

Unpacks the two low-order words in an MMX register and another MMX register or 64-bit memory location and packs them into interleaved words in the destination MMX register.



If the second source operand is all 0s, the destination contains the words from the first source operand zero-extended to 32 bits. This operation is useful for expanding unsigned 16-bit values to unsigned 32-bit operands for subsequent processing that requires higher precision.

#### **Related Instructions**

PUNPCKHBW, PUNPCKHDQ, PUNPCKHQDQ, PUNPCKHWD, PUNPCKLBW, PUNPCKLDQ, PUNPCKLQDQ

#### rFLAGS Affected

None

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|-------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                       | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                           | Х    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available, #NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                           |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                      |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

#### **PXOR**

# **Packed Logical Bitwise Exclusive OR**

Performs a bitwise exclusive OR of the values in the first and second source operands and writes the result in the destination (first source). The first source/destination operand is an MMX register and the second source operand is another MMX register or 64-bit memory location.

| Mnemonic | Opcode | Description |
|----------|--------|-------------|
|          |        |             |

PXOR mmx1, mmx2/mem64 OF EF/r

Performs bitwise logical XOR of values in an MMX register and in another MMX register or 64-bit memory location and writes the result in the destination MMX register.



#### **Related Instructions**

PAND, PANDN, POR

#### rFLAGS Affected

None

228 PXOR

| Exception                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                            |
|----------------------------------------------|------|-----------------|-----------|-------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                          | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                     |
|                                              | Х    | Х               | Х         | The MMX™ instructions are not supported, as indicated by bit 23 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available,<br>#NM                 | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                 |
| Stack, #SS                                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                                                       |
| General protection, #GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                                                          |
|                                              |      |                 | Х         | A null data segment was used to reference memory.                                                                             |
| Page fault, #PF                              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                  |
| x87 floating-point<br>exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                                                         |
| Alignment check, #AC                         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.                                             |

26569-Rev. 3.04-September 2003

230 PXOR

# 2 x87 Floating-Point Instruction Reference

This chapter describes the function, mnemonic syntax, opcodes, condition codes, affected flags, and possible exceptions generated by the x87 floating-point instructions. The x87 floating-point instructions are used in legacy floating-point applications. Most of these instructions load, store, or operate on data located in the x87 ST(0)–ST(7) stack registers (the FPR0–FPR7 physical registers). The remaining instructions within this category are used to manage the x87 floating-point environment.

A given hardware implementation of the AMD64 architecture supports the x87 floating-point instructions if the following CPUID functions are set:

- On-Chip Floating-Point Unit, indicated by bit 0 of CPUID standard function 1 and extended function 8000\_0001h.
- CMOVcc (conditional moves), indicated by bit 15 of CPUID standard function 1 and extended function 8000\_0001h. A 1 in this bit indicates support for x87 floating-point conditional moves (FCMOVcc) whenever the On-Chip Floating-Point Unit bit (bit 0) is also 1.

The x87 instructions can be used in legacy mode or long mode. Their use in long mode is available if the following CPUID function bit is set to 1:

■ Long Mode, indicated by bit 29 of CPUID extended function 8000 0001h.

Compilation of x87 media programs for execution in 64-bit mode offers two primary advantages: access to the 64-bit virtual address space and access to the RIP-relative addressing mode.

For further information about the x87 floating-point instructions and register resources, see:

- "x87 Floating-Point Programming" in volume 1.
- "Summary of Registers and Data Types" in volume 3.
- "Notation" in volume 3.
- "Instruction Prefixes" in volume 3.

### F2XM1

# Floating-Point Compute 2<sup>x</sup>-1

Raises 2 to the power specified by the value in ST(0), subtracts 1, and stores the result in ST(0). The source value must be in the range -1.0 to +1.0. The result is undefined for source values outside this range.

This instruction, when used in conjunction with the FYL2X instruction, can be applied to calculate  $z = x^y$  by taking advantage of the log property  $x^y = 2^{y^* \log_2 x}$ .

| Mnemonic | Opcode | Description                            |
|----------|--------|----------------------------------------|
| F2XM1    | D9 F0  | Replace ST(0) with $(2^{ST(0)} - 1)$ . |

#### **Related Instructions**

FYL2X, FYL2XP1

#### rFLAGS Affected

None

#### **x87 Condition Code**

| x87 Condition Code         | Value                                                                                               | Description                                                       |  |  |
|----------------------------|-----------------------------------------------------------------------------------------------------|-------------------------------------------------------------------|--|--|
| C0                         | U                                                                                                   |                                                                   |  |  |
|                            | 0                                                                                                   | x87 stack underflow, if an x87 register stack fault was detected. |  |  |
| <b>C</b> 1                 | 0                                                                                                   | Result was rounded down, if a precision exception was detected.   |  |  |
|                            | 1                                                                                                   | Result was rounded up, if a precision exception was detected.     |  |  |
| C2                         | U                                                                                                   |                                                                   |  |  |
| C3                         | U                                                                                                   |                                                                   |  |  |
| A flag set to 1 or cleared | A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |                                                                   |  |  |

232 F2XM1

| Exception                                                    | Real                                        | Virtual<br>8086 | Protected | Cause of Exception                                                                            |  |  |
|--------------------------------------------------------------|---------------------------------------------|-----------------|-----------|-----------------------------------------------------------------------------------------------|--|--|
| Device not available,<br>#NM                                 | Х                                           | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) were set to 1. |  |  |
| x87 floating-point<br>exception pending,<br>#MF              | Х                                           | Х               | Х         | An unmasked x87 floating-point exception was pending.                                         |  |  |
|                                                              | x87 Floating-Point Exception Generated, #MF |                 |           |                                                                                               |  |  |
| Invalid-operation exception (IE)                             | Х                                           | Х               | Х         | A source operand was an SNaN value or an unsupported format.                                  |  |  |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х                                           | Х               | Х         | An x87 stack underflow occured.                                                               |  |  |
| Denormalized-oper-<br>and exception (DE)                     | Х                                           | Х               | Х         | A source operand was a denormal value.                                                        |  |  |
| Underflow exception (UE)                                     | Х                                           | Х               | Х         | A rounded result was too small to fit into the format of the destination operand.             |  |  |
| Precision exception (PE)                                     | Х                                           | Х               | Х         | A result could not be represented exactly in the destination format.                          |  |  |

### **FABS**

# **Floating-Point Absolute Value**

Converts the value in ST(0) to its absolute value by clearing the sign bit. The resulting value depends upon the type of number used as the source value:

| Source Value (ST(0)) | Result (ST(0)) |
|----------------------|----------------|
| -∞                   | +∞             |
| -FiniteReal          | +FiniteReal    |
| -0                   | +0             |
| +0                   | +0             |
| +FiniteReal          | +FiniteReal    |
| +∞                   | +∞             |
| NaN                  | NaN            |

This operation applies even if the value in ST(0) is negative zero or negative infinity.

| Mnemonic | Opcode | Description                            |
|----------|--------|----------------------------------------|
| FABS     | D9 E1  | Replace ST(0) with its absolute value. |

#### **Related Instructions**

FPREM, FRNDINT, FXTRACT, FCHS

#### rFLAGS Affected

None

#### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------|--|
| CO                                                                                                  | U     |             |  |
| C1                                                                                                  | 0     |             |  |
| C2                                                                                                  | U     |             |  |
| C3                                                                                                  | U     |             |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |             |  |

234 FABS

| Exception                                                    | Real                                        | Virtual<br>8086 | Protected | Cause of Exception                                                                           |  |  |
|--------------------------------------------------------------|---------------------------------------------|-----------------|-----------|----------------------------------------------------------------------------------------------|--|--|
| Device not available,<br>#NM                                 | Х                                           | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |  |  |
| x87 floating-point<br>exception pending,<br>#MF              | Х                                           | Х               | Х         | An unmasked x87 floating-point exception was pending.                                        |  |  |
|                                                              | x87 Floating-Point Exception Generated, #MF |                 |           |                                                                                              |  |  |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х                                           | Х               | Х         | An x87 stack underflow occurred.                                                             |  |  |

### FADD FADDP FIADD

### **Floating-Point Add**

Adds two values and stores the result in a floating-point register. If two operands are specified, the values are in ST(0) and another floating-point register and the instruction stores the result in the first register specified. If one operand is specified, the instruction adds the 32-bit or 64-bit value in the specified memory location to the value in ST(0).

The FADDP instruction adds the value in ST(0) to the value in another floating-point register and pops the register stack. If two operands are specified, the first operand is the other register. If no operand is specified, then the other register is ST(1).

The FIADD instruction reads a 16-bit or 32-bit signed integer value from the specified memory location, converts it to double-extended-real format, and adds it to the value in ST(0).

| Mnemonic          | Opcode          | Description                                                            |
|-------------------|-----------------|------------------------------------------------------------------------|
| FADD ST(0),ST(i)  | D8 C0+ <i>i</i> | Replace $ST(0)$ with $ST(0) + ST(i)$ .                                 |
| FADD ST(i),ST(0)  | DC C0+i         | Replace $ST(i)$ with $ST(0) + ST(i)$ .                                 |
| FADD mem32real    | D8 /0           | Replace ST(0) with ST(0) + mem32real.                                  |
| FADD mem64real    | DC/0            | Replace $ST(0)$ with $ST(0) + mem64real$ .                             |
| FADDP             | DE C1           | Replace $ST(1)$ with $ST(0) + ST(1)$ , and pop the x87 register stack. |
| FADDP ST(i),ST(0) | DE C0+i         | Replace $ST(i)$ with $ST(0) + ST(i)$ , and pop the x87 register stack. |
| FIADD mem16int    | DE/0            | Replace $ST(0)$ with $ST(0) + mem16int$ .                              |
| FIADD mem32int    | DA /0           | Replace ST(0) with ST(0) + mem32int.                                   |

#### **Related Instructions**

None

#### rFLAGS Affected

None

236 FADDx

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                       |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------------------------------------------------------------|--|--|
| CO                                                                                                  | U     |                                                                   |  |  |
|                                                                                                     | 0     | x87 stack underflow, if an x87 register stack fault was detected. |  |  |
| C1                                                                                                  | 0     | Result was rounded down, if a precision exception was detected.   |  |  |
|                                                                                                     | 1     | Result was rounded up, if a precision exception was detected.     |  |  |
| C2                                                                                                  | U     |                                                                   |  |  |
| C3                                                                                                  | U     |                                                                   |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                   |  |  |

### **Exceptions**

| Exception                                                    | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                                           |
|--------------------------------------------------------------|------|-----------------|--------------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х            | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| Stack, #SS                                                   | Х    | Х               | Х            | A memory address exceeded the stack segment limit or was non-canonical.                      |
| General protection,<br>#GP                                   | Х    | Х               | Х            | A memory address exceeded a data segment limit or was non-canonical.                         |
|                                                              |      |                 | Х            | A null data segment was used to reference memory.                                            |
| Page fault, #PF                                              |      | Х               | Х            | A page fault resulted from the execution of the instruction.                                 |
| Alignment check, #AC                                         |      | Х               | Х            | An unaligned memory reference was performed while alignment checking was enabled.            |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х            | An unmasked x87 floating-point exception was pending.                                        |
|                                                              |      | x87 Fl          | oating-Point | Exception Generated, #MF                                                                     |
| Invalid-operation exception (IE)                             | Х    | Х               | Х            | A source operand was an SNaN value or an unsupported format.                                 |
|                                                              | Х    | Х               | Х            | +infinity was added to –infinity.                                                            |
| Invalid-operation<br>exception (IE) with stack<br>fault (SF) | Х    | Х               | Х            | An x87 stack underflow occurred.                                                             |
| Denormalized-operand exception (DE)                          | Х    | Х               | Х            | A source operand was a denormal value.                                                       |

FADDx 237

| Exception                | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                |
|--------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------|
| Overflow exception (OE)  | Х    | Х               | Х         | A rounded result was too large to fit into the format of the destination operand. |
| Underflow exception (UE) | Х    | Х               | Х         | A rounded result was too small to fit into the format of the destination operand. |
| Precision exception (PE) | Х    | Х               | Х         | A result could not be represented exactly in the destination format.              |

238 FADDx

### **FBLD**

# **Floating-Point Load Binary-Coded Decimal**

Converts a 10-byte packed BCD value in memory into double-extended-precision format, and pushes the result onto the x87 stack. In the process, it preserves the sign of the source value.

The packed BCD digits should be in the range 0 to 9. Attempting to load invalid digits (Ah through Fh) produces undefined results.

| Mnemonic      | Opcode | Description                                                                                   |
|---------------|--------|-----------------------------------------------------------------------------------------------|
| FBLD mem80dec | DF /4  | Convert a packed BCD value to floating-point and push the result onto the x87 register stack. |

#### **Related Instructions**

**FBSTP** 

#### rFLAGS Affected

None

#### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                      |  |  |
|-----------------------------------------------------------------------------------------------------|-------|------------------------------------------------------------------|--|--|
| CO                                                                                                  | U     |                                                                  |  |  |
| C1                                                                                                  | 1     | x87 stack overflow, if an x87 register stack fault was detected. |  |  |
| 0                                                                                                   | 0     | If no other flags are set.                                       |  |  |
| C2                                                                                                  | U     |                                                                  |  |  |
| C3                                                                                                  | U     |                                                                  |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                  |  |  |

| Exception                                                    | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                                           |
|--------------------------------------------------------------|------|-----------------|--------------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х            | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| Stack, #SS                                                   | Х    | Х               | Х            | A memory address exceeded the stack segment limit or was non-canonical.                      |
| General protection,<br>#GP                                   | Х    | Х               | Х            | A memory address exceeded a data segment limit or was non-canonical.                         |
|                                                              |      |                 | Х            | A null data segment was used to reference memory.                                            |
| Page fault, #PF                                              |      | Х               | Х            | A page fault resulted from the execution of the instruction.                                 |
| Alignment check, #AC                                         |      | Х               | Х            | An unaligned memory reference was performed while alignment checking was enabled.            |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х            | An unmasked x87 floating-point exception was pending.                                        |
|                                                              |      | x87             | Floating-Poi | nt Exception Generated, #MF                                                                  |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х            | An x87 stack overflow occurred.                                                              |

240 FBLD

# FBSTP Floating-Point Store Binary-Coded Decimal and Pop

Converts the value in ST(0) to an 18-digit packed BCD integer, stores the result in the specified memory location, and pops the register stack. It rounds a non-integral value to an integer value, depending on the rounding mode specified by the RC field of the x87 control word.

The operand specifies the memory address of the first byte of the resulting 10-byte value.

| Mnemonic       | Opcode | Description                                                                                                  |
|----------------|--------|--------------------------------------------------------------------------------------------------------------|
| FBSTP mem80dec | DF/6   | Convert the floating-point value in ST(0) to BCD, store the result in mem80, and pop the x87 register stack. |

#### **Related Instructions**

**FBLD** 

#### rFLAGS Affected

None

#### **x87 Condition Code**

| x87 Condition Code         | Value       | Description                                                       |
|----------------------------|-------------|-------------------------------------------------------------------|
| Co                         | U           |                                                                   |
|                            | 0           | x87 stack underflow, if an x87 register stack fault was detected. |
| C1                         | 0           | Result was rounded down, if a precision exception was detected.   |
|                            | 1           | Result was rounded up, if a precision exception was detected.     |
| C2                         | U           |                                                                   |
| C3                         | U           |                                                                   |
| A flag set to 1 or cleared | d to 0 is M | (modified). Unaffected flags are blank. Undefined flags are U.    |

FBSTP 241

| Exception                                                    | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                                           |
|--------------------------------------------------------------|------|-----------------|--------------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | X               | X            | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| Stack, #SS                                                   | Х    | Х               | Х            | A memory address exceeded the stack segment limit or was non-canonical.                      |
| General protection,<br>#GP                                   | Х    | Х               | Х            | A memory address exceeded a data segment limit or was non-canonical.                         |
|                                                              |      |                 | Х            | The destination operand was in a nonwritable segment.                                        |
|                                                              |      |                 | Х            | A null data segment was used to reference memory.                                            |
| Page fault, #PF                                              |      | Х               | Х            | A page fault resulted from the execution of the instruction.                                 |
| Alignment check, #AC                                         |      | Х               | Х            | An unaligned memory reference was performed while alignment checking was enabled.            |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х            | An unmasked x87 floating-point exception was pending.                                        |
|                                                              | ı    | x87             | Floating-Poi | int Exception Generated, #MF                                                                 |
| Invalid-operation exception (IE)                             | Х    | Х               | Х            | A source operand was an SNaN value, a QNaN value, ±infinity or an unsupported format.        |
|                                                              | Х    | Х               | Х            | A source operand was too large to fit in the destination format.                             |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х            | An x87 stack underflow occurred.                                                             |
| Precision exception (PE)                                     | Х    | Х               | Х            | A result could not be represented exactly in the destination format.                         |

242 FBSTP

### **FCHS**

# **Floating-Point Change Sign**

Compliments the sign bit of ST(0), changing the value from negative to positive or vice versa. This operation applies to positive and negative floating point values, as well as -0 and +0, NaNs, and  $+\infty$  and  $-\infty$ .

| Mnemonic | Opcode | Description                    |
|----------|--------|--------------------------------|
| FCHS     | D9 E0  | Reverse the sign bit of ST(0). |

#### **Related Instructions**

FABS, FPREM, FRNDINT, FXTRACT

#### rFLAGS Affected

None

#### **x87 Condition Code**

| x87 Condition Code         | Value       | Description                                                    |
|----------------------------|-------------|----------------------------------------------------------------|
| CO                         | U           |                                                                |
| C1                         | 0           |                                                                |
| C2                         | U           |                                                                |
| C3                         | U           |                                                                |
| A flag set to 1 or cleared | d to 0 is M | (modified). Unaffected flags are blank. Undefined flags are U. |

### **Exceptions**

| Exception                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |

FCHS 243

| Exception                                                    | Real | Virtual<br>8086 | Protected    | Cause of Exception                                    |
|--------------------------------------------------------------|------|-----------------|--------------|-------------------------------------------------------|
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х            | An unmasked x87 floating-point exception was pending. |
|                                                              |      | x87 I           | Floating-Poi | nt Exception Generated, #MF                           |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х            | An x87 stack underflow occurred.                      |

244 FCHS

# FCLEX (FNCLEX)

# **Floating-Point Clear Flags**

Clears the following flags in the x87 status word:

- Floating-point exception flags (PE, UE, OE, ZE, DE, and IE)
- Stack fault flag (SF)
- Exception summary status flag (ES)
- Busy flag (B)

It leaves the four condition-code bits undefined. It does not check for possible floating-point exceptions before clearing the flags.

Assemblers usually provide an FCLEX macro that expands into the instruction sequence

WAIT ; Opcode 9B FNCLEX destination ; Opcode DB E2

The WAIT (9Bh) instruction checks for pending x87 exceptions and calls an exception handler, if necessary. The FNCLEX instruction then clears all the relevant x87 exception flags.

| Mnemonic | Opcode   | Description                                                                                                            |
|----------|----------|------------------------------------------------------------------------------------------------------------------------|
| FCLEX    | 9B DB E2 | Perform a WAIT (9B) to check for pending floating-point exceptions, and then clear the floating-point exception flags. |
| FNCLEX   | DB E2    | Clear the floating-point flags without checking for pending unmasked floating-point exceptions.                        |

#### **Related Instructions**

WAIT

#### rFLAGS Affected

None

### **x87 Condition Code**

| x87 Condition Code         | Value                                                                                               | Description |  |  |  |
|----------------------------|-----------------------------------------------------------------------------------------------------|-------------|--|--|--|
| CO                         | U                                                                                                   |             |  |  |  |
| C1                         | U                                                                                                   |             |  |  |  |
| C2                         | U                                                                                                   |             |  |  |  |
| C3                         | U                                                                                                   |             |  |  |  |
| A flag set to 1 or cleared | A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |             |  |  |  |

| Exception                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CRO) was set to 1. |

### **FCMOV***cc*

# **Floating-Point Conditional Move**

Tests the flags in the rFLAGS register and, depending upon the values encountered, moves the value in another stack register to ST(0).

This set of instructions includes the mnemonics FCMOVB, FCMOVBE, FCMOVE, FCMOVNB, FCMOVNBE, FCMOVNE, FCMOVNU, and FCMOVU.

Use the CPUID instruction to determine if this instruction is supported on a particular x86-64 implementation. It is supported if both the CMOV and FPU bits are set to 1.

| Mnemonic             | Opcode          | Description                                                                          |
|----------------------|-----------------|--------------------------------------------------------------------------------------|
| FCMOVB ST(0),ST(i)   | DA C0+ <i>i</i> | Move the contents of $ST(i)$ into $ST(0)$ if below (CF = 1).                         |
| FCMOVBE ST(0),ST(i)  | DA D0+i         | Move the contents of $ST(i)$ into $ST(0)$ if below or equal (CF = 1 or $ZF = 1$ ).   |
| FCMOVE ST(0),ST(i)   | DA C8+ <i>i</i> | Move the contents of $ST(i)$ into $ST(0)$ if equal (ZF = 1).                         |
| FCMOVNB ST(0),ST(i)  | DB C0+ <i>i</i> | Move the contents of $ST(i)$ into $ST(0)$ if not below (CF = 0).                     |
| FCMOVNBE ST(0),ST(i) | DB D0+i         | Move the contents of $ST(i)$ into $ST(0)$ if not below or equal (CF = 0 and ZF = 0). |
| FCMOVNE ST(0),ST(i)  | DB C8+ <i>i</i> | Move the contents of $ST(i)$ into $ST(0)$ if not equal $(ZF = 0)$ .                  |
| FCMOVNU ST(0),ST(i)  | DB D8+ <i>i</i> | Move the contents of $ST(i)$ into $ST(0)$ if not unordered (PF = 0).                 |
| FCMOVU ST(0),ST(i)   | DA D8+ <i>i</i> | Move the contents of $ST(i)$ into $ST(0)$ if unordered (PF = 1).                     |

#### **Related Instructions**

None

#### rFLAGS Affected

None

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                       |  |  |  |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------------------------------------------------------------|--|--|--|--|--|
| CO                                                                                                  | U     |                                                                   |  |  |  |  |  |
| C1                                                                                                  | 0     | x87 stack underflow, if an x87 register stack fault was detected. |  |  |  |  |  |
| C2                                                                                                  | U     |                                                                   |  |  |  |  |  |
| C3 U                                                                                                |       |                                                                   |  |  |  |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                   |  |  |  |  |  |

# Exceptions

| Exception                                                    | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                                                                                        |
|--------------------------------------------------------------|------|-----------------|--------------|-------------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD                                          | Х    | Х               | Х            | The Conditional Move instructions are not supported, as indicated by bit 15 in CPUID standard function 1 or extended function 8000_0001h. |
| Device not available,<br>#NM                                 | Х    | Х               | Х            | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1.                                              |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х            | An unmasked x87 floating-point exception was pending.                                                                                     |
|                                                              |      | x8:             | 7 Floating-I | Point Exception Generated, #MF                                                                                                            |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х            | An x87 stack underflow occurred.                                                                                                          |

248 FCMOVcc

# FCOMP FCOMPP

### **Floating-Point Compare**

Compares the specified value to the value in ST(0) and sets the C0, C2, and C3 condition code flags in the x87 status word as shown in the x87 Condition Code table below. The specified value can be in a floating-point register or a memory location.

The no-operand version compares the value in ST(1) with the value in ST(0).

The comparison operation ignores the sign of zero (-0.0 = +0.0).

After performing the comparison operation, the FCOMP instruction pops the x87 register stack and the FCOMPP instruction pops the x87 register stack twice.

If either or both of the compared values is a NaN or is in an unsupported format, the FCOMx instruction sets the invalid-operation exception (IE) bit in the x87 status word to 1. Then, if the exception is masked (IM bit set to 1 in the x87 control word), the instruction sets the condition flags to "unordered." If the exception is unmasked (IM bit cleared to 0), the instruction does not set the condition code flags.

The FUCOMx instructions perform the same operations as the FCOMx instructions, but do not set the IE bit for ONaNs.

| Mnemonic       | Opcode          | Description                                                                                                                                           |
|----------------|-----------------|-------------------------------------------------------------------------------------------------------------------------------------------------------|
| FCOM           | D8 D1           | Compare the contents of ST(0) to the contents of ST(1) and set condition flags to reflect the results of the comparison.                              |
| FCOM ST(i)     | D8 D0+ <i>i</i> | Compare the contents of $ST(0)$ to the contents of $ST(i)$ and set condition flags to reflect the results of the comparison.                          |
| FCOM mem32real | D8 /2           | Compare the contents of ST(0) to the contents of <i>mem32real</i> and set condition flags to reflect the results of the comparison.                   |
| FCOM mem64real | DC /2           | Compare the contents of ST(0) to the contents of <i>mem64real</i> and set condition flags to reflect the results of the comparison.                   |
| FCOMP          | D8 D9           | Compare the contents of ST(0) to the contents of ST(1), set condition flags to reflect the results of the comparison, and pop the x87 register stack. |
| FCOMP ST(i)    | D8 D8+ <i>i</i> | Compare the contents of ST(0) to the contents of ST(i), set condition flags to reflect the results of the comparison, and pop the x87 register stack. |

FCOMx 249

| FCOMP mem32real | D8/3  | Compare the contents of ST(0) to the contents of <i>mem32real</i> , set condition flags to reflect the results of the comparison, and pop the x87 register stack. |
|-----------------|-------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| FCOMP mem64real | DC /3 | Compare the contents of ST(0) to the contents of <i>mem64real</i> , set condition flags to reflect the results of the comparison, and pop the x87 register stack. |
| FCOMPP          | DE D9 | Compare the contents of ST(0) to the contents of ST(1), set condition flags to reflect the results of the comparison, and pop the x87 register stack twice.       |

### **Related Instructions**

FCOMI, FCOMIP, FICOM, FICOMP, FTST, FUCOMI, FUCOMIP, FXAM

### rFLAGS Affected

None

### **x87 Condition Code**

| <b>C</b> 3 | <b>C2</b> | C1 | CO | Compare Result          |
|------------|-----------|----|----|-------------------------|
| 0          | 0         | 0  | 0  | ST(0) > source          |
| 0          | 0         | 0  | 1  | ST(0) < source          |
| 1          | 0         | 0  | 0  | ST(0) = source          |
| 1          | 1         | 0  | 1  | Operands were unordered |

### **Exceptions**

| Exception                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| Stack, #SS                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                      |
| General protection,<br>#GP   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                         |
|                              |      |                 | Х         | A null data segment was used to reference memory.                                            |
| Page fault, #PF              |      | Χ               | Х         | A page fault resulted from the execution of the instruction.                                 |
| Alignment check, #AC         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.            |

250 FCOMx

| Exception                                                    | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                          |
|--------------------------------------------------------------|------|-----------------|--------------|-----------------------------------------------------------------------------|
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | X            | An unmasked x87 floating-point exception was pending.                       |
|                                                              |      | x87             | Floating-Poi | nt Exception Generated, #MF                                                 |
| Invalid-operation exception (IE)                             | Х    | Х               | Х            | A source operand was an SNaN value, a QNaN value, or an unsupported format. |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х            | An x87 stack underflow occurred.                                            |
| Denormalized-operand exception (DE)                          | Х    | Х               | Х            | A source operand was a denormal value.                                      |

# FCOMIP

# **Floating-Point Compare and Set Flags**

Compares the value in ST(0) with the value in another floating-point register and sets the zero flag (ZF), parity flag (PF), and carry flag (CF) in the rFLAGS register based on the result as shown in the table in the x87 Condition Code section.

The comparison operation ignores the sign of zero (-0.0 = +0.0).

After performing the comparison operation, FCOMIP pops the x87 register stack.

If either or both of the compared values is a NaN or is in an unsupported format, the FCOMIx instruction sets the invalid-operation exception (IE) bit in the x87 status word to 1. Then, if the exception is masked (IM bit set to 1 in the x87 control word), the instruction sets the flags to "unordered." If the exception is unmasked (IM bit cleared to 0), the instruction does not set the flags.

The FUCOMIx instructions perform the same operations as the FCOMIx instructions, but do not set the IE bit for QNaNs.

| Mnemonic           | Opcode          | Description                                                                                                                                          |
|--------------------|-----------------|------------------------------------------------------------------------------------------------------------------------------------------------------|
| FCOMI ST(0),ST(i)  | DB F0+ <i>i</i> | Compare the contents of ST(0) with the contents of ST(i) and set status flags to reflect the results of the comparison.                              |
| FCOMIP ST(0),ST(i) | DF F0+ <i>i</i> | Compare the contents of ST(0) with the contents of ST(i), set status flags to reflect the results of the comparison, and pop the x87 register stack. |

#### **Related Instructions**

FCOM, FCOMPP, FICOM, FICOMP, FTST, FUCOMI, FUCOMIP, FXAM

252 FCOMIX

### rFLAGS Affected

| ZF | PF | CF | Compare Result          |
|----|----|----|-------------------------|
| 0  | 0  | 0  | ST(0) > source          |
| 0  | 0  | 1  | ST(0) < source          |
| 1  | 0  | 0  | ST(0) = source          |
| 1  | 1  | 1  | Operands were unordered |

### **x87 Condition Code**

| x87 Condition Code | Value | Description |
|--------------------|-------|-------------|
| C0                 |       |             |
| C1                 | 0     |             |
| C2                 |       |             |
| C3                 |       |             |

A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U.

| Exception                                                    | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                                           |
|--------------------------------------------------------------|------|-----------------|--------------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х            | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х            | An unmasked x87 floating-point exception was pending.                                        |
|                                                              | •    | x87             | Floating-Poi | nt Exception Generated, #MF                                                                  |
| Invalid-operation exception (IE)                             | Х    | X               | Х            | A source operand was an SNaN value, a QNaN value, or an unsupported format.                  |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х            | An x87 stack underflow occurred.                                                             |
| Denormalized-oper-<br>and exception (DE)                     | Х    | Х               | Х            | A source operand was a denormal value.                                                       |

### **FCOS**

# **Floating-Point Cosine**

Computes the cosine of the radian value in ST(0) and stores the result in ST(0).

If the radian value lies outside the valid range of  $-2^{63}$  to  $+2^{63}$  radians, the instruction sets the C2 flag in the x87 status word to 1 to indicate the value is out of range and does not change the value in ST(0). It does not set any of the exception flags. The program should check the C2 flag and, if necessary, can reduce an invalid source value to the proper range by using the FPREM instruction with the value  $2\pi$  in ST(1) and the out-of-range radian value in ST(0).

| Mnemonic | Opcode | Description                             |
|----------|--------|-----------------------------------------|
| FCOS     | D9 FF  | Replace ST(0) with the cosine of ST(0). |

#### **Related Instructions**

FPTAN, FPATAN, FSIN, FSINCOS

#### rFLAGS Affected

None

#### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                       |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------------------------------------------------------------|--|
| CO                                                                                                  | U     |                                                                   |  |
|                                                                                                     | 0     | x87 stack underflow, if an x87 register stack fault was detected. |  |
| C1                                                                                                  | 0     | Result was rounded down, if a precision exception was detected.   |  |
|                                                                                                     | 1     | Result was rounded up, if a precision exception was detected.     |  |
| C2                                                                                                  | 0     | Source operand was in range.                                      |  |
|                                                                                                     | 1     | Source operand was out of range.                                  |  |
| C3                                                                                                  | U     |                                                                   |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                   |  |

254 FCOS

| Exception                                                    | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                                          |
|--------------------------------------------------------------|------|-----------------|--------------|---------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х            | The emulate bit (EM) or the task switch bit (TS) of the control register (CRO) is set to 1. |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х            | An unmasked x87 floating-point exception was pending.                                       |
|                                                              | •    | x87             | Floating-Poi | nt Exception Generated, #MF                                                                 |
| Invalid-operation exception (IE)                             | Х    | Х               | Х            | A source operand was an SNaN value or an unsupported format.                                |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х            | An x87 stack underflow occurred.                                                            |
| Denormalized-oper-<br>and exception (DE)                     | Х    | Х               | Х            | A source operand was a denormal value.                                                      |
| Underflow exception (UE)                                     | Х    | Х               | Х            | A rounded result was too small to fit into the format of the destination operand.           |
| Precision exception (PE)                                     | Х    | Х               | Х            | A result could not be represented exactly in the destination format.                        |

### **FDECSTP**

# **Floating-Point Decrement Stack-Top Pointer**

Decrements the top-of-stack pointer (TOP) field of the x87 status word. If the TOP field contains 0, it is set to 7. In other words, this instruction rotates the stack by one position.

| Mnemonic | Opcode | Description                                     |
|----------|--------|-------------------------------------------------|
| FDECSTP  | D9 F6  | Decrement the TOP field in the x87 status word. |

| Data Pogistor | Before | FDECSTP       | After FDECSTP |       |
|---------------|--------|---------------|---------------|-------|
| Data Register | Value  | Stack Pointer | Stack Pointer | Value |
| 7             | num1   | ST(7)         | <br>ST(0)     | num1  |
| 6             | num2   | ST(6)         | ST(7)         | num2  |
| 5             | num3   | ST(5)         | ST(6)         | num3  |
| 4             | num4   | ST(4)         | ST(5)         | num4  |
| 3             | num5   | ST(3)         | ST(4)         | num5  |
| 2             | num6   | ST(2)         | ST(3)         | num6  |
| 1             | num7   | ST(1)         | ST(2)         | num7  |
| 0             | num8   | ST(0)         | ST(1)         | num8  |

#### **Related Instructions**

**FINCSTP** 

#### rFLAGS Affected

None

256 FDECSTP

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------|--|--|
| CO                                                                                                  | U     |             |  |  |
| C1                                                                                                  | 0     |             |  |  |
| C2                                                                                                  | U     |             |  |  |
| C3                                                                                                  | U     |             |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |             |  |  |

# Exceptions

| Exception                                       | Real  | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|-------------------------------------------------|-------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Exception                                       | iveai | 0000            | Trotecteu | cause of exception                                                                           |
| Device not available,<br>#NM                    | Х     | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| x87 floating-point<br>exception pending,<br>#MF | Х     | Х               | Х         | An unmasked x87 floating-point exception was pending.                                        |

FDECSTP 257

### FDIV FDIVP FIDIV

### **Floating-Point Divide**

Divides the value in a floating-point register by the value in another register or a memory location and stores the result in the register containing the dividend. For the FDIV and FDIVP instructions, the divisor value in memory can be stored in single-precision or double-precision floating-point format.

If only one operand is specified, the instruction divides the value in ST(0) by the value in the specified memory location.

If no operands are specified, the FDIVP instruction divides the value in ST(1) by the value in ST(0), stores the result in ST(1), and pops the x87 register stack.

The FIDIV instruction converts a divisor in word integer or short integer format to double-extended-precision floating-point format before performing the division. It treats an integer 0 as +0.

If the zero-divide exception is not masked (ZM bit cleared to 0 in the x87 control word) and the operation causes a zero-divide exception (sets the ZE bit in the x87 status word to 1), the operation stores no result. If the zero-divide exception is masked (ZM bit set to 1), a zero-divide exception causes  $\pm \infty$  to be stored.

The sign of the operands, even if one of the operands is 0, determines the sign of the result.

| Mnemonic          | Opcode          | Description                                                     |
|-------------------|-----------------|-----------------------------------------------------------------|
| FDIV ST(0),ST(i)  | D8 F0+ <i>i</i> | Replace ST(0) with ST(0)/ST(i).                                 |
| FDIV ST(i),ST(0)  | DC F8+ <i>i</i> | Replace ST(i) with ST(i)/ST(0).                                 |
| FDIV mem32real    | D8/6            | Replace ST(0) with ST(0)/mem32real.                             |
| FDIV mem64real    | DC /6           | Replace ST(0) with ST(0)/mem64real.                             |
| FDIVP             | DE F9           | Replace ST(1) with ST(1)/ST(0), and pop the x87 register stack. |
| FDIVP ST(i),ST(0) | DE F8+ <i>i</i> | Replace ST(i) with ST(i)/ST(0), and pop the x87 register stack. |
| FIDIV mem 16int   | DE/6            | Replace ST(0) with ST(0)/mem16int.                              |
| FIDIV mem32int    | DA /6           | Replace ST(0) with ST(0)/mem32int.                              |
|                   |                 |                                                                 |

258 FDIVx

### **Related Instructions**

FDIVR, FDIVRP, FIDIVR

### rFLAGS Affected

None

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                       |  |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------------------------------------------------------------|--|--|--|
| CO                                                                                                  | U     |                                                                   |  |  |  |
|                                                                                                     | 0     | x87 stack underflow, if an x87 register stack fault was detected. |  |  |  |
| <b>C</b> 1                                                                                          | 0     | Result was rounded down, if a precision exception was detected.   |  |  |  |
|                                                                                                     | 1     | Result was rounded up, if a precision exception was detected.     |  |  |  |
| C2                                                                                                  | U     |                                                                   |  |  |  |
| C3                                                                                                  | U     |                                                                   |  |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                   |  |  |  |

# **Exceptions**

| Exception                                       | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|-------------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                    | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| Stack, #SS                                      | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                      |
| General protection,<br>#GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                         |
|                                                 |      |                 | Х         | A null data segment was used to reference memory.                                            |
| Page fault, #PF                                 |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                 |
| Alignment check, #AC                            |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.            |
| x87 floating-point<br>exception pending,<br>#MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                        |

| Exception                                                    | Real                                        | Virtual<br>8086 | Protected | Cause of Exception                                                                |  |  |  |  |
|--------------------------------------------------------------|---------------------------------------------|-----------------|-----------|-----------------------------------------------------------------------------------|--|--|--|--|
| Exception                                                    | x87 Floating-Point Exception Generated, #MF |                 |           |                                                                                   |  |  |  |  |
| Invalid-operation exception (IE)                             | X                                           | Х               | X         | A source operand was an SNaN value or an unsupported format.                      |  |  |  |  |
|                                                              | Х                                           | Х               | Х         | ±infinity was divided by ±infinity.                                               |  |  |  |  |
|                                                              | Х                                           | Х               | Х         | ±zero was divided by ±zero.                                                       |  |  |  |  |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х                                           | Х               | Х         | An x87 stack underflow occurred.                                                  |  |  |  |  |
| Denormalized-oper-<br>and exception (DE)                     | Х                                           | Х               | Х         | A source operand was a denormal value.                                            |  |  |  |  |
| Zero-divide exception (ZE)                                   | Х                                           | Х               | Х         | A non-zero value was divided by ±0.                                               |  |  |  |  |
| Overflow exception (OE)                                      | Х                                           | Х               | Х         | A rounded result was too large to fit into the format of the destination operand. |  |  |  |  |
| Underflow exception (UE)                                     | Х                                           | Х               | Х         | A rounded result was too small to fit into the format of the destination operand. |  |  |  |  |
| Precision exception (PE)                                     | Х                                           | Х               | Х         | A result could not be represented exactly in the destination format.              |  |  |  |  |

260 FDIVx

# FDIVR FDIVRP FIDIVR

### **Floating-Point Divide Reverse**

Divides a value in a floating-point register or a memory location by the value in a floating-point register and stores the result in the register containing the divisor. For the FDIVR and FDIVRP instructions, a dividend value in memory can be stored in single-precision or double-precision floating-point format.

If one operand is specified, the instruction divides the value at the specified memory location by the value in ST(0). If two operands are specified, it divides the value in ST(0) by the value in another x87 stack register or vice versa.

The FIDIVR instruction converts a dividend in word integer or short integer format to double-extended-precision format before performing the division.

The FDIVRP instruction pops the x87 register stack after performing the division operation. If no operand is specified, the FDIVRP instruction divides the value in ST(0) by the value in ST(1).

If the zero-divide exception is not masked (ZM bit cleared to 0 in the x87 control word) and the operation causes a zero-divide exception (sets the ZE bit in the x87 status word to 1), the operation stores no result. If the zero-divide exception is masked (ZM bit set to 1), a zero-divide exception causes  $\pm \infty$  to be stored.

The sign of the operands, even if one of the operands is 0, determines the sign of the result.

| Mnemonic            | Opcode           | Description                                                          |
|---------------------|------------------|----------------------------------------------------------------------|
| FDIVR ST(0),ST(i)   | D8 F8+ <i>i</i>  | Replace ST(0) with ST(i)/ST(0).                                      |
| FDIVR ST(i), ST(0)  | DC F0+ <i>i</i>  | Replace $ST(i)$ with $ST(0)/ST(i)$ .                                 |
| FDIVR mem32real     | D8 /7            | Replace ST(0) with mem32real/ST(0).                                  |
| FDIVR mem64real     | DC /7            | Replace ST(0) with mem64real/ST(0).                                  |
| FDIVRP              | DE F1            | Replace $ST(1)$ with $ST(0)/ST(1)$ , and pop the x87 register stack. |
| FDIVRP ST(i), ST(0) | DE F0 + <i>i</i> | Replace $ST(i)$ with $ST(0)/ST(i)$ , and pop the x87 register stack. |
| FIDIVR mem 16int    | DE /7            | Replace ST(0) with mem16int/ST(0).                                   |
| FIDIVR mem32int     | DA /7            | Replace ST(0) with mem32int/ST(0).                                   |

FDIVRx 261

### **Related Instructions**

FDIV, FDIVP, FIDIV

### rFLAGS Affected

None

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                       |  |  |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------------------------------------------------------------|--|--|--|--|
| CO                                                                                                  | U     |                                                                   |  |  |  |  |
|                                                                                                     | 0     | x87 stack underflow, if an x87 register stack fault was detected. |  |  |  |  |
| C1                                                                                                  | 0     | Result was rounded down, if a precision exception was detected.   |  |  |  |  |
|                                                                                                     | 1     | Result was rounded up, if a precision exception was detected.     |  |  |  |  |
| C2                                                                                                  | U     |                                                                   |  |  |  |  |
| C3                                                                                                  | U     |                                                                   |  |  |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                   |  |  |  |  |

# **Exceptions**

| Exception                                       | Real | Virtual<br>8086 | Protected                                                                                      | Cause of Exception                                                                |  |
|-------------------------------------------------|------|-----------------|------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------|--|
| Device not available,<br>#NM                    | Х    | Х               | X The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |                                                                                   |  |
| Stack, #SS                                      | Х    | Х               | X A memory address exceeded the stack segment limit or is non-canonical.                       |                                                                                   |  |
| General protection,<br>#GP                      | Х    | Х               | Х                                                                                              | A memory address exceeded a data segment limit or is non-canonical.               |  |
|                                                 |      |                 | Х                                                                                              | A null data segment was used to reference memory.                                 |  |
| Page fault, #PF                                 |      | Χ               | Х                                                                                              | A page fault resulted from the execution of the instruction.                      |  |
| Alignment check, #AC                            |      | Х               | Х                                                                                              | An unaligned memory reference was performed while alignment checking was enabled. |  |
| x87 floating-point<br>exception pending,<br>#MF | Х    | Х               | Х                                                                                              | An unmasked x87 floating-point exception was pending.                             |  |

262 FDIVRx

|                                                              |      | Virtual |              |                                                                                   |
|--------------------------------------------------------------|------|---------|--------------|-----------------------------------------------------------------------------------|
| Exception                                                    | Real | 8086    | Protected    | Cause of Exception                                                                |
|                                                              |      | x87     | Floating-Poi | nt Exception Generated, #MF                                                       |
| Invalid-operation exception (IE)                             | Х    | Х       | Х            | A source operand was an SNaN value or an unsupported format.                      |
| . , ,                                                        | Х    | Х       | Х            | ±infinity was divided by ±infinity.                                               |
|                                                              | Х    | Х       | Х            | ±zero was divided by ±zero.                                                       |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х       | Х            | An x87 stack underflow occurred.                                                  |
| Denormalized-oper-<br>and exception (DE)                     | Х    | Х       | Х            | A source operand was a denormal value.                                            |
| Zero-divide exception (ZE)                                   | Х    | Х       | Х            | A non-zero value was divided by ±zero.                                            |
| Overflow exception (OE)                                      | Х    | Х       | Х            | A rounded result was too large to fit into the format of the destination operand. |
| Underflow exception (UE)                                     | Х    | Х       | Х            | A rounded result was too small to fit into the format of the destination operand. |
| Precision exception (PE)                                     | Х    | Х       | Х            | A result could not be represented exactly in the destination format.              |

### **FFREE**

# **Floating-Point Free Register**

Frees the specified x87 stack register by marking its tag register entry as empty. The instruction does not affect the contents of the freed register or the top-of-stack pointer (TOP).

| Mnemonic    | Opcode          | Description                                            |
|-------------|-----------------|--------------------------------------------------------|
| FFREE ST(i) | DD C0+ <i>i</i> | Set the tag for x87 stack register $i$ to empty (11b). |

#### **Related Instructions**

FLD, FST, FSTP

### **rFLAGS Affected**

None

#### **x87 Condition Code**

| x87 Condition Code         | Value                                                                                               | Description |  |  |  |  |
|----------------------------|-----------------------------------------------------------------------------------------------------|-------------|--|--|--|--|
| CO                         | U                                                                                                   |             |  |  |  |  |
| C1                         | U                                                                                                   |             |  |  |  |  |
| C2                         | U                                                                                                   |             |  |  |  |  |
| C3                         | U                                                                                                   |             |  |  |  |  |
| A flag set to 1 or cleared | A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |             |  |  |  |  |

### **Exceptions**

| Exception                                       | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                          |
|-------------------------------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                    | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) is set to 1. |
| x87 floating-point<br>exception pending,<br>#MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                       |

264 FFREE

# FICOM FICOMP

# **Floating-Point Integer Compare**

Converts a 16-bit or 32-bit signed integer value to double-extended-precision format, compares it to the value in ST(0), and sets the C0, C2, and C3 condition code flags in the x87 status word to reflect the results.

The comparison operation ignores the sign of zero (-0.0 = +0.0).

After performing the comparison operation, the FICOMP instruction pops the x87 register stack.

If ST(0) is a NaN or is in an unsupported format, the instruction sets the condition flags to "unordered."

| Mnemonic        | Opcode | Description                                                                                                                                                                                                             |
|-----------------|--------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| FICOM mem16int  | DE /2  | Convert the contents of <i>mem16int</i> to double-extended-precision format, compare the result to the contents of ST(0), and set condition flags to reflect the results of the comparison.                             |
| FICOM mem32int  | DA /2  | Convert the contents of <i>mem32int</i> to double-extended-precision format, compare the result to the contents of ST(0), and set condition flags to reflect the results of the comparison.                             |
| FICOMP mem16int | DE/3   | Convert the contents of <i>mem16int</i> to double-extended-precision format, compare the result to the contents of ST(0), set condition flags to reflect the results of the comparison, and pop the x87 register stack. |
| FICOMP mem32int | DA /3  | Convert the contents of <i>mem32int</i> to double-extended-precision format, compare the result to the contents of ST(0), set condition flags to reflect the results of the comparison, and pop the x87 register stack. |

#### **Related Instructions**

FCOM, FCOMPP, FCOMI, FCOMIP, FTST, FUCOMI, FUCOMIP, FXAM

#### rFLAGS Affected

None

### **x87 Condition Code**

| <b>C</b> 3 | <b>C</b> 2 | <b>C</b> 1 | CO | Compare Result          |
|------------|------------|------------|----|-------------------------|
| 0          | 0          | 0          | 0  | ST(0) > source          |
| 0          | 0          | 0          | 1  | ST(0) < source          |
| 1          | 0          | 0          | 0  | ST(0) = source          |
| 1          | 1          | 0          | 1  | Operands were unordered |

# **Exceptions**

| Exception                                                    | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                                           |
|--------------------------------------------------------------|------|-----------------|--------------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х            | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| Stack, #SS                                                   | Х    | Х               | Х            | A memory address exceeded the stack segment limit or was non-canonical.                      |
| General protection,<br>#GP                                   | Х    | Х               | Х            | A memory address exceeded a data segment limit or was non-canonical.                         |
|                                                              |      |                 | Х            | A null data segment was used to reference memory.                                            |
| Page fault, #PF                                              |      | Х               | Х            | A page fault resulted from the execution of the instruction.                                 |
| Alignment check, #AC                                         |      | Х               | Х            | An unaligned memory reference was performed while alignment checking was enabled.            |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х            | An unmasked x87 floating-point exception was pending.                                        |
|                                                              | l    | x87             | Floating-Poi | nt Exception Generated, #MF                                                                  |
| Invalid-operation exception (IE)                             | X    | Х               | Х            | A source operand was an SNaN value, a QNaN value, or an unsupported format.                  |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х            | An x87 stack underflow occurred.                                                             |
| Denormalized-oper-<br>and exception (DE)                     | Х    | Х               | Х            | A source operand was a denormal value.                                                       |

266 FICOMx

### **FILD**

# **Floating-Point Load Integer**

Converts a signed-integer in memory to double-extended-precision format and pushes the value onto the x87 register stack. The value can be a 16-bit, 32-bit, or 64-bit integer value. Signed values from memory can always be represented exactly in x87 registers without rounding.

| Mnemonic       | Opcode | Description                                                       |
|----------------|--------|-------------------------------------------------------------------|
| FILD mem 16int | DF/0   | Push the contents of <i>mem16int</i> onto the x87 register stack. |
| FILD mem32int  | DB /0  | Push the contents of <i>mem32int</i> onto the x87 register stack. |
| FILD mem64int  | DF /5  | Push the contents of <i>mem64int</i> onto the x87 register stack. |

#### **Related Instructions**

FLD, FST, FSTP, FIST, FISTP, FBLD, FBSTP

### **rFLAGS Affected**

None

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                      |  |  |
|-----------------------------------------------------------------------------------------------------|-------|------------------------------------------------------------------|--|--|
| CO                                                                                                  | U     |                                                                  |  |  |
| C1                                                                                                  | 0     | No stack overflow                                                |  |  |
|                                                                                                     | 1     | x87 stack overflow, if an x87 register stack fault was detected. |  |  |
| C2                                                                                                  | U     |                                                                  |  |  |
| C3                                                                                                  | U     |                                                                  |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                  |  |  |

| Exception                                                    | Real                                        | Virtual<br>8086 | Protected | Cause of Exception                                                                          |  |
|--------------------------------------------------------------|---------------------------------------------|-----------------|-----------|---------------------------------------------------------------------------------------------|--|
| Device not available,<br>#NM                                 | Х                                           | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CRO) is set to 1. |  |
| Stack, #SS                                                   | Х                                           | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                     |  |
| General protection,<br>#GP                                   | Х                                           | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                        |  |
|                                                              |                                             |                 | Х         | A null data segment was used to reference memory.                                           |  |
| Page fault, #PF                                              |                                             | Х               | Х         | A page fault resulted from the execution of the instruction.                                |  |
| Alignment check, #AC                                         |                                             | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.           |  |
| x87 floating-point<br>exception pending,<br>#MF              | Х                                           | Х               | Х         | An unmasked x87 floating-point exception was pending.                                       |  |
|                                                              | x87 Floating-Point Exception Generated, #MF |                 |           |                                                                                             |  |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х                                           | Х               | Х         | An x87 stack overflow occurred.                                                             |  |

268 FILD

### **FINCSTP**

# **Floating-Point Increment Stack-Top Pointer**

Increments the top-of-stack pointer (TOP) field of the x87 status word. If the TOP field contains 7, it is cleared to 0. In other words, this instruction rotates the stack by one position.

| Mnemonic | Opcode | Description                                     |
|----------|--------|-------------------------------------------------|
| FINCSTP  | D9 F7  | Increment the TOP field in the x87 status word. |

| Data Register | Before | FINCSTP       |     | After F       | INCSTP |
|---------------|--------|---------------|-----|---------------|--------|
| Data Register | Value  | Stack Pointer |     | Stack Pointer | Value  |
| 7             | num1   | ST(7)         | , . | ST(6)         | num1   |
| 6             | num2   | ST(6)         |     | ST(5)         | num2   |
| 5             | num3   | ST(5)         |     | ST(4)         | num3   |
| 4             | num4   | ST(4)         |     | ST(3)         | num4   |
| 3             | num5   | ST(3)         |     | ST(2)         | num5   |
| 2             | num6   | ST(2)         |     | ST(1)         | num6   |
| 1             | num7   | ST(1)         |     | ST(0)         | num7   |
| 0             | num8   | ST(0)         |     | ST(7)         | num8   |

#### **Related Instructions**

**FDECSTP** 

#### rFLAGS Affected

None

# **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------|--|--|
| CO                                                                                                  | U     |             |  |  |
| C1                                                                                                  | 0     |             |  |  |
| C2                                                                                                  | U     |             |  |  |
| C3                                                                                                  | U     |             |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |             |  |  |

# Exceptions

| Exception                                       | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|-------------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                    | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| x87 floating-point<br>exception pending,<br>#MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                        |

270 FINCSTP

# FINIT (FNINIT)

# **Floating-Point Initialize**

Sets the x87 control word register, status word register, tag word register, instruction pointer, and data pointer to their default states as follows:

- Sets the x87 control word to 037Fh—round to nearest (RC = 00b); double-extended-precision (PC = 11b); all exceptions masked (PM, UM, OM, ZM, DM, and IM all set to 1).
- Clears all bits in the x87 status word (TOP is set to 0, which maps ST(0) onto FPR0).
- Marks all x87 stack registers as empty (11b) in the x87 tag register.
- Clears the instruction pointer and the data pointer.

These instructions do not actually zero out the x87 stack registers.

Assemblers usually provide an FINIT macro that expands into the instruction sequence

WAIT ; Opcode 9B FNINIT destination ; Opcode DB E3

The WAIT (9Bh) instruction checks for pending x87 exceptions and calls an exception handler, if necessary. The FNINIT instruction then resets the x87 environment to its default state.

| Mnemonic | Opcode   | Description                                                                                          |
|----------|----------|------------------------------------------------------------------------------------------------------|
| FINIT    | 9B DB E3 | Perform a WAIT (9B) to check for pending floating-point exceptions and then initialize the x87 unit. |
| FNINIT   | DB E3    | Initialize the x87 unit without checking for unmasked floating-point exceptions.                     |

#### **Related Instructions**

FWAIT, WAIT

#### rFLAGS Affected

None

# **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------|--|--|
| CO                                                                                                  | 0     |             |  |  |
| C1                                                                                                  | 0     |             |  |  |
| C2                                                                                                  | 0     |             |  |  |
| C3                                                                                                  | 0     |             |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |             |  |  |

# **Exceptions**

| Exception                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |

# FIST FISTP

# **Floating-Point Integer Store**

Converts the value in ST(0) to a signed integer, rounds it if necessary, and copies it to the specified memory location. The rounding control (RC) field of the x87 control word determines the type of rounding used.

The FIST instruction supports 16-bit and 32-bit values. The FISTP instructions supports 16-bit, 32-bit, and 64-bit values.

The FISTP instruction pops the stack after storing the rounded value in memory.

If the value is too large for the destination location, is a NaN, or is in an unsupported format, the instruction sets the invalid-operation exception (IE) bit in the x87 status word to 1. Then, if the exception is masked (IM bit set to 1 in the x87 control word), the instruction stores the integer indefinite value. If the exception is unmasked (IM bit cleared to 0), the instruction does not store the value.

| Mnemonic        | Opcode | Description                                                                                                     |
|-----------------|--------|-----------------------------------------------------------------------------------------------------------------|
| FIST mem16int   | DF/2   | Convert the contents of ST(0) to integer and store the result in <i>mem16int</i> .                              |
| FIST mem32int   | DB /2  | Convert the contents of ST(0) to integer and store the result in <i>mem32int</i> .                              |
| FISTP mem 16int | DF/3   | Convert the contents of ST(0) to integer, store the result in <i>mem16int</i> , and pop the x87 register stack. |
| FISTP mem32int  | DB /3  | Convert the contents of ST(0) to integer, store the result in <i>mem32int</i> , and pop the x87 register stack. |
| FISTP mem64int  | DF/7   | Convert the contents of ST(0) to integer, store the result in <i>mem64int</i> , and pop the x87 register stack. |

Table 2-1 on page 274 shows the results of storing various types of numbers as integers.

FISTx 273

Table 2-1. Storing Numbers as Integers

| ST(0)                  | DEST                                                                                        |
|------------------------|---------------------------------------------------------------------------------------------|
| -∞                     | Invalid-operation (IE) exception                                                            |
| –Finite-real < −1      | -Integer (Invalid-operation (IE) exception if the integer is too large for the destination) |
| -1 < -Finite-real< -0  | 0 or −1, depending on the rounding mode                                                     |
| -0                     | 0                                                                                           |
| +0                     | 0                                                                                           |
| +0 < +Finite-real < +1 | 0 or +1, depending on the rounding mode                                                     |
| +Finite-real > +1      | +Integer (Invalid-operation (IE) exception if the integer is too large for the destination) |
| +∞                     | Invalid-operation (IE) exception                                                            |
| NaN                    | Invalid-operation (IE) exception                                                            |

### **Related Instructions**

FLD, FST, FSTP, FILD, FBLD, FBSTP

### rFLAGS Affected

None

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                       |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------------------------------------------------------------|--|--|
| CO                                                                                                  | U     |                                                                   |  |  |
|                                                                                                     | 0     | x87 stack underflow, if an x87 register stack fault was detected. |  |  |
| C1                                                                                                  | 0     | Result was rounded down, if a precision exception was detected.   |  |  |
|                                                                                                     | 1     | Result was rounded up, if a precision exception was detected.     |  |  |
| C2                                                                                                  | U     |                                                                   |  |  |
| C3                                                                                                  | U     |                                                                   |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                   |  |  |

274 FISTx

| Exception                                                    | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                                           |
|--------------------------------------------------------------|------|-----------------|--------------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х            | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| Stack, #SS                                                   | Х    | Х               | Х            | A memory address exceeded the stack segment limit or was non-canonical.                      |
| General protection,<br>#GP                                   | Х    | Х               | Х            | A memory address exceeded a data segment limit or was non-canonical.                         |
|                                                              |      |                 | Х            | The destination operand was in a nonwritable segment.                                        |
|                                                              |      |                 | Х            | A null data segment was used to reference memory.                                            |
| Page fault, #PF                                              |      | Х               | Х            | A page fault resulted from the execution of the instruction.                                 |
| Alignment check, #AC                                         |      | Х               | Х            | An unaligned memory reference was performed while alignment checking was enabled.            |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х            | An unmasked x87 floating-point exception was pending.                                        |
|                                                              |      | x87             | Floating-Poi | nt Exception Generated, #MF                                                                  |
| Invalid-operation exception (IE)                             | Х    | Х               | Х            | The source operand was too large for the destination format.                                 |
| . ,                                                          | Х    | Х               | Х            | A source operand was an SNaN value, a QNaN value, or an unsupported format.                  |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х            | An x87 stack underflow occurred.                                                             |
| Precision exception (PE)                                     | Х    | Х               | Х            | A result could not be represented exactly in the destination format.                         |

### **FLD**

# **Floating-Point Load**

Pushes a value in memory or in a floating-point register onto the register stack. If in memory, the value can be a single-precision, double-precision, or double-extended-precision floating-point value. The operation converts a single-precision or double-precision value to double-extended-precision format before pushing it onto the stack.

| Mnemonic      | Opcode          | Description                                                 |
|---------------|-----------------|-------------------------------------------------------------|
| FLD ST(i)     | D9 C0+ <i>i</i> | Push the contents of ST(i) onto the x87 register stack.     |
| FLD mem32real | D9 /0           | Push the contents of mem32real onto the x87 register stack. |
| FLD mem64real | DD /0           | Push the contents of mem64real onto the x87 register stack. |
| FLD mem80real | DB /5           | Push the contents of mem80real onto the x87 register stack. |

#### **Related Instructions**

FFREE, FST, FSTP, FILD, FIST, FISTP, FBLD, FBSTP

#### rFLAGS Affected

None

#### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                       |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------------------------------------------------------------|--|--|
| CO                                                                                                  | U     |                                                                   |  |  |
|                                                                                                     | 0     | x87 stack underflow, if an x87 register stack fault was detected. |  |  |
| C1                                                                                                  | 1     | x87 stack overflow, if an x87 register stack fault was detected.  |  |  |
|                                                                                                     | 0     | No x87 stack fault.                                               |  |  |
| C2                                                                                                  | U     |                                                                   |  |  |
| C3                                                                                                  | U     |                                                                   |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                   |  |  |

276 FLD

|                                                 |      | Virtual |              |                                                                                                                                     |
|-------------------------------------------------|------|---------|--------------|-------------------------------------------------------------------------------------------------------------------------------------|
| Exception                                       | Real | 8086    | Protected    | Cause of Exception                                                                                                                  |
| Device not available,<br>#NM                    | Х    | Х       | Х            | The emulate bit (EM) or the task switch bit (TS) of the control register (CRO) was set to 1.                                        |
| Stack, #SS                                      | Х    | Х       | Х            | A memory address exceeded the stack segment limit or was non-canonical.                                                             |
| General protection,<br>#GP                      | Х    | Х       | Х            | A memory address exceeded a data segment limit or was non-canonical.                                                                |
|                                                 |      |         | Х            | A null data segment was used to reference memory.                                                                                   |
| Page fault, #PF                                 |      | Х       | Х            | A page fault resulted from the execution of the instruction.                                                                        |
| Alignment check, #AC                            |      | Х       | Х            | An unaligned memory reference was performed while alignment checking was enabled.                                                   |
| x87 floating-point<br>exception pending,<br>#MF | Х    | Х       | Х            | An unmasked x87 floating-point exception was pending.                                                                               |
|                                                 |      | x87     | Floating-Poi | nt Exception Generated, #MF                                                                                                         |
| Invalid-operation exception (IE)                | Х    | Х       | Х            | A source operand was an SNaN value.                                                                                                 |
| Invalid-operation exception (IE) with           | Х    | Х       | Х            | An x87 stack underflow occurred.                                                                                                    |
| stack fault (SF)                                | X    | Х       | Х            | An x87 stack overflow occurred.                                                                                                     |
| Denormalized-oper-<br>and exception (DE)        | Х    | X       | X            | A source operand was a denormal value. This exception does not occur if the source operand was in double-extended-precision format. |

### FLD1

# Floating-Point Load +1.0

Pushes the floating-point value +1.0 onto the register stack.

| Mnemonic | Opcode | Description                            |
|----------|--------|----------------------------------------|
| FLD1     | D9 E8  | Push +1.0 onto the x87 register stack. |

### **Related Instructions**

FLD, FLDZ, FLDPI, FLDL2T, FLDL2E, FLDLG2, FLDLN2

### rFLAGS Affected

None

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                      |  |  |  |
|-----------------------------------------------------------------------------------------------------|-------|------------------------------------------------------------------|--|--|--|
| CO                                                                                                  | U     |                                                                  |  |  |  |
| Cl                                                                                                  | 0     | No x87 stack fault occurred.                                     |  |  |  |
| C1 1                                                                                                |       | x87 stack overflow, if an x87 register stack fault was detected. |  |  |  |
| C2                                                                                                  | U     |                                                                  |  |  |  |
| C3                                                                                                  | U     |                                                                  |  |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                  |  |  |  |

### **Exceptions**

| Exception                                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                          |
|--------------------------------------------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) is set to 1. |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                       |
| x87 Floating-Point Exception Generated, #MF                  |      |                 |           |                                                                                             |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | X               | X         | An x87 stack overflow occurred.                                                             |

278 FLD1

### **FLDCW**

# **Floating-Point Load x87 Control Word**

Loads a 16-bit value from the specified memory location into the x87 control word. If the new x87 control word unmasks any pending floating point exceptions, then they are handled upon execution of the next x87 floating-point or 64-bit media instruction.

To avoid generating exceptions when loading a new control word, use the FCLEX or FNCLEX instruction to clear any pending exceptions.

| Mnemonic      | Opcode | Description                                                    |
|---------------|--------|----------------------------------------------------------------|
| FLDCW mem2env | D9 /5  | Load the contents of <i>mem2env</i> into the x87 control word. |

#### **Related Instructions**

FSTCW, FNSTCW, FSTSW, FNSTSW, FSTENV, FNSTENV, FLDENV, FCLEX, FNCLEX

#### rFLAGS Affected

None

#### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description |  |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------|--|--|--|
| CO                                                                                                  | U     |             |  |  |  |
| C1                                                                                                  | U     |             |  |  |  |
| C2                                                                                                  | U     |             |  |  |  |
| C3                                                                                                  | U     |             |  |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |             |  |  |  |

FLDCW 279

| Exception                                       | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|-------------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                    | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CRO) was set to 1. |
| Stack, #SS                                      | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                      |
| General protection,<br>#GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                         |
|                                                 |      |                 | Х         | A null data segment was used to reference memory.                                            |
| Page fault, #PF                                 |      | Χ               | Х         | A page fault resulted from the execution of the instruction.                                 |
| Alignment check, #AC                            |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.            |
| x87 floating-point<br>exception pending,<br>#MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                        |

280 FLDCW

#### **FLDENV**

# **Floating-Point Load x87 Environment**

Restores the x87 environment from memory starting at the specified address. The x87 environment consists of the x87 control, status, and tag word registers, the last non-control x87 instruction pointer, the last x87 data pointer, and the opcode of the last completed non-control x87 instruction.

The x87 environment requires a 14-byte or 28-byte area in memory, depending on whether the processor is operating in protected or real mode and whether the operand-size attribute is 16-bit or 32-bit. See "Media and x87 Processor State" in volume 2 for details on how this instruction stores the x87 environment in memory.

The environment to be loaded is typically stored by a previous FNSTENV or FSTENV instruction. The FLDENV instruction should be executed in the same operating mode as the instruction that stored the x87 environment.

If FLDENV results in set exception flags in the loaded x87 status word register, and these exceptions are unmasked in the x87 control word register, a floating-point exception occurs when the next floating-point instruction is executed (except for the no-wait floating-point instructions).

To avoid generating exceptions when loading a new environment, use the FCLEX or FNCLEX instruction to clear the exception flags in the x87 status word before storing that environment.

| Mnemonic           | Opcode | Description                                                         |
|--------------------|--------|---------------------------------------------------------------------|
| FLDENV mem14/28env | D9 /4  | Load the complete contents of the x87 environment from mem14/28env. |

#### **Related Instructions**

FSTENV, FNSTENV, FCLEX, FNCLEX

#### **rFLAGS Affected**

None

# **x87 Condition Code**

| x87 Condition Code                                                                                  | Value                    | Description         |  |  |  |  |
|-----------------------------------------------------------------------------------------------------|--------------------------|---------------------|--|--|--|--|
| CO                                                                                                  | M                        | Loaded from memory. |  |  |  |  |
| C1                                                                                                  | М                        | Loaded from memory. |  |  |  |  |
| C2                                                                                                  | М                        | Loaded from memory. |  |  |  |  |
| C3                                                                                                  | C3 M Loaded from memory. |                     |  |  |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |                          |                     |  |  |  |  |

# Exceptions

| Exception                                       | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|-------------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                    | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CRO) was set to 1. |
| Stack, #SS                                      | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                      |
| General protection,<br>#GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                         |
|                                                 |      |                 | Х         | A null data segment was used to reference memory.                                            |
| Page fault, #PF                                 |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                 |
| Alignment check, #AC                            |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.            |
| x87 floating-point<br>exception pending,<br>#MF | X    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                        |

282 FLDENV

### FLDL<sub>2</sub>E

# Floating-Point Load Log<sub>2</sub> e

Pushes  $log_2e$  onto the x87 register stack. The value in ST(0) is the result, in double-extended-precision format, of rounding an internal 66-bit constant according to the setting of the RC field in the x87 control word register.

| Mnemonic | Opcode | Description                                          |  |  |
|----------|--------|------------------------------------------------------|--|--|
| FLDL2E   | D9 EA  | Push log <sub>2</sub> e onto the x87 register stack. |  |  |

#### **Related Instructions**

FLD, FLD1, FLDZ, FLDPI, FLDL2T, FLDLG2, FLDLN2

#### rFLAGS Affected

None

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                      |  |  |  |  |
|-----------------------------------------------------------------------------------------------------|-------|------------------------------------------------------------------|--|--|--|--|
| CO                                                                                                  | U     |                                                                  |  |  |  |  |
| C1                                                                                                  | 0     | No x87 stack fault occurred.                                     |  |  |  |  |
| Ci                                                                                                  | 1     | x87 stack overflow, if an x87 register stack fault was detected. |  |  |  |  |
| C2                                                                                                  | U     |                                                                  |  |  |  |  |
| C3                                                                                                  | U     | U                                                                |  |  |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                  |  |  |  |  |

FLDL2E 283

| Exception                                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|--------------------------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                        |
| x87 Floating-Point Exception Generated, #MF                  |      |                 |           |                                                                                              |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х         | An x87 stack overflow occurred.                                                              |

284 FLDL2E

### FLDL2T

# Floating-Point Load Log<sub>2</sub> 10

Pushes  $\log_2 10$  onto the x87 register stack. The value in ST(0) is the result, in double-extended-precision format, of rounding an internal 66-bit constant according to the setting of the RC field in the x87 control word register.

| Mnemonic | Opcode | Description                                           |
|----------|--------|-------------------------------------------------------|
| FLDL2T   | D9 E9  | Push log <sub>2</sub> 10 onto the x87 register stack. |

#### **Related Instructions**

FLD, FLD1, FLDZ, FLDPI, FLDL2E, FLDLG2, FLDLN2

#### rFLAGS Affected

None

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                      |  |  |  |  |
|-----------------------------------------------------------------------------------------------------|-------|------------------------------------------------------------------|--|--|--|--|
| CO                                                                                                  | U     |                                                                  |  |  |  |  |
| C1                                                                                                  | 0     | No x87 stack fault occurred.                                     |  |  |  |  |
| 1                                                                                                   |       | x87 stack overflow, if an x87 register stack fault was detected. |  |  |  |  |
| C2                                                                                                  | U     |                                                                  |  |  |  |  |
| C3                                                                                                  | U     | U                                                                |  |  |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                  |  |  |  |  |

FLDL2T 285

| Exception                                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|--------------------------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                        |
| x87 Floating-Point Exception Generated, #MF                  |      |                 |           |                                                                                              |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х         | An x87 stack overflow occurred.                                                              |

286 FLDL2T

### FLDLG2

# Floating-Point Load Log<sub>10</sub> 2

Pushes  $\log_{10} 2$  onto the x87 register stack. The value in ST(0) is the result, in double-extended-precision format, of rounding an internal 66-bit constant according to the setting of the RC field in the x87 control word register.

| Mnemonic | Opcode | Description                                           |
|----------|--------|-------------------------------------------------------|
| FLDLG2   | D9 EC  | Push log <sub>10</sub> 2 onto the x87 register stack. |

#### **Related Instructions**

FLD, FLD1, FLDZ, FLDPI, FLDL2T, FLDL2E, FLDLN2

#### rFLAGS Affected

None

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                      |  |  |  |  |  |
|-----------------------------------------------------------------------------------------------------|-------|------------------------------------------------------------------|--|--|--|--|--|
| CO                                                                                                  | U     |                                                                  |  |  |  |  |  |
| C1                                                                                                  | 0     | No x87 stack fault occurred.                                     |  |  |  |  |  |
| Ci                                                                                                  | 1     | x87 stack overflow, if an x87 register stack fault was detected. |  |  |  |  |  |
| C2                                                                                                  | U     |                                                                  |  |  |  |  |  |
| C3                                                                                                  | U     | U                                                                |  |  |  |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                  |  |  |  |  |  |

| Exception                                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|--------------------------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                        |
| x87 Floating-Point Exception Generated, #MF                  |      |                 |           |                                                                                              |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х         | An x87 stack overflow occurred.                                                              |

288 FLDLG2

### FLDLN2

# Floating-Point Load Ln 2

Pushes  $\log_{e}2$  onto the x87 register stack. The value in ST(0) is the result, in double-extended-precision format, of rounding an internal 66-bit constant according to the setting of the RC field in the x87 control word register.

| Mnemonic | Opcode | Description                                          |
|----------|--------|------------------------------------------------------|
| FLDLN2   | D9 ED  | Push log <sub>e</sub> 2 onto the x87 register stack. |

#### **Related Instructions**

FLD, FLD1, FLDZ, FLDPI, FLDL2T, FLDL2E, FLDLG2

#### rFLAGS Affected

None

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                      |
|-----------------------------------------------------------------------------------------------------|-------|------------------------------------------------------------------|
| CO                                                                                                  | U     |                                                                  |
| C1                                                                                                  | 0     | No x87 stack fault occurred.                                     |
| 1                                                                                                   | 1     | x87 stack overflow, if an x87 register stack fault was detected. |
| C2                                                                                                  | U     |                                                                  |
| C3                                                                                                  | U     |                                                                  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                  |

FLDLN2 289

| Exception                                                    | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                                           |
|--------------------------------------------------------------|------|-----------------|--------------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х            | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х            | An unmasked x87 floating-point exception was pending.                                        |
|                                                              |      | x87             | Floating-Poi | nt Exception Generated, #MF                                                                  |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х            | An x87 stack overflow occurred.                                                              |

290 FLDLN2

### **FLDPI**

# **Floating-Point Load Pi**

Pushes  $\pi$  onto the x87 register stack. The value in ST(0) is the result, in double-extended-precision format, of rounding an internal 66-bit constant according to the setting of the RC field in the x87 control word register.

| Mnemonic | Opcode | Description                             |
|----------|--------|-----------------------------------------|
| FLDPI    | D9 EB  | Push $\pi$ onto the x87 register stack. |

### **Related Instructions**

FLD, FLD1, FLDZ, FLDL2T, FLDL2E, FLDLG2, FLDLN2

#### rFLAGS Affected

None

### **x87 Condition Code**

| x87 Condition Code         | Value                                                                                               | Description                                                      |  |
|----------------------------|-----------------------------------------------------------------------------------------------------|------------------------------------------------------------------|--|
| CO                         | U                                                                                                   |                                                                  |  |
| Cl                         | 0                                                                                                   | No x87 stack fault occurred.                                     |  |
| Ci                         | 1                                                                                                   | x87 stack overflow, if an x87 register stack fault was detected. |  |
| C2                         | U                                                                                                   |                                                                  |  |
| C3                         | U                                                                                                   |                                                                  |  |
| A flag set to 1 or cleared | A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |                                                                  |  |

FLDPI 291

| Exception                                                    | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                                           |
|--------------------------------------------------------------|------|-----------------|--------------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х            | The emulate bit (EM) or the task switch bit (TS) of the control register (CRO) was set to 1. |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х            | An unmasked x87 floating-point exception was pending.                                        |
|                                                              | •    | x87             | Floating-Poi | nt Exception Generated, #MF                                                                  |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х            | An x87 stack overflow occurred.                                                              |

292 FLDPI

### **FLDZ**

# Floating-Point Load +0.0

Pushes +0.0 onto the x87 register stack.

| Mnemonic | Opcode | Description                            |
|----------|--------|----------------------------------------|
| FLDZ     | D9 EE  | Push zero onto the x87 register stack. |

### **Related Instructions**

FLD, FLD1, FLDPI, FLDL2T, FLDL2E, FLDLG2, FLDLN2

### rFLAGS Affected

None

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value                                                            | Description                  |
|-----------------------------------------------------------------------------------------------------|------------------------------------------------------------------|------------------------------|
| C0                                                                                                  | U                                                                |                              |
| C1                                                                                                  | 0                                                                | No x87 stack fault occurred. |
| C1 1                                                                                                | x87 stack overflow, if an x87 register stack fault was detected. |                              |
| C2                                                                                                  | U                                                                |                              |
| C3                                                                                                  | U                                                                |                              |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |                                                                  |                              |

### **Exceptions**

| Exception                                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|--------------------------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                        |
| x87 Floating-Point Exception Generated, #MF                  |      |                 |           |                                                                                              |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | X         | An x87 stack overflow occurred.                                                              |

FLDZ 293

# FMUL FMULP FIMUL

### **Floating-Point Multiply**

Multiplies the value in a floating-point register by the value in a memory location or another stack register and stores the result in the first register. The instruction converts a single-precision or double-precision value in memory to double-extended-precision format before multiplying.

If one operand is specified, the instruction multiplies the value in the ST(0) register by the value in the specified memory location and stores the result in the ST(0) register.

If two operands are specified, the instruction multiplies the value in the ST(0) register by the value in another specified floating-point register and stores the result in the register specified in the first operand.

The FMULP instruction pops the x87 stack after storing the product. The no-operand version of the FMULP instruction multiplies the value in the ST(1) register by the value in the ST(0) register and stores the product in the ST(1) register.

The FIMUL instruction converts a short-integer or word-integer value in memory to double-extended-precision format, multiplies it by the value in ST(0), and stores the product in ST(0).

| Mnemonic          | Opcode          | Description                                                            |
|-------------------|-----------------|------------------------------------------------------------------------|
| FMUL ST(0),ST(i)  | D8 C8+i         | Replace $ST(0)$ with $ST(0) * ST(i)$ .                                 |
| FMUL ST(i),ST(0)  | DC C8+ <i>i</i> | Replace $ST(i)$ with $ST(0) * ST(i)$ .                                 |
| FMUL mem32real    | D8 /1           | Replace $ST(0)$ with $mem32real * ST(0)$ .                             |
| FMUL mem64real    | DC /1           | Replace $ST(0)$ with $mem64real * ST(0)$ .                             |
| FMULP             | DE C9           | Replace $ST(1)$ with $ST(0) * ST(1)$ , and pop the x87 register stack. |
| FMULP ST(i),ST(0) | DE C8+ <i>i</i> | Replace $ST(i)$ with $ST(0) * ST(i)$ , and pop the x87 register stack. |
| FIMUL mem 16int   | DE/1            | Replace $ST(0)$ with $mem16int * ST(0)$ .                              |
| FIMUL mem32int    | DA /1           | Replace ST(0) with mem32int * ST(0).                                   |

294 FMULx

### **Related Instructions**

None.

### rFLAGS Affected

None

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                       |  |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------------------------------------------------------------|--|--|--|
| CO                                                                                                  | U     |                                                                   |  |  |  |
|                                                                                                     | 0     | x87 stack underflow, if an x87 register stack fault was detected. |  |  |  |
| C1                                                                                                  | 0     | Result was rounded down, if a precision exception was detected.   |  |  |  |
|                                                                                                     | 1     | Result was rounded up, if a precision exception was detected.     |  |  |  |
| C2                                                                                                  | U     |                                                                   |  |  |  |
| C3                                                                                                  | U     |                                                                   |  |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                   |  |  |  |

| Exception                                       | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|-------------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                    | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CRO) was set to 1. |
| Stack, #SS                                      | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                      |
| General protection,<br>#GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                         |
|                                                 |      |                 | Х         | A null data segment was used to reference memory.                                            |
| Page fault, #PF                                 |      | X               | Х         | A page fault resulted from the execution of the instruction.                                 |
| Alignment check, #AC                            |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.            |
| x87 floating-point<br>exception pending,<br>#MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                        |

| Exception                                                    | Real                                        | Virtual<br>8086 | Protected | Cause of Exception                                                                |  |  |  |  |  |
|--------------------------------------------------------------|---------------------------------------------|-----------------|-----------|-----------------------------------------------------------------------------------|--|--|--|--|--|
|                                                              | x87 Floating-Point Exception Generated, #MF |                 |           |                                                                                   |  |  |  |  |  |
| Invalid-operation exception (IE)                             | Х                                           | Х               | Х         | A source operand was an SNaN value or an unsupported format.                      |  |  |  |  |  |
|                                                              | Х                                           | Х               | Х         | ±infinity was multiplied by ±zero.                                                |  |  |  |  |  |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х                                           | Х               | Х         | An x87 stack underflow occurred.                                                  |  |  |  |  |  |
| Denormalized-oper-<br>and exception (DE)                     | Х                                           | Х               | Х         | A source operand was a denormal value.                                            |  |  |  |  |  |
| Overflow exception (OE)                                      | Х                                           | Х               | Х         | A rounded result was too large to fit into the format of the destination operand. |  |  |  |  |  |
| Underflow exception (UE)                                     | Х                                           | Х               | Х         | A rounded result was too small to fit into the format of the destination operand. |  |  |  |  |  |
| Precision exception (PE)                                     | Х                                           | Х               | Х         | A result could not be represented exactly in the destination format.              |  |  |  |  |  |

296 FMULx

### **FNOP**

# **Floating-Point No Operation**

Performs no operation. This instruction affects only the rIP register. It does not otherwise affect the processor context.

| Mnemonic | Opcode | Description           |
|----------|--------|-----------------------|
| FNOP     | D9 D0  | Perform no operation. |

#### **Related Instructions**

FWAIT, NOP

#### **rFLAGS Affected**

None

#### **x87 Condition Code**

None

### **Exceptions**

| Exception                                       | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|-------------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                    | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| x87 floating-point<br>exception pending,<br>#MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                        |

FNOP 297

### **FPATAN**

# **Floating-Point Partial Arctangent**

Computes the arctangent of the ordinate (Y) in ST(1) divided by the abscissa (X) in ST(0), which is the angle in radians between the X axis and the radius vector from the origin to the point (X,Y). It then stores the result in ST(1) and pops the x87 register stack. The resulting value has the same sign as the ordinate value and a magnitude less than or equal to  $\pi$ .

There is no restriction on the range of values that FPATAN can accept. Table 2-2 shows the results obtained when computing the arctangent of various classes of numbers, assuming that underflow does not occur:

**Table 2-2.** Computing Arctangent of Numbers

|           |         |            | X (ST(0))          |              |              |            |      |     |  |  |
|-----------|---------|------------|--------------------|--------------|--------------|------------|------|-----|--|--|
|           |         | <b>-</b> ∞ | -Finite            | -0           | +0           | +Finite    | +∞   | NaN |  |  |
|           |         | -3π/4      | <b>-π/2</b>        | <b>-</b> π/2 | <b>-</b> π/2 | -π/2       | −π/4 | NaN |  |  |
|           | -Finite | -π         | $-\pi$ to $-\pi/2$ | <b>-π/2</b>  | −π/2         | −π/2 to −0 | -0   | NaN |  |  |
|           | -0      | -π         | -π                 | -π           | -0           | -0         | -0   | NaN |  |  |
| Y (ST(1)) | +0      | +π         | +π                 | +π           | +0           | +0         | +0   | NaN |  |  |
|           | +Finite | +π         | $+\pi$ to $+\pi/2$ | <b>+</b> π/2 | <b>+</b> π/2 | +π/2 to +0 | +0   | NaN |  |  |
|           | +∞      | +3π/4      | +π/2               | +π/2         | +π/2         | +π/2       | +π/4 | NaN |  |  |
|           | NaN     | NaN        | NaN                | NaN          | NaN          | NaN        | NaN  | NaN |  |  |

| Mnemonic | Opcode | Description                                                                             |
|----------|--------|-----------------------------------------------------------------------------------------|
| FPATAN   | D9 F3  | Compute arctan(ST(1)/ST(0)), store the result in ST(1), and pop the x87 register stack. |

#### **Related Instructions**

FCOS, FPTAN, FSIN, FSINCOS

#### **rFLAGS Affected**

None

298 FPATAN

# **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                       |  |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------------------------------------------------------------|--|--|--|
| CO                                                                                                  | U     |                                                                   |  |  |  |
|                                                                                                     | 0     | x87 stack underflow, if an x87 register stack fault was detected. |  |  |  |
| C1                                                                                                  | 0     | Result was rounded down, if a precision exception was detected.   |  |  |  |
|                                                                                                     | 1     | Result was rounded up, if a precision exception was detected.     |  |  |  |
| C2                                                                                                  | U     |                                                                   |  |  |  |
| C3                                                                                                  | U     |                                                                   |  |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                   |  |  |  |

| Exception                                                    | Real | Virtual<br>8086 | Protected   | Cause of Exception                                                                          |
|--------------------------------------------------------------|------|-----------------|-------------|---------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х           | The emulate bit (EM) or the task switch bit (TS) of the control register (CRO) is set to 1. |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х           | An unmasked x87 floating-point exception was pending.                                       |
|                                                              | *    | x87             | Floating-Po | int Exception Generated, #MF                                                                |
| Invalid-operation exception (IE)                             | Х    | Х               | Х           | A source operand was an SNaN value or an unsupported format.                                |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х           | An x87 stack underflow occurred.                                                            |
| Denormalized-oper-<br>and exception (DE)                     | Х    | Х               | Х           | A source operand was a denormal value.                                                      |
| Underflow exception (UE)                                     | Х    | Х               | Х           | A rounded result was too small to fit into the format of the destination operand.           |
| Precision exception (PE)                                     | Х    | Х               | Х           | A result could not be represented exactly in the destination format.                        |

### **FPREM**

## **Floating-Point Partial Remainder**

Computes the exact remainder obtained by dividing the value in ST(0) by that in ST(1), and stores the result in ST(0). It computes the remainder by an iterative subtract-and-shift long division algorithm in which one quotient bit is calculated in each iteration.

If the exponent difference between ST(0) and ST(1) is less than 64, the instruction computes all integer bits of the quotient, guaranteeing that the remainder is less in magnitude than the divisor in ST(1). If the exponent difference is equal to or greater than 64, it computes only the subset of integer quotient bits numbering between 32 and 63, returns a partial remainder, and sets the C2 condition code bit to 1.

FPREM is supported for software that was written for early x87 coprocessors. Unlike the FPREM1 instruction, FPREM does not compute the partial remainder as specified in IEEE Standard 754.

MnemonicOpcodeDescriptionFPREMD9 F8Compute the remainder of the division of ST(0) by ST(1) and store the result in ST(0).

```
ExpDiff = Exponent(ST(0)) - Exponent(ST(1)
IF (ExpDiff < 0)
{
    SW.C2 = 0
    {SW.C0, SW.C3, SW.C1} = 0
}
ELSIF (ExpDiff < 64)
{
    Quotient = Floor(ST(0)/ST(1))
    ST(0) = ST(0) - (ST(1) * Quotient)
    SW.C2 = 0
    {SW.C0, SW.C3, SW.C1} = Quotient mod 8
}
ELSE
{
    N = 32 + (ExpDiff mod 32)
    Quotient = Floor ((ST(0)/ST(1))/2^(ExpDiff-N))
    ST(0) = ST(0) - (ST(1) * Quotient * 2^(ExpDiff-N))
    SW.C2 = 1
    {SW.C0, SW.C3, SW.C1} = 0
}</pre>
```

300 FPREM

### **Related Instructions**

FPREM1, FABS, FRNDINT, FXTRACT, FCHS

### rFLAGS Affected

None

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                                                          |  |  |  |
|-----------------------------------------------------------------------------------------------------|-------|------------------------------------------------------------------------------------------------------|--|--|--|
| CO                                                                                                  | М     | Set equal to the value of bit 2 of the quotient.                                                     |  |  |  |
| C1                                                                                                  | 0     | x87 stack underflow, if an x87 register stack fault was detected.                                    |  |  |  |
| Ci                                                                                                  | М     | Set equal to the value of bit 0 of the quotient, if there was no fault.                              |  |  |  |
|                                                                                                     | 0     | FPREM generated the partial remainder.                                                               |  |  |  |
| C2                                                                                                  | 1     | The source operands differed by more than a factor of 2 <sup>64</sup> , so the result is incomplete. |  |  |  |
| C3                                                                                                  | М     | Set equal to the value of bit 1 of the quotient.                                                     |  |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                                                      |  |  |  |

| Exception                                                    | Real | Virtual<br>8086 | Protected   | Cause of Exception                                                                          |
|--------------------------------------------------------------|------|-----------------|-------------|---------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х           | The emulate bit (EM) or the task switch bit (TS) of the control register (CRO) is set to 1. |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х           | An unmasked x87 floating-point exception was pending.                                       |
|                                                              | •    | x87             | Floating-Po | int Exception Generated, #MF                                                                |
| Invalid-operation exception (IE)                             | Х    | Х               | Х           | A source operand was an SNaN value or an unsupported format.                                |
| (-2)                                                         | Х    | Х               | Х           | ST(0) was ±infinity.                                                                        |
|                                                              | Х    | Х               | Х           | ST(0) and ST(1) were both ±zero.                                                            |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х           | An x87 stack underflow occurred.                                                            |

| Exception                                | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                |
|------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------|
| Zero-divide exception (ZE)               | Х    | Х               | Х         | ST(1) was ±zero and ST(0) was not ±zero or ±infinity.                             |
| Denormalized-oper-<br>and exception (DE) | Х    | Х               | Х         | A source operand was a denormal value.                                            |
| Underflow exception (UE)                 | Х    | Х               | Х         | A rounded result was too small to fit into the format of the destination operand. |

302 FPREM

### **FPREM1**

## **Floating-Point Partial Remainder**

Computes the IEEE Standard 754 remainder obtained by dividing the value in ST(0) by that in ST(1), and stores the result in ST(0). Unlike FPREM, it rounds the integer quotient to the nearest even integer and returns the remainder corresponding to the back multiply of the rounded quotient.

If the exponent difference between ST(0) and ST(1) is less than 64, the instruction computes all integer as well as additional fractional bits of the quotient to do the rounding. The remainder returned is a complete remainder and is less than or equal to one half of the magnitude of the divisor. If the exponent difference is equal to or greater than 64, it computes only the subset of integer quotient bits numbering between 32 and 63, returns the partial remainder, and sets the C2 condition code bit to 1.

Rounding control has no effect. FPREM1 results are exact.

| Mnemonic | Opcode | Description                                                                                              |
|----------|--------|----------------------------------------------------------------------------------------------------------|
| FPREM1   | D9 F5  | Compute the IEEE standard 754 remainder of the division of ST(0) by ST(1) and store the result in ST(0). |

#### Action

```
ExpDiff = Exponent(ST(0)) - Exponent(ST(1))
IF (ExpDiff < 0)</pre>
   SW.C2 = 0
   \{SW.CO, SW.C3, SW.C1\} = 0
ELSIF (ExpDiff < 64)
   Quotient = Integer obtained by rounding (ST(0)/ST(1))
          to nearest even integer
   ST(0) = ST(0) - (ST(1) * Quotient)
   \{SW.CO. SW.C3. SW.C1\} = Ouotient mod 8
ELSE
   N = 32 + (ExpDiff mod 32)
   Quotient = Floor ((ST(0)/ST(1))/2^{(ExpDiff-N)})
   ST(0) = ST(0) - (ST(1) * Ouotient * 2^(ExpDiff-N))
   SW.C2 = 1
   \{SW.CO, SW.C3, SW.C1\} = 0
}
```

FPREM1 303

### **Related Instructions**

FPREM, FABS, FRNDINT, FXTRACT, FCHS

### rFLAGS Affected

None

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value                                              | Description                                                                                          |  |
|-----------------------------------------------------------------------------------------------------|----------------------------------------------------|------------------------------------------------------------------------------------------------------|--|
| CO                                                                                                  | М                                                  | Set equal to the value of bit 2 of the quotient.                                                     |  |
| C1                                                                                                  | 0                                                  | x87 stack underflow, if an x87 register stack fault was detected.                                    |  |
| Ci                                                                                                  | М                                                  | Set equal to the value of the bit 0 of the quotient, if there was no fault.                          |  |
|                                                                                                     | 0                                                  | FPREM1 generated the partial remainder.                                                              |  |
| C2                                                                                                  | 1                                                  | The source operands differed by more than a factor of 2 <sup>64</sup> , so the result is incomplete. |  |
| C3                                                                                                  | M Set equal to the value of bit 1 of the quotient. |                                                                                                      |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |                                                    |                                                                                                      |  |

### **Exceptions**

| Exception                                                    | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                                          |
|--------------------------------------------------------------|------|-----------------|--------------|---------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х            | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) is set to 1. |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х            | An unmasked x87 floating-point exception was pending.                                       |
|                                                              | •    | x87             | Floating-Poi | nt Exception Generated, #MF                                                                 |
| Invalid-operation exception (IE)                             | Х    | Х               | Х            | A source operand was an SNaN value or an unsupported format.                                |
| (-2)                                                         | X    | Х               | Х            | ST(0) was ±infinity.                                                                        |
|                                                              | Х    | Х               | Х            | ST(0) and ST(1) were both ±zero.                                                            |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х            | An x87 stack underflow occurred.                                                            |

304 FPREM1

| Exception                                | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                |
|------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------|
| Zero-divide exception (ZE)               | Х    | Х               | Х         | ST(1) was ±0 and ST(0) was not ±zero or ±infinity.                                |
| Denormalized-oper-<br>and exception (DE) | Х    | Х               | Х         | A source operand was a denormal value.                                            |
| Underflow exception (UE)                 | Х    | Х               | Х         | A rounded result was too small to fit into the format of the destination operand. |

### **FPTAN**

# **Floating-Point Partial Tangent**

Computes the tangent of the radian value in ST(0), stores the result in ST(0), and pushes a value of 1.0 onto the x87 register stack.

The source value must be between  $-2^{63}$  and  $+2^{63}$  radians. To convert a source value outside of this range to an equivalent acceptable value, use the FPREM instruction to divide the value with a divisor of  $2\pi$ . If the source value lies outside the specified range, the instruction sets the C2 bit of the x87 status word to 1 and does not change the value in ST(0).

| Mnemonic | Opcode | Description                                                                         |
|----------|--------|-------------------------------------------------------------------------------------|
| FPTAN    | D9 F2  | Replace ST(0) with the tangent of ST(0), then push 1.0 onto the x87 register stack. |

#### **Related Instructions**

FCOS, FPATAN, FSIN, FSINCOS

#### rFLAGS Affected

None

#### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                       |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------------------------------------------------------------|--|
| C0                                                                                                  | U     |                                                                   |  |
|                                                                                                     | 0     | x87 stack underflow, if an x87 register stack fault was detected. |  |
| C1                                                                                                  | 1     | x87 stack overflow, if an x87 register stack fault was detected.  |  |
| Ci                                                                                                  | 0     | Result was rounded down, if a precision exception was detected.   |  |
|                                                                                                     | 1     | Result was rounded up, if a precision exception was detected.     |  |
| C2                                                                                                  | 0     | Source operand was in range.                                      |  |
| CZ                                                                                                  | 1     | Source operand was out of range.                                  |  |
| C3                                                                                                  | U     |                                                                   |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                   |  |

306 FPTAN

| Exception                                       | Real | Virtual<br>8086 | Protected   | Cause of Exception                                                                          |
|-------------------------------------------------|------|-----------------|-------------|---------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                    | Х    | Х               | Х           | The emulate bit (EM) or the task switch bit (TS) of the control register (CRO) is set to 1. |
| x87 floating-point<br>exception pending,<br>#MF | Х    | Х               | X           | An unmasked x87 floating-point exception was pending.                                       |
|                                                 | ч    | x87             | Floating-Po | int Exception Generated, #MF                                                                |
| Invalid-operation exception (IE)                | Х    | Х               | Х           | A source operand was an SNaN value or an unsupported format.                                |
|                                                 | Х    | Χ               | Х           | A source operand was ±infinity.                                                             |
| Invalid-operation exception (IE) with           | Х    | Х               | Х           | An x87 stack underflow occurred.                                                            |
| stack fault (SF)                                | X    | Х               | Х           | An x87 stack overflow occurred.                                                             |
| Denormalized-oper-<br>and exception (DE)        | Х    | Х               | Х           | A source operand was a denormal value.                                                      |
| Underflow exception (UE)                        | Х    | Х               | Х           | A rounded result was too small to fit into the format of the destination operand.           |
| Precision exception (PE)                        | Х    | Х               | Х           | A result could not be represented exactly in the destination format.                        |

### **FRNDINT**

# **Floating-Point Round to Integer**

Rounds the value in ST(0) to an integer, depending on the setting of the rounding control (RC) field of the x87 control word, and stores the result in ST(0).

If the initial value in ST(0) is  $\infty$ , the instruction does not change ST(0). If the value in ST(0) is not an integer, it sets the precision exception (PE) bit of the x87 status word to 1.

| Mnemonic | Opcode | Description                                |
|----------|--------|--------------------------------------------|
| FRNDINT  | D9 FC  | Round the contents of ST(0) to an integer. |

#### **Related Instructions**

FABS, FPREM, FXTRACT, FCHS

#### rFLAGS Affected

None

#### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                       |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------------------------------------------------------------|--|--|
| CO                                                                                                  | U     |                                                                   |  |  |
|                                                                                                     | 0     | x87 stack underflow, if an x87 register stack fault was detected. |  |  |
| C1                                                                                                  | 0     | Result was rounded down, if a precision exception was detected.   |  |  |
|                                                                                                     | 1     | Result was rounded up, if a precision exception was detected.     |  |  |
| C2                                                                                                  | U     |                                                                   |  |  |
| C3                                                                                                  | U     |                                                                   |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                   |  |  |

308 FRNDINT

# Exceptions

| Exception                                                    | Real                                        | Virtual<br>8086 | Protected | Cause of Exception                                                                          |  |  |
|--------------------------------------------------------------|---------------------------------------------|-----------------|-----------|---------------------------------------------------------------------------------------------|--|--|
| Device not available,<br>#NM                                 | Х                                           | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) is set to 1. |  |  |
| x87 floating-point<br>exception pending,<br>#MF              | Х                                           | Х               | Х         | An unmasked x87 floating-point exception was pending.                                       |  |  |
|                                                              | x87 Floating-Point Exception Generated, #MF |                 |           |                                                                                             |  |  |
| Invalid-operation exception (IE)                             | Х                                           | Х               | Х         | A source operand was an SNaN value or an unsupported format.                                |  |  |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х                                           | Х               | Х         | An x87 stack underflow occurred.                                                            |  |  |
| Denormalized-oper-<br>and exception (DE)                     | Х                                           | Х               | Х         | A source operand was a denormal value.                                                      |  |  |
| Precision exception (PE)                                     | Х                                           | Х               | Х         | The source operand was not an integral value.                                               |  |  |

FRNDINT 309

### **FRSTOR**

# Floating-Point Restore x87 and MMX™ State

Restores the complete x87 state from memory starting at the specified address, as stored by a previous call to FNSAVE. The x87 state occupies 94 or 108 bytes of memory depending on whether the processor is operating in real or protected mode and whether the operand-size attribute is 16-bit or 32-bit. Because the MMX registers are mapped onto the low 64 bits of the x87 floating-point registers, this operation also restores the MMX state.

If FRSTOR results in set exception flags in the loaded x87 status word register, and these exceptions are unmasked in the x87 control word register, a floating-point exception occurs when the next floating-point instruction is executed (except for the no-wait floating-point instructions).

To avoid generating exceptions when loading a new environment, use the FCLEX or FNCLEX instruction to clear the exception flags in the x87 status word before storing that environment.

For details about the memory image restored by FRSTOR, see "Media and x87 Processor State" in volume 2.

| Mnemonic            | Opcode | Description                           |
|---------------------|--------|---------------------------------------|
| FRSTOR mem94/108env | DD /4  | Load the x87 state from mem94/108env. |

#### **Related Instructions**

FSAVE, FNSAVE, FXSAVE, FXRSTOR

#### rFLAGS Affected

None

#### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description         |  |  |
|-----------------------------------------------------------------------------------------------------|-------|---------------------|--|--|
| C0                                                                                                  | M     | Loaded from memory. |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                     |  |  |

310 FRSTOR

| x87 Condition Code                                                                           | Value | Description         |
|----------------------------------------------------------------------------------------------|-------|---------------------|
| C1                                                                                           | М     | Loaded from memory. |
| C2                                                                                           | М     | Loaded from memory. |
| C3                                                                                           | М     | Loaded from memory. |
| A floorable 1 and around to 0 is Marco 155 A. Har floorable floorable like 155 and floorable |       |                     |

#### A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U.

| Exception                                       | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|-------------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                    | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CRO) was set to 1. |
| Stack, #SS                                      | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                      |
| General protection,<br>#GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                         |
|                                                 |      |                 | Х         | A null data segment was used to reference memory.                                            |
| Page fault, #PF                                 |      | Χ               | Х         | A page fault resulted from the execution of the instruction.                                 |
| Alignment check, #AC                            |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.            |
| x87 floating-point<br>exception pending,<br>#MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                        |

# FSAVE Floating-Point Save x87 and MMX™ State (FNSAVE)

Stores the complete x87 state to memory starting at the specified address and reinitializes the x87 state. The x87 state requires 94 or 108 bytes of memory, depending upon whether the processor is operating in real or protected mode and whether the operand-size attribute is 16-bit or 32-bit. Because the MMX registers are mapped onto the low 64 bits of the x87 floating-point registers, this operation also saves the MMX state. For details about the memory image saved by FNSAVE, see "Media and x87 Processor State" in volume 2.

The FNSAVE instruction does not wait for pending unmasked x87 floating-point exceptions to be processed.

Assemblers usually provide an FSAVE macro that expands into the instruction sequence

WAIT ; Opcode 9B FNSAVE destination ; Opcode DD /6

The WAIT (9Bh) instruction checks for pending x87 exceptions and calls an exception handler, if necessary. The FNSAVE instruction then stores the x87 state to the specified destination.

| Mnemonic            | Opcode   | Description                                                                                                                        |
|---------------------|----------|------------------------------------------------------------------------------------------------------------------------------------|
| FSAVE mem94/108env  | 9B DD /6 | Copy the x87 state to <i>mem94/108env</i> after checking for pending floating-point exceptions, then reinitialize the x87 state.   |
| FNSAVE mem94/108env | DD /6    | Copy the x87 state to <i>mem94/108env</i> without checking for pending floating-point exceptions, then reinitialize the x87 state. |

#### **Related Instructions**

FRSTOR, FXSAVE, FXRSTOR

#### rFLAGS Affected

None

# **x87 Condition Code**

| x87 Condition Code | Value | Description |
|--------------------|-------|-------------|
| CO                 | 0     |             |
| <b>C</b> 1         | 0     |             |
| C2                 | 0     |             |
| C3                 | 0     |             |

| Exception                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| Stack, #SS                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                      |
| General protection,<br>#GP   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                         |
|                              |      |                 | X         | The destination operand was in a nonwritable segment.                                        |
|                              |      |                 | Х         | A null data segment was used to reference memory.                                            |
| Page fault, #PF              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                 |
| Alignment check, #AC         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.            |

### **FSCALE**

## **Floating-Point Scale**

Multiplies the floating-point value in ST(0) by 2 to the power of the integer portion of the floating-point value in ST(1).

This instruction provides an efficient method of multiplying (or dividing) by integral powers of 2 because, typically, it simply adds the integer value to the exponent of the value in ST(0), leaving the significand unaffected. However, if the value in ST(0) is a denormal value, the mantissa is also modified and the result may end up being a normalized number. Likewise, if overflow or underflow results from a scale operation, the mantissa of the resulting value will be different from that of the source.

The FSCALE instruction performs the reverse operation to that of the FXTRACT instruction.

| Mnemonic | Opcode | Description                                           |
|----------|--------|-------------------------------------------------------|
| FSCALE   | D9 FD  | Replace ST(0) with ST(0) * $2^{\text{rndint}(ST(1))}$ |

#### **Related Instructions**

FSQRT, FPREM, FPREM1, FRNDINT, FXTRACT, FABS, FCHS

#### rFLAGS Affected

None

#### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                       |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------------------------------------------------------------|--|
| CO                                                                                                  | U     | Undefined.                                                        |  |
|                                                                                                     | 0     | x87 stack underflow, if an x87 register stack fault was detected. |  |
| C1                                                                                                  | 0     | Result was rounded down, if a precision exception was detected.   |  |
|                                                                                                     | 1     | Result was rounded up, if a precision exception was detected.     |  |
| C2                                                                                                  | U     | Undefined.                                                        |  |
| C3                                                                                                  | U     | Undefined                                                         |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                   |  |

314 FSCALE

| Exception                                                    | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                                          |
|--------------------------------------------------------------|------|-----------------|--------------|---------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х            | The emulate bit (EM) or the task switch bit (TS) of the control register (CRO) is set to 1. |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х            | An unmasked x87 floating-point exception was pending.                                       |
|                                                              | - I  | x87 l           | Floating-Poi | nt Exception Generated, #MF                                                                 |
| Invalid-operation exception (IE)                             | Х    | Х               | Х            | A source operand was an SNaN value or an unsupported format.                                |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х            | An x87 stack underflow occurred.                                                            |
| Denormalized-oper-<br>and exception (DE)                     | Х    | Х               | Х            | A source operand was a denormal value.                                                      |
| Overflow exception (OE)                                      | Х    | Х               | Х            | A rounded result was too large to fit into the format of the destination operand.           |
| Underflow exception (UE)                                     | Х    | Х               | Х            | A rounded result was too small to fit into the format of the destination operand.           |
| Precision exception (PE)                                     | Х    | Х               | Х            | A result could not be represented exactly in the destination format.                        |

### **FSIN**

# **Floating-Point Sine**

Computes the sine of the radian value in ST(0) and stores the result in ST(0).

The source value must be in the range  $-2^{63}$  to  $+2^{63}$  radians. If the value lies outside this range, the instruction sets the C2 bit in the x87 status word to 1 and does not change the value in ST(0). To convert a source value outside the range  $-2^{63}$  and  $+2^{63}$  to an equivalent acceptable value, use the FPREM instruction to divide it by  $2\pi$ .

| Mnemonic | Opcode | Description                           |
|----------|--------|---------------------------------------|
| FSIN     | D9 FE  | Replace ST(0) with the sine of ST(0). |

#### **Related Instructions**

FCOS, FPATAN, FPTAN, FSINCOS

#### **rFLAGS Affected**

None

#### **x87 Condition Code**

| x87 Condition Code                                                                                  |   | Description                                                       |  |
|-----------------------------------------------------------------------------------------------------|---|-------------------------------------------------------------------|--|
| CO                                                                                                  | U |                                                                   |  |
|                                                                                                     | 0 | x87 stack underflow, if an x87 register stack fault was detected. |  |
| <b>C</b> 1                                                                                          | 0 | Result was rounded down, if a precision exception was detected.   |  |
|                                                                                                     | 1 | Result was rounded up, if a precision exception was detected.     |  |
| C2                                                                                                  | 0 | Source operand was in range.                                      |  |
| C2                                                                                                  | 1 | Source operand was out of range.                                  |  |
| C3                                                                                                  | U |                                                                   |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |   |                                                                   |  |

*316* FSIN

| Exception                                                    | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                                           |
|--------------------------------------------------------------|------|-----------------|--------------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | X    | X               | X            | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х            | An unmasked x87 floating-point exception was pending.                                        |
|                                                              |      | x87             | Floating-Poi | nt Exception Generated, #MF                                                                  |
| Invalid-operation exception (IE)                             | Х    | Х               | Х            | A source operand was an SNaN value or an unsupported format.                                 |
|                                                              | Χ    | Х               | Х            | A source operand was ±infinity.                                                              |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х            | An x87 stack underflow occurred.                                                             |
| Denormalized-oper-<br>and exception (DE)                     | Х    | Х               | Х            | A source operand was a denormal value.                                                       |
| Precision exception (PE)                                     | Х    | Х               | Х            | A result could not be represented exactly in the destination format.                         |

### **FSINCOS**

# **Floating-Point Sine and Cosine**

Computes the sine and cosine of the value in ST(0), stores the sine in ST(0), and pushes the cosine onto the x87 register stack. The source value must be in the range  $-2^{63}$  to  $+2^{63}$  radians.

If the source operand is outside this range, the instruction sets the C2 bit in the x87 status word to 1 and does not change the value in ST(0). To convert a source value outside the range  $-2^{63}$  and  $+2^{63}$  to an equivalent acceptable value, use the FPREM instruction to divide it by  $2\pi$ .

| Mnemonic | Opcode | Description                                                                                      |
|----------|--------|--------------------------------------------------------------------------------------------------|
| FSINCOS  | D9 FB  | Replace ST(0) with the sine of ST(0), then push the cosine of ST(0) onto the x87 register stack. |

#### **Related Instructions**

FCOS, FPATAN, FPTAN, FSIN

#### rFLAGS Affected

None

#### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                       |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------------------------------------------------------------|--|
| CO                                                                                                  | U     |                                                                   |  |
|                                                                                                     | 0     | x87 stack underflow, if an x87 register stack fault was detected. |  |
| C1                                                                                                  | 1     | x87 stack overflow, if an x87 register stack fault was detected.  |  |
| CI                                                                                                  | 0     | Result was rounded down, if a precision exception was detected.   |  |
|                                                                                                     | 1     | Result was rounded up, if a precision exception was detected.     |  |
| C2                                                                                                  | 0     | Source operand was in range.                                      |  |
| C2                                                                                                  | 1     | Source operand was out of range.                                  |  |
| C3                                                                                                  | U     |                                                                   |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                   |  |

318 FSINCOS

# Exceptions

| Exception                                       | Real | Virtual<br>8086 | Protected   | Cause of Exception                                                                           |
|-------------------------------------------------|------|-----------------|-------------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                    | Х    | Х               | Х           | The emulate bit (EM) or the task switch bit (TS) of the control register (CRO) was set to 1. |
| x87 floating-point<br>exception pending,<br>#MF | Х    | Х               | Х           | An unmasked x87 floating-point exception was pending.                                        |
|                                                 | 1    | x87             | Floating-Po | int Exception Generated, #MF                                                                 |
| Invalid-operation exception (IE)                | Х    | Х               | Х           | A source operand was an SNaN value or an unsupported format.                                 |
|                                                 | X    | Х               | Х           | A source operand was ±infinity.                                                              |
| Invalid-operation exception (IE) with           | Х    | Х               | Х           | An x87 stack underflow occurred.                                                             |
| stack fault (SF)                                | X    | Х               | Х           | An x87 stack overflow occurred.                                                              |
| Denormalized-oper-<br>and exception (DE)        | Х    | Х               | Х           | A source operand was a denormal value.                                                       |
| Underflow exception (UE)                        | Х    | Х               | Х           | A rounded result was too small to fit into the format of the destination operand.            |
| Precision exception (PE)                        | Х    | Х               | Х           | A result could not be represented exactly in the destination format.                         |

FSINCOS 319

# **FSQRT**

# **Floating-Point Square Root**

Computes the square root of the value in ST(0) and stores the result in ST(0). Taking the square root of +infinity returns +infinity.

| Mnemonic | Opcode | Description                                       |
|----------|--------|---------------------------------------------------|
| FSQRT    | D9 FA  | Replace $ST(0)$ with the square root of $ST(0)$ . |

#### **Related Instructions**

FSCALE, FPREM, FPREM1, FRNDINT, FXTRACT, FABS, FCHS

#### **rFLAGS Affected**

None

#### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                       |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------------------------------------------------------------|--|--|
| CO                                                                                                  | U     |                                                                   |  |  |
|                                                                                                     | 0     | x87 stack underflow, if an x87 register stack fault was detected. |  |  |
| <b>C</b> 1                                                                                          | 0     | Result was rounded down, if a precision exception was detected.   |  |  |
|                                                                                                     | 1     | Result was rounded up, if a precision exception was detected.     |  |  |
| C2                                                                                                  | U     |                                                                   |  |  |
| C3                                                                                                  | U     |                                                                   |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                   |  |  |

### **Exceptions**

| Exception                                       | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|-------------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                    | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| x87 floating-point<br>exception pending,<br>#MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                        |

320 FSQRT

| Exception                                                    | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                   |
|--------------------------------------------------------------|------|-----------------|--------------|----------------------------------------------------------------------|
|                                                              |      | x87             | Floating-Poi | nt Exception Generated, #MF                                          |
| Invalid-operation exception (IE)                             | Х    | Х               | Х            | A source operand was an SNaN value or an unsupported format.         |
|                                                              | Х    | Х               | Х            | A source operand was a negative value (not including -zero).         |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х            | An x87 stack underflow occurred.                                     |
| Denormalized-oper-<br>and exception (DE)                     | Х    | Х               | Х            | A source operand was a denormal value.                               |
| Precision exception (PE)                                     | Х    | Х               | Х            | A result could not be represented exactly in the destination format. |

# FST FSTP

# **Floating-Point Store Stack Top**

Copies the value in ST(0) to the specified floating-point register or memory location.

The FSTP instruction pops the x87 stack after copying the value. The instruction FSTP ST(0) is the same as popping the stack with no data transfer.

If the specified destination is a single-precision or double-precision memory location, the instruction converts the value to the appropriate precision format. It does this by truncating the significand of the source value to the width of the memory location and rounding as specified by the rounding mode determined by the RC field of the x87 control word. It also converts the exponent to the width and bias of the destination format.

If the value is too large for the destination format, the instruction sets the overflow exception (OE) bit of the x87 status word. Then, if the overflow exception is unmasked (OM bit cleared to 0 in the x87 control word), the instruction does not perform the store.

If the value is a denormal value, the instruction sets the underflow exception (UE) bit in the x87 status word.

If the value is  $\pm 0$ ,  $\pm \infty$ , or a NaN, the instruction truncates the least significant bits of the significand and exponent to fit the destination location.

| Mnemonic       | Opcode  | Description                                                                    |
|----------------|---------|--------------------------------------------------------------------------------|
| FST ST(i)      | DD D0+i | Copy the contents of $ST(0)$ to $ST(i)$ .                                      |
| FST mem32real  | D9/2    | Copy the contents of ST(0) to mem32real.                                       |
| FST mem64real  | DD /2   | Copy the contents of ST(0) to mem64real.                                       |
| FSTP ST(i)     | DD D8+i | Copy the contents of $ST(0)$ to $ST(i)$ and pop the x87 register stack.        |
| FSTP mem32real | D9 /3   | Copy the contents of ST(0) to <i>mem32real</i> and pop the x87 register stack  |
| FSTP mem64real | DD /3   | Copy the contents of ST(0) to <i>mem64real</i> and pop the x87 register stack. |
| FSTP mem80real | DB /7   | Copy the contents of ST(0) to <i>mem80real</i> and pop the x87 register stack. |

322 FST

### **Related Instructions**

FFREE, FLD, FILD, FIST, FISTP, FBLD, FBSTP

### rFLAGS Affected

None

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                       |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------------------------------------------------------------|--|--|
| CO                                                                                                  | U     |                                                                   |  |  |
|                                                                                                     | 0     | x87 stack underflow, if an x87 register stack fault was detected. |  |  |
| CI                                                                                                  | 1     | x87 stack overflow, if an x87 register stack fault was detected.  |  |  |
| C1                                                                                                  | 0     | Result was rounded down, if a precision exception was detected.   |  |  |
|                                                                                                     | 1     | Result was rounded up, if a precision exception was detected.     |  |  |
| C2                                                                                                  | U     |                                                                   |  |  |
| C3                                                                                                  | U     |                                                                   |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                   |  |  |

| Exception                                       | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|-------------------------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                    | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| Stack, #SS                                      | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                      |
| General protection,<br>#GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                         |
|                                                 |      |                 | Х         | The destination operand was in a nonwritable segment.                                        |
|                                                 |      |                 | Х         | A null data segment was used to reference memory.                                            |
| Page fault, #PF                                 |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                 |
| Alignment check, #AC                            |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.            |
| x87 floating-point<br>exception pending,<br>#MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                        |

| Exception                             | Real | Virtual<br>8086 | Protected   | Cause of Exception                                                                |
|---------------------------------------|------|-----------------|-------------|-----------------------------------------------------------------------------------|
|                                       |      | x87 Fl          | oating-Poin | t Exception Generated, #MF                                                        |
| Invalid-operation exception (IE)      | Х    | Х               | Х           | A source operand was an SNaN value or an unsupported format.                      |
| Invalid-operation exception (IE) with | Х    | Х               | Х           | An x87 stack underflow occurred.                                                  |
| stack fault (SF)                      | Х    | X               | X           | An x87 stack overflow occurred.                                                   |
| Overflow exception (OE)               | Х    | Х               | Х           | A rounded result was too large to fit into the format of the destination operand. |
| Underflow exception (UE)              | Х    | Х               | Х           | A rounded result was too small to fit into the format of the destination operand. |
| Precision exception (PE)              | Х    | Х               | Х           | A result could not be represented exactly in the destination format.              |

324 FST

# FSTCW (FNSTCW)

# **Floating-Point Store Control Word**

Stores the x87 control word in the specified 2-byte memory location. The FNSTCW instruction does not check for possible floating-point exceptions before copying the image of the x87 status register.

Assemblers usually provide an FSTCW macro that expands into the instruction sequence:

WAIT ; Opcode 9B FNSTCW destination ; Opcode D9 /7

The WAIT (9Bh) instruction checks for pending x87 exception and calls an exception handler, if necessary. The FNSTCW instruction then stores the state of the x87 control register to the desired destination.

| Mnemonic       | Opcode   | Description                                                                                                            |
|----------------|----------|------------------------------------------------------------------------------------------------------------------------|
| FSTCW mem2env  | 9B D9 /7 | Perform a WAIT (9B) to check for pending floating-point exceptions, then copy the x87 control word to <i>mem2env</i> . |
| FNSTCW mem2env | D9 /7    | Copy the x87 control word to <i>mem2env</i> without checking for floating-point exceptions.                            |

#### **Related Instructions**

FSTSW, FNSTSW, FSTENV, FNSTENV

#### rFLAGS Affected

None

#### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------|--|--|
| CO                                                                                                  | U     |             |  |  |
| C1                                                                                                  | U     |             |  |  |
| C2                                                                                                  | U     |             |  |  |
| C3                                                                                                  | U     |             |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |             |  |  |

| Exception                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| Stack, #SS                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                      |
| General protection,<br>#GP   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                         |
|                              |      |                 | Х         | The destination operand was in a nonwritable segment.                                        |
|                              |      |                 | Х         | A null data segment was used to reference memory.                                            |
| Page fault, #PF              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                 |
| Alignment check, #AC         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.            |

# FSTENV (FNSTENV)

# **Floating-Point Store x87 Environment**

Stores the current x87 environment to memory starting at the specified address, and then masks all floating-point exceptions. The x87 environment consists of the x87 control, status, and tag word registers, the last non-control x87 instruction pointer, the last x87 data pointer, and the opcode of the last completed non-control x87 instruction.

The x87 environment requires a 14-byte or 28-byte area in memory, depending on whether the processor is operating in protected or real mode and whether the operand-size attribute is 16-bit or 32-bit. See "Media and x87 Processor State" in volume 2 for details on how this instruction stores the x87 environment in memory.

The FNSTENV instruction does not check for possible floating-point exceptions before storing the environment.

Assemblers usually provide an FSTENV macro that expands into the instruction sequence

WAIT ; Opcode 9B FNSTENV destination : Opcode D9 /6

The WAIT (9Bh) instruction checks for pending x87 exceptions and calls an exception handler if necessary. The FNSTENV instruction then stores the state of the x87 environment to the specified destination.

Exception handlers often use these instructions because they provide access to the x87 instruction and data pointers. An exception handler typically saves the environment on the stack. The instructions mask all floating-point exceptions after saving the environment to prevent those exceptions from interrupting the exception handler.

| Mnemonic            | Opcode   | Description                                                                                                                                                     |
|---------------------|----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------|
| FSTENV mem14/28env  | 9B D9 /6 | Perform a WAIT (9B) to check for pending floating-point exceptions, then copy the x87 environment to <i>mem14/28env</i> and mask the floating-point exceptions. |
| FNSTENV mem14/28env | D9/6     | Copy the x87 environment to <i>mem14/28env</i> without checking for pending floating-point exceptions, and mask the exceptions.                                 |

### **Related Instructions**

FLDENV, FSTSW, FNSTSW, FSTCW, FNSTCW

### rFLAGS Affected

None

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------|--|--|
| CO                                                                                                  | U     |             |  |  |
| C1                                                                                                  | U     |             |  |  |
| C2                                                                                                  | U     |             |  |  |
| C3                                                                                                  | U     |             |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |             |  |  |

| Exception                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| Stack, #SS                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                      |
| General protection,<br>#GP   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                         |
|                              |      |                 | Х         | The destination operand was in a nonwritable segment.                                        |
|                              |      |                 | Х         | A null data segment was used to reference memory.                                            |
| Page fault, #PF              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                 |
| Alignment check, #AC         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.            |

# FSTSW (FNSTSW)

# **Floating-Point Store x87 Status Word**

Stores the current state of the x87 status word register in either the AX register or a specified two-byte memory location. The image of the status word placed in the AX register always reflects the result after the execution of the previous x87 instruction.

The AX form of the instruction is useful for performing conditional branching operations based on the values of x87 condition flags.

The FNSTSW instruction does not check for possible floating-point exceptions before storing the x87 status word.

Assemblers usually provide an FSTSW macro that expands into the instruction sequence:

WAIT ; Opcode 9B

FNSTSW destination : Opcode DD /7 or DF E0

The WAIT (9Bh) instruction checks for pending x87 exceptions and calls an exception handler if necessary. The FNSTSW instruction then stores the state of the x87 status register to the desired destination.

| Mnemonic       | Opcode   | Description                                                                                                             |
|----------------|----------|-------------------------------------------------------------------------------------------------------------------------|
| FSTSW AX       | 9B DF E0 | Perform a WAIT (9B) to check for pending floating-point exceptions, then copy the x87 status word to the AX register.   |
| FSTSW mem2env  | 9B DD /7 | Perform a WAIT (9B) to check for pending floating-point exceptions, then copy the x87 status word to <i>mem12byte</i> . |
| FNSTSW AX      | DF E0    | Copy the x87 status word to the AX register without checking for pending floating-point exceptions.                     |
| FNSTSW mem2env | DD /7    | Copy the x87 status word to <i>mem12byte</i> without checking for pending floating-point exceptions.                    |

#### **Related Instructions**

FSTCW, FNSTCW, FSTENV, FNSTENV

#### rFLAGS Affected

None

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------|--|--|
| CO                                                                                                  | U     |             |  |  |
| <b>C</b> 1                                                                                          | U     |             |  |  |
| C2                                                                                                  | U     |             |  |  |
| C3                                                                                                  | U     |             |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |             |  |  |

| Exception                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                           |
|------------------------------|------|-----------------|-----------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| Stack, #SS                   | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                      |
| General protection,<br>#GP   | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                         |
|                              |      |                 | Х         | The destination operand was in a nonwritable segment.                                        |
|                              |      |                 | Х         | A null data segment was used to reference memory.                                            |
| Page fault, #PF              |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                 |
| Alignment check, #AC         |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.            |

### FSUBP FSUBP FISUB

### **Floating-Point Subtract**

Subtracts the value in a floating-point register or memory location from the value in a another register and stores the result in that register.

If no operands are specified, the instruction subtracts the value in ST(0) from that in ST(1) and stores the result in ST(1).

If one operand is specified, it subtracts a floating-point or integer value in memory from the contents of ST(0) and stores the result in ST(0).

If two operands are specified, it subtracts the value in ST(0) from the value in another floating-point register or vice versa.

The FSUBP instruction pops the x87 register stack after performing the subtraction.

The no-operand version of the instruction always pops the register stack. In some assemblers, the mnemonic for this instruction is FSUB rather than FSUBP.

The FISUB instruction converts a signed integer value to double-extended-precision format before performing the subtraction.

| Mnemonic          | Opcode          | Description                                                            |
|-------------------|-----------------|------------------------------------------------------------------------|
| FSUB ST(0),ST(i)  | D8 E0+ <i>i</i> | Replace ST(0) with ST(0) – ST(i).                                      |
| FSUB ST(i),ST(0)  | DC E8+ <i>i</i> | Replace $ST(i)$ with $ST(i) - ST(0)$                                   |
| FSUB mem32real    | D8 /4           | Replace ST(0) with ST(0) – mem32real.                                  |
| FSUB mem64real    | DC /4           | Replace ST(0) with ST(0) – mem64real.                                  |
| FSUBP             | DE E9           | Replace $ST(1)$ with $ST(1) - ST(0)$ and pop the x87 register stack.   |
| FSUBP ST(i),ST(0) | DE E8+ <i>i</i> | Replace $ST(i)$ with $ST(i) - ST(0)$ , and pop the x87 register stack. |
| FISUB mem 16int   | DE /4           | Replace ST(0) with ST(0) – mem16int.                                   |
| FISUB mem32int    | DA /4           | Replace ST(0) with ST(0) – mem32int.                                   |

#### **Related Instructions**

FSUBRP, FISUBR, FSUBR

FSUBx 331

### rFLAGS Affected

None

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                       |  |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------------------------------------------------------------|--|--|--|
| CO                                                                                                  | U     |                                                                   |  |  |  |
|                                                                                                     | 0     | x87 stack underflow, if an x87 register stack fault was detected. |  |  |  |
| C1                                                                                                  | 0     | Result was rounded down, if a precision exception was detected.   |  |  |  |
|                                                                                                     | 1     | Result was rounded up, if a precision exception was detected.     |  |  |  |
| C2                                                                                                  | U     |                                                                   |  |  |  |
| C3                                                                                                  | U     |                                                                   |  |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                   |  |  |  |

### Exceptions

|                                                 |      | Virtual |                                                                                                |                                                                                   |
|-------------------------------------------------|------|---------|------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------|
| Exception                                       | Real | 8086    | Protected                                                                                      | Cause of Exception                                                                |
| Device not available,<br>#NM                    | Х    | Х       | X The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |                                                                                   |
| Stack, #SS                                      | Х    | Х       | Х                                                                                              | A memory address exceeded the stack segment limit or was non-canonical.           |
| General protection,<br>#GP                      | Х    | Х       | X A memory address exceeded a data segment limit or w canonical.                               |                                                                                   |
|                                                 |      |         | Х                                                                                              | A null data segment was used to reference memory.                                 |
| Page fault, #PF                                 |      | Х       | Х                                                                                              | A page fault resulted from the execution of the instruction.                      |
| Alignment check, #AC                            |      | Х       | Х                                                                                              | An unaligned memory reference was performed while alignment checking was enabled. |
| x87 floating-point<br>exception pending,<br>#MF | Х    | Х       | Х                                                                                              | An unmasked x87 floating-point exception was pending.                             |
|                                                 |      | x87     | Floating-Poi                                                                                   | int Exception Generated, #MF                                                      |
| Invalid-operation exception (IE)                | Х    | Х       | Х                                                                                              | A source operand was an SNaN value or an unsupported format.                      |
|                                                 | X    | Х       | Х                                                                                              | +infinity was subtracted from +infinity.                                          |
|                                                 | Χ    | Х       | Х                                                                                              | -infinity was subtracted from -infinity.                                          |

332 FSUBx

| Exception                                                    | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                |
|--------------------------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------|
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | X         | An x87 stack underflow occurred.                                                  |
| Denormalized-oper-<br>and exception (DE)                     | Х    | Х               | Х         | A source operand was a denormal value.                                            |
| Overflow exception (OE)                                      | Х    | Х               | Х         | A rounded result was too large to fit into the format of the destination operand. |
| Underflow exception (UE)                                     | Х    | Х               | Х         | A rounded result was too small to fit into the format of the destination operand. |
| Precision exception (PE)                                     | Х    | Х               | Х         | A result could not be represented exactly in the destination format.              |

### FSUBR FSUBRP FISUBR

### **Floating-Point Subtract Reverse**

Subtracts the value in a floating-point register from the value in another register or a memory location, and stores the result in the first specified register. Values in memory can be in single-precision or double-precision floating-point, word integer, or short integer format.

If one operand is specified, the instruction subtracts the value in ST(0) from the value in memory and stores the result in ST(0).

If two operands are specified, it subtracts the value in ST(0) from the value in another floating-point register or vice versa.

The FSUBRP instruction pops the x87 register stack after performing the subtraction.

The no-operand version of the instruction always pops the register stack. In some assemblers, the mnemonic for this instruction is FSUBR rather than FSUBRP.

The FISUBR instruction converts a signed integer operand to double-extended-precision format before performing the subtraction.

The FSUBR instructions perform the reverse operations of the FSUB instructions.

| Mnemonic           | Opcode          | Description                                               |
|--------------------|-----------------|-----------------------------------------------------------|
| FSUBR ST(0),ST(i)  | D8 E8+ <i>i</i> | Replace ST(0) with ST(i) - ST(0).                         |
| FSUBR ST(i),ST(0)  | DC E0+ <i>i</i> | Replace ST(i) with ST(0) - ST(i)                          |
| FSUBR mem32real    | D8/5            | Replace ST(0) with mem32real - ST(0).                     |
| FSUBR mem64real    | DC /5           | Replace ST(0) with mem64real - ST(0).                     |
| FSUBRP             | DE E1           | Replace $ST(1)$ with $ST(0)$ - $ST(1)$ and pop x87 stack. |
| FSUBRP ST(i),ST(0) | DE E0+ <i>i</i> | Replace $ST(i)$ with $ST(0)$ - $ST(i)$ and pop x87 stack. |
| FISUBR mem 16int   | DE/5            | Replace ST(0) with mem16int - ST(0).                      |
| FISUBR mem32int    | DA /5           | Replace ST(0) with mem32int - ST(0).                      |
|                    |                 |                                                           |

334 FSUBRx

### **Related Instructions**

FSUB, FSUBP, FISUB

### rFLAGS Affected

None

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                       |  |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------------------------------------------------------------|--|--|--|
| CO                                                                                                  | U     |                                                                   |  |  |  |
|                                                                                                     | 0     | x87 stack underflow, if an x87 register stack fault was detected. |  |  |  |
| C1                                                                                                  | 0     | Result was rounded down, if a precision exception was detected.   |  |  |  |
|                                                                                                     | 1     | Result was rounded up, if a precision exception was detected.     |  |  |  |
| C2                                                                                                  | U     |                                                                   |  |  |  |
| C3                                                                                                  | U     |                                                                   |  |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                   |  |  |  |

| Exception                                       | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                          |
|-------------------------------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                    | Х    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) is set to 1. |
| Stack, #SS                                      | Х    | Х               | Х         | A memory address exceeded the stack segment limit or was non-canonical.                     |
| General protection,<br>#GP                      | Х    | Х               | Х         | A memory address exceeded a data segment limit or was non-canonical.                        |
|                                                 |      |                 | Х         | A null data segment was used to reference memory.                                           |
| Page fault, #PF                                 |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                |
| Alignment check, #AC                            |      | Х               | Х         | An unaligned memory reference was performed while alignment checking was enabled.           |
| x87 floating-point<br>exception pending,<br>#MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                       |

| Exception                                                    | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                                |
|--------------------------------------------------------------|------|-----------------|--------------|-----------------------------------------------------------------------------------|
|                                                              |      | x87 F           | loating-Poir | nt Exception Generated, #MF                                                       |
| Invalid-operation exception (IE)                             | Х    | Х               | Х            | A source operand was an SNaN value or an unsupported format.                      |
| . , ,                                                        | Х    | Х               | Х            | +infinity was subtracted from +infinity.                                          |
|                                                              | Χ    | Х               | Х            | -infinity was subtracted from -infinity.                                          |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х            | An x87 stack underflow occurred.                                                  |
| Denormalized-oper-<br>and exception (DE)                     | Х    | Х               | Х            | A source operand was a denormal value.                                            |
| Overflow exception (OE)                                      | Х    | Х               | Х            | A rounded result was too large to fit into the format of the destination operand. |
| Underflow exception (UE)                                     | Х    | Х               | Х            | A rounded result was too small to fit into the format of the destination operand. |
| Precision exception (PE)                                     | Х    | Х               | Х            | A result could not be represented exactly in the destination format.              |

336 FSUBRx

### **FTST**

### **Floating-Point Test with Zero**

Compares the value in ST(0) with 0.0, and sets the condition code flags in the x87 status word as shown in the x87 Condition Code table below. The instruction ignores the sign distinction between -0.0 and +0.0.

| Mnemonic | Opcode | Description           |
|----------|--------|-----------------------|
| FTST     | D9 E4  | Compare ST(0) to 0.0. |

#### **Related Instructions**

FCOM, FCOMP, FCOMP, FCOMI, FCOMIP, FICOM, FICOMP, FUCOMIP, FUCOMP, FUCOMP, FUCOMP, FXAM

### **rFLAGS Affected**

None

#### **x87 Condition Code**

| <b>C</b> 3 | <b>C</b> 2 | C1 | CO | Compare Result      |
|------------|------------|----|----|---------------------|
| 0          | 0          | 0  | 0  | ST(0) > 0.0         |
| 0          | 0          | 0  | 1  | ST(0) < 0.0         |
| 1          | 0          | 0  | 0  | ST(0) = 0.0         |
| 1          | 1          | 0  | 1  | ST(0) was unordered |

| Exception                                                    | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                                           |
|--------------------------------------------------------------|------|-----------------|--------------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х            | The emulate bit (EM) or the task switch bit (TS) of the control register (CRO) was set to 1. |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х            | An unmasked x87 floating-point exception was pending.                                        |
|                                                              |      | x87 F           | loating-Poin | t Exception Generated, #MF                                                                   |
| Invalid-operation exception (IE)                             | Х    | Х               | Х            | A source operand was a SNaN value, a QNaN value, or an unsupported format                    |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х            | An x87 stack underflow occurred.                                                             |
| Denormalized-oper-<br>and exception (DE)                     | Х    | Х               | Х            | A source operand was a denormal value.                                                       |

338 FTST

# FUCOMP FUCOMPP

### **Floating-Point Unordered Compare**

Compares the value in ST(0) to the value in another x87 register, and sets the condition codes in the x87 status word as shown in the x87 Condition Code table below.

If no source operand is specified, the instruction compares the value in ST(0) to that in ST(1).

After making the comparison, the FUCOMP instruction pops the x87 stack register and the FUCOMPP instruction pops the x87 stack register twice.

The instruction carries out the same comparison operation as the FCOM instructions, but sets the invalid-operation exception (IE) bit in the x87 status word to 1 when either or both operands are an SNaN or are in an unsupported format. If either or both operands is a QNaN, it sets the condition code flags to unordered, but does not set the IE bit. The FCOM instructions, on the other hand, raise an IE exception when either or both of the operands are a NaN value or are in an unsupported format.

| Mnemonic     | Opcode          | Description                                                                                                                      |
|--------------|-----------------|----------------------------------------------------------------------------------------------------------------------------------|
| FUCOM        | DD E1           | Compare ST(0) to ST(1) and set condition code flags to reflect the results of the comparison.                                    |
| FUCOM ST(i)  | DD E0+ <i>i</i> | Compare ST(0) to ST(i) and set condition code flags to reflect the results of the comparison.                                    |
| FUCOMP       | DD E9           | Compare ST(0) to ST(1), set condition code flags to reflect the results of the comparison, and pop the x87 register stack.       |
| FUCOMP ST(i) | DD E8+ <i>i</i> | Compare ST(0) to ST(i), set condition code flags to reflect the results of the comparison, and pop the x87 register stack.       |
| FUCOMPP      | DA E9           | Compare ST(0) to ST(1), set condition code flags to reflect the results of the comparison, and pop the x87 register stack twice. |

#### **Related Instructions**

FCOM, FCOMPP, FCOMI, FCOMIP, FICOM, FICOMP, FTST, FUCOMI, FUCOMIP, FXAM

### rFLAGS Affected

None

### **x87 Condition Code**

| <b>C</b> 3 | <b>C</b> 2 | C1 | Co | Compare Result          |
|------------|------------|----|----|-------------------------|
| 0          | 0          | 0  | 0  | ST(0) > source          |
| 0          | 0          | 0  | 1  | ST(0) < source          |
| 1          | 0          | 0  | 0  | ST(0) = source          |
| 1          | 1          | 0  | 1  | Operands were unordered |

### **Exceptions**

| Exception                                                    | Real | Virtual<br>8086 | Protected   | Cause of Exception                                                                           |
|--------------------------------------------------------------|------|-----------------|-------------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х           | The emulate bit (EM) or the task switch bit (TS) of the control register (CRO) was set to 1. |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х           | An unmasked x87 floating-point exception was pending.                                        |
|                                                              |      | x87             | Floating-Po | int Exception Generated, #MF                                                                 |
| Invalid-operation exception (IE)                             | Х    | Х               | Х           | A source operand was an SNaN value or an unsupported format.                                 |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х           | An x87 stack underflow occurred.                                                             |
| Denormalized-oper-<br>and exception (DE)                     | Х    | Х               | Х           | A source operand was a denormal value.                                                       |

340 FUCOMx

### FUCOMI FUCOMIP

### **Floating-Point Unordered Compare and Set Eflags**

Compares the contents of ST(0) with the contents of another floating-point register, and sets the zero flag (ZF), parity flag (PF), and carry flag (CF) as shown in the rFLAGS Affected table below.

Unlike FCOMI and FCOMIP, the FUCOMI and FUCOMIP instructions do not set the invalid-operation exception (IE) bit in the x87 status word for QNaNs.

After completing the comparison, FUCOMIP pops the x87 register stack.

| Mnemonic            | Opcode          | Description                                                                                                 |
|---------------------|-----------------|-------------------------------------------------------------------------------------------------------------|
| FUCOMI ST(0),ST(i)  | DB E8+ <i>i</i> | Compare $ST(0)$ to $ST(i)$ and set eflags to reflect the result of the comparison.                          |
| FUCOMIP ST(0),ST(i) | DF E8+ <i>i</i> | Compare ST(0) to ST(i), set eflags to reflect the result of the comparison, and pop the x87 register stack. |

#### **Related Instructions**

FCOM, FCOMPP, FCOMI, FCOMIP, FICOM, FICOMP, FTST, FUCOM, FUCOMP, FUCOMPP, FXAM

FUCOMIX 341

### rFLAGS Affected

| ZF | PF | CF | Compare Result          |
|----|----|----|-------------------------|
| 0  | 0  | 0  | ST(0) > source          |
| 0  | 0  | 1  | ST(0) < source          |
| 1  | 0  | 0  | ST(0) = source          |
| 1  | 1  | 1  | Operands were unordered |

### **x87 Condition Code**

| Value | Description |
|-------|-------------|
|       |             |
| 0     |             |
|       |             |
|       |             |
|       | Value<br>0  |

A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U.

### **Exceptions**

| Exception                                                    | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                                          |
|--------------------------------------------------------------|------|-----------------|--------------|---------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х            | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) is set to 1. |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х            | An unmasked x87 floating-point exception was pending.                                       |
|                                                              |      | x87             | Floating-Poi | nt Exception Generated, #MF                                                                 |
| Invalid-operation exception (IE)                             | Х    | Х               | Х            | A source operand was an SNaN value or an unsupported format.                                |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х            | An x87 stack underflow occurred.                                                            |
| Denormalized-oper-<br>and exception (DE)                     | Х    | Х               | Х            | A source operand was a denormal value.                                                      |

342 FUCOMIX

# FWAIT (WAIT)

### **Wait for Unmasked x87 Floating-Point Exceptions**

Forces the processor to test for pending unmasked floating-point exceptions before proceeding.

If there is a pending floating-point exception and CR0.NE = 1, a numeric exception (#MF) is generated. If there is a pending floating-point exception and CR0.NE = 0, FWAIT asserts the FERR output signal, then waits for an external interrupt.

This instruction is useful for insuring that unmasked floating-point exceptions are handled before altering the results of a floating point instruction.

FWAIT and WAIT are synonyms for the same opcode.

| Mnemonic | Opcode | Description                                      |
|----------|--------|--------------------------------------------------|
| FWAIT    | 9B     | Check for any pending floating-point exceptions. |

#### **Related Instructions**

None

#### rFLAGS Affected

None

#### **x87 Condition Code**

| x87 Condition Code         | Value                                                                                               | Description |  |  |  |  |
|----------------------------|-----------------------------------------------------------------------------------------------------|-------------|--|--|--|--|
| CO                         | U                                                                                                   |             |  |  |  |  |
| C1                         | U                                                                                                   |             |  |  |  |  |
| C2                         | U                                                                                                   |             |  |  |  |  |
| C3                         | U                                                                                                   |             |  |  |  |  |
| A flag set to 1 or cleared | A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |             |  |  |  |  |

| Exception                                       | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                              |
|-------------------------------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                    | Х    | Х               | Х         | The monitor coprocessor bit (MP) and the task switch bit (TS) of the control register (CR0) were both set to 1. |
| x87 floating-point<br>exception pending,<br>#MF | Х    | Х               | X         | An unmasked x87 floating-point exception was pending.                                                           |

### **FXAM**

### **Floating-Point Examine**

Examines the value in ST(0) and sets the C0, C2, and C3 condition code flags in the x87 status word as shown in the x87 Condition Code table below to indicate whether the value is a NaN, infinity, zero, empty, denormal, normal finite, or unsupported value. The instruction also sets the C1 flag to indicate the sign of the value in ST(0) (0 = positive, 1 = negative).

| Mnemonic | Opcode | Description                                    |
|----------|--------|------------------------------------------------|
| FXAM     | D9 E5  | Characterize the number in the ST(0) register. |

### **Related Instructions**

FCOM, FCOMP, FCOMP, FCOMI, FCOMIP, FICOM, FICOMP, FTST, FUCOM, FUCOMI, FUCOMP, FUCOMPP

### **rFLAGS Affected**

None

#### **x87 Condition Code**

| <b>C</b> 3 | C2 | C1 | CO | Meaning             |
|------------|----|----|----|---------------------|
| 0          | 0  | 0  | 0  | +unsupported format |
| 0          | 0  | 0  | 1  | +NaN                |
| 0          | 0  | 1  | 0  | -unsupported format |
| 0          | 0  | 1  | 1  | -NaN                |
| 0          | 1  | 0  | 0  | +normal             |
| 0          | 1  | 0  | 1  | +infinity           |
| 0          | 1  | 1  | 0  | -normal             |
| 0          | 1  | 1  | 1  | -infinity           |
| 1          | 0  | 0  | 0  | +0                  |
| 1          | 0  | 0  | 1  | empty               |

FXAM 345

| <b>C</b> 3 | C2 | C1 | CO | Meaning   |
|------------|----|----|----|-----------|
| 1          | 0  | 1  | 0  | -0        |
| 1          | 0  | 1  | 1  | empty     |
| 1          | 1  | 0  | 0  | +denormal |
| 1          | 1  | 1  | 0  | -denormal |

| Exception                                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                          |
|-------------------------------------------|------|-----------------|-----------|---------------------------------------------------------------------------------------------|
| Device not available,<br>#NM              | X    | Х               | Х         | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) is set to 1. |
| x87 floating-point exception pending, #MF | Х    | Х               | Х         | An unmasked x87 floating-point exception was pending.                                       |

346 FXAM

### **FXCH**

### **Floating-Point Exchange**

Exchanges the value in ST(0) with the value in any other x87 register. If no operand is specified, the instruction exchanges the values in ST(0) and ST(1).

Use this instruction to move a value from an x87 register to ST(0) for subsequent processing by a floating-point instruction that can only operate on ST(0).

| Mnemonic   | Opcode          | Description                                    |
|------------|-----------------|------------------------------------------------|
| FXCH       | D9 C9           | Exchange the contents of ST(0) and ST(1).      |
| FXCH ST(i) | D9 C8+ <i>i</i> | Exchange the contents of $ST(0)$ and $ST(i)$ . |

#### **Related Instructions**

FLD, FST, FSTP

#### rFLAGS Affected

None

### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description |  |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------|--|--|--|
| CO                                                                                                  | U     |             |  |  |  |
| C1                                                                                                  | 0     |             |  |  |  |
| C2                                                                                                  | U     |             |  |  |  |
| C3                                                                                                  | U     |             |  |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |             |  |  |  |

FXCH 347

| Exception                                                    | Real | Virtual<br>8086 | Protected   | Cause of Exception                                                                           |
|--------------------------------------------------------------|------|-----------------|-------------|----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х           | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1. |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х           | An unmasked x87 floating-point exception was pending.                                        |
|                                                              |      | x87             | Floating-Po | int Exception Generated, #MF                                                                 |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х           | An x87 stack underflow occurred.                                                             |

*348* FXCH

### **FXRSTOR**

### Restore XMM, MMX™, and x87 State

Restores the XMM, MMX, and x87 state. The data loaded from memory is the state information previously saved using the FXSAVE instruction. Restoring data with FXRSTOR that had been previously saved with an FSAVE (rather than FXSAVE) instruction results in an incorrect restoration.

If FXRSTOR results in set exception flags in the loaded x87 status word register, and these exceptions are unmasked in the x87 control word register, a floating-point exception occurs when the next floating-point instruction is executed (except for the no-wait floating-point instructions).

If the restored MXCSR register contains a set bit in an exception status flag, and the corresponding exception mask bit is cleared (indicating an unmasked exception), loading the MXCSR register from memory does not cause a SIMD floating-point exception (#XF).

FXRSTOR does not restore the x87 error pointers (last instruction pointer, last data pointer, and last opcode), except in the relatively rare cases in which the exception-summary (ES) bit in the x87 status word is set to 1, indicating that an unmasked x87 exception has occurred.

The architecture supports two memory formats for FXRSTOR, a 512-byte 32-bit legacy format and a 512-byte 64-bit format. Selection of the 32-bit or 64-bit format is accomplished by using the corresponding effective operand size in the FXRSTOR instruction. If software running in 64-bit mode executes an FXRSTOR with a 32-bit operand size (no REX-prefix operand-size override), the 32-bit legacy format is used. If software running in 64-bit mode executes an FXRSTOR with a 64-bit operand size (requires REX-prefix operand-size override), the 64-bit format is used. For details about the memory image restored by FXRSTOR, see "Saving Media and x87 Processor State" in volume 2.

If the fast-FXSAVE/FXRSTOR (FFXSR) feature is enabled in EFER, FXRSTOR does not restore the XMM registers (XMM0-XMM15) when executed in 64-bit mode at CPL 0. MXCSR is restored whether fast-FXSAVE/FXRSTOR is enabled or not. Software can use CPUID to determine whether the fast-FXSAVE/FXRSTOR feature is available. (See "CPUID" in Volume 3.)

If the operating-system FXSAVE/FXRSTOR support bit (OSFXSR) of CR4 is cleared to 0, the saved image of XMM0–XMM15 and MXCSR is not loaded into the processor. A general-protection exception occurs if there is an attempt to load a non-zero value to the bits in MXCSR that are defined as reserved (bits 31–16).

.

| Mnemonic          | Opcode   | Description                                                                   |
|-------------------|----------|-------------------------------------------------------------------------------|
| FXRSTOR mem512env | 0F AE /1 | Restores XMM, MMX <sup>™</sup> , and x87 state from 512-byte memory location. |

### **Related Instructions**

FWAIT, FXSAVE

### rFLAGS Affected

None

### **MXCSR Flags Affected**

| FZ | R  | C  | PM | UM | ОМ | ZM | DM | IM | DAZ | PE | UE | OE | ZE | DE | IE |
|----|----|----|----|----|----|----|----|----|-----|----|----|----|----|----|----|
| М  | М  | М  | M  | M  | М  | M  | M  | М  | M   | M  | M  | М  | M  | М  | М  |
| 15 | 14 | 13 | 12 | 11 | 10 | 9  | 8  | 7  | 6   | 5  | 4  | 3  | 2  | 1  | 0  |

#### Note:

### **Exceptions**

| Exception                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                      |
|---------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD       | X    | X               | Х         | The FXSAVE/FXRSTOR instructions are not supported, as indicated by bit 24 of CPUID standard function 1 or extended function 8000_0001h. |
|                           | Χ    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                               |
| Device not available, #NM | Χ    | Х               | X         | The task-switch bit (TS) of CR0 was set to 1.                                                                                           |

350 FXRSTOR

A flag that can be set to one or zero is M (modified). Unaffected flags are blank. Shaded fields are reserved.

| Exception               | Real | Virtual<br>8086 | Protected | Cause of Exception                                                       |
|-------------------------|------|-----------------|-----------|--------------------------------------------------------------------------|
| Stack, #SS              | Х    | Х               | Х         | A memory address exceeded the stack segment limit, or was non-canonical. |
| General protection, #GP | Х    | Х               | Х         | A memory address exceeded the data segment limit or was non-canonical.   |
|                         |      |                 | Х         | A null data segment was used to reference memory.                        |
|                         | Х    | Х               | Х         | The memory operand was not aligned on a 16-byte boundary.                |
|                         | Х    | Х               | Х         | Ones were written to the reserved bits in MXCSR.                         |
| Page fault, #PF         |      | Х               | Х         | A page fault resulted from the execution of the instruction.             |

### **FXSAVE**

### Save XMM, MMX™, and x87 State

Saves the XMM, MMX, and x87 state. A memory location that is not aligned on a 16-byte boundary causes a general-protection exception.

Unlike FSAVE and FNSAVE, FXSAVE does not alter the x87 tag bits. The contents of the saved MMX/x87 data registers are retained, thus indicating that the registers may be valid (or whatever other value the x87 tag bits indicated prior to the save). To invalidate the contents of the MMX/x87 data registers after FXSAVE, software must execute an FINIT instruction. Also, FXSAVE (like FNSAVE) does not check for pending unmasked x87 floating-point exceptions. An FWAIT instruction can be used for this purpose.

FXSAVE does not save the x87 pointer registers (last instruction pointer, last data pointer, and last opcode), except in the relatively rare cases in which the exception-summary (ES) bit in the x87 status word is set to 1, indicating that an unmasked x87 exception has occurred.

The architecture supports two memory formats for FXSAVE, a 512-byte 32-bit legacy format and a 512-byte 64-bit format. Selection of the 32-bit or 64-bit format is accomplished by using the corresponding effective operand size in the FXSAVE instruction. If software running in 64-bit mode executes an FXSAVE with a 32-bit operand size (no REX-prefix operand-size override), the 32-bit legacy format is used. If software running in 64-bit mode executes an FXSAVE with a 64-bit operand size (requires REX-prefix operand-size override), the 64-bit format is used. For details about the memory image restored by FXRSTOR, see "Saving Media and x87 Processor State" in volume 2.

If the fast-FXSAVE/FXRSTOR (FFXSR) feature is enabled in EFER, FXSAVE does not save the XMM registers (XMM0-XMM15) when executed in 64-bit mode at CPL 0. MXCSR is saved whether fast-FXSAVE/FXRSTOR is enabled or not. Software can use CPUID to determine whether the fast-FXSAVE/FXRSTOR feature is available. (See "CPUID" in Volume 3.)

If the operating-system FXSAVE/FXRSTOR support bit (OSFXSR) of CR4 is cleared to 0, FXSAVE does not save the image of XMM0–XMM15 or MXCSR. For details about the CR4.OSFXSR bit, see "FXSAVE/FXRSTOR Support (OSFXSR) Bit" in volume 2.

| Mnemonic         | Opcode   | Description                                                                |
|------------------|----------|----------------------------------------------------------------------------|
| FXSAVE mem512env | 0F AE /0 | Saves XMM, MMX $^{\text{TM}}$ , and x87 state to 512-byte memory location. |

352 FXSAVE

### **Related Instructions**

FINIT, FNSAVE, FRSTOR, FSAVE, FXRSTOR, LDMXCSR, STMXCSR

### rFLAGS Affected

None

### **MXCSR Flags Affected**

None

| Exception                 | Real | Virtual<br>8086 | Protected | Cause of Exception                                                                                                                      |
|---------------------------|------|-----------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------|
| Invalid opcode, #UD       | Х    | Х               | Х         | The FXSAVE/FXRSTOR instructions are not supported, as indicated by bit 24 of CPUID standard function 1 or extended function 8000_0001h. |
|                           | Х    | Х               | Х         | The emulate bit (EM) of CR0 was set to 1.                                                                                               |
| Device not available, #NM | Х    | Х               | Х         | The task-switch bit (TS) of CR0 was set to 1.                                                                                           |
| Stack, #SS                | Х    | Х               | Х         | A memory address exceeded the stack segment limit, or was non-canonical.                                                                |
| General protection, #GP   | Х    | Х               | Х         | A memory address exceeded the data segment limit or was non-canonical.                                                                  |
|                           |      |                 | Х         | A null data segment was used to reference memory.                                                                                       |
|                           |      |                 | Х         | The destination operand was in a non-writable segment.                                                                                  |
|                           | Х    | Х               | Х         | The memory operand was not aligned on a 16-byte boundary.                                                                               |
| Page fault, #PF           |      | Х               | Х         | A page fault resulted from the execution of the instruction.                                                                            |

### **FXTRACT**

### **Floating-Point Extract Exponent and Significand**

Extracts the exponent and significand portions of the floating-point value in ST(0), stores the exponent in ST(0), and then pushes the significand onto the x87 register stack. After this operation, the new ST(0) contains a real number with the sign and value of the original significand and an exponent of 3FFFh (biased value for true exponent of zero), and ST(1) contains a real number that is the value of the original value's true (unbiased) exponent.

The FXTRACT instruction is useful for converting a double-extended-precision number to its decimal representation.

If the zero-divide-exception mask (ZM) bit of the x87 control word is set to 1 and the source value is  $\pm 0$ , then the instruction stores  $\pm zero$  in ST(0) and an exponent value of  $-\infty$  in register ST(1).

| Mnemonic | Opcode | Description                                                                                                                       |
|----------|--------|-----------------------------------------------------------------------------------------------------------------------------------|
| FXTRACT  | D9 F4  | Extract the exponent and significand of ST(0), store the exponent in ST(0), and push the significand onto the x87 register stack. |

### **Related Instructions**

FABS, FPREM, FRNDINT, FCHS

#### rFLAGS Affected

None

#### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                       |  |  |  |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------------------------------------------------------------|--|--|--|--|
| CO                                                                                                  | U     |                                                                   |  |  |  |  |
| C1                                                                                                  | 0     | x87 stack underflow, if an x87 register stack fault was detected. |  |  |  |  |
| Ci                                                                                                  | 1     | x87 stack overflow, if an x87 register stack fault was detected.  |  |  |  |  |
| C2                                                                                                  | U     |                                                                   |  |  |  |  |
| C3                                                                                                  | U     |                                                                   |  |  |  |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                   |  |  |  |  |

354 FXTRACT

| Exception                                       | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                                          |
|-------------------------------------------------|------|-----------------|--------------|---------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                    | Х    | Х               | Х            | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) is set to 1. |
| x87 floating-point<br>exception pending,<br>#MF | Х    | Х               | Х            | An unmasked x87 floating-point exception was pending.                                       |
|                                                 |      | x87 l           | Floating-Poi | nt Exception Generated, #MF                                                                 |
| Invalid-operation exception (IE)                | Х    | Х               | Х            | A source operand was an SNaN value or an unsupported format.                                |
| Invalid-operation exception (IE) with           | Х    | Х               | Х            | An x87 stack underflow occurred.                                                            |
| stack fault (SF)                                | X    | Χ               | Х            | An x87 stack overflow occurred.                                                             |
| Denormalized-oper-<br>and exception (DE)        | Х    | Х               | Х            | A source operand was a denormal value.                                                      |
| Zero-divide exception (ZE)                      | Х    | Х               | Х            | The source operand was ±zero.                                                               |

### FYL2X

### Floating-Point $y * Log_2(x)$

Computes  $(ST(1) * log_2(ST(0)))$ , stores the result in ST(1), and pops the x87 register stack. The value in ST(0) must be greater than zero.

If the zero-divide-exception mask (ZM) bit in the x87 control word is set to 1 and ST(0) contains ±zero, the instruction returns  $\infty$  with the opposite sign of the value in register ST(1).

| Mnemonic | Opcode | Description                                                                    |
|----------|--------|--------------------------------------------------------------------------------|
| FYL2X    | D9 F1  | Replace $ST(1)$ with $ST(1) * log_2(ST(0))$ , then pop the x87 register stack. |

#### **Related Instructions**

FYL2XP1, F2XM1

#### rFLAGS Affected

None

#### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                       |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------------------------------------------------------------|--|
| C0                                                                                                  | U     |                                                                   |  |
|                                                                                                     | 0     | x87 stack underflow, if an x87 register stack fault was detected. |  |
| C1                                                                                                  | 0     | Result was rounded down, if a precision exception was detected.   |  |
|                                                                                                     | 1     | Result was rounded up, if a precision exception was detected.     |  |
| C2                                                                                                  | U     |                                                                   |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                   |  |

356 FYL2X

| Exception                                                    | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                                            |
|--------------------------------------------------------------|------|-----------------|--------------|-----------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х            | The emulate bit (EM) or the task switch bit (TS) of the control register (CR0) was set to 1.  |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х            | An unmasked x87 floating-point exception was pending.                                         |
|                                                              | •    | x87             | Floating-Poi | nt Exception Generated, #MF                                                                   |
| Invalid-operation exception (IE)                             | Х    | Х               | Х            | A source operand was an SNaN value or an unsupported format.                                  |
| . , ,                                                        | Х    | Х               | Х            | The source operand in ST(0) was a negative finite value (not -zero).                          |
|                                                              | X    | Х               | X            | The source operand in ST(0) was +1 and the source operand in ST(1) was ±infinity.             |
|                                                              | Х    | Х               | Х            | The source operand in ST(0) was -infinity.                                                    |
|                                                              | X    | Х               | Х            | The source operand in ST(0) was ±zero or ±infinity and the source operand in ST(1) was ±zero. |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х            | An x87 stack underflow occurred.                                                              |
| Denormalized-oper-<br>and exception (DE)                     | Х    | Х               | Х            | A source operand was a denormal value.                                                        |
| Zero-divide exception (ZE)                                   | Х    | Х               | Х            | The source operand in ST(0) was ±zero and the source operand in ST(1) was a finite value.     |
| Overflow exception (OE)                                      | Х    | Х               | Х            | A rounded result was too large to fit into the format of the destination operand.             |
| Underflow exception (UE)                                     | Х    | Х               | Х            | A rounded result was too small to fit into the format of the destination operand.             |
| Precision exception (PE)                                     | Х    | Х               | Х            | A result could not be represented exactly in the destination format.                          |

### FYL2XP1

### Floating-Point $y * Log_2(x+1)$

Computes  $(ST(1) * log_2(ST(0) + 1.0))$ , stores the result in ST(1), and pops the x87 register stack. The value in ST(0) must be in the range sqrt(1/2)-1 to sqrt(2)-1.

| Mnemonic | Opcode | Description                                                        |
|----------|--------|--------------------------------------------------------------------|
| FYL2XP1  | D9 F9  | Replace ST(1) with $ST(1) * log_2(ST(0) + 1.0)$ , then pop the x87 |
|          |        | register stack.                                                    |

#### **Related Instructions**

FYL2X, F2XM1

### **rFLAGS Affected**

None

#### **x87 Condition Code**

| x87 Condition Code                                                                                  | Value | Description                                                       |  |
|-----------------------------------------------------------------------------------------------------|-------|-------------------------------------------------------------------|--|
| CO                                                                                                  | U     |                                                                   |  |
|                                                                                                     | 0     | x87 stack underflow, if an x87 register stack fault was detected. |  |
| C1                                                                                                  | 0     | Result was rounded down, if a precision exception was detected.   |  |
|                                                                                                     | 1     | Result was rounded up, if a precision exception was detected.     |  |
| C2                                                                                                  | U     |                                                                   |  |
| C3                                                                                                  | U     |                                                                   |  |
| A flag set to 1 or cleared to 0 is M (modified). Unaffected flags are blank. Undefined flags are U. |       |                                                                   |  |

358 FYL2XP1

| Exception                                                    | Real | Virtual<br>8086 | Protected    | Cause of Exception                                                                               |
|--------------------------------------------------------------|------|-----------------|--------------|--------------------------------------------------------------------------------------------------|
| Device not available,<br>#NM                                 | Х    | Х               | Х            | The emulate bit (EM) or the task switch bit (TS) of the control register (CRO) was set to 1.     |
| x87 floating-point<br>exception pending,<br>#MF              | Х    | Х               | Х            | An unmasked x87 floating-point exception was pending.                                            |
|                                                              | •    | x87             | Floating-Poi | nt Exception Generated, #MF                                                                      |
| Invalid-operation exception (IE)                             | Х    | Х               | Х            | A source operand was an SNaN or unsupported format.                                              |
| . , ,                                                        | Х    | Х               | Х            | The source operand in $ST(0)$ was $\pm 0$ and the source operand in $ST(1)$ was $\pm infinity$ . |
| Invalid-operation<br>exception (IE) with<br>stack fault (SF) | Х    | Х               | Х            | An x87 stack underflow occurred.                                                                 |
| Denormalized-oper-<br>and exception (DE)                     | Х    | Х               | Х            | A source operand was a denormal value.                                                           |
| Overflow exception (OE)                                      | Х    | Х               | Х            | A rounded result was too large to fit into the format of the destination operand.                |
| Underflow exception (UE)                                     | Х    | Х               | Х            | A rounded result was too small to fit into the format of the destination operand.                |
| Precision exception (PE)                                     | Х    | Х               | Х            | A result could not be represented exactly in the destination format.                             |

26569-Rev. 3.04-September 2003

360 FYL2XP1

## Index

| Numerics                   | FCOMIx     | 252         |
|----------------------------|------------|-------------|
| 16-bit mode xvii           | FCOMx      | 249         |
| 32-bit mode xvii           |            | 254         |
| 64-bit mode xvii           | FDECSTP    | 256         |
| A                          | FDIVRx     | 261         |
|                            | FDIVx      | 258         |
| addressing                 | FEMMS      | 21          |
| RIP-relative xxiii         | FFREE      | 264         |
| В                          | FICOMx     | 265         |
| biased exponent xvii       |            | 267         |
| C                          | FINCSTP    | 269         |
| commit xviii               |            | 271         |
| compatibility mode xviii   | FISTx      | <b>27</b> 3 |
| condition codes            | FLD        | 276         |
| x87                        |            | 278         |
| CVTPD2PI 4                 | FLDCW      | 279         |
| CVTPI2PD                   |            | 281         |
| CVTPI2PS                   |            | 283         |
| CVTPS2PI                   |            | 285         |
| CVTTPD2PI                  |            | 287         |
| CVTTPS2PI                  |            | 289         |
| CV11F32F1 1/               |            | 291         |
| D                          |            | 293         |
| direct referencing xviii   | flush      | xix         |
| displacements xviii        |            | 294         |
| double quadword xviii      |            | 245         |
| doubleword xviii           |            | 271         |
| E                          | FNOP       | 297         |
| eAX-eSP register xxiv      | FNSAVE 22, | 312         |
| effective address size xix |            | 325         |
| effective operand size xix |            | 327         |
| eFLAGS register xxv        | FNSTSW     | 329         |
| eIP register xxv           | FPATAN     | 298         |
| element xix                |            | 300         |
| EMMS 20                    | FPREM1     | 303         |
| endian order xxvii         |            | 306         |
| exceptions xix             |            | 308         |
| exponent xvii              | FRSTOR 24, | 310         |
| F                          | FSAVE      | 312         |
| -                          | FSCALE     | 314         |
| F2XM1                      | FSIN       | 316         |
| FABS                       |            | 318         |
| FADDx                      | FSQRT      | 320         |
| FBLD                       |            | 322         |
| FBSTP 241                  | FSTCW      | 325         |
| FCHS                       |            | 327         |
| FCLEX                      |            | 329         |
| FCMOVcc                    |            | 334         |

| FSUBx                | 0            |
|----------------------|--------------|
| FTST 337             | octword xxi  |
| FUCOMIx 341          | offset xxi   |
| FUCOMx 339           | overflowxxii |
| FWAIT 343            | P            |
| FXAM 345             | •            |
| FXCH                 | packed xxii  |
| FXRSTOR              | PACKSSDW44   |
| FXSAVE               | PACKSSWB     |
| FXTRACT              | PACKUSWB 48  |
| FYL2X                | PADDB 50     |
| FYL2XP1              | PADDD 52     |
| F1L2AP1 35/          | PADDQ 54     |
|                      | PADDSB 56    |
| IGNxx                | PADDSW 58    |
| indirect xx          | PADDUSB      |
| instructions         | PADDUSW      |
| 3DNow! <sup>TM</sup> | PADDW        |
| 64-bit media         | PAND         |
| SSE                  | PANDN 68     |
| x87                  |              |
| X0/                  | PAVGB 70     |
| L                    | PAVGUSB 72   |
| legacy mode xx       | PAVGW 74     |
| legacy x86 xx        | PCMPEQB      |
| long mode xx         | PCMPEQD 78   |
| LSBxx                | PCMPEQW 80   |
| lsbxx                | PCMPGTB 82   |
|                      | PCMPGTD 84   |
| M .                  | PCMPGTW 86   |
| mask xxi             | PEXTRW 88    |
| MASKMOVQ 31          | PF2ID 90     |
| MBZ xxi              | PF2IW 92     |
| modes                | PFACC 95     |
| 16-bit xvii          | PFADD        |
| 32-bit xvii          | PFCMPEQ      |
| 64-bit xvii          | PFCMPGE 104  |
| compatibility xvii   | PFCMPGT      |
| legacy xx            | PFMAX        |
| long xx              | PFMIN        |
| protected xxii       |              |
| real xxii            | PFMUL        |
| virtual-8086 xxiv    | PFNACC       |
| moffset xxiv         | PFPNACC      |
|                      | PFRCP 125    |
| MOVD 33              | PFRCPIT1 128 |
| MOVDQ2Q              | PFRCPIT2 131 |
| MOVNTQ               | PFRSQIT1 134 |
| MOVQ 40              | PFRSQRT 137  |
| MOVQ2DQ              | PFSUB 140    |
| MSB xxi              | PFSUBR 143   |
| msb xxi              | PI2FD 146    |
| MSR xxv              | PI2FW        |
|                      | PINSRW 150   |
|                      | 130          |

| PMADDWD                          | 152  |
|----------------------------------|------|
| PMAXSW                           | 154  |
| PMAXUB                           | 156  |
| PMINSW                           | 158  |
| PMINUB                           | 160  |
| PMOVMSKB                         | 162  |
| PMULHRW                          | 164  |
| PMULHUW                          | 166  |
| PMULHW                           | 168  |
| PMULLW                           | 170  |
| PMULUDQ                          | 172  |
| POR                              | 174  |
| protected mode                   | xxii |
| PSADBW                           | 176  |
| PSHUFW                           | 178  |
| PSLLD                            | 181  |
| PSLLQ                            | 183  |
| PSLLW                            | 185  |
| PSRAD                            | 187  |
| PSRAW                            | 189  |
| PSRLD                            | 192  |
| PSRLQ                            | 194  |
| PSRLW                            | 196  |
| PSUBB                            | 198  |
| PSUBD                            | 200  |
| PSUBQ                            | 202  |
| PSUBSB                           | 204  |
| PSUBSW                           | 206  |
| PSUBUSB                          | 208  |
| PSUBUSW                          | 210  |
| PSUBW                            | 212  |
| PSWAPD                           | 214  |
| PUNPCKHBW                        | 216  |
| PUNPCKHDQ                        | 218  |
| PUNPCKHWD                        | 220  |
| PUNPCKLBW                        | 222  |
| PUNPCKLDQ                        | 224  |
| PUNPCKLWD                        | 226  |
| PXOR                             | 228  |
| 0                                |      |
| quadword                         | vvii |
| -                                | ЛЛП  |
| R                                |      |
| r8-r15                           |      |
| rAX-rSP                          |      |
| RAZ                              | xxii |
| real address mode. See real mode |      |
| real mode                        | xxii |
| registers                        |      |
| eAX-eSP                          |      |
| eFLAGS                           |      |
| eIP                              | XXV  |

| r8–r15                  | XXV    |
|-------------------------|--------|
| rAX-rSP                 | xxvi   |
| rFLAGS                  | xxvi   |
| rIP                     | xxvi   |
| relative                |        |
| revision history        | . xiii |
| rFLAGS register         | xxvi   |
| rIP register            | xxvi   |
| RIP-relative addressing | xxiii  |
| S                       |        |
| set                     | xxiii  |
| SSE                     | xxiii  |
| SSE-2                   | xxiii  |
| sticky bits             | xxiii  |
| Т                       |        |
| TSS                     | xxiii  |
| U                       |        |
| underflow               | xxiii  |
| V                       |        |
| vector                  | xxiii  |
| virtual-8086 mode       | xxiv   |

26569-Rev. 3.04-September 2003

362 Index