It seems more a matter of wording than anything in the CPU " An assembler would ...

creshal · on Oct 23, 2015

> But 0x90 also means nop

As… the article even explained further down, sometimes. That's where the whole discussion comes from.

• In 32 bit mode, it doesn't matter. Some (dumb) CPUs treated it as xchg ax,ax; pipelined CPUs optimized it away.

• In 64 bit mode, it matters: Is "xchg eax, eax" a valid way to clear the upper 32 bits of eax? Or will it always be optimized away as legacy NOP?

AMD decided for the latter. They could also have introduced a new opcode for it instead (there are already multiple nop instructions, like nopl/nopw, so it wouldn't have been too far off) – as this only affects 64 bit mode, backwards compatibility didn't really matter, both would have been possible.

Asbostos · on Oct 23, 2015

Opcode 0x90 only means xchg eax,eax on paper. If no documentation ever called it that then it would be a non-issue. It would always have been nop and still be nop. Somebody could also have called it xchg ebx,ebx as well and it would have been just the same in 32-bit mode.

creshal · on Oct 23, 2015

> If no documentation ever called it that then it would be a non-issue.

XCHG EAX,target is defined as opcode (0x90 + offset of target register), with EAX having offset 0.

So, it was xchg eax,eax originally, and documented as such, before it was turned into NOP because it happened to be safe for it.

It's still documented as "alias for the XCHG (E)AX, (E)AX instruction" in Intel's instruction set reference, and pre-486 embedded x86s still treat it as xchg.

Asbostos · on Oct 23, 2015

Sure. That "If" kind of makes it moot. That's all I was trying to point out - that it's a documentation thing rather than something in the design of the chip and how it works.

Edit: What do you mean "still treat it as xchg"? On those chips, isn't there no distinction between xchg and what we might retrospectively call "nop"? Perhaps this is something I'm missing.

creshal · on Oct 23, 2015

> Edit: What do you mean "still treat it as xchg"? On those chips, isn't there no distinction between xchg and what we might retrospectively call "nop"? Perhaps this is something I'm missing.

XCHG EAX, EAX in its dumbest interpretation loads EAX into a temporary register, replaces it with the contents of EAX and restores the temporary data to… EAX. So, it is an operation that does nothing, but it does nothing in an elaborate way. You can skip it instead of executing it, but only if your other code doesn't depend on 0x90 taking exactly three clock cycles.

The "treat 0x90 as NOP and skip it instead of wasting three cycles" optimization was only done with the 486 and up, and wasn't retroactively applied to the 386 embedded versions. Doing so would have messed up their timings, and would have needed a small design change, both not interesting to that customer base.

(The 386 and derivatives were still produced for embedded use for a long, long time, past 2001 – and thus, after the introduction of AMD64. When its instruction set was drafted, new 386-based embedded devices were still designed.)

0x0 · on Oct 23, 2015

It makes more sense if you look at the neighbors on the opcode map. It seems like the range 0x90-0x97 are all variants of xchg, just that the first happens to have a destination same as source:

http://sparksandflames.com/files/x86InstructionChart.html