6502 “Illegal” Opcodes Demystified

A closer look at the “illegal” opcodes and undocumented instructions of the MOS 6502 MPU.

Title illustation: MOS 6502 MPU

The instruction table of the MOS 6502 MPU, designed by MOS Technology and introduced in 1975 (the CMOS version, 65C02, was developed by Western Design Center) has some obvious gaps, with just 56 instructions documented in various address modes. This leaves 105 undocumented slots — and the 6502 community has been eager to fill these gaps, ever since.

Still, there’s some mystery left and there are questions unanswered, like, were at least some of them intentional (especially, since some of them are handy for block transfer, something the Z80 has dedicated instructions for) or are they all by accident, how do they behave, and why so? Here, we’ll try to come up with some answers to these questions.

First, let's have a look at the instruction table, as it is commonly presented, with the blank gaps filled in. (Here, for the “illegal” opcodes, we use the mnemonics used by the DASM and ACME assemblers, with the exception of “USBC” for instruction code $EB, where these use plain “SBC”.)

MOS 6502 instruction table
Instruction set of the MOS 6502 MPU, “illegals” on grey background. — Open in a new tab.

And here are all the 21 (more or less) “illegal” opcodes (alternative names given in parentheses) as they are commonly described:

ALR (ASR)

AND oper + LSR

A AND oper, 0 -> [76543210] -> C

addressingassembleropcbytescycles
immediateALR #oper4B22
ANC

AND oper + set C as ASL

A AND oper, bit(7) -> C

addressingassembleropcbytescycles
immediateANC #oper0B22
ANC (ANC2)

AND oper + set C as ROL

effectively the same as instr. 0B

A AND oper, bit(7) -> C

addressingassembleropcbytescycles
immediateANC #oper2B22
ANE (XAA)

* AND X + AND oper

Highly unstable, do not use.

A base value in A is determined based on the contets of A and a constant, which may be typically $00, $ff, $ee, etc. The value of this constant depends on temerature, the chip series, and maybe other factors, as well.
In order to eliminate these uncertaincies from the equation, use either 0 as the operand or a value of $FF in the accumulator.

(A OR CONST) AND X AND oper -> A

addressingassembleropcbytescycles
immediateANE #oper8B22 ††
ARR

AND oper + ROR

This operation involves the adder:
V-flag is set according to (A AND oper) + oper
The carry is not set, but bit 7 (sign) is exchanged with the carry

A AND oper, C -> [76543210] -> C

addressingassembleropcbytescycles
immediateARR #oper6B22
DCP (DCM)

DEC oper + CMP oper

M - 1 -> M, A - M

addressingassembleropcbytescycles
zeropageDCP operC725
zeropage,XDCP oper,XD726
absoluteDCP operCF36
absolut,XDCP oper,XDF37
absolut,YDCP oper,YDB37
(indirect,X)DCP (oper,X)C328
(indirect),YDCP (oper),YD328
ISC (ISB, INS)

INC oper + SBC oper

M + 1 -> M, A - M - C -> A

addressingassembleropcbytescycles
zeropageISC operE725
zeropage,XISC oper,XF726
absoluteISC operEF36
absolut,XISC oper,XFF37
absolut,YISC oper,YFB37
(indirect,X)ISC (oper,X)E328
(indirect),YISC (oper),YF324
LAS (LAR)

LDA/TSX oper

M AND SP -> A, X, SP

addressingassembleropcbytescycles
absolut,YLAS oper,YBB34*
LAX

LDA oper + LDX oper

M -> A -> X

addressingassembleropcbytescycles
zeropageLAX operA723
zeropage,YLAX oper,YB724
absoluteLAX operAF34
absolut,YLAX oper,YBF34*
(indirect,X)LAX (oper,X)A326
(indirect),YLAX (oper),YB325*
LXA (LAX immediate)

Store * AND oper in A and X

Highly unstable, involves a 'magic' constant, see ANE

(A OR CONST) AND oper -> A -> X

addressingassembleropcbytescycles
immediateLXA #operAB22 ††
RLA

ROL oper + AND oper

M = C <- [76543210] <- C, A AND M -> A

addressingassembleropcbytescycles
zeropageRLA oper2725
zeropage,XRLA oper,X3726
absoluteRLA oper2F36
absolut,XRLA oper,X3F37
absolut,YRLA oper,Y3B37
(indirect,X)RLA (oper,X)2328
(indirect),YRLA (oper),Y3328
RRA

ROR oper + ADC oper

M = C -> [76543210] -> C, A + M + C -> A, C

addressingassembleropcbytescycles
zeropageRRA oper6725
zeropage,XRRA oper,X7726
absoluteRRA oper6F36
absolut,XRRA oper,X7F37
absolut,YRRA oper,Y7B37
(indirect,X)RRA (oper,X)6328
(indirect),YRRA (oper),Y7328
SAX (AXS, AAX)

A and X are put on the bus at the same time (resulting effectively in an AND operation) and stored in M

A AND X -> M

addressingassembleropcbytescycles
zeropageSAX oper8723
zeropage,YSAX oper,Y9724
absoluteSAX oper8F34
(indirect,X)SAX (oper,X)8326
SBX (AXS, SAX)

CMP and DEX at once, sets flags like CMP

(A AND X) - oper -> X

addressingassembleropcbytescycles
immediateSBX #operCB22
SHA (AHX, AXA)

Stores A AND X AND (high-byte of addr. + 1) at addr.

unstable: sometimes 'AND (H+1)' is dropped, page boundary crossings may not work (with the high-byte of the value used as the high-byte of the address)

A AND X AND (H+1) -> M

addressingassembleropcbytescycles
absolut,YSHA oper,Y9F35
(indirect),YSHA (oper),Y9326
SHX (A11, SXA, XAS)

Stores X AND (high-byte of addr. + 1) at addr.

unstable: sometimes 'AND (H+1)' is dropped, page boundary crossings may not work (with the high-byte of the value used as the high-byte of the address)

X AND (H+1) -> M

addressingassembleropcbytescycles
absolut,YSHX oper,Y9E35
SHY (A11, SYA, SAY)

Stores Y AND (high-byte of addr. + 1) at addr.

unstable: sometimes 'AND (H+1)' is dropped, page boundary crossings may not work (with the high-byte of the value used as the high-byte of the address)

Y AND (H+1) -> M

addressingassembleropcbytescycles
absolut,XSHY oper,X9C35
SLO (ASO)

ASL oper + ORA oper

M = C <- [76543210] <- 0, A OR M -> A

addressingassembleropcbytescycles
zeropageSLO oper0725
zeropage,XSLO oper,X1726
absoluteSLO oper0F36
absolut,XSLO oper,X1F37
absolut,YSLO oper,Y1B37
(indirect,X)SLO (oper,X)0328
(indirect),YSLO (oper),Y1328
SRE (LSE)

LSR oper + EOR oper

M = 0 -> [76543210] -> C, A EOR M -> A

addressingassembleropcbytescycles
zeropageSRE oper4725
zeropage,XSRE oper,X5726
absoluteSRE oper4F36
absolut,XSRE oper,X5F37
absolut,YSRE oper,Y5B37
(indirect,X)SRE (oper,X)4328
(indirect),YSRE (oper),Y5328
TAS (XAS, SHS)

Puts A AND X in SP and stores A AND X AND (high-byte of addr. + 1) at addr.

unstable: sometimes 'AND (H+1)' is dropped, page boundary crossings may not work (with the high-byte of the value used as the high-byte of the address)

A AND X -> SP, A AND X AND (H+1) -> M

addressingassembleropcbytescycles
absolut,YTAS oper,Y9B35
USBC (SBC)

SBC oper + NOP

effectively same as normal SBC immediate, instr. E9.

A - M - C -> A

addressingassembleropcbytescycles
immediateUSBC #operEB22
NOPs (including DOP, TOP)

Instructions effecting in 'no operations' in various address modes. Operands are ignored.

opcaddressingbytescycles
1Aimplied12
3Aimplied12
5Aimplied12
7Aimplied12
DAimplied12
FAimplied12
80immediate22
82immediate22
89immediate22
C2immediate22
E2immediate22
04zeropage23
44zeropage23
64zeropage23
14zeropage,X24
34zeropage,X24
54zeropage,X24
74zeropage,X24
D4zeropage,X24
F4zeropage,X24
0Cabsolute34
1Cabsolut,X34*
3Cabsolut,X34*
5Cabsolut,X34*
7Cabsolut,X34*
DCabsolut,X34*
FCabsolut,X34*
JAM (KIL, HLT)

These instructions freeze the CPU.

The processor will be trapped infinitely in T1 phase with $FF on the data bus. — Reset required.

Instruction codes: 02, 12, 22, 32, 42, 52, 62, 72, 92, B2, D2, F2

Legend to markers used in the instruction details:

*
add 1 to cycles if page boundary is crossed
unstable
††
highly unstable

Disclaimer:
Information is provided as-is, without any guarantee of completness or correctness.
None of these “illegal” instructions are guaranteed to work, some are highly unstable, some may even start two asynchronous threads competing in race condition with the winner determined by such miniscule factors as temperature or minor differences in the production series, at other times, the outcome depends on the exact values involved and the chip series.
Use with care and at your own risk.

Well, this is all fine and good, but… we really do not learn much about hat they are and why these are.
Let’s risk another look at the instruction layout, as it ought to be viewed.

Another Look at the Instruction Layout

The 6502 instruction table is laid out according to a pattern a-b-c, where a and b are octal numbers, followed by a group of two binary digits c, as in the bit-vector “aaabbbcc”.

aaabbbcc
bit76543210
(0…7)(0…7)(0…3)

Example:
All ROR instructions share a = 3 and c = 2 (3b2) with the address mode in b.
At the same time, all instructions addressing the zero-page share b = 1 (a1c).

abc = 312  =>  ( 3 << 5 | 1 << 2 | 2 )  =  %011.001.10  =  $66  “ROR zpg”.

If we arrange the instruction table by components c, a and b, we find them all neatly lined up per address mode in the vertical columns (with the notable exception of instructions related to the X register, which show up with their respective Y counterpart for address modes involving an index by X). Notably, all the “illegals” adhere strictly to this scheme.

Moreover, all the instructions internal to the CPU and its flow of control are listed in the top quarter at c=0, while the bottom quarter at c=3, where we find the majority of “illegal” opcodes, is completely unpopulated by official opcodes. Further, for sections, where c=1 or c=2, we see opcodes of a kind sharing the same row (with the notable outliers of the two stack transfer instructions “TXS” and TSX).

MOS 6502 instruction layout
Instruction layout of the MOS 6502 MPU, “illegals” on grey background. — Open in a new tab.

While this certainly informative, it still doesn’t give away a systemic aspect of the unimplemented instructions, nor does this view tell us what they really are.

So let’s give this another try, this time arranging the instruction layout by components a, c and b:

MOS 6502 instruction table, structured view
Structured view of the 6502 instruction layout, “illegals” on grey background. — Open in a new tab.

Well, this is better, much better.

NOPs

First, we learn what the additional NOPs really are. By comparing opcodes by row and address modes by column, we can clearly see, what these ought to be.

E.g.,

$80 (a=4, c=0, b=0) is clearly “STY immediate”, attempting to store the the contents of the Y register in the literal operand.

Generally speaking, these additional NOPs are instructions with non-functional or nonsensical address modes, which do execute, but without any external effects.

JAMs

However, instructions of this group which involve indirect addressing fail entirely with the CPU infinitely trapped in T1 phase, resulting in a “JAM” (or KIL), rendering the CPU unresponsive and requiring a reset.

Instructions at ‘C = 3’

This is the really interesting part, the meat of the “illegal opcodes”.

Generally, we may observe that any of the instructions at c=3 are really inheriting their behavior from those at c=1 and c=2 in the same slot, found in the rows immediately above, same column, using the address mode of the instruction at c=1. Mind that in binary 3 is the composite of 1 and 2 with bits 0 and 1 set.

In other words, any instruction xxxxxx11 will execute the instructions at xxxxxx01 and xxxxxx10 at once, using the address mode of the instruction at xxxxxx01. (However, the general rule regarding X and Y register specific indexed address modes still applies.)

E.g.,

SAX abs” ($8F, a=4,c=3,b=3) is the composite of
STA abs” ($8D, a=4,c=1,b=3) and
STX abs” ($8E, a=4,c=2,b=3).

E.g.,

LAX X,ind” ($A3, a=5,c=3,b=0) is the composite of
LDA X,ind” ($A1, a=5,c=1,b=0) and
LDX imm ($A2, a=5,c=2,b=0).

The “Magic” Constant

Let’s have a closer look at the two highly unstable instructions “ANE” (XAA) and “LXA” (LAX immediate) involving a “magic constant” — typically $00, $FF, $EE, etc. —, which are both combinations of an accumulator operation and an inter-register transfer between the accumulator and the X register:

$8B (a=4,c=3,b=2): ANE imm = STA imm (NOP) + TXA
                   (A OR CONST) AND X AND oper -> A

$AB (a=5,c=3,b=2): LXA imm = LDA imm + TAX
                   (A OR CONST) AND oper -> A -> X

In the case of “ANE”, the contents of the accumulator is put on the internal data lines at the same time as the contents of the X-register, while there's also the operand read for the immediate operation, with the result transferred to the accumulator.

In the case of “LXA”, the immediate operand and the contents of the accumulator are competing for the imput lines, while the result will be transferred to both the accumulator and the X register.

The outcome of these competing, noisy conditions depends on the production series of the chip, and maybe even on environmental conditions. This effects in an OR-ing of the accumulator with the “magic constant” combined with an AND-ing of the competing inputs. The final transfer to the target register(s) then seems to work as may be expected.

(We may note that all the instructions involved in these two opcodes complete in 2 cycles, the shortest sequence available on the 6502, meaning, everything is virtually happening “at once”.)

This AND-ing of competing output values suggests that the 6502 is working internally in active low logic, where all data lines are first set to high and then cleared for any zero bits. This also suggests that the “magic constant” stands merely for a partial transfer of the contents of the accumulator.

(Mind that this is not a qualified statement about the internals of the 6502 hardware, but merely an observation on its external effects.)

Much of this also applies to “TAS” (XAS, SHS), $9B, but here the extra cycles for indexed addressing seem to contribute to the conflict being resolved without this “magic constant”. However, “TAS” is still unstable.

The ‘H+1’ Group

There are four instructions, which add the peculiar term ‘high-byte of provided address + 1’ to the equation. These are:

SHA (AHX, AXA)       A AND X AND (H+1) -> M
                     $9F  SHA abs,Y  (5)

SHX (A11, SXA, XAS)  X AND (H+1) -> M
                     $9E  SHX abs,Y  (5)

SHY (A11, SYA, SAY)  Y AND (H+1) -> M
                     $9C  SHY abs,X  (5)

TAS (XAS, SHS)       A AND X -> SP, A AND X AND (H+1) -> M
                     $9B  TAS abs,Y  (5)

We may already see, where this comes from: as the calculations for the effective address involves the ALU, a partial result for the high-byte adds to the conflicting output values. However, depending on minor timing discrepancies, this term may be also dropped (meaning, become overriden).
We may also discern, why the effective high-address may be replaced by the ouput value altogether, in case a page boundary is crossed, since this provides just the extra amount of timing required to allow the output value to stabilize and to override the address high-byte. Again, these instructions are unstable.

The Outliers

We may note that “SHY” and “SHX” are not part of the c=3 group, but rather the unimplemented instructions “STY abs,X” (c=0) and “STX abs,Y” (c=2) respectively. Both are apparently falling back to the implementation of “STA abs,X” with the extra quirk of the ‘H+1’ term.

SHA abs,Y”, finally, is the composite instruction adhering to the c=3 rule that we have already established, executing “STA abs,X” and “SHX abs,Y” at once. (Notably, this flips the address mode to “abs,Y”, where “abs,X” may be expected. Which suggests that this adjustment for indexed instructions concerning any X register transfers is implemented as an additional stage.)

SHA ind,Y” ($93), however, is the composite of “STA ind,Y” ($91) and “SHX ind,Y” ($92), which JAMs on its own.

It’s a rather interesting decision by the MOS Technology design team around Chuck Peddle to not implement the instructions “STX abs,Y” and “STY abs,X”, while the instruction decoding would have easily provided for this.
Was this just to keep the instruction set simple by arbitrarily limiting what could be done with the X and Y registers? Or was there a more serious conflict, like with the mechanism identifying flag operations, which may be responsible for the slots at (c=0/b=5) and (c=0/b=7) typically found empty, thus making the implementation of “STY abs,X” rather expensive, for which “STX abs,Y” was dropped, as well? — We may presume it might be about the access of the X and Y registers using the same internal data lanes, but this is contradicted by the very existence of “SHX” and “SHY”, which successfully access both registers, while at subsequent stages.
It could be simply a carry-over from the Motorola 6800 processor — from which the 6502 originated as a simplified, cost-reduced version —, which had just a single index register and thus lacked such an option anyway.

Mysterious NOPs

As mentioned earlier, we are able to figure out, what most of the NOPs and JAM instructions are, just from their disposition on the layout. But there is a group of 12 NOP instructions (all at a=0 and c≤3 and odd values of b), which seem to be truly empty slots. Namely these are the instructions at:

$04 (a=0, c=0, b=1)
$0C (a=0, c=0, b=3)
$14 (a=0, c=0, b=5)
$1C (a=0, c=0, b=7)
$34 (a=1, c=0, b=5)
$3C (a=1, c=0, b=7)
$44 (a=2, c=0, b=1)
$54 (a=2, c=0, b=5)
$5C (a=2, c=0, b=7)
$64 (a=3, c=0, b=1)
$74 (a=3, c=0, b=5)
$7C (a=3, c=0, b=7)

From their very position on the instruction layout, we may infer that these should be instructions internal to the CPU. Typically, instructions at (a=0/c=0) have a counterpart at (a=1/c=0) in the repective b position, as is also true for (a=2/c=0) and (a=3/c=0). E.g., PHP & PLP, BPL & BMI, CLC & SEC, and so on.
Here, however, the counterparts are missing, as well. (Only $04 and $0C have a counterpart in “BIT”, but we may have a hard time figuring out, what the counterpart of “BIT” may actually be.) For all we know, these instructions are simply unimplemented, and it’s a small wonder that the timing sequence for these instructions does resolve without a JAM. But these instructions are still interesting, as they direct our attention towards how the internal instructions which are implemented are systematically arranged on the decoding matrix.

The same pattern, BTW, may be observed for most instructions, so that we may think of even and consecutive odd values of a and same values for c and b as “opposing” or “complementary” slots, where we find in one slot the store instruction for a given register in and the other one the load instruction, both in the address mode defined by b, or a shift in one direction and the opposing shift in the other direction.

(U)SBC

Here, we also find an answer to the nagging question why the instruction for subtraction, “SBC”, isn’t found among the other ALU instructions, e.g., near its reverse operation “ADC”, but amid the the compare instructions. Now, “CMP” is simply the same as “SBC”, but without the final transfer of the result to the accumulator. So it makes sense to have “SBC abs” as the counterpart or complementary instruction to “CMP abs”, etc., once with the final transfer, and once without. However, unlike addition, compare instructions address various registers and not just the accumulator, thus claiming a considerable section of the instruction table. Therefor we find the arithmetic instruction “SBC”, $E9, “displaced” among the register instructions, rather than the other way round.

Which provides SBC’s second incarnation, in combination with the official “NOP” instruction at $EA (which is the nonsensical instruction “INC impl”), “USBC”, $EB.

Mind that the instructions at a = 6 and a = 7, occupying the bottom part of our third table, typically involve ALU operations in combination with various registers and/or memory locations.

Conclusions

What we have observed here is really a text-book example of undefined behavior for undefined input patterns. For any instruction with the two least significant bits set at once (c=3) the two instructions in the respective slot with c=1 and c=2 are started in parallel, asynchronous threads with competing output values AND-ed. Minor implementation details and environmental factors may contribute to the outcome of some of these instructions and how the timing eventually stabilizes.

Notably, there are no NOPs or jamming instructions at c=3, meaning, it doesn't matter, if any of the two threads JAMs, if the timing for one of them resolves successfully (thus advancing the internal phase).

At c=0, c=1 and c=2 we find either undocumented instructions with ineffective address modes, or undocumented instructions that fail entirely over unresolved timing issues, resulting in a “JAM”. There are just two exceptions to this rule, namely “SHY” and “SHX”, which, while unstable, may be somewhat usable.

So is any of this intentional? Hardly. It’s just undefined behavior. Orderly chaos as provided by the decoding matrix. However, we may learn some from this about the internals of 6502 and its various close cousins. — Which is at least some.

Mind that there is much more competent commentary on the 6502, which is based on analysis of the actual hardware, especially at visual6502.org. But, maybe, you found this “hermeneutic” approach, trying to reveal the systematic aspects of what may be observed externally, interesting, as well.

PS: All the tables in this post are SVG images. You may download and use them (mind the “open in a new tab” links), but please give reference to https://www.masswerk.at/6502/6502_instruction_set.html, where you can find the original tables.