Adventures in 6502 Land, Part 1: Concepts

I've been buried in 6502 assembly code as a result of working on Zeldomizer, a Legend of Zelda NES ROM manipulation library. I aim to, as I go along and discover more about the inner workings of the game, describe how the game uses clever techniques that are specific to this processor.

Rap Sheet

The MOS 6502 processor was used in quite a few game and home computer systems from the late 70s all the way through the late 80s:

  • The Commodore 64 uses a variant called the MOS 6510.
  • The Nintendo Entertainment System uses a specialized version of the 6502 manufactured by Ricoh called the 2A03 in NTSC regions (such as the United States), and the 2A07 in PAL regions (such as Britain.)
  • The Atari 2600 uses a cost-reduced version called the 6507 which removes a bunch of address pins and external signals. Its successors, the 5200 and 7800, also use variants of this processor.
  • The PC Engine / TurboGrafx-16 used a modified version developed by Hudson called the HuC6280.

Even the Super Nintendo, a game console released in the early 90s, sported a processor in the 6502 family: the beefy 16-bit version called the 65816. This made it a lot easier for people who had been developing games for the previous Nintendo console to do so, and more attractive for those involved in the crack/demo scenes on the home computer to join in on the fun.

If you'd like to learn more about the history of the 6502 processor and MOS Technologies, check out this article at Commodore.ca.

Right, let's get down to what makes this processor tick.

Registers

The 6502 processor has a few registers that can be accessed directly by a number of opcodes: the accumulator register a, and two index registers x and y.

The 6502 does not provide direct access to some registers. The program counter, or pc, is one of these. You can think of it as a pointer to the memory location where the next instruction will be read from.

In the 6502, the stack is one page, or block of memory that is 256 bytes large. This isn't a whole lot of memory compared to what is needed for running today's applications, but it was quite enough when the processor saw wide adoption in new electronics. The 6502 processor uses the second page of memory to keep track of the stack: the $0100 region.

You can push the contents of the a register on top of the stack with the pha instruction, which will also move the stack pointer backwards after saving the data. You can pop the contents from the top of the stack right back into the a register with the pla instruction, which will first move the stack pointer and then retrieve the data. That's how data gets on and off the stack, but what about addresses?

When the code jumps to a routine using the jsr instruction, the intention is often to return where it left after the routine is done. When this happens, the processor will push the pc register on the stack as two bytes.

The 6502 contains an 8-bit stack pointer. This is how the processor knows where in memory it needs to push or pop data from. It's a relative value; it will always reference memory at $0100 + s.

In order to get access to the stack pointer, we need to use the tsx instruction (Transfer S to X). All it does is take the value that is in the s register and copy it to the x register. It's the only way we can programmatically access the stack pointer. We can also go the other direction with the txs instruction (Transfer X to S).

Flags

The 6502 processor has a number of different flags, or bits which are turned on or off based on the outcome of particular instructions. For example, the adc instruction adds a number to the a register, and if you want to know if the result was larger than the register can hold, you can just check the c flag, which stands for Carry.

Here are the flags in the 6502 processor:

  • n - Negative
    • If the result of an operation has bit 7 set, this flag is set. The Negative name comes from signed integer math: negative numbers always have the highest bit set.
  • v - Overflow
    • If the result of an operation would cause signed overflow. This flag doesn't get much use, but if you're really curious, check the link.
  • b - Break
    • If an interrupt was caused by a brk instruction, this flag will be set. It can't be checked for by the processor; one needs to actually read the stack directly in an interrupt handler to retrieve the value. Not really useful unless you're writing a debugger.
  • d - Decimal Mode
    • This flag affects the adc and sbc instructions and causes arithmetic to process numbers as if they were Binary Coded Decimal. The NES has this functionality removed.
  • i - Interrupt Disable
    • If set, this flag disables the maskable interrupt.
  • z - Zero Result
    • If set, the outcome of the previous operation is zero.
  • c - Carry
    • Used in a number of different arithmetic operations. Often, this will be treated as a 9th bit.

Some of these can be manipulated directly. For example, the Carry flag can be set with sec and cleared with clc, but there's no way to set and clear the Zero flag.

Addressing

In order to specify which address to perform an operation on, there is a number of addressing modes which can be used. The assembler knows an address when it sees $ in front.

There are many kinds of addressing, but they typically belong to three groups of addressing modes:

  • Absolute addressing always refers to a specific memory address.
    • lda $1234
    • sta $3456
  • Indexed addressing will take a memory address and either add the contents of the x or y register to get the final destination.
    • lda $1234,x
    • sta $3456,y
  • Indirect addressing will take an address, read a new address from it, then process the value from the resulting address. Interrupt vectors work this way.
    • jmp ($1357)

The processor also has additional addressing modes that are either variations or combinations of the above:

  • Zero page addressing is a short form which is faster and references only the addresses in the very first page.
    • lda $12
    • sta $34
    • inc $56,x (Indexed addressing using the x register works on the zero page too)
  • Indexed indirect and Indirect indexed are combinations of addressing modes we've covered. These are specific to the x and y registers as shown below - these index registers are not interchangeable with these two addressing modes.
    • lda ($12),y - get the 16-bit value at $12 and add y to it for the final address
    • sta ($34,x) - get the 16-bit value at $34 + x for the final address

Not all addressing modes are valid for every operation. If you're curious how the instructions and addressing modes can be combined, a group named Oxyron created an opcode matrix which contains every possible combination, plus information about undocumented instructions.

Moving Data

The 6502 is designed such that you can't move memory directly to other memory. You have to load it into the processor first (typically to the a register) and then store it somewhere else.

You can load data directly into the a register. The x and y registers can also be loaded directly, but there are some restrictions.

  • Load: lda, ldx, ldy
  • Store: sta, stx, sty
  • Stack: pha, pla, php, plp, tsx, txs

Arithmetic

The 6502 processor has an arithmetic logic unit capable of performing a number of different operations. It does not have multiplication or division built in. Operations like these that are missing must be implemented using a combination of simpler operations. (For example, one might implement multiplication as a series of repeated additions.)

  • Addition: adc
  • Subtraction: sbc
  • Increment: inc, inx, iny
  • Decrement: dec, dex, dey
  • Binary And: and
  • Binary Or: or
  • Binary Exclusive Or: eor
  • Comparison: cmp, cpx, cpy
  • Bit Shifting: asl, lsr, rol, ror

Control Flow

The 6502 is able to make comparisons with the arithmetic unit as described above. Some of the processor bits are set or cleared accordingly, and the processor can branch based on the value of these bits.

The 6502 has two interrupts: one is maskable and the other is not. The maskable interrupt can be disabled with the sei instruction and re-enabled with the cli instruction. When an interrupt signal is initiated - most often by hardware such as a video chip or timer - the processor will finish the instruction it's currently working on and then fetch the appropriate interrupt vector. It then begins execution at the address it loaded. Three of these exist at the top of memory:

  • Non-maskable interrupt vector $fffa
  • Reset vector $fffc
  • Maskable interrupt vector $fffe

Processor flags are preserved when an interrupt happens, but the interrupt handler needs to leave the stack how they found it and use rti when it's done, or a (possibly spectacular) crash might happen.

The processor can jump to an address without saving its current position with jmp. If the current position should be saved to return to later, jsr can be used, and the routine can return where it left off with the rts instruction.

Jumping based on the value of a processor flag is also possible with the use of branching instructions. For example, if the code should jump somewhere else when the v flag is set, then the bvs (Branch if V is Set) instruction is the right one for the job.

  • Control Flow: jmp
    • Subroutines: jsr, rts, rti
    • Branch: bcs, bcc, beq, bne, bvs, bvc, bpl, bmi

Onward

There's a lot more to cover, and in the following parts, we will be touching on game-specific applications of the general concepts explained in this part.

You probably won't need to know that much more about the 6502 in order to appreciate what will be covered later, but if you want to know more about the processor, there's a wonderful site dedicated to it and it's fairly detailed!

Thanks for reading! See you next time~