Sizecoding on the Apple II

by Vince "Deater" Weaver

Apple II Forever!


Lovebyte 2021 Presentation

I gave a presentation on this topic at the Lovebyte 2021 Demoparty on 14 March 2021.



Link to my slides: lovebyte2021_sizecoding.pdf

Apple II Background


General 6502 Tricks

You can use most of the same tricks for small 6502 code that you'd use on other systems.

Memory Map

$0000 - $00FFZero Page
$0100 - $01FFStack
$0200 - $03FFMostly Free, Input Buffer, Interrupt Vectors
$0400 - $07FFLo-res/Text Page1
$0800 - $0BFFLo-res/Text Page2 (BASIC programs load here)
$0C00 - $1FFFFree
$2000 - $3FFFHi-res Page1
$4000 - $5FFFHi-res Page2
$6000 - $95FFFree
$9600 - $BFFFDOS3.3 and Buffers
$C000 - $CFFFSoft Switches, Expansion Card I/O and ROM
$D000 - $F7FFBASIC ROM (can be bankswitched later models)
$F800 - $FFFFMachine Language Monitor ROM (also can be bankswitched)

Soft Switches

The way you configure the hardware is via soft-switches, which are memory-mapped addresses you load/store from. The value you load or store doesn't matter, what matters is you read or write from the address. On older systems read or write didn't matter, on IIe and newer sometimes the behavior is different depending on if it's a R or W.

Typically you use LDA or STA to access these. If you don't want to destroy the contents of a register you can use BIT instead, though that might affect the flags.

For example, to switch to graphcis mode you can run the command
	BIT	$C050
however using firmware routines might still take less space as for example setting up HGR graphics mode can take 3 or 4 softswitches while just running
	JSR	HGR
only takes 3 bytes.

Some common useful softswitches (this is by far not a full list):

Graphics

The Apple II has various graphics modes and they are all horrible to use. This is because Woz was really clever saving gates. He also got DRAM refresh for free by having the video update circuits touching the RAM causing a refresh, but for this to work the memory addresses had to sort of be scattered around.

The graphics are almost-but-not-quite NTSC output, relying on NTSC artifacting for colors. Modern displays don't always like this very much. Back in the day a considerable amount of people had monochrome (usually green) monitors.

The graphics are very simple bitmapped displays. No sprites, no hardware scrolling, no palette effects, nothing fancy. You do get hardware page flipping between two pages.

Both lo-res and hi-res you can optionally switch in the bottom 4 lines of the text display to have mixed graphics/text. Use $C052/$C053 to switch mixed mode on and off. Note in mixed mode the "color killer" circuit is disabled so the text at the bottom will have purple/green fringes.

To switch pages between PAGE1 and PAGE2, access the soft switch at $C054/$C055.

The modes available are

Text

Both text and Lo-res share the same memory, 1k starting at $400. Page2 starts at $800.

To write text to the display you write the ASCII value to the address with the high bit set (so to write an A, $41, you'd actually write $C1). Until the Apple IIe you only got Uppercase. To get inverse (black text on white background) be sure the top two bits clear. For flash have the top bit set and bit 6 clear.

You might think you have a nice bitmap of 40x24, but no, it's interleaved. The 24 rows start at the following memory addresses:
$400,$480,$500,$580,$600,$680,$700,$780
$428,$4a8,$528,$5a8,$628,$6a8,$728,$7a8
$450,$4d0,$550,$5d0,$650,$6d0,$750,$7d0

You might notice an 8-byte gap at the end of the last rows. (40+40+40 columns is 120, it takes 8 more bytes to make 128). These are known as the "screen holes" and are reserved for expansion cards. If you try to optimize and over-write these values that can cause unintended issues with peripherals.

Lo-res

Lo-res mode just re-uses the text mode described above. It's just instead of describing an ASCII char the byte is split in half, the bottom 4-bits describe the color of the top pixel in a row and the top 4-bits describe the color of the bottom pixel in the row.

There are 15 colors (the two greys at 5 and 10 are more or less the same).
Apple II Lo-res colors
Even though they are solid colors, you do get minor fringing where colors join up due to the 0/1 patterns changing. You can sometimes use this intentionally to make effects.

Hi-res

Hi-res is a pain to program, even worse than lo-res. On a monochrome screen it's 280x192 pixels, but in color really you only get 140x192 if you want guaranteed color values. You get six distinct colors: black, white, orange, blue, green, purple. There are actually two blacks and two whites (one for each pallette).
Apple II hi-res colors

Page1 starts at $2000 and goes for 8k, and page2 starts at $4000 and goes for 8k. Again it's not a straight bitmap, the rows are interleaved in a weird 64:1 way (this is the reason for the traditional "miniblinds" effect when loading an image from disk). Row 0 is at $2000...$2027, but Row 1 is at $2400...$2427 and increase by 1K until Row7, but then things drop back and start over again at line 8 at $2080..$20A7, and again for 8 rows increase by 1K again. Then since the screen is divided into thirds like lo-res, it starts over *again* at line 64 at $2028 - $204f, etc.

Once you find the proper row (not a trivial exercise) you have to find the pixel to write to. Each row is 40 bytes which map to the 280 pixels. It is two bytes for 14 pixels (this can involve dividing by 7, not an easy task on the 6502).


Start with a byte in the upper left at $2000. The high bit of the byte indicates the palette for the next 3.5 bits (either blue/orange or purple/green. You can't mix the two sets in a single 3.5 pixel chunk. The bit actually shifts the pixels by .5 which causes NTSC color change). Then the 2-bit values (starting from lowest bit to highest bit) indicate the color. Note that having white (11) is both pixels on, black (00) is both pixels off) 01 and 10 are the colors, and in this case only the odd or even pixel is on. If you change colors and adjacent bits are 11 or 00 you'll get a white or black artifact line between them.

All of this means it takes a very skilled artist to make good looking hi-res art, having to keep in mind the color rules and such.

If you're size coding it's often easiest to use the built-in firmware HCOLOR and HPLOT routines even though they are a bit slow.

Later Graphics Modes

80 column text

IIe and later have 80-column text mode. You need an expansion card with extra AUX auxiliary RAM for this mode. The memory for the odd/even columns is split between normal text mode at $400 and AUX text mode at AUX $400. You can bank switch between these manually, or you can access a soft switch to remap this into $800 of normal RAM.

You can enter the mode by entering "PR#3" from BASIC, or from assembly jumping to $C300. You can also mess with the soft switches directly but that can get a bit complicated. Once the 80-column firmware is active you can switch modes and other things by printing control characters through COUT.

Double Lo-res

Double Lo-res is a mode on Apple IIe and newer. Just like 80 column text uses the AUX ram to hold the extra interleaved columns, the same is true of the extra 40 columns of double Lo-res graphics. In theory it is possible to page-flip in this mode but it's complicated. I personally have not done much with this mode.

Some of the switches involved:
	sta     80STOREOFF      ; $C000 page2 switches page1/page2
        sta     80STOREON       ; $C001 page2 switches main/aux video
        sta     80COLON         ; $C00D 80 column/double-res mode
        sta     ALTCHARSETON    ; $C00F Enable mouse text
        sta     AN3             ; $C05E set double graphics

Double Hi-res

This mode again uses AUX RAM, but in this case to have more colors rather than more columns. An extra 8k in AUX RAM is used to expand to 15 colors with no color-clash.

It is complicated though because the format of the data has changed. The high bit of each byte is now ignored. The 4-bit colors from Lo-res are used. However they are spread across the 7 bit values in AUX and Normal RAM. The first color starts in the upper left with AUX location 0, Holding 4 bits of the first pixel. Then the first 3 bits of the next pixel. The remaining bit is stored at the beginning of the first byte in normal RAM (but I want to say the bits are inverted somehow?). Then the next 4 bits, then again but overflowing back to AUX again.

Switching from AUX to Normal for writing is a pain. There's a way you can map the AUX pages overtop of where PAGE2 would be in normal RAM to make things slightly easier.

To get into double hires mode, after calling HGR do something like this:
	sta     $C05E           ; set double hires
        sta     $C00D           ; 80 column
        sta     $C001           ; 80 store

It in theory is possible to page flip in this mode too, but it's complex.

Advanced Graphics Programming

BASIC/Monitor Firmware Routines

If doing size-coding it can be useful to call into routines in the BASIC firmware. Some of the monitor routines can be assumed to always be there. Usually you assume that Applesoft BASIC is there, though on the original Apple II it had Woz's Integer BASIC instead which is not compatible.

There are nice routines for graphics (hi-res and lo-res) there, as well as text and screen scrolling. My 64-byte flame demo takes advantage of the text/lo-res duality and scrolls up the graphics by actually calling into the firmware scroll-text routine.

You can find commented dis-assemblies of the firmware on line. Here's one for Applesoft

Here are some useful entry points:

Lo-res Plotting

To enable Lo-res manually:
        bit     SET_GR  ; $C050 3 bytes
        bit     LORES   ; $C056 3 bytes
        bit     FULLGR  ; $C052 3 bytes
        bit     PAGE1   ; $C054 3 bytes

Instead you could call into ROM, Apple II has some well-defined and stable entry points:
        jsr     SETGR  ; $FB40 3 bytes (same as Applesoft GR)
                       ;               set Lo-res graphics, Page1
                       ;               split text/graphics, clear to black

To plot a point:
	; Plot light green point at 10,10
        lda     #$CC    ; load color hi/lo (light green here)
        sta     COLOR   ; store to zero page $30
        ldy     #10
        lda     #10
        jsr     PLOT    ; $F800 plots at screen location in Y, A


Those routines are slow. For fast plotting you want something like:
        lda     YPOS            ; load y-coordinate
        and     #$FE            ; make even
        tay                     ; put in Y register
        lda     gr_offsets,Y    ; get address from lookup
        sta     GBASL
        lda     gr_offsets+1,Y
        sta     GBASH           ; if page-flipping, should add $0/$4

        lda     COLOR           ; get color (note: 40x24 faster and smaller!)
                                ; for 40x48 need to load, mask, logical-or
        ldy     XPOS            ; load x-coordinate
        sta     (GBASL),Y


gr_offsets: .word $400,$480,$500,$580,$600,$680,$700,$780
            .word $428,$4a8,$528,$5a8,$628,$6a8,$728,$7a8
            .word $450,$4d0,$550,$5d0,$650,$6d0,$750,$7d0

However that wastes a lot of room. You can do a sort of hybrid where when iterating across the screen you can calculate the row once and then offset. There are also Applesoft/ROM routines for VLIN (to draw a vertical line) and HLIN (to draw a horizontal line). You can use the text positioning firmware routines to set the proper row pointer (usually in BASL:BASH $28/$29) rather than having to have a lookup table to find the right address. Also SETCOL to set the COLOR ($30) to have the same top/bottom nibble (effectively multiplying the bottom 4 bits by 17).

Page Flipping

Page flipping makes for smooth animations (drawing offscreen). Unfortunately the lo-res ROM routines not PAGE aware (the hi-res ones are). It generally takes around 20 bytes or so.
        ldx     #0              ; x already 0
        lda     draw_page_smc+1 ; DRAW_PAGE
        beq     done_page
        inx
done_page:
        ldy     PAGE0,X         ; set display page to PAGE1 or PAGE2
        eor     #$4             ; flip draw page between $400/$800
        sta     draw_page_smc+1 ; DRAW_PAGE

Hi-res Plotting

Hi-res is complex enough that it's hard to do much when size constrained. Using the ROM routines helps, and you can do some fancy effects if you really take the time to learn what's going on.
HGR $F3E2set hires/mixed/page1/clear to 0
HGR2 $F3D8set hires/full/page2/clear to 0
HCLR $F3F2clear page in $E6 to 0
BKGND $F3F6clear page in $E6 to last color plotted
HPOSN $F411move to Xcoord (Y,X) Ycoord (A)
HPLOT0 $F457plot point at (Y,X), (A)
HGLIN $F53Adraw line to (A,X), (Y)
HLINRL $F530draw relative (A,X), (Y)
Some useful Zero page addresses:

Shape Tables

Applesoft BASIC has built in vector-drawing "shape table" routines.

It's a bit beyond the scope of this document, but you point a pointer in RAM to values that describe a shape, by UP DOWN LEFT RIGHT (with draw) or pen-up UP DOWN LEFT RIGHT. It takes 3 bits for each of these, and you can squeeze two or three (three if the top value fits in two bits) commands per byte.

The BASIC routines DRAW and XDRAW (draw with xor) draw these patterns and you can set the SCALE or ROT (rotation) when drawing these.

You can call into these routines from assembly language to get a somewhat slow but powerful compact shape drawing capability.

Other Graphics Tricks


Floppy Disk

The Disk II floppy disk has a lot of interesting stories. For background see my 2020 Demosplash talk. Woz (Steve Wozniak) optimized down to the bare minimum number of parts, implementing a lot in software (a lot of cycle counted routines).

You can fit 140k on a floppy (35 tracks, 16 sectors, 256 bytes). It was all software controlled, so you could do all kinds of crazy copy protection by messing with the stepper motors in real time. See 4am's writeups on this.

Demos are often distributed as 140k .dsk images which are more or less a raw bit image of the disk (there's complications involving sector interleave). Usually the format is Apple DOS3.3 (note, this is *not* MS-DOS 3.3). This is a very simple, but relatively slow (but still many times faster than the C64 1541) filesystem with lots of weird quirks as it was written by amateurs. It is relatively easy to write files to the filesystem with various tools you can get.

There's a more advanced and faster filesystem called ProDOS that Apple released later in the II lifetime. This is still maintained by volunteers to this day, though some versions don't work on the older machines (due to 65c02 opcodes).

There's a new .woz file format that is an actual flux capture and can image disks exactly, copy protection and all. There's usually little need for using this instead of .dsk if you're just writing simple demos.

If your demo is large, it can take a while to load from DOS3.3 (it's about 1k/s). If you need faster code, seek out Qkumba's various disk loader routines. You can easily get at least 8x the speed with very little overhead.

Bootsector Demos

The Disk II firmware will load a 256 byte sector from track 0 / sector 0 of disk into memory at $800 and then jump to $801. (Byte 0 at $800 indicates how many sectors to load, this is usually 1).

The first thing you want to do is turn off the floppy motor, which you can do by LDA $C0X8 where X is the slot number (usually 6) or'd with 8. The slot number times 16 is in the X register at entry.

You can save some bytes by not turning off the drive motor, but then the drive will spin forever.

Using Apple DOS

If you boot into a disk image you might want to poke around a bit.

To list files use the CATALOG command. It will list the filename, the filesize (in number of 256 byte sectors; this will always be at least 2 because it also counts the track-sector-list filesystem metadata) and the file type.

File type 'A' means a BASIC program. You can run it with RUN FILENAME. If you want to see the source code you can LOAD FILENAME first and then run LIST.

File type 'B' is a binary program. You can run it with BRUN FILENAME. You can load it first with BLOAD FILENAME though you'll have to go into the machine language monitor if you want to list it. There is an A command line option if you want to load the file to an alternate address.

When you 'init' (format) a disk, a file is set as the boot program. By default it is a BASIC program called HELLO but it doesn't have to have that name. If you want to have the HELLO program automatically run your program at boot, you'd make it look like this:
10 PRINT CHR$(4);"BRUN FILENAME"
When DOS loads it intercepts the output handler and adds code to watch for the ASCII 4 (control-D) character and treats what follows as a DOS command.

Executable Format

An Apple II executable is just a raw 6502 program with a 4-byte header.

The four byte header is two bytes giving the length of the program, and two bytes giving the address it should be loaded to (these are little endian 16-bit values). These values really should be filesystem metadata, but they are stored in the file because DOS3.3 is oddly designed.

It's sort of an open question whether these 4 bytes should count against size-counted code for compo reasons. The files are in the binary on disk, but never get loaded into memory.

To run a binary file you use BRUN FILENAME

Apple II Monitor

The Apple II by default boots into Applesoft BASIC (you might need to press control-reset if you don't have a disk in the drive). You can drop into the machine language monitor with CALL -151. You can use commands like 300L to disassemble memory starting at $300 and 300G to execute code starting at $300.

Compo Issues


Assembling Code

If you ask in the Apple II community what assembler to use you'll get a lot of different answers. I like using ca65 from the cc65 project, but that's not necessarily the most popular choice. I also do all of my development under Linux, use a custom tool of mine (dos33fs-utils) to put the executables into a disk image, and test using AppleWin emulator (under wine) before transferring to an actual Apple II that has a USB/SD disk emulator (cffa3k or floppy-emu).

Keyboard

The Apple II has very simple keyboard support. No buffer.

Read $C000 and if it's positive (high bit not set) a key was pressed.

If it's negative, a key was pressed and you'll need to mask off the high bit to get ASCII.

You only get a key press, no key push/release events.

IIe and later keypresses will autorepeat, on II/II+ there was a separate REPT key you had to press to get repeats.

Once you read the value, you need to read the keyboard strobe $C010 to reset for the next keypress.

Random Number Generator

There's a built-in RNG in the Firmware but it's fairly awful. The built in keypress routines update a zero-page address you can use as a seed.

You can write a shift/xor based 8-bit linear feedback pseudo-rng in a handful of bytes.

If you're really constrained for space, indexing and reading from ROM $D000 (for example) can get you an approximation of randomness.

Sine/Cosine Trig Functions

Applesoft has some in ROM but you have to use the weird 5-byte floating point format.

Rom has a table of COS(90*X/16 DEGREES)*$100 - 1 at $F5BA

Delay

The Apple II by default has no timers. (If you have a Mockingboard sound board installed you can get some there, but that was a somewhat rare expansion).

A quick way to delay is the firmware WAIT call at $FCA8 which delays for (26+27A+5A^2)/2 microseconds.

Sound

By default, the Apple II machines only have a memory mapped speaker. You can make it click by accessing $C030. You can play tones with cycle-counted timing loops. There is no timer interrupt on a stock II which means you are stuck cycle-counting.

Demo programmers often assume a Mockingboard with AY-3-8910 chips is installed. This gives you a timer. It also gives you the AY chips connected via VIA6522 chips. Unfortunately accessing those takes a lot of bytes which makes size-coding difficult (also, some obvious size-code techniques can break the sound, as there's a 10us (10 cycle) limit on some of the card timings and it can actually be hard to hit that on a 1MHz Apple II).

Interrupts

As mentioned previously, there is no source of interrupts on a stock II. With a Mockingboard you can get interrupts from an onboard 6522 timer. Be aware that generally the firmware captures the interrupts before passing them to the user handler, and depending on the model it can waste a lot of cycles doing that.

Also note that the Apple II disk code is cycle-counted, so interrupts are disabled during disk access which makes it difficult to have things like music playing during disk access.

RAM Banking

Default Apple II maxes out at 48k of RAM. 64k systems use something called the language card to provide a 16k expansion by banking out the RAM (as one 12k chunk at $D000, then another 4k chunk again at $D000. You can't bank out $C000 as that's where the I/O lives).

Banking the RAM involves certain patterns of accesses to an I/O address.

Apple IIe and later can have 128k of RAM. 64k normal, 64k of AUX. Switching back and forth between these is a bit complex, especially as when you switch you need to have code at the program counter where you switch over to and also the stack/zero page switch too.

You don't have to do a full switch of all of RAM, there are various soft switches to hit to switch over various combinations of the pages. This is all rather complex and I won't detail it all here.

65C02

The Apple IIe enhanced / platinum and the IIc models have 65C02 processors with many additional useful opcodes that can help code density. Unless you know your code is going to be run on one of these newer machines you are stuck using the original 6502 opcodes.

VBLANK/Vapor Lock/Floating Bus

By default the Apple II has no way for detecing the HBLANK and VBLANK. The IIe, IIc, and IIgs have a regsiter you can read, but it's a different one on each machine so not compatible.

There is a crazy hack sometimes called Vapor Lock. The Apple II is constantly scanning the display memory in the off-phases of the 6502 clock cycle. The last byte read from RAM to display on the screen is still there on the capacitance of the "floating" data bus. If you access an I/O address without RAM or I/O connected you can read out this floating value and find out the most recent byte written to screen. So by writing a pattern to screen and then reading the floating bus you can find where the beam is on the screen, then cycle count from there out to find HBLANK/VBLANK. This can lead to some amazing race-the beam effects, but it is really difficult to start and maintain.

Applesoft BASIC Size Coding

I have a page here where I discuss size-coding Applesoft BASIC. This is to fit in the 280-byte twitter limit. We have found you can use plain text BASIC programs to load 128 bytes of machine code (if only uppercase chars are used) or 140 bytes (if lowercase allowed).

Example Code

I post the code for all of my demos on github https://github.com/deater/dos33fsprogs under the "demos" directory

Some of note:

Useful References


Link to my "ll" page where I sizecode for 30 different platforms
Back to my Demos page