Wednesday, July 18, 2018

High speed full-screen drawing on the ZX Spectrum

This is another retro-programming idea that occurred to me, harking back to my old ZX Spectrum days.  If you're not au fait with the speccy's screen layout, I refer you to this posting of mine about rendering sprites.

The problem is that the old Z80 isn't fast enough to update all of the video memory before the CRT beam catches up with you.  The screen has a 64 pixel upper border (requiring no video memory), followed by a 192 pixel high by 256 pixel wide bitmap, followed by a lower border (also requiring no video memory).  The bitmap is monochrome, with each consecutive set of 8 pixels being represented in a single byte.  There is a separate 24 row by 32 column attribute map specifying foreground and background colours for each 8x8 pixel cell.

The only interrupt you get on the speccy occurs once every 50th of a second when the CRT beam starts from the top of the upper border (so this is pretty much when you have to start drawing).  It takes 224 T-states (i.e., clock ticks) for the CRT beam to scan one pixel line.  Since it takes around 16 Ts to move a single byte (give or take), the Z80 can update at most half of the 32 bytes on each row of pixels before the CRT beam catches up (which would appear as "tearing" on the screen).

[Note: some games did manage a full 50 FPS refresh rate, but only by severely restricting the area of the screen that was updated.]

However, let's say you only wanted to update about a third of the bitmap at 50 FPS, but didn't want to be restricted to which third.  The remainder would be painted in the background colour, which is easily arranged by setting the corresponding cells in the attribute map to have the same foreground and background colour (i.e., the set pixels and the clear pixels in such an 8x8 cell will have the same colour).

What occurs to me is this.  For a "blank" cell, we just need to set the cell attribute.  For a "bitmap" cell, we need to set the cell attribute and set the corresponding bitmap bytes in the video memory.  Assuming that we have the address of the "current" attribute entry in DE and the bitmap entry in HL (and the magic number 1 - $0700 in BC), then we can do the following:

DrawSomeBlankCell:
    ld A, SomeAttr
    ld (DE), A
    inc E
    inc L
    ret

(for 32 Ts per blank cell)

DrawSomeBitmapCell:
    ld A, SomeAttr
    ld (DE), A
    inc E
    ld (HL), SomeBitmapByte0 : inc H
    ld (HL), SomeBitmapByte1 : inc H
    ld (HL), SomeBitmapByte2 : inc H
    ld (HL), SomeBitmapByte3 : inc H
    ld (HL), SomeBitmapByte4 : inc H
    ld (HL), SomeBitmapByte5 : inc H
    ld (HL), SomeBitmapByte6 : inc H
    ld (HL), SomeBitmapByte7 : add HL, BC
    ret

(for 146 Ts per bitmap cell).

The "draw list" would simply be a consecutive list of pointers to these drawing functions:

DrawList:
    dw Draw0, Draw1, Draw2, ..., Draw31, [fix up]
    dw Draw32, Draw33, Draw34, ..., Draw 63, [fix up]
    ....
    dw ..., Draw767, [finish up]

We would redraw the entire screen by simply pointing DE at the display attributes, HL at the display bitmap, SP at DrawList and executing a 'ret'.  (The [fix up] calls would simply deal with the odd end-of-line addressing fixes that are needed when handling the ZX Spectrum display.)

How much could we redraw, assuming an even distribution of action across the screen?  A simple calculation shows we can have a good 11 bitmaps on each line and still update the entire display before the CRT beam catches up.  That is just slightly in excess of 33% of the entire display!

(I'm currently working on a full-screen Pac Man demo. to illustrate this idea.  Time is scarce, so don't hold your breath.)

A very belated update: you can find the working source code at https://github.com/ralphbecket/Z80/tree/master/MegaPacMan showing a full-screen 50 FPS scrolling maze with four ghosts running around at random.  It could support many more and/or much larger sprites, but I'm happy to call it a day here!

9 comments:

joe said...

do it, patiently waiting.

Bipboi said...

I like the theory, looking forward to seeing what you come up with.

I am currenty working on some 128k screen paging examples, but it would be great to see this implemented

Rob “Bipboi” Pearmain

Rafe said...

Source code now available 😁

Bipboi said...

Downloaded, Built, highly impressed.

Tip for Zeus, you need to enable dynamic labels for it to build.

I am working through the code today, and going to load on to my OMNI to test

Thanks for making the code available

Rob

Rafe said...

Very kind words, thank you. This idea has been percolating in the back of my mind for ages. The devil is always in the details (e.g., getting the end of row adjustments for the "draw list" working). Coming up with an elegant navigation mechanism for the ghosts took a surprising amount of mental energy.

JonMac said...

I'm trying to get this to build on Zeus, and get the error Label Not Found : "Maze".

So I think I need to do what Bipboi suggests, but don't know how to enable dynamic labels - any tips on how to do this?

Thanks,

Jon.

Rafe said...

Hi @JonMac, under the 'Config' tab on Zeus you can find a checkbox for "Allow floating labels". Let me know if that works for you.

JonMac said...

Hi @Rafe, yes that worked thanks. I'm currently trying to learn a bit of Z80 assembly to get closer to the metal, being a web dev by day I want to understand a bit more about the lower level workings. It's really useful and interesting to read stuff like this.

By the way this is amazing, I hope you can find the time to turn it into a game some day - I've never seen anything look this fast and smooth before on the spectrum.

All the best,

Jon.

Rafe said...

HI @JonMac, glad that worked for you and thank you for the compliment. I don't currently have the inclination to turn it into a full game, but I might after my next project is finished. Right now I'm working on an higher level 16-bit "virtual assembly language" for the old 8-bit CPUs (targetting the Z80 in the first place). One of my pet peeves has always been the popularity of Forth: I find it nearly impossible to read and the performance on a Z80 or 6502 is *terrible*! There must be something better we can do, so I'm interested in demonstrating that. My back of the envelope calculations show my current scheme to be 2x to 3x the speed of Forth.