I said I’d post an update (with a picture) when I got the VGA console working, so here it is:
It’s very preliminary; the screen does not scroll yet, and input is still coming from the serial port (there’s no keyboard port yet), but it does work.
One thing that is rather apparent in the above screen shot is that the weird issues with some of the punctuation symbols. I’ll dig into that after I get scrolling working.
Hardware Changes
The image above was taken using my updated hardware design that I talked about in my last post, which uses clock stretching in lieu of the 65816 RDY pin to pause the CPU during propeller accesses. The previous design rarely lasted more than a few minutes without random serial port garbage, and always crashed after a few hours at most. This new design has so far gone nearly 48 hours glitch-free, and probably would have gone longer had I not powered it down for a ROM update.
Next Steps
With basic video output working my next task is to make the console scrolling work so that it’s truly usable. After that I’ll be getting the keyboard port implemented, which I’m still planning to do using a small ATmega ‘328. It’s going to connect through the VIA so it should be a lot less work than it was getting the propeller implemented.
As I mentioned in my previous post I am using a Parallax Propeller in my COLE-2 SBC project. The propeller is a neat little chip. I won’t go into a whole lot of details about it here, since that is well-covered elsewhere, but the basics are that it’s an 8-core (or “cog”) processor running at 80 MHz, and is very hobbyist-friendly. It’s a bit of an odd duck from a programming point of view, but once you get used to it you can do some amazing things with it. There are literally hundreds of open source modules available implementing all sorts of software-designed peripherals so you can do a lot while writing little to no code of you own, if you so choose.
What initially drew me to the propeller is its ability to generate video signals, both composite (NTSC or PAL) and analog VGA. It’s able to do this thanks to some custom hardware included in each cog that facilitates the generation of the proper timing and the shifting out of pixel data based on that timing. You can do this with as little as one cog, although multiple cogs working together will allow you to get higher resolutions and/or better color depth. The only limitations are that is is limited to 6-bit RRGGBB color, and there is only 32 KB of shared RAM (called “hub RAM”) available for holding your frame buffer and other shared data.
Once you have the propeller integrated into your design, however, you might as well put all eight cogs to use; the only cost incurred is adding the extra driver code and maybe some I/O pins (more on that later). With that in mind I decided to see how much I could pack into that single chip.
The Bus Interface
To start with I need a way to interface the propeller to the rest of the system. Many projects just do this over a serial link; the chip can bit-bang serial at modest bit rates, and I could easily have attached it to the second port of the UART. But serial can be a real bottleneck for graphics modes, so I wanted some sort of parallel interface. In an ideal world I would let the chip pretend to be a 32 KB RAM chip, but this would eat up every single available I/O pin and would probably be overkill for the modest graphics modes the chip is capable of producing. So, I decided to borrow from the 1970s-era TMS9918, which did all configuration and VRAM access through just two addressable registers. This means I only need 1 address input instead of 15.
To implement this I connected the propeller to 14 signals bus: D0-D7, Φ2, /IOSEL2 (from my address decoder GAL), RWB, RDY, and /IRQ. The RDY line would be held low to halt the CPU while the propeller responds to a bus request, since it’s not fast enough to keep up with the CPU at full speed. All of these signals would be fed through a pair of 74LVC245 buffers, because the propeller is a 3.3V part but the rest of my system is 5V. So far so good, or so I thought…
RDY and Waiting
As it turns out using RDY with the 65816 is not quite as straightforward as I had hoped, due to the way it multiplexes the bank address onto the data bus. The bank address is emitted during the first half of the CPU cycle, when Φ2 is low. The data bus is connected to a 74ACT573 latch, which is kept open (transparent) during Φ2 low, but which closes and captures the bank address when Φ2 goes high.
Normally this setup works fine, and in fact it’s the exact design recommended by WDC. The problem comes in when you start trying to use RDY. When RDY is pulled low, the CPU halts as soon as Φ2 transitions from high to low. The actual Φ2 clock, however, does not stop. If RDY is kept low long enough for Φ2 to go high again, the bank address latch will capture some random data bus data as the bank address, and when the CPU finally resumes it will likely access the wrong memory address.
For my test implementation I solved this by using some extra lines on a GAL to construct a latch enable signal that such that the latch remains closed as long as RDY is low. Unfortunately this seems to have made the system slightly unstable, even when the propeller is not being accessed (during which times it’s not even on the bus, as its data bus buffer is disabled until /IOSEL2 goes low while Φ2 is high.) My GALs are fast (7 ns parts), but it’s possible the extra delay is the causing the strange behavior.
For my next attempt I am going to try a different approach: halting the CPU’s Φ2 clock during the high phase using a circuit like this:
My current clock generator is the top half of that circuit, which means I already have the second half of the flip-flop available to use to add the bottom half; I just need to do the wiring. Once that’s done the propeller will pull /STP low instead of RDY, and my bank address latch will go back to being directly qualified by Φ2. I am hoping this will result in a stable system.
As a bonus I am hoping this new setup will solve another issue I have so far ignored; when first powering on, the propeller takes a few seconds to boot, during which it is not properly asserting RDY. This causes boots to randomly fail. With the new setup /STP will be low during this time so the CPU will not even try to boot until the propeller is up and running.
The Software
Writing the PASM code to implement the 65xx bus interface turned out to be much easier than I was expecting; I was able to hack out a working proof-of-concept implementation in a few hours. It isn’t even a lot of code; the basic code forming the main loop is just this:
mainloop waitpeq Pin_PHI2, Pin_CS_PHI2 'Wait for /CS to go low with PHI2 high
andn outa, Pin_RDY 'Pull RDY low
mov _in, ina 'Capture the input port
and _in, Pin_RS WZ,NR 'Check RS bit (0 = vram, 1 = registers)
and _in, Pin_RWB 'Mask RWB bit for later
if_e jmp #:vram
tjz _in, #write_register
jmp #read_register
:vram tjz _in, #write_vram
jmp #read_vram
'' Common code for all ops; unhalts the CPU, waits for /CS to go high and then loops
finish_request
or outa, Pin_RDY 'Unpause the CPU
waitpeq Pin_CS, Pin_CS 'Wait for /CS to go high again
andn dira, Pins_Data 'Set data bus pins to high-Z (input state)
jmp #mainloop 'Rinse and repeat
In the end I had a working setup in which reading the propeller on either I/O port would return a constantly incrementing byte, which I could also change by writing to either port. This allowed me to verify that the bus interface was working properly.
With the test code working I’ve started adding useful functionality. So far I’ve gotten the VRAM read and write working, and I’ve been able to successfully fill the screen with characters using assembly code running on the main CPU.
Video
At the moment the video output is being driven by the “80×25 C0DF” driver from the waitvid.2048 repository. It generates 80×25 text using a 9×16 font; each character has an attribute byte associated with it that points to a 256-entry color palette. Each color entry in turn consists of a foreground/background color pair and a blink bit. The driver also supports two independent hardware cursors that can be a block or an underline, with or without blinking. The video buffer, color palette, and the font are all in hub RAM, so in theory they could all be made changeable by the main CPU.
I would like to offer the ability to switch to an alternate video driver (either a limited resolution bitmap, or perhaps a tiled driver with sprites), but as of yet I have not worked out how to accomplish this.
Sound
Since I have video the most natural choice for another thing to add is audio. As it turns out someone has written a Propeller module called SIDcog that emulates the C64 SID chip. It takes only a single cog to run, uses very little hub RAM (just a couple dozen bytes for registers) and only two I/O pins.
The SIDcog module is very simple to use; you tell it what pins to use for left/right audio, and it returns a pointer to a block of emulated SID registers in hub memory. Reading or writing those locations will affect the emulated SID just like it would a real one. So, in theory, once I’ve finished implementing the write_register function in my bus interface I will be able to play sound.
SPI
An SPI cog is the last piece I plan to add, since at that point I will be almost out of I/O pins. The propeller will handle the actual SPI transfers and signal the main CPU via interrupt once the transfer is complete. To allow reading or writing of entire SDcard blocks I plan to implement a small 512-byte buffer. I am not yet sure how this will be implemented at the bus interface side but I have some ideas.
Future Work
At the moment I’m hard at work getting the bus interface fully implemented. My focus is on getting the video registers implemented enough that I can try redirecting the console output to VGA. Expect another post (with pictures) once that VGA boot screen is working!
NOTE: I’m going to experiment with turning on commenting, starting with this post. We’ll see how it goes.
The nice thing about creating things as a hobby is that you can switch to a new thing whenever you get bored or frustrated with the current thing. That is exactly what happened to me a few months ago: I started getting frustrated with the limited features I had available on COLE-1, so I decided to just jump ahead and start the next iteration. The result is this:
That ugly mess of wires is actually a functional 2 MHz 65816-based SBC, called, not surprisingly, COLE-2.
Hardware Overview
COLE-2 will be a substantial upgrade from its predecessor:
6 MHz 65C816 CPU (largely limited by EEPROM speeds)
1 MB of RAM
256 KB of ROM w/ full system monitor, OS, and 16-bit BASIC
ATmega328p for PS/2 keyboard/mouse and two NES-compatible game ports
Dual 65C22 Versatile Interface Adapters
SD card reader
A user port
A single expansion slot
When I started this project my goal was to produce another headless SBC similar to COLE-1, but with an expansion slot that would act as a platform for developing my custom CPLD-based video system. However it became apparent that I was trying to take too big of a leap there, and so I decided to add basic on-board video as an interim step.
My current prototype is scaled down from the final specs, since it’s implemented on solderless breadboards. It only runs at 2 MHz, and has only 512 KB of RAM, 32 KB of ROM, one VIA, and no ATmega. Much of this is just due to lack of space on the breadboard; as you can see I’m already spilling out onto additional boards, and all those long wire runs take their toll on stability, especially at higher speeds. My plan is to squeeze the ATmega in there somewhere to get a working keyboard port, and then move on to designing the PCB and building a full prototype.
In my next post I will dive into how I integrated the Propeller into my design.