Here we describe the hardware in REMEMOTECH :-
The normal Memotech memory map is implemented. As far as the CPU is concerned, it believes it has :-
The Altera DE1 has 4MB of Flash, 512KB of SRAM and 8MB of SDRAM.
During normal operation, what the CPU sees as ROM and RAM is provided by SRAM. The astute reader will note that 8KB+8x8KB+64KB+320KB is less than 512KB. In the remaining SRAM space sits REMON (8KB) and the read/write virtual cassette data space (48KB).
The FPGA also contains 1KB of on-chip ROM which contains a program called REBOOT.
In these pictures, ROMs are shown with their names and are 8KB in size and RAM pages are assigned letters and are 16KB in size. RAM pages α to δ are the normal 64KB present in an MTX512. RAM pages a to t are extra pages, which are used as 320KB of RAM Disc.
REMEMOTECH logical memory map, as seen in
RELCPMH=0 mode :-
|any SRAM||any SRAM||E|
|any SRAM||any Flash||F|
REMEMOTECH logical memory map, as seen in
RELCPMH=1 mode :-
|on-chip ROM||any SRAM||any Flash||F|
The on-chip memory is only 1KB in size, and so repeats 16 times.
IOBYTE register is initialised to
0x8f on reset,
so as to ensure execution starts from the on-chip memory.
In RAM pages 14 and 15, it is possible to address any 16KB page of SRAM or Flash.
Which page is visible at
0x4000..0x7fff is controlled by page
register 1 (port
which page is visible at
0x8000..0xbfff is controlled by page
register 2 (port
Just write the SRAM page number (range 0x00..0x1f) to the page register.
REMEMOTECH SRAM physical memory map (as a set of 16KB pages) :-
|SRAM address||SRAM page(s)||Content|
|0x00000..0x0ffff||0x00..0x03||RAM pages α to δ|
|0x10000..0x13fff||0x04||BASIC and ASSEM|
|0x14000..0x17fff||0x05||ROM 2 and ROM 3|
|0x18000..0x1bfff||0x06||CP/M boot and SDX ROM|
|0x1c000..0x1ffff||0x07||ROM 6 and ROM 7|
|0x20000..0x23fff||0x08||OS and REMON|
|0x24000..0x2ffff||0x09..0x0b||read/write virtual cassette area|
|0x30000..0x7ffff||0x0c..0x1f||RAM pages a to t, RAM Disc area|
0x10000..0x7ffff of the REMEMOTECH Flash physical
memory map exactly matches the SRAM physical memory map.
0x00000..0x0ffff of the Flash isn't used as we don't initialise
the MTX512 RAM, only ROM images and Initial RAM Disc (pages a to t).
The first 64KB of the Flash chip has 8x8KB sectors and the rest of the
Flash is arranged in 64KB sectors - as we don't use the first 64KB,
we don't need any special case code for this.
0x10000..0x7ffff on the "Flash image" SD Card are
0x10000..0x7ffff in Flash during initial setup, and
then from there to
0x10000..0x7ffff in SRAM during first startup
(if you press the right combination of keys on the DE1).
0x80000 and above in the Flash are divided into
56 64KB virtual tape slots.
As these are flash, they are read-only to MTX BASIC.
REMEMOTECH uses the
clone from OpenCores.
It runs it in the
FastZ80 mode, in which non-M1 CPU cycles
execute in 3T, so for a given speed, it should be slightly faster than a
It runs this at integer divisions of 25MHz, as controlled by switches SW9 to SW7 and reflected in LEDs LEDR9 to LEDR7 :-
The speed may be changed during operation. Its safe to do this because the design avoids glitch problems associated with gated clocks.
4.166MHz is the closest to 4MHz that I could easily obtain, whilst retaining the ability to switch to faster speeds.
The CPU can discern the current 3 bit clock divider value (minus 1) by inputting from port 0xd8.
When RAM Page 15 is selected, Flash is visible in the address space.
If the switches select
000 (ie: 25MHz) then in fact the system
will be slowed to 12.5MHz, and the LEDs and port
This is to ensure the CPU does not go too fast for the 70ns Flash memory.
I do this because I couldn't get wait-states to work properly.
Implements a useful subset of the TMS9918A (datasheet) VDP chip, and the PAL TMS9929A equivalent.
This implementation outputs 256x192 pixels doubled to 512x384, with border, to VGA.
It can output a non-standard 640x480 @ 50Hz signal, which is preferred as it means that VDP interrupts will occur at 50Hz, as they did on Memotech computers sold everywhere outside the US. You may have difficulty finding a monitor which copes with this, and if so, try looking for a UK LCD TV which also has a VGA input.
Alternatively, the VDP can be switched into 60Hz mode using SW4, which causes a 640x480 @ 60Hz VGA signal to be produced and VDP interrupts to occur at 60Hz. This can cause some games to go 20% faster. Some games (those which require 0.02s of processing time between frames) can miss the first end of frame and end up waiting for the next, and thus go twice as slow. Games written for the US market (if there are any) should be fine.
SW5 can be used to switch between the palette as I and Richard F. Drushel remember it, and the palette that Marat Fazyullin suggests better reflects the actual values used in the VDP chip itself.
This implementation of the VDP has debug features, which in REMEMOTECH are activated by certain keys :-
These can be useful to give you a quick idea as to how certain games are constructed. In Text Mode, F10 doesn't work too well, as each character cell is only 6 pixels wide, and the hex code needs 8 pixels. Pressing F12 shifts the hex code so you can see the right 2 pixels of it.
16KB of Cyclone II M4K is used as the VDP memory, rather than external SRAM, thus neatly avoiding competing with T80 for the same external SRAM.
The implementation is a little unusual in that the processor interface to the VRAM is not via the VDP chip, its coded externally. This VDP implementation is an engine which reads from dual port VRAM and based on its registers, renders up a picture on VGA. In theory its possible to memory map the VRAM into the T80 memory space and avoid all the messing around with control and data port based access. In fact, REMEMOTECH faithfully implements the port 1 and 2 based access to memory, with the auto-incrementing address register.
Alternative VHDL implementations of the VDP include
FPGA Colecovision Project,
(aka ESE MSX System 3),
and FPGA Arcade.
REMEMOTECH works with a UK PS/2 keyboard. It maps this to the UK MTX keyboard arrangement.
REMEMOTECH attempts to map the PS/2 keyboard to the MTX keyboard as closesly as possible, but there are several problems :-
(is shift 9 on the PS/2 keyboard, but it is shift 8 on the MTX keyboard. This isn't a problem for REMEMOTECH. If you press type
(by typing shift 9 on the host, MEMU presses shift 8 in the emulated MTX keyboard, thus producing the expected result.
^is shift 6 on the PS/2 keyboard, but it is unshifted on the MTX keyboard. eg:
=is unshifted on the PS/2 keyboard, but it is shift - on the MTX keyboard. This is a huge problem, because if I try to workaround this by lying about the shift state, due to the way the keyboard sense bits are typically scanned in MTX software, I observe intermittant incorrect typing errors. So therefore, to type certain MTX characters, you must type different things on the PS/2 keyboard. This remapping is done to ensure an unshifted PS/2 keypress corresponds to an unshifted MTX keypress and a shifted PS/2 keypresses corresponds to a shifted MTX keypress.
Insertkey presses MTX
HOMEkey. The MTX numeric keypad is physically laid out so that the
HOMEbutton is in the middle and the arrows are grouped around it. Therefore REMEMOTECH maps the PS/2 numeric keypad (on the far right) so that 5 corresponds to
HOMEand the arrows around it work as advertised. So that games like Qogo and Reveal work as advertised, the whole PS/2 keypad maps to the whole MTX keypad, based on key position, not what is written on the keys.
Tables showing the effect of the above follow...
The effect of the shift-state problem :-
Use PS/2 keypress to produce MTX keystroke ----------------- ------------------------ ^ = = ^ ' @ @ ' # : shift ` `
Mapping of the middle part of the host PC keyboard :-
Middle part of PS/2 keyboard MTX keypad ---------------------------- ---------- PgUp End Pause 7 PAGE 8 EOL 9 BRK Tab Up Delete 4 TAB 5 UP 6 DEL Left Home Right 1 LEFT 2 HOME 3 RIGHT Insert Down PgDn 0 INS . DOWN ENT CLS
Mapping of the number pad of the PS/2 keybaord :-
PS/2 number pad MTX keypad --------------- ---------- Num Lock / * 7 PAGE 8 EOL 9 BRK Home Up PgUp 4 TAB 5 UP 6 DEL Left Middle Right 1 LEFT 2 HOME 3 RIGHT End Down PgDn 0 INS . DOWN ENT CLS
REMEMOTECH doesn't cope well with keyboards that don't have a number pad. To try to ease this a little, the Alt and AltGr keys on a PS/2 keyboard are treated as the Home key. This is important as Home is used by most games as the fire key.
PS/2 keyboard F1-F8 become the MTX keyboard F1-F8.
Certain special keys have no equivelent on the MTX keyboard, and are available internally to control REMEMOTECH hardware. In particular, the left and right Windows keys, when pressed together, reset the system. And F9 to F12 control debug features in the VDP. Special keys on the numeric pad are available to the processor.
Older PC keyboards have a limitation in that 3 keypresses at once
can cause the phantom appearance of a 4th keypress.
Newer keyboard detect when this would be the case, and suppress the
The MTX joystick appears to press the arrow keys.
This means moving diagonally and pressing fire counts as
3 keypresses, and will not work as expected.
Read the article on why
Keyboards Are Evil
for a full explanation.
This is an unavoidable limitation, and REMEMOTECH suffers from it.
REMEMOTECH implements the SN76489A sound chip (datasheet).
This implementation produces a signed sound value. However, the Altera DE1 does not provide direct access to a DAC which converts this value into a voltage on the line-out sound jack. Instead it has a WM8731 audio CODEC in the way.
So I used some VHDL from Mike Stirlings BBC Micro on an FPGA project. One VHDL entity programs registers into the CODEC, and I found I needed to tweak one register to raise the sampling/processing frequency from 8KHz to 48KHz, as the sound chip can generate higher frequencies than 4KHz. Another VHDL entity sends the signed sound value to the CODEC.
In the original SN76489A sound chip, the output of the 4 sound channels are analog summed to produce the final analog output. A straight digital "sum of square waves" implementation produces some unwanted noise in the final output signal. So in my sound chip I implement a simple smoothing algorithm, to try to take the "edges" off of the square waves, thus producing a nicer sound output.
Switches SW3 and SW2 provide 4 volume levels: Off, 1/4, 1/2, full.
An alternative VHDL implementation of the SN76489A can be found in the
FPGA Colecovision Project.
REMEMOTECH implements a useful approximation to a Z80 CTC (datasheet). It is modelled on the CTC implementation in MEMU, and so is known to be enough of an implemention to keep all known software happy, but it is acknowledged that it is a subset of the real thing. It doesn't differentiate between rising and falling edges, it doesn't support the "timer trigger" bit, and it doesn't support daisy chaining of CTCs (the MTX only had one anyway).
A comparison of CTC inputs :-
|Counter||0||VDP interrupt||VDP interrupt|
|1||4MHz/13 exactly||4MHz/13 approx|
|2||4MHz/13 exactly||4MHz/13 approx|
Channels 1 and 2 were typically used to generate clocks for the Z80 DART. REMEMOTECH doesn't include serial port support, but similar inputs to these channels are provided in case programs expect to generate interrupts at time intervals computed from them.
The CTC has a special non-standard hack built-in.
The PANEL and
VDEB.COM debuggers write to channel 2 and set it
up in timer mode with a prescaler of 16 and a counter of 13 to ensure there
is an interrupt raised immediately after the next single stepped instruction.
This clever trick allows ROM to be single stepped.
The CTC spots when channel 2 is programmed in this way and then ensures there
will be an interrupt 13*16=208 CPU clocks later, regardless of the fact
that the CTC timer input may not match the CPU clock speed.
Anyone wanting to use my CTC VHDL in their project would need to remove
I wrote this CTC because I was unable to source free VHDL for one.
REMEMOTECH does not support loading or saving to cassette tape.
Almost all of the Memotech library on cassette has been converted into
.MTX file format.
Instead, REMEMOTECH supports "virtual cassette tapes".
A hidden 48KB area of SRAM is used as a read/write virtual cassette tape. 56 64KB areas of Flash are used as read-only virtual cassette tapes. Looking at the known library of Memotech cassettes, almost all of them will fit within 48KB.
Virtual cassette tapes are accessed from CP/M using the
I have no plans to provide support for the printer ports.
I imagine it would be hard to source a Centronics printer nowadays.
I have no plans to provide support for the PIO port.
I have no plans to provide serial port support.
Floppy Disc Controllers
I have no plans to implement floppy disk drive support of either
the FDX or SDX variety.
I certainly wouldn't want to hook up real drives to the Altera DE1.
Just think of the power drain.
REMEMOTECH can access the SD Card on the DE1 instead. SD Cards between 64MB and 1GB are supported. Only 64MB of data may be stored on them. REMEMOTECH considers them to contain 8 8MB partitions. This is somewhat generous, as the entire Memotech software library will fit comfortably within one 8MB partition.
It accesses this using the SPI interface.
It has hardware support for driving the SPI interface so that byte transfer
speed is effectively limited by the T80.
It has a novel feature in that reading data from SPI on one port triggers
the sending of an
0xff byte to trigger the next transfer.
The means that reading of data from SD Card needn't be twice as slow
as writing it (as it would otherwise be).
Unfortunately the fact that CP/M sectors are 128 bytes and SD Card blocks are 512 bytes makes the whole thing somewhat inefficient. To read a 128 byte sector, we must read the enclosing 512 byte block. And to write a 128 byte sector, we must read the enclosing 512 byte block, modify a part of it, then write it back. Even with this handicap, its still usable. Clever driver software helps improve things.
LEDR0 flashes when SD Card is being accessed and for a couple of seconds afterwards, and the intent is that the user doesn't remove the SD Card until the LED goes off. This simple feature allows the SD Card driver code to go faster.
Note that the net suggests that 8MB is the largest disk size CP/M 2.2
can cope with, due to how it does its internal arithmetic.
Even if you could go larger than this, you'd start to have memory problems,
as CP/M keeps allocation and check vectors in (scarce) high memory,
and these are related to the size of the disk.
I had originally planned to support these by mapping I/O requests to accesses to SDRAM. To do this I would have had to integrate an SDRAM controller.
Back in the day, Silicon Discs were a lot faster than floppy disks, but now the benefit of SDRAM access over SD Card access is less clear. Also, Silicon Discs could be a lot bigger than floppy disks, but now the SD Card support provides access to more storage than there is SDRAM.
In the end I decided it isn't worthwhile to support Silicon Discs.
As a result, precious high memory has been freed up, which allows
the user to make effective use of the partitions on SD Cards.
80 column card
REMEMOTECH implements a video card which is largely compatible with the original FDX 80 column card.
It outputs in 8 colours to VGA, 640x480 at 60Hz. I have no plans to output RGB or Composite video, like the FDX did.
In addition to the normal 80x24 mode, it also supports 80x48 mode. To do this it has 8KB of memory, rather than 4KB.
It supports accesses to ports 0x30, 0x31, 0x32, 0x33, 0x38 and 0x39. Inputting from port 0x30 does not cause the bell to ring.
It emulates a subset of the 6845 CRTC registers (as per datasheet), specifically registers 10, 12, 13, 14 and 15. In addition, it has REMEMOTECH special register 31, in which bit 0 controls whether it is 80x24 or 80x48.
The normal Memotech alphanumeric font is present in on-chip ROM in
The graphics characters are programmatically generated from the graphic
character number, saving 2.5KB of scarce on-chip memory.
VGA monitor support
The SW6 switch determines whether the VDP signal or the 80 column card signal is output to the VGA connector on the Altera DE1.
This is good enough for most purposes, as usually you are interested
in the text, or the graphics.
But occasionally, you might be doing something that would benefit from seeing
both at the same time, such as using
VDEB to debug a game.
Whichever signal is not being output to the VGA connector is output on certain pins on the GPIO_1 JP2 socket. With a suitable cable and adapter, this can be wired up to another VGA monitor. When wired up, it looks like this :-
Do not use a 40 pin IDE/ATA/UDMA cable, even though these have 40 pins and fit nicely, as these cables short together various pins (which are all supposed to be GND), per this diagram. Instead, I used an old floppy disk connector, trimmed to size, and with the plastic casing trimmed with a pen-knife near the top most pins (as per the photograph), so as to avoid inserting the cable pushing nearby pins in the GPIO_1 JP2 socket to the side.
I obtained a VGA breakout board. I use this upside down.
The mapping is like this :-
|Purpose||GPIO_1 signal (0-35)||GPIO connector PIN (1-40)||Label on breakout board||Resistor value|
|none **||19||22||B1||Not populated|
|GND *||30||G1||Not populated|
|none **||33||38||R1||Not populated|
* Notice that GND annoyingly appears on the connector where I would like to emit part of the green signal. So this limits me to only passing 3 bits (even though the VDP generates 4). ** So as I can only do 3 bits of green, I also only do 3 of red and blue too.
The connections are made like this (looking from above) :-
The Altera DE1 has 47 Ω resistors between the FPGA and the GPIO socket already.
The VGA connector on the DE1 has 120 Ω resistors for horizontal and vertical sync. This adapter is effectively working with 147 Ω which seems to be ok.
The VGA connector on the DE1 has 1K Ω and 2K Ω resistors, which it combines in series and parallel to produce resistances of 500 Ω, 1K Ω, 2K Ω, and 4K Ω for bits 3, 2, 1 and 0 of each of red, green and blue. This adapter is using resistances of 511 Ω, 1047 Ω and 2247 Ω for bits 3 2 and 1 of each of red, green and blue.
This means that at full brightness, the resistance is
1/(1/511+1/1047+1/2247) = 312 Ω.
As the GPIO outputs are 3.3V LVTTL, and the VGA monitor has 75 Ω internal
resistance, the voltage delivered is
3.3*75/(312+75) = 0.64V.
This is reasonable, given it is supposed to be 0.7V.
The REMEMOTECH r2 or later includes a rudimentary arithmetic accelerator. Once enabled this appears in ports 0A0H to 0A5H.
The accelerator uses quite a lot of FPGA resources.
It supports 32 bit integers (unsigned and signed).
It also supports the MTX BASIC floating point format. This is a 5 byte format, comprised of
A floating point value is of the form :-
(-1)^s * 1.m * 2^(e-81H)
so 5.0 would be :-
(-1)^0 & 1.01 *2^(83H-81H)
and would represented by MTX BASIC in memory as :-
offset value meaning 0 00 mantissa bits -24..-31 1 00 mantissa bits -16..-23 2 00 mantissa bits -8..-15 3 20 sign is 0, and mantissa bits -1..-7 4 83 exponent
Zero (both integer and floating point) has the special representation
00 00 00 00 00.
The hardware supports an 8 element stack and includes forth-like operations to manipulate it.
C_LIT operation pushes 0 onto the stack.
The top-of-stack can then be modified to your desired value
by writing to ports, or by using other operations that explicitly set it.
The hardware doesn't bounds check the use of the stack. Its up to you to ensure you don't push or pop too many times.
The hardware supports these operations :-
C_INIT(sets result to
R_OKand empties stack),
C_OK(sets result to
C_1(sets top of stack to 1)
C_1P0(sets top of stack to 1.0),
C_2P0(sets top of stack to 2.0), ... and many other useful constants
Division by zero is detected.
You may wonder why there are separate
They do produce the same bit pattern, but only in the bottom 32 bits.
The accelerator computes a full 64 bit product, and you can use the
C_HMUL operation to push the high 32 bits on to the stack.
The floating point calculations incorporate rounding, so (1.0/3.0)*3.0 does evaluate to 1.0, rather than 0.9999..
The floating point calculations do also detect overflow and underflow conditions.
After instructing an operation, reading result register returns
R_BUSY until the operation completes, and then it finally returns
Most operations take a cycle or two, and as this is much quicker than
the Z80 can issue instructions, there is no point in polling.
However, the divide and modulo related instructions take 34 cycles.
INCLUDE PORTS.INC ; P_ port values INCLUDE NUMACCEL.INC ; C_ command and R_ result values ; enable accelerator IN A,(P_RIZEQ) OR 40H OUT (P_RIZEQ),A ; push 1.0, ie: + 1.0 x 2^0 LD A,C_LIT OUT (P_NCMD),A LD A,081H OUT (P_EXP),A LD A,000H OUT (P_MAN3),A OUT (P_MAN2),A OUT (P_MAN1),A OUT (P_MAN0),A ; push 3.0, ie: + 1.1 x 2^1 LD A,C_LIT OUT (P_NCMD),A LD A,082H OUT (P_EXP),A LD A,040H OUT (P_MAN3),A LD A,000H OUT (P_MAN2),A OUT (P_MAN1),A OUT (P_MAN0),A ; fdiv LD A,C_FDIV OUT (P_NCMD),A WAIT: IN A,(P_NRES) CP R_BUSY JR Z,WAIT ; with these operands, the result will be R_OK ; with other operands, could be R_DIV0, R_OVER or R_UNDR ; query the top-of-stack value IN A,(P_EXP) ; will be 7F IN A,(P_MAN3) ; will be 2A IN A,(P_MAN2) ; will be AA IN A,(P_MAN1) ; will be AA IN A,(P_MAN0) ; will be AB (note rounding) ; ie: + 1.01010101.. x 2^-2 ; drop the result LD A,C_DROP OUT (P_NCMD),A
RENUMT.COM is a test for the
RENUM.COM is a program which
enables the accelerator and patches the MTX BASIC ROM to use it.
The REMEMOTECH r2 or later includes support for port 7.
When you output to port 7, it presents bits 7 to 0 of the data byte on the GPIO_0 connector on PINs 13,15,..,25.
When you input from port 7, it reads bits 7 to 0 from GPIO_0 connector PINs 14,16,...,26.
These PINs were chosen as they are contiguous runs of 8 PINs, sandwiched conveniently between 5V, 3.V and GND lines.
At this time, this feature is untested.
Be sure to study the Altera DE1 manual before connecting the GPIO header
to any homebrew electronics.
REMEMOTECH also has other miscellaneous bits of hardware not found on real Memotechs. Most of these take advantage of bells and whistles on the Altera DE1.
0xc0to read/write HEX1 and HEX0 seven segment displays
0xc1to read/write HEX3 and HEX2 seven segment displays
0xc4to read/write LEDG7 to LEDG0 green LEDs
0xc5to read KEY2 to KEY0 (REBOOT reads these)
0xc7to read PS/2 various special number pad keys (REMON reads this)