Friday, December 22, 2017

Pocket Calculator: Stack

For this calculator, I am working on a more sophisticated stack system than I had for my RPN Scientific Calculator. Instead of stack items being fixed at the maximum length, which wastes a lot of memory, item length varies. With only 4k to work with on the LPC1114, it will be very important to conserve memory wherever possible. As I have it now, each item has a header that indicates its position on the RPN stack, its name, and an ID number for variables in functions. RPN stack positions start at one so that variables marked zero will not appear on the RPN stack. This is useful for named variables, so that they can use the same stack system internally. The ID numbers will allow me to keep track of variables local to functions, so I can easily clear them when the function exits.

The system I have now writes the result of an operation directly to the top of the stack then marks the two consumed items as free memory. This has the advantage of allowing the result to be larger than the operands, which can be useful for things like string concatenation. The screenshot above shows how I was testing the system in C# with a simple calculator that can only add. Freeing items like this will make the stack grow fairly quickly and eventually require garbage collection. The advantage, however, is that calculations should be very fast until garbage collection is required. A slower method would be to scan the stack for a free section that is large enough for the result before adding to the stack, but this will eventually require garbage collection as well. My suspicion is that just adding to the top of the stack will be faster overall, even if garbage collection happens more often.

To keep track of the memory, I decided to implement my own garbage collection system, since malloc in C would lead to unrecoverable fragmentation. On IRC one day I made the mistake of asking about how to implement garbage collected stacks like these internally and ended up wasting over an hour arguing with someone who was convinced I was going about everything totally wrong. In the end, one of the suggestions was to have the calculator user push a 0 onto the stack manually before entering values to be operated on so that the system can overwrite the 0 with the result. This recommendation (and a few of the others given) were just laughably horrible and would seriously cripple the functionality of the calculator. Sometimes people on IRC can give good suggestions, but I also have to remind myself not to argue with people who don't really know what they are talking about but won't admit it.

One of the useful suggestions given by a different person was to use two stacks for calculations. The result goes on the second stack and is then copied back to the main stack overwriting the consumed operands. In theory this system will never need garbage collection, although it seems to me that it will take the same amount of time on average. It may be simpler since calculating free memory, for example, is comparing just one pointer, rather than crawling the stack adding up free chunks. This system assumes, however, that operations always work on the top of the stack, which is not the case when named variables share the same internal stack as the RPN stack. It might make more sense in that case to use the second stack only for named variables and do garbage collection on that stack, while keeping the first stack RPN only to prevent garbage collection.

Sunday, December 3, 2017

New Project: Pocket Calculator

Recently I got interested in building a very small calculator I can carry around with me. The other calculator projects I worked on are neat, but I don't carry them around because they are too big and fragile. This time I want to make something really practical but also small enough to fit in my pocket. The smallest way I know how to do this with the chips I'm familiar with is using the same LPC1114 in DIP28 package I used for my Programmable RPN Calculator. There are a few changes I would like to make for this calculator, though. First, I want to be able to manipulate all kinds of data on the calculator's stack, such as strings, instead of just floating point BCD numbers. Second, I plan to redo the BCD format I used in the other calculators to store the least significant digits first, which should make them a little easier to deal with. Third, I plan to make all items on the stack variable length, rather than fixed length like on my other calculators. This will be necessary for this calculator since I plan to store the whole stack in the 4k RAM the microcontroller has, instead of on an external SPI RAM chip.

The main new feature I want this calculator to have is a BASIC interpreter. My experience with the first RPN calculator I worked on showed how much of the microcontroller's flash can be eaten up by interface code, so I really don't think I could fit all the calculator functions and interpreter in the LPC1114's 32k of flash. Instead, I plan to store a virtual machine on the chip that can interpret code stored on a large capacity flash or FRAM chip. Retrieving the code over SPI and interpreting it will be much slower than executing code natively on the chip, but I think it will be fast enough for the UI, BASIC editor, and maybe even the tokenizing part of the BASIC editor. On one hand I could switch to a chip with bigger flash, which I am not anxious to do, but I will need some kind of virtual machine anyway to interpret the code generated by the BASIC editor, so it is no loss to use the same virtual machine for the UI. The BASIC programs and editor data will be stored on an SPI SRAM chip, which should work fast enough to edit and tokenize programs, while all the stack and program variables will be stored on the microcontroller.

Tuesday, November 21, 2017

BCD Addition Speed Test: 6502 vs 8051

12345.6789 + 9999.99999 packed BCD on 8051
A few years ago I decided to revisit the microcontroller comparison I had done to see how much of a speed up I could get recoding my BCD addition routine in 8051 assembly. At the time I was surprised that it was 3-4 times faster than the C version. That original BCD routine, which I used for my RPN Scientific Calculator, uses unpacked BCD characters arranged with the most significant bytes first. For the calculator I want to build with a 6502, I wrote a new routine that stores the least significant bytes first in packed BCD. This takes advantage of the 6502's decimal mode to quickly add packed BCD bytes and avoids having to shift all the bytes when an extra byte has to be added for the final carry. It turns out that the 8051 has an instruction, DA, to correct the result of BCD additions from their binary result back into BCD, so it is easy to do packed BCD for it as well. To compare the speed of the two chips for use in a calculator, I translated the 6502 packed BCD routine to 8051 assembly and ran both versions in their respective simulators.

Result after the break:

Wednesday, November 15, 2017

6502 BCD Multiplication

In preparation for the 6502-based calculator I am planning on building, I started thinking about the best way to do BCD multiplication. One of the reasons I was interested in working with the 6502 is its built in BCD mode. This works great for add and subtract but the chip does not have any sort of multiply or divide, BCD or otherwise. (Some of the 6800 family I was researching before do have hardware multiply, as does the 8051.) There are several routines that can make BCD multiplication faster, so I tried some of them on the 6502 to see which one works best.

To test these I used the Kowalski 6502 Simulator, which is neat because you can assemble in the simulator and then single step or run the program and see which line is being executed. Another convenient thing is that an input/output window is mapped into memory starting at at $E000. I also like the help window which automatically gives you information on op codes when you type them. On the other hand, the included macro system is extremely primitive and you can't do much with it, especially compared to the excellent macro system in CA65. It has a lot of other shortcomings and is fairly buggy, but I don't want to complain since it is free software. After I got frustrated at the lack of basic macro functionality, I tried assembling with CA65 and loading the binary into the simulator. The diassembler seems to work well but of course there is no way for the simulator to know label or function names. For these multiplication tests I decided to just make do with the simulator since I don't need much macro functionality. In the future, though, I will go back to using the 6502 trainer I made before.

To run the tests I made a loop that multiplies all combinations of BCD numbers $00 x $00 to $99 x $99. The Kowalski simulator has a function to count cycles, so I used that to find the time of an empty loop. For each test I recorded the cycles it took, subtracted the loop overhead time, and divided by 10,000 to get average cycles per calculation. After I was done, I went back and used look up tables for operations like shifting by 4 and halving numbers to speed things up a little. I also tried checking for 0 and 1 as arguments to skip the rest of the calculation, but this was not faster. In the table below, the table size column takes into account all the space the tables take up, including unused padding bytes inside the table. Here are the results in the order I did the tests:

MethodCycle averageCode sizeTable size
1Repeated additions1,596.23340
2Repeated additions, minimize loops1,236.66480
44x8 lookup table101.006510k
54x8 lookup table, byte aligned97.006114.8k
6Russian peasant algorithm267.79660
7Quarter square table (half table)115.3385816
8Quarter square table (full table)118.00781,632
9Quarter square table (BCD to dec to BCD)86.00511,816
10Quarter square table (BCD to dec only)60.0034792
11Half squares113.60126200
12Simulated ROM lookup31.002046k

Descriptions of methods below:

Friday, November 3, 2017

6502 Address Decoding

For the 6502 graphing calculator I started working on two years ago I wanted to use an ATF1508 CPLD to do the address decoding. In another post, I wrote about how difficult it was to get this chip going. For one, the program I found to generate SVF files for it is 16-bit only. At the time I was running it in a virtual machine, but I don't have my Windows XP disk any more. If I keep using the homemade setup for it I had before, instead of buying a programmer (which is expensive), I think I might be able to run the program in a DOS emulator without Windows.

6502 address decoder with SRAM
Another way I was thinking about doing the decoding is with a 32k 12ns SRAM. For each address, the SRAM would act as a lookup table for the chip selects and other signals that the CPLD normaly does. This would not be as fast as the 7.5ns of the ATF1508, but it would still be very fast. One way would be to hold the 6502 in reset and cycle a counter through all 16-bits of the address space copying information from an EEPROM to the SRAM. After a certain amount of time, a timer would fire to disable that EEPROM and the counter and enable the SRAM, EEPROM, and 6502 of the main system. I'm not sure I got all of the signals exactly right but I made this schematic to illustrate approximately what I was thinking. The disadvantages of this setup is that the extra SRAM consumes 160ma, which would be a lot for the battery-based calculator I want to build, and the extra space the counters and second EEPROM would take.

Another way I thought about was disabling the main SRAM on startup but directing all reads to the EEPROM and all writes to the address decoding SRAM. This would let me write the whole lookup table into that SRAM with the 6502 . A transparent buffer would separate that SRAM from the data bus and prevent it from being written to further after a timer fires. If I got everything right, this would require more glue logic, but it would get rid of the counters and extra EEPROM of the last design.

A third way I thought of is to skip writing the memory decoder SRAM and use an EEPROM. Unfortunately, that would need 45ns or so to do the decoding, which is a lot slower than a CPLD. One alternative I found is 8k 25ns NVRAM, which stores its value permanently in an EEPROM but loads its content into SRAM when it turns on. This would give me 8 byte resolution, so I would probably lose less than 200 bytes of address space for peripherals. Also, if I drive the SPI clock with the NVRAM then I could set it low for one 8 byte stretch and fill that stretch with NOP instructions. Then to pull the clock low for a given number of cycles, I would just have to jump to the corresponding NOP in that stretch of memory.

On the 6502 forum I found a really great page on 6502 bus timing. If I understand the timing correctly, the 6502 needs up to 30ns to output its address and then the address decoding logic can go to work (either 12 or 25ns in this case). After that, the 74HC670 I have in the diagram will need 16ns up to nearly 40ns in the worst case to drive the high bits of the 512k SRAM. If I can get away with only two windows into the SRAM then it would only be about 6ns max if I replace the 74HC670 with a buffer and 74AS244. In order to avoid accidentally writing to wrong addresses, those delays will have to propagate before the clock goes high and the glue logic brings the WE\ of the SRAM low. This would allow me to run between 5 and 10MHz, depending on the combination. This is a lot less than the 14 MHz maximum the W65C02 I am using is rated for but it might work until I get the kinks out of working with the ATF1508.

Wednesday, November 1, 2017

6800 Family Tree

After about two years away from electronics, I have some time for my projects now that I am almost finished with my degree. Lately, I've been thinking about a lot of things and I want to start some new calculator projects and pick up where I left off on some other projects. One thing I have been thinking about is working on a simpler 6502 based calculator than I was working on before to get more experience working with the chip. I think the 6502 graphing calculator I was working on did not turn out very well because I rushed through everything too fast trying to get ready for Makevention in 2015.

Along those lines I recently got curious about other chips I could use to make a calculator. The last time I was curious about that I compared some of the microcontrollers I was interested in to see how fast they compute. One of my plans now is to do some other comparisons, which I think will be more useful. This time I want to compare processors like the 6502 as well, so I looked at other chips that would be fun to use. One group that I have been interested in for a while is the 6800 family like the 6809 and 6309. I could not find a family tree on the internet, so I started making one myself. When I was almost done, I found this tree in a document about the 6804:

Sunday, December 27, 2015

Makevention 2015

Makevention went really well! I was able to show everything I have been working on, although I didn't have time to finish everything. I made a poster board explaining some concepts related to calculators like RPN, BCD, and CORDIC on the left side with information about my RPN Scientific Calculator on the right. I also had a print out of how my external RAM preprocessor works. A few of the people who came by my table were programmers and understood the solution.

On the table itself I had different calculators and projects laid out in roughly chronological order. The first was an MK-61 RPN calculator I added to my collection over 10 years ago. Of course I didn't make it but I wanted to show an example of an RPN calculator and what inspired me to first make calculators. When I first got the calculator out to test it, it didn't show anything on the screen and I was afraid it was broken. I took it completely apart and looked for burst capacitors. I also brushed off the battery contacts since they didn't look very clean. When I turned it on, it still didn't show anything but when I pressed a number it showed up! It seems the calculator just doesn't display anything when first turned on, instead of the zero I was expecting. Somehow I had forgotten that in the 10 years its been since I last used it.

The Casio Algebra FX 2.0 was included to show the RPN stack program I wrote for it in 2002 or so. That program was designed to make the calculator function more like the HP-48GX I had at the time which I admired so much. First, I had to buy a calculator to replace the Algebra FX 2.0+ I sold in college. Loading the program onto the calculator took quite a while to figure out! In 2002 I used a serial cable, which I think was an SB-87, to load programs without any trouble. Nowadays I don't have a serial port, so I bought an FA-124 USB cable. It turns out that this cable doesn't work on Windows 7. I also couldn't get it to work correctly under Windows XP running in an emulator. Next I tried a lot of different third-party software and finally got the program to load using FlashCOM 1.4v and an FTDI cable. For the link port I bought a 2.5mm plug that I could easily wire to the FTDI cable. After that, the program I loaded kept crashing and I thought the transfer was corrupt. After more fiddling I got Turbo C up and running in an emulator and compiled the program source myself, which produced a fairly stable program that was good enough for exhibiting, although it did crash a couple times during Makevention. Someday I would like to fix the program up and make it more stable.

The next thing I showed was a launchpad hooked up to a breadboard with a button and blinking LED to show how easy it is to get started with electronics. The next step was my binary LED calculator. I soldered a coin battery holder on to the board. After that I showed the RPN Scientific Calculator and explained that it was not too difficult to move to this stage after I managed to finish the binary LED calculator. The Programmable RPN Scientific Calculator was next. I printed temporary a colored keypad on poster paper which I taped to it, but it did not work very well. Unexpectedly, the calculator showed random keypresses when I put my hand near it. All the keys were in the same column on the keyboard, which makes me thing there is a bad connection in that row of the key matrix.

Next I showed the Improved 6502 Virtual Trainer up and running on my laptop, along with the first 6502 Virtual Trainer. I also showed what I had accomplished with the 6502 graphing calculator. So far, I have most of the hardware for it installed but I am having problems keeping the buffer chips for driving the LCD at the appropriate voltage level. Maybe the pins joining the sections of the calculator together aren't making a good connection. In any case, I showed the calculator taken apart into its sections. I also showed the Logic Tool I have been using to debug the graphing calculator and the EEPROM programmer I built. The last thing I showed was the TI-86 I bought to hook up to an ESP8266. My plan was to relay text from the calculator over WiFi to a computer running Matlab or similar software to allow the calculator to solve more complex equations. I still haven't gotten the ESP8266 voltage levels to work right, but I want to finish this project when I have time.

All in all it was a great day and I really enjoyed showing the public all the things I have been working