One of the projects I started on for my Summer Hackathon is a new version of my RPN calculator. This version uses an LPC1114 instead of the two MSP430s the last version used. Hopefully this version will be much smaller and faster. Even though I'm not yet at a good stopping place I wanted to make a post about it so that the last post is not overly long.
 Porting the code from the MSP430 was not too difficult. The only real trouble was moving some of the memory registers from the external RAM to the internal RAM of the LPC1114. This really increases calculation speed since it eliminates the bottleneck of accessing external memory. On the MSP430 version I used a preprocessor function to identify and replace any array access to the external RAM with functions calls. This worked well since all of the registers were stored externally but caused a few headaches on the new version since internally stored registers need to be handled differently. One idea I had was to use 32 bit pointers, which wouldn't cause the performance hit they do on a 16-bit system, to somehow encode not only the memory address, but also the memory bank and whether an address is stored externally or internally. It also seems like it would make more sense to allocate memory buffers on a stack as they are needed, rather than hardcoding them in memory. This got me thinking about how other systems handle this problem and I decided to add a way to do that to my 6502 project with a CPLD.
Porting the code from the MSP430 was not too difficult. The only real trouble was moving some of the memory registers from the external RAM to the internal RAM of the LPC1114. This really increases calculation speed since it eliminates the bottleneck of accessing external memory. On the MSP430 version I used a preprocessor function to identify and replace any array access to the external RAM with functions calls. This worked well since all of the registers were stored externally but caused a few headaches on the new version since internally stored registers need to be handled differently. One idea I had was to use 32 bit pointers, which wouldn't cause the performance hit they do on a 16-bit system, to somehow encode not only the memory address, but also the memory bank and whether an address is stored externally or internally. It also seems like it would make more sense to allocate memory buffers on a stack as they are needed, rather than hardcoding them in memory. This got me thinking about how other systems handle this problem and I decided to add a way to do that to my 6502 project with a CPLD.
Along the way I also noticed a few areas where I could improve my code. For some reason I got in the habit of reusing generic counter variables like x or i in an effort to save memory. This just makes everything harder to read and does not save any memory since GCC optimizes how variables are used anyway. BCD numbers should also be stored with the smallest places first since that way extending the number by a decimal place only involves incrementing the length field rather than moving every byte of the number one space. In retrospect 255 decimal places is probably way too much for such a simple calculator. This didn't make much difference when I had lots of external RAM but is inconvenient now that I am trying to fit as many registers as I can into the LPC1114's memory.
To read and write the external RAM I'm using the chip's SPI peripheral. It turned out to be somewhat difficult to configure since several of the steps to get it to work have to be done in a certain order. Debugging this was a lot easier with a logic analyzer. After I had it working and got the memory registers straightened out, I was a little surprised to see that calculating sine was slower than on the MSP430. This was a shock since it was so much faster in my microcontroller comparison. I tried raising the SPI peripheral speed to 24MHz, even though the max speed of the RAM is 20MHz, but it was still slower. My next step is to try and measure the clock speed of the SPI with the logic analyzer to make sure it really is going that fast. After I got sine to work, I uncommented other parts of the program in parts to see how much would fit in the flash. Compiling with GCC's -O3 option for speed optimization quickly overflowed the flash but all of it fit in less than 20KB with -Os. One of my gripes with GCC is that it doesn't tell you by how much the firmware is too large, so you have no clue how much you would need to shave off for it to fit.
Initializing and toggling the pins of the LPC1114 is a bit more complicated than on the MSP430 since different pins have different configuration bits and the pins themselves are not mapped into memory in order. I had a look at the HAL library and it seems to use quite a bit of code for what I want to accomplish so I wrote a simple class for handling pins. Luckily I can use C++ since I am just compiling with GCC, unlike with OpenSTM32 where I was stuck with plain C. One neat thing someone showed me was how to use operator overloading so that something like "Pin=1;" does the same as "Pin.high();"
Since I have so much space left over in the flash, I started adding a keystroke programming function. This was one of the neat things I have always liked about earlier calculators. I am nearly done with the editor on the screen for it, which I'm working on with the PC version of the program since it is much faster to test. The next time I do a project like this I will put the PC and embedded version in the same file and enable them with include statements instead of copying the changed sections of the PC version. So far I am only adding the ability to repeat keys and not loop or branch since I want to finish this and at least one other project in time for Makevention in August. When I have more time I hope to add that functionality.
 Porting the code from the MSP430 was not too difficult. The only real trouble was moving some of the memory registers from the external RAM to the internal RAM of the LPC1114. This really increases calculation speed since it eliminates the bottleneck of accessing external memory. On the MSP430 version I used a preprocessor function to identify and replace any array access to the external RAM with functions calls. This worked well since all of the registers were stored externally but caused a few headaches on the new version since internally stored registers need to be handled differently. One idea I had was to use 32 bit pointers, which wouldn't cause the performance hit they do on a 16-bit system, to somehow encode not only the memory address, but also the memory bank and whether an address is stored externally or internally. It also seems like it would make more sense to allocate memory buffers on a stack as they are needed, rather than hardcoding them in memory. This got me thinking about how other systems handle this problem and I decided to add a way to do that to my 6502 project with a CPLD.
Porting the code from the MSP430 was not too difficult. The only real trouble was moving some of the memory registers from the external RAM to the internal RAM of the LPC1114. This really increases calculation speed since it eliminates the bottleneck of accessing external memory. On the MSP430 version I used a preprocessor function to identify and replace any array access to the external RAM with functions calls. This worked well since all of the registers were stored externally but caused a few headaches on the new version since internally stored registers need to be handled differently. One idea I had was to use 32 bit pointers, which wouldn't cause the performance hit they do on a 16-bit system, to somehow encode not only the memory address, but also the memory bank and whether an address is stored externally or internally. It also seems like it would make more sense to allocate memory buffers on a stack as they are needed, rather than hardcoding them in memory. This got me thinking about how other systems handle this problem and I decided to add a way to do that to my 6502 project with a CPLD.Along the way I also noticed a few areas where I could improve my code. For some reason I got in the habit of reusing generic counter variables like x or i in an effort to save memory. This just makes everything harder to read and does not save any memory since GCC optimizes how variables are used anyway. BCD numbers should also be stored with the smallest places first since that way extending the number by a decimal place only involves incrementing the length field rather than moving every byte of the number one space. In retrospect 255 decimal places is probably way too much for such a simple calculator. This didn't make much difference when I had lots of external RAM but is inconvenient now that I am trying to fit as many registers as I can into the LPC1114's memory.
To read and write the external RAM I'm using the chip's SPI peripheral. It turned out to be somewhat difficult to configure since several of the steps to get it to work have to be done in a certain order. Debugging this was a lot easier with a logic analyzer. After I had it working and got the memory registers straightened out, I was a little surprised to see that calculating sine was slower than on the MSP430. This was a shock since it was so much faster in my microcontroller comparison. I tried raising the SPI peripheral speed to 24MHz, even though the max speed of the RAM is 20MHz, but it was still slower. My next step is to try and measure the clock speed of the SPI with the logic analyzer to make sure it really is going that fast. After I got sine to work, I uncommented other parts of the program in parts to see how much would fit in the flash. Compiling with GCC's -O3 option for speed optimization quickly overflowed the flash but all of it fit in less than 20KB with -Os. One of my gripes with GCC is that it doesn't tell you by how much the firmware is too large, so you have no clue how much you would need to shave off for it to fit.
Initializing and toggling the pins of the LPC1114 is a bit more complicated than on the MSP430 since different pins have different configuration bits and the pins themselves are not mapped into memory in order. I had a look at the HAL library and it seems to use quite a bit of code for what I want to accomplish so I wrote a simple class for handling pins. Luckily I can use C++ since I am just compiling with GCC, unlike with OpenSTM32 where I was stuck with plain C. One neat thing someone showed me was how to use operator overloading so that something like "Pin=1;" does the same as "Pin.high();"
Since I have so much space left over in the flash, I started adding a keystroke programming function. This was one of the neat things I have always liked about earlier calculators. I am nearly done with the editor on the screen for it, which I'm working on with the PC version of the program since it is much faster to test. The next time I do a project like this I will put the PC and embedded version in the same file and enable them with include statements instead of copying the changed sections of the PC version. So far I am only adding the ability to repeat keys and not loop or branch since I want to finish this and at least one other project in time for Makevention in August. When I have more time I hope to add that functionality.
 
No comments:
Post a Comment