Here are the results for the tests I did with the MSP430 and the LPC1114. Lately I have been busy with work so I haven't had a chance to do anything with the other three chips. I do have tool chains set up for them and I plan to run the same tests on them soon too.
To time the tests I used an MSP430 LaunchPad which waited for a pulse from the tested chips to begin and end timing. The timer was run from the VLO which is accurate to ±3% so the results are not exact but they give a rough idea of the difference in speed. The tests were done inside two nested loops which iterate altogether 2,400,000 times. Each operation is done once per iteration except adding and GPIO operations which are done 10 times since they should be much faster. At first I did each test 5 times and took the average but then I did them only once since the results were usually very close and some of the later tests took quite a while. To begin with I timed how long the empty loop took then I subtracted that time from the results to get the actual time the operations took. For both chips I used gcc to compile with both -Os and -O3, although there wasn't really a difference except for the BCD routine tests.
As you can see from the table, the LPC1114 was much faster than the MSP430, even running from the bottlenecked flash. The results would be even faster if the tests had been copied to RAM and run from there. I was also surprised to realize that the LPC1114 has a barrel shifter which is very handy. Another neat feature is GPIO masking. Each mask is memory mapped so individuals pins can be toggled with a single write to memory. This allows GPIO operations to be much faster than the traditional load, modify, and store routine.
To time the tests I used an MSP430 LaunchPad which waited for a pulse from the tested chips to begin and end timing. The timer was run from the VLO which is accurate to ±3% so the results are not exact but they give a rough idea of the difference in speed. The tests were done inside two nested loops which iterate altogether 2,400,000 times. Each operation is done once per iteration except adding and GPIO operations which are done 10 times since they should be much faster. At first I did each test 5 times and took the average but then I did them only once since the results were usually very close and some of the later tests took quite a while. To begin with I timed how long the empty loop took then I subtracted that time from the results to get the actual time the operations took. For both chips I used gcc to compile with both -Os and -O3, although there wasn't really a difference except for the BCD routine tests.
As you can see from the table, the LPC1114 was much faster than the MSP430, even running from the bottlenecked flash. The results would be even faster if the tests had been copied to RAM and run from there. I was also surprised to realize that the LPC1114 has a barrel shifter which is very handy. Another neat feature is GPIO masking. Each mask is memory mapped so individuals pins can be toggled with a single write to memory. This allows GPIO operations to be much faster than the traditional load, modify, and store routine.