1806 Dhrystone Results – Solid If Unspectacular
The 1806 port seems to be working reliably now. Almost all of the adaptation is done in the macros conditioned on setting CPU to 1805 at assembly time (the 1804/5/6 share the enhanced instruction set, 1805 is just what the assembler recognizes). The only changes to the machine description file are to use new epilog/prolog files and to add one to the stack offsets for variables and formal parameters. I could probably fold that back into the macros as well. If I make any backwards compatible improvements to the macros or other code I’ll probably do something like that. As it stands, code created by the new target XR18NW wont work on an 1802.
Rerunning the Dhrystone benchmark, the 1806 clocks 81 Dhrystones/Sec vs 68 on the 1802(both at 4MHZ) an 18% improvement. These compare to over 300 Drhrystones/sec for a Z80 with a much better compiler. The 1806 does 2900 instructions per pass vs 3660 for the 1802 which sounds more impressive but some of the 1806 instructions take longer than the 1802’s which makes up the difference.
So: the changes for the 1806 are:
- adding one to stack offsets for parameters and variables
- using scal/sret for call return
- using RLDI for loading 16 bit constants
- using RLXA and RSXD for 16 bit stack access where practical
- getting rid of places where I inc/dec the stack pointer to make a work area because the stack pointer now points below the last used byte.
- fixing up the odd place where the stack discipline was getting me in trouble – usually to do with calls from one assembly routine to another.
The remaining obvious 1806 instruction is a decrement and branch non-zero(DBNZ) which i will probably incorporate but it’s surprisingly tough to do and probably wouldn’t affect the Dhrystone results at all. All of the 1806 instructions seem more advantageous to someone writing tight code in assembly than for a compiler.
Looking at the generated code reminds me of just how clunky some of it is. I’m tempted to go back and incorporate liveness analysis in my optimizer and have a better look at chaining primitives in the machine description file.