Skip to content

The Rhinestone C Compiler on Google Sites

October 25, 2013

The Rhinestone C Compiler on Google Sites

The latest version of the LCC1802 compiler has been optimized based on the Dhrystone benchmark program.

The Dhrystone integer benchmark dates from the 80’s and is not much used anymore but I was able to find the source which is written in C. http://www.netlib.org/benchmark/dhry-c  It is very well documented and fairly easy to understand at a high level. It includes integer math, procedure calls, and string manipulation. It ends with a printout of predicted and actual results so you can tell it worked.

Compiling it for the 1802 I only had to comment out the parts that call the operating system for time information. Other than that it compiled and ran without a hitch. (so Yay!)

My initial run took about 7.5 seconds for 100 passes or 13.333 Dhrystones/second. This compares to a classic VAX 11/780 which did 1757 Dhrystones/sec. I.E. the VAX score is 133 times the 1802’s.

I think the VAX executed 500,000 instructions per sec vs 100,000 for the MC so the VAX did 500000/1757=284 vax instructions per pass and the 1802 is doing about 7500. Frankly, looking at the benchmark code, 284 seems like an impossibly small number for the work being done but 7500 certainly left room for improvement.

The most recent run scored 27 executing 3697 instructions per pass. The 27 compares favorably with the 36 score of the 6502 in an apple IIe since the 1802 is running well below its rated speed.

By far the biggest improvements came from improving support routines rather than directly optimizing the emitted code.
-The compiler now generates inline shifts and adds for multiplications by small constants and the multiplication routine has been optimized for small operands(which are common);
-The division routine was re-coded to better use 1802 instructions and, again, expedite smaller divides;
-Two common string routines(copy and compare) were re-written in assembly;
Changing these routines pulled out thousands of instructions from each pass.

The peephole optimizer runs over the code after it’s been emitted by the compiler and before it’s assembled looking for simple changes such as combining multiple accesses to the same storage location or eliminating unnecessary register loads . By comparison the peephole rules pulled out 500-600 instructions.

At the same time, the object module got about 10-20% smaller so overall I’m calling this a win.

 

Advertisements

From → Uncategorized

One Comment

Trackbacks & Pingbacks

  1. Z80 V 1802: 1802 Wins the First Round | olduino

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: