Skip to content

1806 Speed Can Impress

March 9, 2017

I fought my way past the stack discipline issue and cleaned up the use of SCAL/SRET and it seems that some aspects of the 1806 do give noticeable speed improvements. I was running the test suite on the latest compiler version and i got a persistent mysterious failure in the float test. The test program just seemed to hang toward the end and stop printing. I assumed something was goobering the stack and going off into lala land but it worked in the emulator and, when I put a blink loop in the compiler cleanup routine it showed that main() was in fact completing.

It seemed to relate to printing a long string and for a while i was imagining something to do with bumping some code over a page boundary but, before getting out the logic analyzer, I stuck some more probe points in and convinced myself that all the code was, in fact executing but the OUT 7’s weren’t driving the AVR to pass them on.

I started to think about timing and did instruction counts relative to baud rate. The AVR is sending data to the host at 57600 baud, 5,760 bytes/second. The 1802/1806 at 4mhz is executing 250,000 instructions/sec so I needed at least 44 instructions between OUT 7’s so the AVR could keep up. Counting instruction times the printstr() routine was using about 58 instructions per character printed and the 1806 cut that down to 40. It wasn’t an issue for most programs but the floating point test prints a single string of 300 bytes and that was enough to blow out the AVR’s buffer.

So: in an actual working program, the 1806 knocks about 30% off the time of an 1802 for the same clock speed!

[Following 14 lines are the inner loop of printstr()]
L22:
;    while(*ptr){
;	putc(*ptr++);
	ldaD R12,7    **1806 RLDI is 2.5 inst vs 4 on 1802
	cpy2 R11,R7   **4
	incm R7,1     **1
	ldn1 R13,R11  **2
	zExt R13      **2
	Ccall _out    **1806 SCAL is 5 inst times vs 17 for 1802
;	}
L23:
	ldn1 R11,R7   **2
	jnzU1 R11,L22 **2.5
;}
[following lines are the body of the OUT routine]
_out:	;raw port output **16 instructions plus Cretn which is 4 on 1806, 10 on 1802
	;stores a small tailored program on the stack and executes it
	dec	sp	;work backwards
	ldi	0xD3	;return instruction
	stxd
	cpy2	rt1,sp	;rt1 will point to the OUT instruction
	glo	regarg1	;get the port number
	ani	0x07	;clean it
	ori	0x60	;make it an out instruction - 60 is harmless
	stxd		;store it for execution
	glo	regarg2	;get the byte to be written
	str	sp	;store it where sp points
	sep	rt1	;execute it
;we will come back to here with sp stepped up by one
+	inc	sp	;need to get rid of the 6x instruction
	inc	sp	;and the D3
	Cretn		;and we're done ** 1806 SRET is 4 inst times vs 10 on 1802

I always want the last word so that wordpress doesn’t eat my code!

Advertisements

From → Uncategorized

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: