Skip to content

More Optimization work

March 15, 2013

When working on the ethernet interface I was just on the edge of practicality for speed.  It was taking me 4 seconds or so to respond to a ping.  I’ve decided to use the ping execution path through the code as another optimization test case.  My going-in assumption is that the biggest gain will be in the xferspi() module which looks like this:

 

 

 
uint8_t xferSPI(uint8_t value){
  int i;

  for(i=0;i<8;i++){
    digitalWrite(mosi,(value&0x80));      //by setting mosi for each bit
    value=(value<<1)|digitalRead(miso);;
    digitalWrite(sck,HIGH);              //then pulsing the clock
    digitalWrite(sck,LOW);
  }
  return value;
}

I also thought that the bulk of the time would be soaked up in digitalRead and digitalWrite.

So if I want just to fix the speed problem I could tackle digitalRead and digitalWrite but i want to look critically at all the code so I’ll go at it top down.

For starters, I printed out the assembly code for the whole packet receive execution path and went over it marking in instruction counts. As an example here’s the xferspi routine:
026

This showed I had 21 instructions before the loop and 28 after with the loop body being 131 instructions plus 30*2 for 2 digitalwrites and 18 for a digitalread; the whole routine takes 21+28+8*(131+60+18)=about 1721 instructions. Remember that the 1802 is executing about 100,000 instructions per second so that’s 17 milliseconds. I built up a picture of the execution time for the packet-receive loop which looked like this:

encpacketreceive		580	**364729**				
	readregbyte	110	14690				
		setbank	235	10890			
		2	writeop	295	3250		
			2	xferspi	1097	528	
				16	digitalread	18	18
				8	digitalwrite	30	30
		readop	315	3250			
			2	xferspi	1097	528	
	writereg	194	29654				
		2	writeregbyte	157	14670		
				writeop	295	3250	
				setbank	235	10890	
	readbuf(16)	1652	53376				
	readbuf(60)	1652	200160				
2	writereg	194	29654				
	writeop	295	3250				

The table is confusing but the two numbers after each routine are the instruction count for that routine body and the time spent in the routines it calls. so xferspi is 1097 instructions in itself and 528 in the routines it calls.
The numbers change every time I look at them but the bottom(or top) line is that the loop to receive a 16 byte header and a 60 byte packet takes almost 365,000 instructions – that’s 3.65 seconds! Never mind the time to respond to the ping. As expected, the time is almost all soaked up in xferspi but not in digitalread; the loop body of xferspi and the call overhead for digitalread/write are at least as much of the problem.

Advertisements

From → Uncategorized

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: