Skip to content

1806 Speed Can Impress

I fought my way past the stack discipline issue and cleaned up the use of SCAL/SRET and it seems that some aspects of the 1806 do give noticeable speed improvements. I was running the test suite on the latest compiler version and i got a persistent mysterious failure in the float test. The test program just seemed to hang toward the end and stop printing. I assumed something was goobering the stack and going off into lala land but it worked in the emulator and, when I put a blink loop in the compiler cleanup routine it showed that main() was in fact completing.

It seemed to relate to printing a long string and for a while i was imagining something to do with bumping some code over a page boundary but, before getting out the logic analyzer, I stuck some more probe points in and convinced myself that all the code was, in fact executing but the OUT 7’s weren’t driving the AVR to pass them on.

I started to think about timing and did instruction counts relative to baud rate. The AVR is sending data to the host at 57600 baud, 5,760 bytes/second. The 1802/1806 at 4mhz is executing 250,000 instructions/sec so I needed at least 44 instructions between OUT 7’s so the AVR could keep up. Counting instruction times the printstr() routine was using about 58 instructions per character printed and the 1806 cut that down to 40. It wasn’t an issue for most programs but the floating point test prints a single string of 300 bytes and that was enough to blow out the AVR’s buffer.

So: in an actual working program, the 1806 knocks about 30% off the time of an 1802 for the same clock speed!

[Following 14 lines are the inner loop of printstr()]
L22:
;    while(*ptr){
;	putc(*ptr++);
	ldaD R12,7    **1806 RLDI is 2.5 inst vs 4 on 1802
	cpy2 R11,R7   **4
	incm R7,1     **1
	ldn1 R13,R11  **2
	zExt R13      **2
	Ccall _out    **1806 SCAL is 5 inst times vs 17 for 1802
;	}
L23:
	ldn1 R11,R7   **2
	jnzU1 R11,L22 **2.5
;}
[following lines are the body of the OUT routine]
_out:	;raw port output **16 instructions plus Cretn which is 4 on 1806, 10 on 1802
	;stores a small tailored program on the stack and executes it
	dec	sp	;work backwards
	ldi	0xD3	;return instruction
	stxd
	cpy2	rt1,sp	;rt1 will point to the OUT instruction
	glo	regarg1	;get the port number
	ani	0x07	;clean it
	ori	0x60	;make it an out instruction - 60 is harmless
	stxd		;store it for execution
	glo	regarg2	;get the byte to be written
	str	sp	;store it where sp points
	sep	rt1	;execute it
;we will come back to here with sp stepped up by one
+	inc	sp	;need to get rid of the 6x instruction
	inc	sp	;and the D3
	Cretn		;and we're done ** 1806 SRET is 4 inst times vs 10 on 1802

I always want the last word so that wordpress doesn’t eat my code!

The 1806 – So Far Meh

So I have the 1806 running reliably and I’ve done a couple of instruction adaptations. So far not so good.  My implementation of SCAL/SRET is clumsy and as a result the code is much bulkier.  My “Hello From The Other Side” program goes from 5,941 bytes to 6,321. My hope would be that if i get the clumsiness fixed this would come more nearly even. The only other new instruction that looked like an easy win is a 16 bit immediate register load RLDI – that brought me back to 6265 bytes. There are just a couple of other instructions that have potential for performance improvement like register store via X and decrement RSXD and companion RLXA.  There are a bunch of decimal instructions which are probably completely useless.

One good thing is that the way i packaged the compiler output as macros really pays off for instruction changes. RLDI, RSXD, and RLXA went into exactly one spot each in the prolog file. The combination of those brought me back to 6161 bytes so, if i started even after SCAL/SRET I’d be down a couple of percent in size and probably a bit better in execution time.

ldiReg:	macro	reg,value
 if MOMCPU=$1805
 	RLDI	reg,value
 else
	ldi	(value)&255
	plo	reg
	ldi	(value)>>8; was/256
	phi	reg
 endif
	endm
popr:	macro	reg
 if MOMCPU=$1805
 	RLXA	reg
 else
	lda	sp
	phi	reg
	lda	sp
	plo 	reg
 endif
	endm

In the image below the 1806 is displaying the four bytes of machine code for the body of a do-nothing function. For the 1802 it would be a single byte D5.
17-02-27-1806
The functional payoff for the 1806 may come from the counter/timer instructions.

An altogether better idea

It occurred to me that the whole xmodem thing was a waste of time.  It’s trivial to have the 1806 fake a load mode like the 1802.  All I need is a little program that runs at startup and accepts binary data whenever IN is pressed.  Then I can use the existing avrdude on windows and more-or-less the same program in the avr bridge processor. Instead of talking to the 1802 load mode the AVR talks to my fake loader program in the 1802/1806.  A piece of cake.

//fakeloader simulates 1802 load mode in run for 1806
void main(){
	asm(" 	b4 run\n"		//bypass bootloader if IN pressed
	" 	ldAD 14,0x2000\n"	//starting address
	" 	sex 14\n"		//in X register
	"noEF4:	bn4 noEF4\n"		//loop til IN pressed
	"yEF4:	b4 yEF4\n"		//wait til switch released
	"	inp 6\n"		//load memory
	"   	nop\n"
	"	out 7\n"		//and echo
	"	br noEF4\n"		//back for more
	"run:	lbr 0x2000\n");	//finally - off we go
}

The 1802/6 loader code really is simple. On startup, it tests EF4 to see if the bootloader should run. Then it loads bytes starting at location 0x2000 every time it sees EF4 go low and then high. To invoke it the AVR resets the 1802/1806 then puts it in run with /EF4 high. It feeds it the code from avrdude then, when that’s done it restarts the 1802/1806 with /EF4 held low to bypass the loader and execute the application at 0x2000.

The changes to the avr loader were simple compared to what was done for xmodem. It looks at the first address avrdude sends for program loading. If it’s 0 it assumes this is an 1802 and it puts it in load mode. If it’s not 0 it puts the assumed 1806 in pseudo-load mode then programs it the same way.

Of course, getting this working involved four programs running on three platforms (avrdude on windows, my loader application on the avr, the fake loader and the loaded application on the 1802/1806). Much head-scratching and logic-analyzer poking was required. But, now that it works it’s all much simpler and once the fake loader is in eprom it should be pretty reliable.

17-02-26-stamps

Ugly But It Works!

With a couple of false starts I have the 1806 loading and running code and using the 1806 native subroutine call/return instructions.

The 1806 runs pretty well any 1802 code but has a bunch of new instructions including hardware subroutine call/return. It lacks the load mode of the 1802 which is what the AVR uses to bootload it. To get around that I wrote a simple xmodem receiver and loaded it while I had the 1802 chip in the olduino. I then swapped in the 1806 and made the video.

I made a new target for the LCC1802 compiler(XR18NW) that will be the platform for the 1806 adaptation. At the moment it only has my horrible kludge for the subroutine call/return.

//ssxload super simple xmodem loader
//loads a single 32 byte block with checksum from a custom host program
//17-02-08 timeout on initial receives
//17-02-09 no timeouts, single start ack, no error checking
//17-02-09 header read moved into main
//17-02-12 block level error checking and NAK
//17-02-14 send stx before using xmodem. esc is now 0x1b
//17-02-16 slimming down, axloader.h
//17-02-17 target=0x2000, stripping out diagnostic arrays 7621 bytes
#include <nstdlib.h>
#include <olduino.h>
#include <cpu1802spd4port7.h>
#include "xloader.h"
#define blocksize 128
#define target 0x2000
unsigned char *targetdata=(unsigned char *)0x2000; //allows 8K for loader
typedef void (*funptr)();
funptr targetcode=(funptr) target;
unsigned char blkno, hdr,bseq,iseq,ciseq,csum,calcsum,valid;
int readblk(unsigned char * where){//read a block
	unsigned int cnt=blocksize;
	unsigned char ch,rc=0;
	bseq=readch();
	iseq=readch();
	calcsum=0;
	while(cnt>0){
		ch=readch();
		*where++=ch;
		calcsum+=ch;
		cnt--;
	}
	csum=readch(); //read the checksum
	if (calcsum!=csum){//check it
		rc+=4;
	}
	if (bseq!=(blkno+1)){
		rc+=1;
	}
	if (iseq!=(254-blkno)){
		rc+=2;
	}
	return rc;
}

void main(){
	int ch='?',eot=0;
	unsigned char *t=targetdata;
	unsigned int i, tries=0,maxtries=1000;
	unsigned char thishdr;
	blkno=0;
	asm(" seq\n nop\n req\n");
	out(7,STX); //enable interrupts for host data
	putch(NAK); //send NAK to start
	thishdr=readch(); //get the header character

	while(thishdr!=EOT){
		if (SOH==thishdr){
			hdr=thishdr;
			valid=readblk(t);
			if (0==valid){
				blkno++;
				t+=blocksize;
				putch(ACK);
			}else{
				putch(NAK);
			}
		}
		thishdr=getchOrTo(1000);
	}

	putch(ACK);
//and we're done - god willing
	out(7,ETX); //disable interrupts for host data
	//if (++tries<maxtries){
		delay(1000);//let python clear out
		printf("terminating header was 0x%cx\n",thishdr);
		dump(targetdata,16);
		printf("number of blocks %d\n",blkno);
		printf("hh:bn/in c:i r:cs c:cs valid?\n");
		printf("%cx:%cx/%cx c:%cx r:%cx c:%cx %d\n",hdr,bseq,iseq,ciseq,csum,calcsum,valid);
	//}
	printf("\nRun1806(%d)\n",tries);
	asm(" ccall 0x2000\n");
	while(1);
}
#include <nstdlib.c>
#include <olduino.c>

from __future__ import print_function
import sys
import logging
logging.basicConfig()
import serial
try:
    from cStringIO import StringIO
except:
    from StringIO import StringIO
from time import sleep
import os
if len(sys.argv)>1:
    filename=sys.argv[1]
else:
    filename="test32.txt"
print ("File Namee is",filename)
fileSize=os.path.getsize(filename)
print ("File Size is",fileSize)

def xmodem_send(serial, file):
	blocksize=128
	t=0
	#	t, anim ='|/-\\'
	while 1:
	    if serial.read(1) != 'N':
		t = t + 1
		print ('.')
		if t == 3 : return False
	    else:
		break

	p = 1
	s = file.read(blocksize)
	while s:
	    s = s + '\xFF'*(blocksize - len(s))
	    chk = 0
	    for c in s:
		chk+=ord(c)
		#print (c,ord(c),chk,chk%256)
	    while 1:
		serial.write('S') #SOH)
		serial.write(chr(p))
		serial.write(chr(255 - p))
		serial.write(s)
		serial.write(chr(chk%256))
		serial.flush()
		print ('checksum is ',format(chk%256, '02X'),end=' ')
		answer = serial.read(1)
		if  answer == 'N': continue
		if  answer == 'K': break
		print ("unknown answer - length is ",len(answer))
		for character in answer:
  			print (character, character.encode('hex'))
		return False
	    s = file.read(blocksize)
	    p = (p + 1)%256
	    print ('.')
	serial.write('T')
	serial.flush()
	sleep(.1)
	serial.write('T')
	serial.flush()

	return True

#Main program starts here - define the serial port, set RTS off, then open it
#open the file to be loaded
stream = open(filename,'rb')
print("file is open")
port = serial.Serial(parity=serial.PARITY_NONE,
                     bytesize=serial.EIGHTBITS,
                     stopbits=serial.STOPBITS_ONE,timeout=10,xonxoff=0,
                     rtscts=0,dsrdtr=0,baudrate=57600)
port.rts=False
port.port='COM3'
port.open()
port.rts=True
sleep(0.001)
port.rts=False
port.flushInput()
sleep(2) # give avrisp a second to clear out
#transfer the file
result=xmodem_send(port, stream)

stream.close()
port.close()

if result:
    print ("\ntransfer successful")
else:
    print ("\ntransfer unsuccessful")

Final Pro Tip of the day, when you are swapping CPU chips, pay attention to which end goes where!
I first put the 1806 in backwards but it survived the assault.
17-02-23-wrongo

1806 Adaptation – Ugly On the Inside

I noted that the 1806 hardware subroutine call and return operation use the stack a bit differently than my software implementation.  A matter of whether you decrement the stack pointer before you store or after.  I chose before and they chose after.  I remember thinking long and hard about this and deciding my way was better – oh well. It’s essentially arbitrary but because parameters and local variables are stored on the stack, I have to pick one regime and stick to it. As a short term kludge i am attempting to paper over the differences as follows:

  • Before each SCAL instruction I decrement the stack pointer so it’s pointing to free memory.  On entry to each function i increment the stack pointer to reverse this.
  • Before each return I decrement the stack again and after each I increment it (the after case is embedded in the ccall macro).

This is awful but I think it will work in the short term to keep me going.  When I get home with the text available I’ll attempt to unwind this and do the right thing: changing my push/pop regime; changing the offsets from the stack pointer for variables and parameters.

;the 1806 SCAL does not decrement SP before pushing the return address
;and the SRET increments it before reloading it
;to accommodate my convention I bracket call/returns with inc/dec as required
;note that the function prolog contains a call to adjspfor1806 which completes the cycle

adjspfor1806: macro	;need to inc the stack to accommodate 1806 - kludge
 if MOMCPU=$1805
 	inc 	2	;leaves sp pointing to last byte of return address
 endif
	endm
Ccall:	macro	target
 if MOMCPU=$1805
 	sex	2
 	dec	2
 	SCAL	6
	dw	target
 	inc	2
 else
	sep	RCALL
	dw	target
 endif
	endm
Cretn:	macro
 if MOMCPU=$1805
 	dec	2
 	sret	6
 else
	sep	RRET
 endif
	endm

I always want the last word so that wordpress doesn’t eat my code!

1806 Standard Call and Return Instructions

17-02-19-1806-scal

CDP1806 Standard Call/Return Instructions

The 1802 has no native subroutine call and return operations. For small programs there’s a very quick way of switching program counters that’s arguably better and for more general cases there’s a Standard Call and Return convention(SCRT) that uses registers 4,5 and 6. It’s very well designed though a bit slow and it does tie up 3 registers. The 1806 incorporates two new instructions SCAL(688N) and SRET(689N) that speed up the process and avoids dedicating registers 4,5 and 6 (you still need a register for the return address but it doesn’t have to be 6).

Unfortunately for me the 1806 SCAL/SRET are different from the routines I use in the compiler. They assume that the stack pointer points to the first byte free byte below the stack and can be used freely. The compiler assumes that the stack pointer addresses the last USED byte on the stack and needs to be decremented to point to free memory.

This affects the way parameters and variables are stored and addressed on the stack. One of the parts of LCC that I had the toughest time coming to grips with. I think I could paper over part of them by decrementing SP before each SCAL and incrementing it before each SRET. I would then have to add one to the offset for accessing parameters but variable addressing wouldn’t change. I’ll have to fire up some test cases and try them on an emulator that supports the 1806 instruction set.

For insurance I’ll also make sure that R2 is the X register on each call.  It’s an extra instruction but it has saved me from myself in the past.

;Standard Call routine invoked as D4xxxx - big-endian stack convention
	sep     R3 ;go to subroutine
_call	sex	SP ;make sure X=SP
	glo	retAddr ;save previous return pointer on stack
	dec	sp
	stxd
	ghi	retAddr
	str	sp
	glo	RPC ;copy old PC to retAddr
	plo	retAddr
	ghi	RPC
	phi	retAddr
	lda	retAddr ;pick up subroutine address into RPC
	phi	RPC
	lda	retAddr
	plo	RPC
	br	_call-1

;Standard subroutine return
	sep	RPC	;return to the original program
_return	glo	retAddr	;transfer the current return address to RPC
	plo	RPC
	ghi	retAddr
	phi	RPC
	lda	SP	;pick up old return address
	phi	retAddr
	lda	SP
	plo	retAddr
	br	_return-1

 

Host I/O Updates to the AVR

Over the last week or so i’ve made a bunch of changes to the AVR code to do with host I/O basically to support the xmodem bootloader but more generally to allow bidirectional serial data.

The external changes are:

  • the AVR program can be put in quiet mode by entering ‘p’ quickly after startup, entering ‘q’.  Quiet mode eliminates the run1802 message and shortens the startup delay to 1 second
  • the AVR recognizes STX in the input stream from the 1802 and enables interrupts so host serial data can be handled properly.  ETX disables interrupts so that SPI doesn’t get screwed up.
  • The AVR recognizes 0x12 in the input stream as a request to read serial.available().  The AVR puts the value into the input shift register so the 1802 can read it with an INP(6);
  • The AVR recognizes 0x00 as a request to read a character from the serial buffer
  • The AVR recognizes 0x1B as an escape so that following character is passed on without interpretation.

Additionally, there are a couple of housekeeping changes related to moving to arduino 1.6.5: char1802 is defined unsigned char and output with _BYTE; a #define of MISO was removed because it’s in the standard defines.

ALSO, I increased the size of the input serial buffer from 64 to 256 bytes.  I did that by editing
C:\Program Files (x86)\Arduino\hardware\arduino\avr\cores\arduino\HardwareSerial.h and replacing the 64 with 256.

This will get lost when i upgrade the arduino but there doesn’t seem to be a better option for the moment.

//now needs to be compiled with 1.6.5 and the minicore addon.
//ISPV6 meant for use with olduino_2.0_V4E/V8 hardware AND NEW AVRDUDE
//this sketch turns the Arduino into a loader for the 1802 membership card
//adapted from the arduinoisp sketch included with arduino-022 
//with credit to David Mellis and Randall Bohen
//July 12 2012 - revamped from aduinoisp1802v8
//mostly pin changes to start
//August 1 - dinking around expanding ? code trying to debug verification failure
//aug 9 changing ! to ~ in interface() ddr settings
//aug 10/11 changed run code to make memrd output and low.
//saved back to olduinoispv2_2
//Aug 16 trying shorter timeout delay (1 sec)
//August 18 trying 57600 baud
//Jan 15, 2014 first build for Olduino V@.0 - all serial/spi
//Jan 26, fixing spi clocking logic in justclock
//Mar 10, 2014 speeding up busout/busin for program loading
//April 3, 2014 moving clock pins to PORTC SCK=PC1, /SCK=PC0, circuit olduino_2.0_v2b.sch
//May 25, 2014 pin definitions changed to match olduino_2.0_v4e schematic
//ldsw,clrsw,datapin=mosi,spiclk,/srclk,srclk,clockbvs
//June 12 speeding up clocking and tightening up timing - 1st try @20mhz
//Aug 10 - increasing presstimeu to 20 us
//aug 10 -presstimeu 
//15-01-13 avrdude is requesting version with a 5630 sequence instead of 75'
//17-01-23 experimenting with sending data from host to 1802 program
//         also had to change MISO to MISOX to avoid compile error
//??-??-?? changed char1802 in run1802 to unsigned char cout <<_BYTE(...
//17-02-06 verbose/quites runmode controls run1802 msg
//17-02-14 stx/etx trigger enable/disable interrupts
#define INsw 5 // IN bus pin 27 - normally high
#define presstimeu 20 //us to hold input switch down
#define LDsw 8 // /LOAD(/WAIT) bus pin 29 - arduino PB0 14
#define CLRsw 7 // /CLEAR bus pin 28 - ard pd7 13
#define MEMrd  0 //not connected
//all comm to 1802 is serial.  datapin is connected to the output of MOSR
//MISOX is connected to the input of the 1802's MISR
//sr_sck loads a bit into MISR 
//spi_sck tells other devices to grab top bit of MOSR
//notsr_sck advances MOSR and clocks the data in MISR To the storage register
#define datapin 9 //pin to receive shift register data
#define MISOX 12   //output pin hooked to 1802's input shift register
#define MISOPORT PORTB
#define MISODDR DDRB
#define MISOBIT 4  //miso is on port B bit 4
#define sr_sck 15 //pin to clock the data in the shift registers
#define notsr_sck 16 //inverse of sr_sck
#define spi_sck 14 //clock seen at the spi connector.
#define clockport PORTC
#define clockddr DDRC
#define clockbv 3  //both clocks high
#define notclockbv 4 //clocks low, notclock high
#define N0pin 2    //N0 is the 1802 signalling to do host output
#define OUT2pin 6  //signals the 1802 is writing to MOSR
#include <Streaming.h>
#define cout Serial
#define endl '\n'

#define HWVER 2
#define SWMAJ 5
#define SWMIN 0

// STK Definitions
#define STK_OK 0x10
#define STK_FAILED 0x11
#define STK_UNKNOWN 0x12
#define STK_INSYNC 0x14
#define STK_NOSYNC 0x15
#define CRC_EOP 0x20 //ok it is a space...

unsigned long timeoutstart, timeoutinterval=2000; //used to timeout the loader and start the olduino
byte state1802=0;
#define running 1
#define reset 2
#define loading 3
//char runmode; //setting this to 'v' means verbose
char runmode __attribute__ ((section (".noinit")));

void setup() {
  Serial.begin(57600);
  //delay(2000);
  //cout<<"OlduinoII_ISPV6.1 here with runmode="<<runmode<<'\n';
  DDRB=0; //all B pins are inputs (PB5 will have been left as an output by the bootloader)
  digitalWrite(sr_sck,LOW); digitalWrite(spi_sck,LOW);digitalWrite(notsr_sck,HIGH);
  pinMode(sr_sck,OUTPUT); pinMode(notsr_sck,OUTPUT);pinMode(spi_sck,OUTPUT);//set up to drive the shift register
 //doitall(); 
   //while(1); //loop
  timeoutstart=millis(); //if we don't get anything within timeouttime ms we start the 1802 running
  reset1802();

  //while(1){
  //  justclock();
  //}
}
int error=0;
int pmode=0;
int here; // address for reading and writing, set by 'U' command
uint8_t buff[256]; // global block storage

#define beget16(addr) (*addr * 256 + *(addr+1) )
typedef struct param {
  uint8_t devicecode;
  uint8_t revision;
  uint8_t progtype;
  uint8_t parmode;
  uint8_t polling;
  uint8_t selftimed;
  uint8_t lockbytes;
  uint8_t fusebytes;
  int flashpoll;
  int eeprompoll;
  int pagesize;
  int eepromsize;
  int flashsize;
} 
parameter;

parameter param;  

void loop(void) {
  uint8_t char1802=0x55;
  if (Serial.available()) {
    avrisp();
    timeoutstart=millis();
  }
  if(millis() - timeoutstart > timeoutinterval){
      run1802(); //run - does not return.
  }
}

uint8_t getch() {
  while(!Serial.available());
  return Serial.read();
}
void readbytes(int n) {
  for (int x = 0; x < n; x++) {
    buff[x] = getch(); //was Serial.read();
  }
}


void empty_reply() {
  char gc=getch();
  if (CRC_EOP == gc) {
    Serial.print((char)STK_INSYNC);
    Serial.print((char)STK_OK);
  } 
  else {
    Serial.print((char)STK_NOSYNC);
  }
}

void breply(uint8_t b) {
  if (CRC_EOP == getch()) {
    Serial.print((char)STK_INSYNC);
    Serial.print((char)b);
    Serial.print((char)STK_OK);
  } 
  else {
    Serial.print((char)STK_NOSYNC);
  }
}

void get_version(uint8_t c) {
  switch(c) {
  case 0x80:
    breply(HWVER);
    break;
  case 0x81:
    breply(SWMAJ);
    break;
  case 0x82:
    breply(SWMIN);
    break;
  case 0x93:
    breply('S'); // serial programmer
    break;
  default:
    breply(0);
  }
}

void set_parameters() {
  // call this after reading paramter packet into buff[]
  param.devicecode = buff[0];
  param.revision = buff[1];
  param.progtype = buff[2];
  param.parmode = buff[3];
  param.polling = buff[4];
  param.selftimed = buff[5];
  param.lockbytes = buff[6];
  param.fusebytes = buff[7];
  param.flashpoll = buff[8]; 
  // ignore buff[9] (= buff[8])
  //getch(); // discard second value
  
  // WARNING: not sure about the byte order of the following
  // following are 16 bits (big endian)
  param.eeprompoll = beget16(&buff[10]);
  param.pagesize = beget16(&buff[12]);
  param.eepromsize = beget16(&buff[14]);

  // 32 bits flashsize (big endian)
  param.flashsize = buff[16] * 0x01000000
    + buff[17] * 0x00010000
    + buff[18] * 0x00000100
    + buff[19];

}

void start_pmode() {
  start_pmode_1802();
  //cout<<"entering program mode";
  pmode = 1;
}

void end_pmode() {
  end_pmode_1802();
  //cout<<"leaving program mode";
  pmode = 0;
}


uint8_t write_flash(int length) {
  write_flash_1802(length);
  return STK_OK;
}

void program_page() {
  char result = (char) STK_FAILED;
  int length = 256 * getch() + getch(); //gets length 1-256
  if (length > 256) {
      Serial.print((char) STK_FAILED);
      return;
  }
  char memtype = getch();
  for (int x = 0; x < length; x++) {
    buff[x] = getch();
  }
  if (CRC_EOP == getch()) {
    Serial.print((char) STK_INSYNC);
    if (memtype == 'F') result = (char)write_flash(length);
    Serial.print(result);
  } 
  else {
    Serial.print((char) STK_NOSYNC);
  }
}

void read_signature() {
  if (CRC_EOP != getch()) {
    Serial.print((char) STK_NOSYNC);
    return;
  }
  Serial.print((char) STK_INSYNC);
  #define sig_high 0x1e //dummy (code for part ATMEGA644P
  #define sig_middle 0x96 //device
  #define sig_low 0x0A //signature
  Serial.print((char) sig_high);
  Serial.print((char) sig_middle);
  Serial.print((char) sig_low);
  Serial.print((char) STK_OK);
}
void universal() {//15-01-13 this is being used to retrieve the signature
  int w;
  uint8_t ch;

  for (w = 0; w < 4; w++) { //take the data
    buff[w] = getch();
  }
  if ((buff[0]==0x30)  && (buff[2]==0x00)){
    breply(sig_high);
  }
  else if ((buff[0]==0x30)  && (buff[2]==0x01)){
     breply(sig_middle);
  }
  else if ((buff[0]==0x30)  && (buff[2]==0x02)){
    breply(sig_low);
  }
  else{
    breply(STK_FAILED); //see if this gets a rise out of avrdude
  }
}
//////////////////////////////////////////
//////////////////////////////////////////



void read_page() {
  char result = (char)STK_FAILED;
  int length = 256 * getch() + getch();
  char memtype = getch();
  //cout<<"reading "<<length<<" bytes of "<<memtype<<" ";
  if (CRC_EOP != getch()) {
    Serial.print((char) STK_NOSYNC);
    return;
  }
  Serial.print((char) STK_INSYNC);
  if (memtype == 'F') result = read_page_1802(length);
  Serial.print(result);
  return;
}

////////////////////////////////////
////////////////////////////////////


I always want the last word so that wordpress doesn’t eat my code!

int avrisp() { 
  uint8_t data, low, high, gc;
  uint8_t ch =getch();    //cout<<"<"<<_HEX(ch)<<" ";
  switch (ch) {
  case '?': //debug
    doitall();
    break;
  case 'v': //verbose mode
    runmode='v';
    break;
  case 'q': //quiet mode
    runmode='q';
    cout<<"going silent\n";
    break;
  case '!': //run
    cout<<"!=runq\n";
    runmode='q';
    run1802();
    break;
  case '0': // signon
    empty_reply();
    break;
  case '1':
    gc=getch(); //cout<<":"<<_HEX(gc)<<" ";
    if (gc == CRC_EOP) {
      Serial.print((char) STK_INSYNC);
      Serial.print("AVR ISP");
      Serial.print((char) STK_OK);
    }
    break;
  case 'A':
    gc=getch(); //cout<<":"<<_HEX(gc)<<" ";
    get_version(gc);
    break;
  case 'B':
    readbytes(20);
    set_parameters();
    empty_reply();
    break;
  case 'E': // extended parameters - ignore for now
    readbytes(5);
    empty_reply();
    break;

  case 'P':
    start_pmode();
    empty_reply();
    break;
  case 'U':
    here = getch() + 256 * getch();
    sethere_1802();
    //cout<<"here="<<_HEX(here);
    empty_reply();
    break;

  case 0x60: //STK_PROG_FLASH
    low = getch();
    high = getch();
    empty_reply();
    break;
  case 0x61: //STK_PROG_DATA
    data = getch();
    empty_reply();
    break;

  case 0x64: //STK_PROG_PAGE
    program_page();
    break;
    
  case 0x74: //STK_READ_PAGE
    read_page();    
    break;

  case 'V':
    universal();
    break;
  case 'Q':
    error=0;
    end_pmode();
    empty_reply();
    //run1802();
    break;
    
  case 0x75: //STK_READ_SIGN
    read_signature();
    break;

  // expecting a command, not CRC_EOP
  // this is how we can get back in sync
  case CRC_EOP:
    Serial.print((char) STK_NOSYNC);
    break;
    
  // anything else we will return STK_UNKNOWN
  default:
    if (CRC_EOP == getch()) 
      Serial.print((char)STK_UNKNOWN);
    else
      Serial.print((char)STK_NOSYNC);
  }
}

I always want the last word so that wordpress doesn’t eat my code!
 

#define sbi(sfr,bit) (_SFR_BYTE(sfr) |= _BV(bit))
#define cbi(sfr,bit) (_SFR_BYTE(sfr) &= ~_BV(bit))
void busout(uint8_t t){ //put a byte on the bus
  sbi(MISODDR,MISOBIT);
  for (int i=0;i<8;i++){
      if(MISOX, t&0x80){
        sbi(MISOPORT,MISOBIT);
      }else{
        cbi(MISOPORT,MISOBIT);
      }
      clockport=clockbv;//clock up, notclock down
      asm("  nop\n");
      clockport=notclockbv;//clock down, notclock up
      t<<=1;
  }
  cbi(MISODDR,MISOBIT);
}
uint8_t busin(){ //read a byte from the 1802
  uint8_t buschar;
//bit 7 of the byte is already in position to be read, the others have to be clocked in
  buschar=0; //clear the character
  for (uint8_t i = 0; i < 8; ++i) {  //get the 8 bits in sequence
	buschar = (buschar<<1)|digitalRead(datapin); //read bits 0-7 in sequence ad set them in the byte
      clockport=clockbv;//clock up, notclock down
      asm("  nop\n");
      clockport=notclockbv;//clock down, notclock up
  }
  return buschar;
}
void justclock(){ //just run the spi clock
      clockport=clockbv;//clock up, notclock down
      clockport=notclockbv;//clock down, notclock up
      asm("nop\n");
      clockport=clockbv;//clock up, notclock down
      asm("nop\n");
      clockport=notclockbv;//clock down, notclock up
      asm("nop\n");
      clockport=clockbv;//clock up, notclock down
      asm("nop\n");
      clockport=notclockbv;//clock down, notclock up
      asm("nop\n");
      clockport=clockbv;//clock up, notclock down
      asm("nop\n");
      clockport=notclockbv;//clock down, notclock up
      asm("nop\n");
      clockport=clockbv;//clock up, notclock down
      asm("nop\n");
      clockport=notclockbv;//clock down, notclock up
      asm("nop\n");
      clockport=clockbv;//clock up, notclock down
      asm("nop\n");
      clockport=notclockbv;//clock down, notclock up
      asm("nop\n");
      clockport=clockbv;//clock up, notclock down
      asm("nop\n");
      clockport=notclockbv;//clock down, notclock up
      asm("nop\n");
      clockport=clockbv;//clock up, notclock down
      asm("nop\n");
      clockport=notclockbv;//clock down, notclock up
}
void run1802(){
  unsigned char char1802=21, pindsave,escmode=false; //char from 1802, captured PIND;
  if ('v'==runmode){
    cout<<" run1802.6.1("<<timeoutinterval<<")"<<"\n";
  }
  //Serial.end(); Serial.begin(9600);
  interface(INPUT); //release the pins(really just INsw)
  digitalWrite(CLRsw,LOW);   digitalWrite(LDsw,LOW);//ensure wait state
  delay(5);
  digitalWrite(LDsw,HIGH); //reset
  delay(5);
  pinMode(MEMrd,OUTPUT); digitalWrite(MEMrd,LOW); //AUG 10 -make sure 1802 can write
  digitalWrite(CLRsw,HIGH); //run
  state1802=running;
  cbi(TIMSK0,TOIE0);
  noInterrupts(); //default is no interrupts
  while(true){ //loop for 1802 program output
    while (PIND&0x40); //wait for SPI signal to go low
    pindsave=PIND; //grab the n0 line while it's valid
    while (0==(PIND&0x40)); //wait for trailing edge - signal goes high again
    if (pindsave&0x04){ //if it's for me  ** have to test this now
      char1802=busin(); //grab the character and clock the SPI bus
      if (escmode==1){//escaping this character
        cout<<_BYTE(char1802); //send it to the host
        escmode=0;
      }else{
        switch(char1802){
          case 0x12: //querying for avail characters
            busout(Serial.available());
            break;
          case 0: //wanting host characters
            busout(getch());
            break;
          case 0x1b: //escape next character
            escmode=1;
            break;
          case 0x02: // allow interrupts
            interrupts(); break;
          case 0x03: // disallow interrupts
            noInterrupts(); break;            
          default:
            cout<<_BYTE(char1802); //send it to the host
        }
      }
   } else{ //not for me, have to do this fast
      justclock(); //just clock the SPI bus
    }
  }
}
void domeall(){
 }
  
void doitall(){
  uint8_t c18;
  cout<<"doing it all\n";
  interface(OUTPUT);
   //start the write sequence
  reset1802();
  digitalWrite(LDsw,LOW);  //then go to load mode
 // busout(0xaa);
  //while(1);
  busout(0x7b);
  pressin();
  busout(0x7a);
  pressin();
  busout(0x7a);
  pressin();
  busout(0x7a);
  pressin();
  busout(0x7a);
  pressin();
  busout(0x30);
  pressin();
  busout(0x00);
  pressin();
  reset1802();
  run1802();
while(1);
}  
void sethere_1802(){
  if (here==0){ //if it wants to read from the beginning, fine
    reset1802();
    digitalWrite(LDsw,LOW);  //then go to load mode
  } //otherwise ignore it
}
void start_pmode_1802() {
  interface(OUTPUT); //set rquired pins to OUTPUT
  reset1802();
  digitalWrite(LDsw,LOW);  //then go to load mode
  //cout<<"entering program mode";
  pmode = 1;
}

void end_pmode_1802() {
  //cout<<"leaving program mode";
  pmode = 0;
}
void pressin(){
  digitalWrite(INsw,LOW); //toggle
  delayMicroseconds(presstimeu);
  digitalWrite(INsw,HIGH);
  delayMicroseconds(presstimeu);
}  
void reset1802(){
  pinMode(CLRsw,OUTPUT); pinMode(LDsw,OUTPUT);
  digitalWrite(CLRsw,LOW);   digitalWrite(LDsw,LOW);//reset
  digitalWrite(LDsw,HIGH); //1802
  state1802=reset;
}

void interface(uint8_t io){ //setting pins to INPUT or OUTPUT
 //really need to rethink this
  digitalWrite(MISOX,LOW);
  digitalWrite(INsw,HIGH);
  digitalWrite(spi_sck,LOW);digitalWrite(sr_sck,LOW);digitalWrite(notsr_sck,HIGH);
  digitalWrite(LDsw,LOW);digitalWrite(CLRsw,LOW);
  pinMode(MISOX,INPUT);//MISO is controlled by busout
  pinMode(spi_sck,OUTPUT);pinMode(sr_sck,OUTPUT);pinMode(notsr_sck,OUTPUT); //clocks are always output
  pinMode(CLRsw,OUTPUT);pinMode(LDsw,OUTPUT);  //ld & clr are always output
  pinMode(INsw,io);  //in controlled here
}



uint8_t read_byte_1802() {//first try
  digitalWrite(INsw,LOW);//push input button
  delayMicroseconds(presstimeu);
  digitalWrite(INsw,HIGH);
  delayMicroseconds(presstimeu);
  uint8_t buschar=busin();  
  return buschar;
}
char read_page_1802(int length) {//first try
  digitalWrite(MEMrd,HIGH); //make sure we're read only
  for (int x = 0; x < length; x+=1) {
    Serial.print((char) read_byte_1802());
  }
  return STK_OK;
}

uint8_t write_flash_1802(int length) {
  int x = 0;
  digitalWrite(MEMrd,LOW); //allow memory write
  while (x < length) {
    write_byte_1802(buff[x++]); //we can only write bytes sequentially from address 0
  }
  digitalWrite(MEMrd,HIGH); //make sure we're read only
  return STK_OK;
}

void write_byte_1802(char data){ //write a byte to the 1802
  busout(data); //put it on the data bus
  digitalWrite(INsw,LOW);//push input button
  delayMicroseconds(presstimeu);
  digitalWrite(INsw,HIGH);
  delayMicroseconds(presstimeu);
}

I always want the last word so that wordpress doesn’t eat my code!