Reading an AVR Bootloader From the Application Section

lightbulb
What follows is a brilliant hack, created by Julians Kidmore (aka Snial), the talented mind behind the Fignition project. His ingenious hack, called BootJacker, is documented on his blog, OneWeekWonder. I can’t begin to explain his algorithm as well as he did, so go read his blog. But, in brief, Snial wanted to inject a new smaller boot loader from the application portion of his program in order to reclaim unused flash. He accomplished what ATMEL and their datasheet claim impossible:

“The Application section can never store any Boot Loader code since the SPM instruction is disabled when executed from the Application section.”

More precisely, what ATMEL claim impossible, is for the SPM instruction to write the boot loader section (BLS) from the application section. This protection is normally a good thing. On the arduino, it prevents errant programs from overwriting the boot loader thus rendering the AVR chip useless (requiring low level ISP programming to revive it).

However, I was recently trying to read the BLS from the application section, and found this too is blocked. Try it and see if you can:

//168 memory location
void setup(void) {
  uint16_t address = 0x3800;
  
  Serial.begin(9600);
  for (uint8_t i=0; i<32; i++)
    Serial.println(pgm_read_byte_near(address++));
}

void loop(void) { }

This program produces gibberish.

However, if I correctly understand the arduino lock bit settings, these are set in such a way to prevent it. The arduino lock bit settings prevent all SPM writes to the BLS, and additionally prevent application section LPMs from reading the BLS. But, the lock bits stop there. There is nothing preventing an LPM from inside the BLS reading the BLS. Am I wrong here?

So I read about the inspired method Snial devised on his blog and thought it could be used to do just that. Use an LPM inside the BLS to read the BLS. Snial’s technique involves devious and potentially catastrophic stack manipulation combined with precise timing to cleverly redirect code execution. I simply jacked Snial’s jack of the SPM instruction, and applied it to the LPM instruction. All the credit goes to Snial, and all the glory goes to God.

My code is below, and it sort-of works, but only if all of the arduino lock bits are disabled. If the lock bits are set in the normal arduino manner the code doesn’t work. For the moment I give up. But here is my code:

//arduino 168 memory locations
#define kTCCR0B   0x25    //these defines required for inline asm
#define kTCNT0    0x26
#define kTIFR0    0x15
//timer 0 settings
#define T0_TIFR0  ((1<<OCF0B) | (1<<OCF0A) | (1<<TOV0))
#define T0_CYCLES 22
#define BLS_START 0x3800; //start of bls on atmega168

uint16_t ReadAddr;

void SetupTimer0B(void) {
  TCCR0B = 0;           // stop the timer
  TCCR0A = 0;           // mode 0, no OCR outputs
  TCNT0 = 0;            // reset the timer
  TIFR0 = T0_TIFR0;     //clear all pending t0 interrupts
  OCR0B = T0_CYCLES;    // clock cycles from now
  TIMSK0 = (1<<OCIE0B); // OCR0B interrupt enabled
}

uint8_t LpmCmd(void) {
  uint8_t result;

  asm volatile(
    "push r0 \n"
    "push r1 \n"
    "push r16 \n"
    "push r30 \n"
    "push r31 \n"

    "ldi r16,1 \n"                  // timer 0 start at fClk
    "out %1,r16 \n"                 // set TCCR0B so off we go. This is time 0c

    "ldi r30,pm_lo8(ReturnHere) \n" //(1c)
    "ldi r31,pm_hi8(ReturnHere) \n" //(1c)
    "push r30 \n"                   //(2c)
    "push r31 \n"                   //(2c) these addresses must be pushed big-endian
	
                                    //0x3de6 or 0x1ef3 (word address) is location in bls of lpm instruction
    "ldi r30,0xf3 \n"               //(1c) lo byte 
    "ldi r31,0x1e \n"               //(1c) hi byte
    "push r30 \n"                   //(2c)
    "push r31 \n"                   //(2c)
	
    "lds r30,ReadAddr \n"           //(2c) lpm instruction needs a byte address in Z
    "lds r31,ReadAddr+1 \n"         //(2c)
	
    "ldi r25,0x00 \n\t"             //(1c)
    "ret \n"                        //(4c) goto (return t0) bootloader via address we pushed onto stack
//   lpm r25,z+                       (3c) 24c total, timer set to 22 due to ISR latency (x-2)

    "ReturnHere: \n"                // interrupt returns to this location
    "mov %0,r25 \n"                 // save byte that lpm instruction fetched 

    "pop r31 \n"
    "pop r30 \n"
    "pop r16 \n"
    "pop r1 \n"
    "pop r0 \n"
    : "=r" (result) : "I" (kTCCR0B)
  );
  return(result);
}

// This timer interrupt fires during bootloader execution immediately after the lpm instruction.
// Then, if we would simply return (reti), we would go back to the bootloader. So, first we pop the
// return address (discarding it) and then do a reti, which takes us back to the "ReturnHere" location 
// the address of which, previously the LpmCmd() pushed onto the stack.
ISR(__vector_15, ISR_NAKED) {
  asm volatile(
    "ldi r30,0  \n"
    "out %0,r30 \n" //stop timer 0
    "out %1,r30 \n" //reset timer 0
    "ldi r30,%2 \n"
    "out %3,r30 \n" //clear interrupts on timer 0
    "pop r30 \n"    //pop ISR return, so we return to LpmCmd
    "pop r30 \n"    //understand we are trashing value in r30 here, but that shouldn't matter...
    "reti \n"
    : : "I" (kTCCR0B), "I" (kTCNT0), "I" (T0_TIFR0), "I" (kTIFR0)
  );
}

void setup(void) {
  uint8_t b;
  
  Serial.begin(9600);
  ReadAddr = BLS_START;
  SetupTimer0B();
  for (uint8_t i=0; i<32; i++) {
    Serial.print(ReadAddr, HEX); 
    Serial.print(": ");
    for (uint8_t j=0; j<8; j++) {
      b = LpmCmd();      //read byte
      OCR0B = T0_CYCLES; //reset timer
      ReadAddr++;        //advance to next byte
      if (b < 0x10)
        Serial.print("0");
      Serial.print(b, HEX);
    }
    Serial.println(); 
  }
}

void loop(void) { }

Program output with the arduino lock bits disabled, correctly shows the boot loader code:

3800: 0C94341C0C94511C
3808: 0C94511C0C94511C
3810: 0C94511C0C94511C
3818: 0C94511C0C94511C
3820: 0C94511C0C94511C
3828: 0C94511C0C94511C
3830: 0C94511C0C94511C
3838: 0C94511C0C94511C
3840: 0C94511C0C94511C
3848: 0C94511C0C94511C
3850: 0C94511C0C94511C
3858: 0C94511C0C94511C
3860: 0C94511C0C94511C
3868: 11241FBECFEFD4E0
3870: DEBFCDBF11E0A0E0
3878: B1E0E4EAFFE302C0
3880: 05900D92A230B107
3888: D9F712E0A2E0B1E0
3890: 01C01D92AD30B107
3898: E1F70E94361D0C94
38A0: D01F0C94001C982F
38A8: 9595959595959595
38B0: 905D8F708A307CF0
38B8: 282F295A8091C000
38C0: 85FFFCCF9093C600
38C8: 8091C00085FFFCCF
38D0: 2093C6000895282F
38D8: 205DF0CF982F8091
38E0: C00085FFFCCF9093
38E8: C6000895EF92FF92
38F0: 0F931F93EE24FF24
38F8: 87018091C00087FD
Advertisements

About Jim Eli

µC experimenter
This entry was posted in Uncategorized and tagged , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s