Not Reading an AVR Bootloader From the Application Section

dog lady epic fail

See this post first.

The reason my program fails is because interrupts are disable inside the BLS. Running interrupts in the BLS requires a different lock bit setting than the standard arduino. The arduino boot loader doesn’t use any interrupts, so this lock bi setting is ok.

Since interrupts are disabled inside the BLS, when my program calls the LPM instruction, my timer interrupt never fires. Hence the program never returns from the LPM instruction. It probably continues to execute the boot loader code until it crashes, restarts the boot loader, starts my program, etc. It continues to do this in an endless loop.

As a confirmation of the disabled interrupts and to validate my code, I modified the instruction after the LPM in the BLS to a “RET”, then ran my program without using the timer interrupt. I discovered it now works, with and without the lock bits set.

Here is the section of the boot loader code that I modified:

3de2: 00 23 and r16, r16
3de4: 39 f4 brne SkipRead ;.+14: 0x3df4
3de6: 94 91 lpm r25, Z+
//we need to insert this return command here
3de8: 08 95 ret
3de8: 80 91 c0 00 lds r24, UCSR0A ;0x00C0
3dec: 85 ff sbrs r24, 0b00000101 ;5
3dee: fc cf rjmp waituart ;.-8: 0x3de8

I downloaded the entire flash from my arduino (with both my program and a standard boot loader installed). Then I opened the file (Intel Hex format) in a hexeditor program, searched for the LPM instruction, and changed the following two bytes:

hex editor

Note, I needed to change the checksum byte also because avrdude complains when we attempt the upload:

avrdude.exe: input file C:\Users\James\168.hex auto detected as Intel Hex
avrdude.exe: ERROR: checksum mismatch at line 991 of "C:\Users\James\168.hex"
avrdude.exe: checksum=0x99, computed checksum=0x0d
avrdude.exe: write to file 'C:\Users\James\168.hex' failed

Luck for me, avrdude also computes and displays the correct checksum, so this was easy to accomplish.

So the answer to my question is, “Yes, maybe.” An LPM instruction in the BLS can read the BLS, it just can’t do it with the method I attempted.

BTW, I’m a little annoyed with my new ATMEL-ICE (~$40). When using debugWire with an arduino, calls into the BLS are prevented, as ATMEL Studio depicts the BLS as solid “FF FF FF…”. However, when you return from the debug session, the bootloader is still there. Also, you scare yourself about once per debug session into thinking you bricked the chip. It seems like it takes a magical combination of nearly hidden menu selections to get in and out of debug mode. The combination of ATMEL Studio 6.2, debugWire and arduino is very fragile.

Posted in Uncategorized | Tagged , , , , , , | Leave a comment

Reading an AVR Bootloader From the Application Section

What follows is a brilliant hack, created by Julians Kidmore (aka Snial), the talented mind behind the Fignition project. His ingenious hack, called BootJacker, is documented on his blog, OneWeekWonder. I can’t begin to explain his algorithm as well as he did, so go read his blog. But, in brief, Snial wanted to inject a new smaller boot loader from the application portion of his program in order to reclaim unused flash. He accomplished what ATMEL and their datasheet claim impossible:

“The Application section can never store any Boot Loader code since the SPM instruction is disabled when executed from the Application section.”

More precisely, what ATMEL claim impossible, is for the SPM instruction to write the boot loader section (BLS) from the application section. This protection is normally a good thing. On the arduino, it prevents errant programs from overwriting the boot loader thus rendering the AVR chip useless (requiring low level ISP programming to revive it).

However, I was recently trying to read the BLS from the application section, and found this too is blocked. Try it and see if you can:

//168 memory location
void setup(void) {
  uint16_t address = 0x3800;
  for (uint8_t i=0; i<32; i++)

void loop(void) { }

This program produces gibberish.

However, if I correctly understand the arduino lock bit settings, these are set in such a way to prevent it. The arduino lock bit settings prevent all SPM writes to the BLS, and additionally prevent application section LPMs from reading the BLS. But, the lock bits stop there. There is nothing preventing an LPM from inside the BLS reading the BLS. Am I wrong here?

So I read about the inspired method Snial devised on his blog and thought it could be used to do just that. Use an LPM inside the BLS to read the BLS. Snial’s technique involves devious and potentially catastrophic stack manipulation combined with precise timing to cleverly redirect code execution. I simply jacked Snial’s jack of the SPM instruction, and applied it to the LPM instruction. All the credit goes to Snial, and all the glory goes to God.

My code is below, and it sort-of works, but only if all of the arduino lock bits are disabled. If the lock bits are set in the normal arduino manner the code doesn’t work. For the moment I give up. But here is my code:

//arduino 168 memory locations
#define kTCCR0B   0x25    //these defines required for inline asm
#define kTCNT0    0x26
#define kTIFR0    0x15
//timer 0 settings
#define T0_TIFR0  ((1<<OCF0B) | (1<<OCF0A) | (1<<TOV0))
#define T0_CYCLES 22
#define BLS_START 0x3800; //start of bls on atmega168

uint16_t ReadAddr;

void SetupTimer0B(void) {
  TCCR0B = 0;           // stop the timer
  TCCR0A = 0;           // mode 0, no OCR outputs
  TCNT0 = 0;            // reset the timer
  TIFR0 = T0_TIFR0;     //clear all pending t0 interrupts
  OCR0B = T0_CYCLES;    // clock cycles from now
  TIMSK0 = (1<<OCIE0B); // OCR0B interrupt enabled

uint8_t LpmCmd(void) {
  uint8_t result;

  asm volatile(
    "push r0 \n"
    "push r1 \n"
    "push r16 \n"
    "push r30 \n"
    "push r31 \n"

    "ldi r16,1 \n"                  // timer 0 start at fClk
    "out %1,r16 \n"                 // set TCCR0B so off we go. This is time 0c

    "ldi r30,pm_lo8(ReturnHere) \n" //(1c)
    "ldi r31,pm_hi8(ReturnHere) \n" //(1c)
    "push r30 \n"                   //(2c)
    "push r31 \n"                   //(2c) these addresses must be pushed big-endian
                                    //0x3de6 or 0x1ef3 (word address) is location in bls of lpm instruction
    "ldi r30,0xf3 \n"               //(1c) lo byte 
    "ldi r31,0x1e \n"               //(1c) hi byte
    "push r30 \n"                   //(2c)
    "push r31 \n"                   //(2c)
    "lds r30,ReadAddr \n"           //(2c) lpm instruction needs a byte address in Z
    "lds r31,ReadAddr+1 \n"         //(2c)
    "ldi r25,0x00 \n\t"             //(1c)
    "ret \n"                        //(4c) goto (return t0) bootloader via address we pushed onto stack
//   lpm r25,z+                       (3c) 24c total, timer set to 22 due to ISR latency (x-2)

    "ReturnHere: \n"                // interrupt returns to this location
    "mov %0,r25 \n"                 // save byte that lpm instruction fetched 

    "pop r31 \n"
    "pop r30 \n"
    "pop r16 \n"
    "pop r1 \n"
    "pop r0 \n"
    : "=r" (result) : "I" (kTCCR0B)

// This timer interrupt fires during bootloader execution immediately after the lpm instruction.
// Then, if we would simply return (reti), we would go back to the bootloader. So, first we pop the
// return address (discarding it) and then do a reti, which takes us back to the "ReturnHere" location 
// the address of which, previously the LpmCmd() pushed onto the stack.
ISR(__vector_15, ISR_NAKED) {
  asm volatile(
    "ldi r30,0  \n"
    "out %0,r30 \n" //stop timer 0
    "out %1,r30 \n" //reset timer 0
    "ldi r30,%2 \n"
    "out %3,r30 \n" //clear interrupts on timer 0
    "pop r30 \n"    //pop ISR return, so we return to LpmCmd
    "pop r30 \n"    //understand we are trashing value in r30 here, but that shouldn't matter...
    "reti \n"
    : : "I" (kTCCR0B), "I" (kTCNT0), "I" (T0_TIFR0), "I" (kTIFR0)

void setup(void) {
  uint8_t b;
  ReadAddr = BLS_START;
  for (uint8_t i=0; i<32; i++) {
    Serial.print(ReadAddr, HEX); 
    Serial.print(": ");
    for (uint8_t j=0; j<8; j++) {
      b = LpmCmd();      //read byte
      OCR0B = T0_CYCLES; //reset timer
      ReadAddr++;        //advance to next byte
      if (b < 0x10)
      Serial.print(b, HEX);

void loop(void) { }

Program output with the arduino lock bits disabled, correctly shows the boot loader code:

3800: 0C94341C0C94511C
3808: 0C94511C0C94511C
3810: 0C94511C0C94511C
3818: 0C94511C0C94511C
3820: 0C94511C0C94511C
3828: 0C94511C0C94511C
3830: 0C94511C0C94511C
3838: 0C94511C0C94511C
3840: 0C94511C0C94511C
3848: 0C94511C0C94511C
3850: 0C94511C0C94511C
3858: 0C94511C0C94511C
3860: 0C94511C0C94511C
3868: 11241FBECFEFD4E0
3870: DEBFCDBF11E0A0E0
3878: B1E0E4EAFFE302C0
3880: 05900D92A230B107
3888: D9F712E0A2E0B1E0
3890: 01C01D92AD30B107
3898: E1F70E94361D0C94
38A0: D01F0C94001C982F
38A8: 9595959595959595
38B0: 905D8F708A307CF0
38B8: 282F295A8091C000
38C0: 85FFFCCF9093C600
38C8: 8091C00085FFFCCF
38D0: 2093C6000895282F
38D8: 205DF0CF982F8091
38E0: C00085FFFCCF9093
38E8: C6000895EF92FF92
38F0: 0F931F93EE24FF24
38F8: 87018091C00087FD
Posted in Uncategorized | Tagged , , , , , , | Leave a comment

Cascading Timers to Create a Long Delay

Here is a demonstration program that runs on an Arduino which creates a 1 minute long delay by cascading timers. The procedure is outlined in Atmel Application Note AVR133.

I’ve set timer #1 up to toggle the OCR1A pin (D9) and wired that to the T0 pin (D4) which clocks timer #0. Then I put the arduino to sleep and wait for the timer #0 interrupt to wake it. The values I’m using should toggle the pin 13 LED at a 1 minute frequency.

A much longer delay is easily possible by increasing the counter values and the timer #1 prescaler. For example, with a 1024 prescale and maximum counter values, a delay of over 35 minutes is possible (with a 16MHz system clock).

Note: the reprogrammed timer 0 and 1 trashes the original arduino functions of these timers.

// AVR 133: Atmel Application Note Long Delay Generation Demo
// Blinks LED on a 1 minute period:
// T = 2/Fs x T1P x OCR1A x (256 - TCNT0)
// T = 2/16000000x256x7500x(256-6)
// T = 60 or 1 minute
// connect arduino D9 to D4: 
//  T0   = PD4 (Arduino D4 as input)
//  OC1A = PB1 (Arduino D9 as output)
#include <avr/sleep.h>
#include <avr/power.h>

void setup() {
  //set pins
  DDRB |= (1<<PINB1) | (1<<PINB5); //set arduino D9 and D13 as outputs
  PORTB &= ~(1<<PINB1);            //set D9 low
  PORTD &= ~(1<<PIND4);            //set D4 low

  //timer #1 toggles OCR1A on TCNT1=0 in turn toggling T0
  TCCR1A = (1<<COM1A0); //TCCR1A toggle OC1A on compare match
  TCCR1B = 0;
  TCCR1C = 0;
  OCR1A = 7500;       //output compare register on division ratio of 7500
  TIMSK1 = 0;

  //timer #0 fires interrupt when TCNT0=0 waking arduino 
  TCCR0A = 0;
  TCCR0B = 0;
  TIMSK0 = (1<<TOIE0);  //enable timer0 interrupt

void loop() {
  //toggle led
  PORTB ^= (1<<PINB5); //toggle led pin 

  //reset timers
  TCNT0 = 6;
  TCCR0B = (1<<CS00) | (1<<CS01) | (1<<CS02); //external source (t0) rising edge
  TCNT1 = 0UL;
  TCCR1B = (1<<WGM12) | (1<<CS12); //CTC mode 4 and 256 prescaler

  //sleep and power down setup
  //go to sleep here

  //wake upon timer #0 interrupt here
  //stop timers
  TCCR1B = 0;
  TCCR0B = 0; 
Posted in Uncategorized | Tagged , , , , , , | 4 Comments

Hall Effect Sensor BoB


I made a tiny breakout board for a Melexis US5881 hall effect sensor. A HES detects whether a magnet is near, and is useful for non-contact/waterproof type switches, position sensors and rotary/shaft encoders.


Here is the circuit utilized on the BoB:


The BoB has been sent to for fabrication. 3 copies cost me $1.40 (shipped). I find that unbelievable.

An US5881 HES is available from adafruit here.

Melexis datasheet is located here.

Posted in Uncategorized | Tagged , , , | Leave a comment

NavSpark and GPS Predictive Lap Timer

This cool little programmable (via a modified Arduino IDE) GNSS device should be capable of implementing all of the functionality needed for a predictive Lap Timer. There are models which integrate GPS/Beidou, GPS/GLONASS and GPS/Galileo. SkyTraq Technology Inc., a leading GNSS technology company, developed the device via an Indiegogo crowdfunding campaign. I have several on the way.


NavSpark features:
100MHz 32bit RISC Processor with 16Kbyte I-Cache and 2Kbyte D-Cache
IEEE-754 Compliant Floating Point Unit
1MByte Flash Memory
212Kbyte SRAM
GPS Receiver
UART x 2
SPI x 2
I2C x 1
17 Digital I/O (shared with above functional pins)
1 Pulse Per Sec Timing Reference with +/-10nsec Accuracy
Customized Arduino IDE with GPS SDK Seamlessly Integrated

Posted in Uncategorized | Tagged , , , , | Leave a comment

Boardtrack Racer


Slightly off topic, however relative to using Eagle to design a PCB. I’ve built a gas powered bicycle. It uses a Chinese 2-stroke engine purchased off the internet for approximately $125. The kit includes everything needed to convert a basic bicycle into a motorized version. Here is my Beach Cruiser:


I’ve made a few improvements to the package over time, the most recent being a high-power CDI/coil electronic ignition system. Cost of the CDI/coil package is about $20 each. This was the first time I used a potting compound to seal the board from the elements. The PCBs were made at in a batch of 3 for approximately $14. Anyone interested in the files and list of materials can email me.


Posted in Uncategorized | Tagged , | Leave a comment

New Project Under Wraps

wrapped package
I’ve been busy working on a new project. I am hesitant to publish any details yet. It has required making several PCBs in an iterative process, and I needed to learn SMD soldering. I will soon post about the lessons I learned. Here are a few photos:

ADXL-377 Eagle file:

ADXL-377 Breakout Board:

RXM418LR Breakout Board:

xminilab SMD soldering:

Secret Project:
HITsafe BoB

Posted in Uncategorized | Leave a comment