Arduino Inline Assembly Tutorial #3 (Clobbers)

clobbered

Clobbered

Guess what? Our previous tutorial example (Tutorial #2) has a problem. Here is the inline portion of that code:

asm (
  "ldi r26, 42  \n"
  "sts (a), r26 \n"
);

Notice in our example, we use register #26, or r26. Even though we only used this register temporarily, we have trashed (or “clobbered”) any value that was previous stored there. The compiler may have been using register r26 somewhere else in this program, and we’ve inadvertently replaced any value that may have been inside r26 with our value of 42. This may have introduced a bug into our program, or worse, it could have caused the program to crash.

Remember, the compiler simply passes our assembly code onto the avr-as assembler. It really has no idea what we are doing. Because of this we need a method to inform the compiler of the registers we use, hence the clobber list.

If you recall the general form of the extended inline assembler statement:

asm(“code” : output operand list : input operand list : clobber list);

The fourth part is a list of “clobbered” or “accessed” registers. The format for this is to simply list the registers we clobber inside quotations. Like so:

"r26"

Our inline code should have looked like this:

asm (
  "ldi r26, 42  \n" 
  "sts (a), r26 \n" 
  : : : "r26"
);

Don’t forget the clobber list is the fourth part of the asm statement, and we separate the parts with colons. If we clobbered additional registers, we would simply add them to the list, separating them with commas, like so:

"r16", “r17”, “r25”, “r26”

The Chicken or the Egg

Let’s introduce a minor addition to our previous inline tutorial program. Instead of dealing with an 8-bit byte value, lets use a 16-bit (2-byte) integer value. Obviously, an integer value requires two byte-sized memory locations to completely store itself. This introduces a conundrum, which byte comes first?

Lows, Highs and Endians

Endianness refers to the order of the bytes in computer memory. An integer may be represented in big-endian or little-endian format. The arduino uses little-endian, which means the least significant byte (LSB) is stored in a lower memory address while the most significant byte is stored at a higher memory address.

Before we get to our inline assembly program, here’s a diversionary program for the arduino which demonstrates endianness:

//program demonstrating arduino endianness [little endian]
char text[32];

void setup() {
  uint16_t n16 = 0x1234;     //declare & initialize 16-bit number
  uint32_t n32 = 0x12345678; //declare & initialize 32-bit number

  Serial.begin(9600);

  uint8_t* pn16 = (uint8_t *)&n16; //declare uint8_t pointer to 1st byte of 16-bit number
  
  Serial.println(n16, HEX);
  for (uint8_t i=0; i<2; i++) {
    //iterate through both bytes of n16, noting order of digits
    sprintf(text, "%p: %02x \n", pn16, (uint8_t)*pn16++); 
    Serial.print(text);
  }
  Serial.println();

  uint8_t* pn32 = (uint8_t *)&n32; //declare uint8_t pointer to 1st byte of 32-bit number

  Serial.println(n32, HEX);
  for (uint8_t i=0; i<4; i++) {
    //iterate through all 4-bytes of n32, noting order of digits
    sprintf(text, "%p: %02x \n", pn32, (uint8_t)*pn32++); 
    Serial.print(text);
  }

  Serial.println();

}

void loop(void) { }

For the above program, you should receive output similar to this:

0x1234
0x8f1: 34
0x8f2: 12

0x12345678
0x8ed: 78
0x8ee: 56
0x8ef: 34
0x8f0: 12

From Low to High

It is helpful to use a couple of assembly operators which easily determine the LSB and MSB of a 16-bit integer:

  • lo8() Takes the least significant 8 bits of a 16-bit integer
  • hi8() Takes the most significant 8 bits of a 16-bit integer

Lucky for us, when using these operators, we don’t need to perform the math to determine that the LSB of 32,767 is 255 (0xff in hexadecimal), and the MSB is 127 (0x7f). The lo8 and hi8 operators do this for us. Armed with this new information let’s store the value of 32,767 into our integer variable (a):

16-bit Integer Example

volatile int a = 0;

void setup() {
  Serial.begin(9600);

  asm (
    "ldi r24, lo8(32767) \n" //0xff
    "ldi r25, hi8(32767) \n" //0x7f
    "sts (a), r24        \n" //lsb
    "sts (a + 1), r25    \n" //msb
    : : : "r24", "r25"
  );

  Serial.print("a = "); Serial.println(a);
}

void loop(void) { }

First, notice how we address the 2-byte memory location representing (a), by using the notation of (a + 1) for the MSB, while just (a) equates to the LSB. It’s vitally important to keep the correct order, or endianess, otherwise our number would have become 0xff7f in hexadecimal, which is 65,407 as an unsigned integer, or -127 as a signed integer. If all of this sounded foreign to you, you might want to study up on hexadecimal notation and signed vs. unsigned integers.

Second, we didn’t forget to include the “clobber” list this time.

Final Answer

Our final example is just a simple adaptation of our very first inline program. Here we are again dealing with byte values, and we are just going to perform a simple variable swap. In C, we would code this something like:

byte c, b=20, a=10;

c = a; 
a = b; 
b = c;

In inline assembler:

volatile byte a = 10;
volatile byte b = 20;

void setup() {
  Serial.begin(9600);

  asm (
    "lds r24, (a) \n"
    "lds r26, (b) \n"
    "sts (b), r24 \n" //exchange registers
    "sts (a), r26 \n"
    : : : "r24", "r26"
  );

  Serial.print("a = "); Serial.println(a);
  Serial.print("b = "); Serial.println(b);
}

void loop(void) { }

Notice, instead of loading an immediate value with the LDI instruction, we use LDS. LDS is the mnemonic for “Load Direct from data Space”. LDS loads one byte from the data space (SRAM) into a register.

In our program, in the process of loading and then storing, we simply exchange the registers (r24 for r26) in order to perform the swap. Notice we don’t need to burden ourselves with the actual addresses of the variables a and b. In both the LDS and STS instructions, the assembler inserts the SRAM memory addressing for us. Furthermore, we correctly identify the two registers used, as “Clobbered”.

Spoiler Alert

In our next tutorial, we will reduce the previous byte swap program into the following rather odd-looking inline assembler code. You might even be tempted to exclaim, “What code?” Believe it or not, this works. Get ready to travel the winding path of input and output operands!

asm (
  "" : "=r" (a), "=r" (b) : "0" (b), "1" (a) 
);

Reference

AVR 8-bit Instruction Set
AVR-GCC Inline Assembler Cookbook
Extended Asm – Assembler Instructions with C Expression Operands

Also available as a book, with greatly expanded coverage!

BookCover

About Jim Eli

µC experimenter
This entry was posted in Uncategorized and tagged , , , , , , . Bookmark the permalink.

Leave a comment