Arduino Inline Assembly Tutorial (Functions)

func machine

At first consideration, the topic of functions seems simple and trite. Just discuss how to “CALL” and “RETURN” to and from a function, right? However, there are many subtopics involved as well. For example, passing and returning parameters, prologue and epilogue code, the stack frame and mixing assembly and C are topics deserving of separate tutorials. Hopefully, we can do all of these justice, but first, the basics…

Convert Snippet Into a Function

How about a simple demonstration of turning an inline code snippet into a function? In a previous tutorial on indirect addressing, several inline pieces of code were developed to perform various string operations. One such operation determined the character length of a string. The code is below.

String Length, Sounds Like strlen

const char src[4] = "abc";
volatile uint8_t len;
 
asm (
  "_loop:               \n"
  "ld   __tmp_reg__, Z+ \n"
  "tst  __tmp_reg__     \n"
  "brne _loop           \n"
  //Z points one character past the terminating NUL
  "subi %A1, 1          \n" //subtract post-increment
  "sbci %B1, 0          \n"
  "sub  %A1, %A2        \n" //length = end - start
  "sbc  %B1, %B2        \n"
  "mov  %0, %A1         \n" //save len (uint8_t)
  : "=r" (len) : "z" (src), "x" (src)
);

While this code could easily be included “inline”, it certainly would be more useful if it was defined as a general function. This would make it much easier to use throughout a program, and also reduce overall program size by incorporating only one instance of the code. So how is this accomplished?

Stub Your Code

The official Cookbook refers to this techniques as a “C Stub Function,” which is nothing more than a function definition containing only inline assembler code. Typically, in a “C Stub Function”, the function parameters and local variables define the data used in, and the value returned (if any) by the function. This is an easy method to pass data to/from the inline function, without the need to understand the underlying details of how its done. Therefore, we eliminate the necessity of writing additional supporting code.

The above “string length” snippet easily becomes a full blown function, _strlen() using this method. Notice the transformed function below receives a string, (s) as a parameter, and returns the length, which is defined as a local variable. We refer to these same variables in the input and output constraints:

inline uint8_t _strlen(const char *s) {
  uint8_t len;

  asm (
    "_loop:              \n"
    "ld  __tmp_reg__, Z+ \n"
    "tst __tmp_reg__     \n"
    "brne _loop          \n"
    //len=Z - 1 – src = (-1 - src) + Z = ~src + Z
    "com %A2             \n"
    "com %B2             \n"
    "add %A2, %A1        \n"
    "adc %B2, %B1        \n"
    : "=r" (len) : "z" (s), "x" (s)
  );

  return len;
}

Here is a look at the code generated by the above C-Stub Function (notice the compiler/assembler doesn’t need to generate a lot of “stub” code):

  MOVW r30, r24
  MOVW r26, r24
loop:
  LD r0,Z+
  TST r0
  BRNE loop
  COM r26
  COM r27
  ADD r26, r30
  ADC r27, r31

Placing a Call

An extension to the “C Stub Function” technique is calling another C function from inside inline assembly code. The following bit of code demonstrates the CALL instruction. This instruction “calls” a subroutine located within the program memory (if we remember to properly define the function to avoid linkage errors). The C Stub Function even handles the return (RET) for us.

An additional detail required here, is the need to encapsulate the “called” function inside the extern “C” { } declaration (see below example). The extern “C”, C++ keyword prevents the function name from becoming “mangled”, thus preventing the linker from locating the called function.

extern "C" {
  void foo() {
    // do something here...
  }
}

void test() {
  asm (
    "call foo \n"
  );
}

Playing Catch

Next, we present a basic example of passing and returning parameters to and from C Stub Functions. The purpose of the following code is to convert an upper case ASCII character into its lower case equivalent. We’ve created two functions here, _isupper and _tolower, which validate the input character and then perform the conversion.

Take a look at the code below.

Notice, the first thing _tolower does is call the function, _isupper. Since _tolower hasn’t done anything yet, the C Stub Function simply hands the input character (c), the parameter to _tolower directly onto the _isupper function. Neat!

Next, _isupper checks the character to confirm its actually an upper case character. If so, it returns the character, otherwise it returns a zero. Upon returning to _tolower, the next instruction which is executed is “tst r24”, a test of the contents of register r24. If register #24 (r24) is not zero, the character is converted and the function returns.

Again, notice the use of the C++ keyword “extern C {}” here:

extern "C" {
  unsigned char _isupper(unsigned char c) {
    //bind variable to a specific register r18
    register unsigned char ch asm("r18");
    
    asm (
      "mov  %1, %0 \n" //save input
      "subi %1, 'A'\n" //subtract 0x41
      "brmi 2f     \n" //branch if minus
      "subi %1, 26 \n" //26 letters
      "brpl 2f     \n" //branch if plus
      "ret         \n" //c==upper, return
      "2: clr  %0  \n" //false
      : "+r" (c) : "r" (ch) 
    );
    
    return c;
  }
}

char _tolower(unsigned char c) {
  asm (
    "call _isupper \n" //validate char
    "tst r24       \n" //0 = not alpha char
    "breq 1f       \n" //not alpha char
    "ori %0, 0x20  \n" //make lower
    "1:            \n"
    : "+r" (c)
  );
  
  return c;
}

Insider Information

Why did function _tolower choose to test register #24 (r24)? The above two functions relied on “insider” information when using register r24. These routines knew that an 8-bit, byte-sized value is passed to and from a function via the r24 register. The C Compiler always passes function arguments and returns values in specific register locations. Knowing these locations are essential to writing efficient inline assembly code, especially when interfacing with the C language.

This is a good time to review the data type sizes: a char is 8 bits, an int is 16 bits, a long is 32 bits, a long long is 64 bits, floats are 32 bits, and pointers are 16 bits (function pointers are word addresses). Arguments are allocated left to right, starting in register r25 descending through register r8. All arguments are aligned to start in even-numbered registers (odd-sized arguments, like char, have one free register above them), for example, a single 8-bit value is passed via the r24 register (r25 is assumed empty), a single 16-bit value is passed via the r25:r24 register pair, and a 32-bit value would be passed via r25:r24:r23:r22 register combination.

Return values are expected to be passed in a similar fashion. An 8-bit value is passed via r24, a 16-bit value in r25:r24, and 32-bits in r22:r23:r24:r25. An 8-bit return value may be zero/sign-extended to 16-bits by the called function.

What’s the Use of a Register?

Function “call-used” registers are r18-r27, and r30-r31. Any, or all of these registers may be allocated by the compiler for local data. However, we may use them freely in assembler subroutines. Calling C subroutines can clobber any of them, and the caller is responsible for saving and restoring before and after use.

Function “call-saved” registers are r2-r17, and r28-r29. They may also be allocated by the compiler for local data, but C subroutines leaves them unchanged. Assembler subroutines are responsible for saving and restoring any of these registers, if changed. The Y register pair (r29:r28) is used as a frame pointer (pointing to local data placed on the stack) if necessary.

Fixed registers, r0, and r1 are never allocated by the compiler for local data. The temporary register, r0 can be clobbered by any C code (except interrupt handlers which save it), and may be used freely. The zero register is r1, and assumed to be always zero in any C code. It may be used for other purposes within a piece of assembler code, but must then be cleared after use (clr r1). Interrupt handlers save and clear r1 on entry, and restore r1 on exit (in case it was non-zero).

References

AVR 8-bit Instruction Set
AVR-GCC Inline Assembler Cookbook
Extended Asm – Assembler Instructions with C Expression Operands
Mixing C and Assembly Language

Also available as a book, with greatly expanded coverage!

BookCover
[click on the image]

About Jim Eli

µC experimenter
This entry was posted in arduino, assembly language, avr, avr inline assenbly and tagged , , , , , . Bookmark the permalink.

1 Response to Arduino Inline Assembly Tutorial (Functions)

  1. Jared Butcher says:

    I have been searching for the information about how to pass arguments and return within the “Insider Information” section for a few hours now. Thank you, you have saved the rest of my hair.

Leave a comment