Arduino Inline Assembly Tutorial (Bit Shifts)

shifty eyes

Introduction

In this tutorial we continue our examination of bit operations. Specifically we discuss left and right shifts. We’ll end this tutorial with an assortment of short C language MACRO code segments that demonstrate checking bits.

Shift to Multiply

The instruction which performs a left bit shift, (try saying that 3 times fast) is LSL. LSL is a mnemonic for Logical Shift Left, which shifts all bits inside the register one place to the left. Bit 0 is cleared, and bit 7 is loaded into the C flag of the Status Register (SREG). This operation effectively multiplies signed and unsigned values by two.

Why would we want to move each bit in a register, one place to the left? Lets look at the example of left-shifting the value of three:

volatile uint8_t foo = 0;

asm (
  "ldi %0, 3 \n" //foo = 0b00000011 = 3
  "lsl %0    \n" //foo = 0b00000110 = 3x2 = 6
  "lsl %0    \n" //foo = 0b00001100 = 3x2x2 = 12
  : "=r" (foo)
);

Each left shift is also the equivalent of multiplying by 2. If we want to multiply something by eight, we would left-shift it three times, and so on.

LSL has a cousin named ROL, or ROtate Left through carry. ROL shifts all bits one place to the left. The C flag is shifted into bit 0, and bit 7 is shifted into the C flag.

Since LSL always shifts a zero into the lowest bit, and ROL shifts the contents of the carry flag into the lowest bit, using both allows us to multiply numbers larger than one byte. To multiply a 16-bit number by two, first LSL the lower byte, then ROL the high byte. This has the net effect of “rolling” the high bit of the lower byte into the first bit of the 2nd byte. This technique can be expanded to multiply even larger numbers.

Multiplying a 32-bit number by two:

volatile uint32_t foo = 0x00f0f0f0;

asm (
  "lsl %A0      \n" //r24 = 0x00f0f0f0 = 15790320x2 = 0x01e1e1e0 (31580640)
  "rol %B0      \n" //number shifted out of %A0 into %B0
  "rol %C0      \n" //number shifted out of %B0 into %C0
  "rol %D0      \n" //number shifted out of %C0 into %D0
  : "+r" (foo)
);

Right Shift to Divide and Conquer

The right-shift operation moves each bit in a register, one place to the right. Why would we want to do this? Again, examine this example of right-shifting the value of twelve:

volatile uint8_t foo;

asm (
  "ldi %0, 12 \n" //foo = 0b00001100 = 12
  "lsr %0     \n" //foo = 0b00000110 = 12/2 = 6
  "lsr %0     \n" //foo = 0b00000011 = 12/4 = 3
  : "=r" (foo)
);

Each time we shift a value one bit to the right, we are dividing that value by two. If we wanted to divide something by eight, we would right-shift three times.

What the Logical Shift Right command (LSR) does is shift a zero into the highest bit, shift the contents to the right and shift the lowest bit into the carry flag. The C flag can be used to round the result.

LSR always shifts a zero into the highest bit, while ROR shifts the contents of the carry flag into the highest bit. Using both we can divide numbers larger than one byte. To divide a sixteen bit number by two, first LSR the highest byte, then ROR the lower byte, this has the net effect of “rolling” the low bit of the high byte into the high bit of the lower byte. This technique can be expanded to divide even larger numbers.

Dividing a 32-bit number by two:

volatile uint32_t foo = 0x01e1e1e0;

asm (
  "lsr %D0      \n" //foo = 0x01e1e1e0 (31580640/2) = 0x00f0f0f0 (15790320)
  "ror %C0      \n" //number shifted out of %D0 into %C0
  "ror %B0      \n" //number shifted out of %C0 into %B0
  "ror %A0      \n" //number shifted out of %B0 into %A0
  : "+r" (foo)
);

Notice, we are not covering general multiplication and division operations. That topic is beyond the scope of this tutorial. Furthermore, it would be very difficult to improve upon the highly efficient routines found inside the arduino/AVR library. Additional information on general integer multiply and divide routines can be found in ATMEL Application Note 200.

Lets Jump

Now we introduce the concept of “jumping.” The RJMP, or Relative JuMP instruction jumps to an address within 2048 (words). In assembly, we don’t calculate the relative operand portion of the jump, or the amount of instructions to jump over. We use a label as the destination of our jump and let the assembler/linker calculate the amount to jump for us.

A jump instruction causes the μC to start execution of instructions at a location defined by the label. The label’s location can be anywhere (forward, backward, and even in the same place) within the distance limits defined. However, for inline assembly, the label must be located inside the same block of inline code.

Take a close look at the following code. It does nothing except cause execution to jump between the two labels, “here” and “there”. The 3 “NOP” instructions in between are never executed.

here:
rjmp there

nop
nop
nop

there:
rjmp here

You’re Both Write

In a previous tutorial we demonstrated a routine that operates like the arduino’s digitalWrite function. It uses an unconditional branch (RJMP) to skip over the clear bit instruction (CBI), if the result of the comparison (CPI) with 0 is true. Here it is again:

//digitalWrite(output)
volatile uint8_t output = HIGH; //LOW or HIGH

asm (
  "cpi %2, 0     \n"
  "breq 1f       \n"
  "sbi %0, %1    \n"
  "rjmp 2f       \n"
"1:              \n"
  cbi %0, %1     \n"
"2:              \n"
  : : "I" (_SFR_IO_ADDR(PORTB)), "I" (PORTB5), "r" (output)
);

Bit Checking MACROs

These C language MACROs are useful for checking the status of particular bits. The first two use relative jump instructions. Newer, C-language versions of these MACROs are defined inside the file “sfr_defs.h”.

#define _loop_until_bit_is_set(port, bit)           \
  asm volatile (                                    \
    "L_%=: sbis %0, %1 \n"                          \
    "rjmp L_%=         \n"                          \
    : : "I" ((uint8_t)(port)), "I" ((uint8_t)(bit)) \
)

#define _loop_until_bit_is_clear(port, bit)       \
  asm volatile (                                  \
  "L_%=: sbic %0, %1 \n"                          \
  "rjmp L_%=         \n"                          \
  : : "I" ((uint8_t)(port)), "I" ((uint8_t)(bit)) \
)

#define _bit_is_clear(port, bit) ({               \
  uint8_t t;                                      \
  asm volatile (                                  \
    "clr %0      \n"                              \
    "sbis %1, %2 \n"                              \
    "inc %0      \n"                              \
    : "=r" (t)                                    \
    : "I" ((uint8_t)(port)), "I" ((uint8_t)(bit)) \
  );                                              \
  t;                                              \
})

#define _bit_is_set(port, bit)                    \
  ({ uint8_t t;                                   \
  asm volatile (                                  \
    "clr %0      \n"                              \
    "sbic %1, %2 \n"                              \
    "inc %0      \n"                              \
    : "=r" (t)                                    \
    : "I" ((uint8_t)(port)), "I" ((uint8_t)(bit)) \
  );                                              \
  t; })

An example using the above MACROs:


Wait_Until_Clear:
  //pause here in loop checking until until bit is clear
  if ( _bit_is_set(_SFR_IO_ADDR(PORTB), PORTB5) )
    goto Wait_Until_Clear;

  //or, this which performs the same thing
  _loop_until_bit_is_clear(_SFR_IO_ADDR(PORTB), PORTB5);

C MACROs offer a convenient way to insert assembly code into your program, but they demand extra care because a MACRO expands into a single logical line. Writing a proper C MACRO is an art. If you desire to reuse your assembler language parts, it is useful to define them as MACROs and put them into include files. AVR Libc comes with a bunch of them, which could be found in the directory “avr/include.” You may want to peruse these files and study a few examples.

Do note, that a problem with reused macros arises if you are using labels. In such cases you may make use of the special pattern =, which is replaced by a unique number in each asm statement. As in the above _loop_until_bit_is_clear MACRO, notice how the relative jump label is both defined ( L_%=: ) and referenced ( rjmp L_%= ). When the MACROS is used for the first time, for example, L_= may be translated to L_1404, while the next usage might create L_1405. This results in each instance of the label becoming unique.

Reference

AVR 8-bit Instruction Set
AVR-GCC Inline Assembler Cookbook
Extended Asm – Assembler Instructions with C Expression Operands
AVR Bit and Bit-Test Instructions
ATMEL Application Note 200
AVR200 and AVR201 as inline assembly
MACROs and the GCC Preprocessor

Also available as a book, with greatly expanded coverage!

BookCover
[click on the image]

About Jim Eli

µC experimenter
This entry was posted in Uncategorized and tagged , , , , , . Bookmark the permalink.

1 Response to Arduino Inline Assembly Tutorial (Bit Shifts)

  1. Thank you for sharing this tutorial series…..

Leave a comment