Fixed Point Math on the Arduino Platform

Fixed-Point Math
Here is an overly simplified comparison of 24.8 fixed point vs. floating math on the AVR168/328 chip. I compared the elapsed time of the following two multiplications running at 16Mhz:

#define FIXED_BITS        32
#define FIXED_WBITS       24
#define FIXED_FBITS       8
#define FIXED_TO_INT(a)   ((a) >> FIXED_FBITS)
#define FIXED_FROM_INT(a) (int32_t)((a) << FIXED_FBITS)
#define FIXED_MAKE(a)     (int32_t)((a*(1 << FIXED_FBITS)))

int32_t a, b, c;
float x, y, z;

static int32_t FIXED_Mul(int32_t a, int32_t b) {
  return(((int32_t)a*(int32_t)b) >> FIXED_FBITS);
int main(void) {
  //floating point
  x = 8.0;  //11 clock cycles
  y = 2.5;  //12 clock cycles
  z = x*y;  //1651 clock cycles (104.69us)
  //fixed point
  a = FIXED_MAKE(8.0);  //11 clock cycles
  b = FIXED_MAKE(2.5);  //12 clock cycles
  c = FIXED_Mul(a, b);  //175 clock cycles (10.94us)
  return 0;

That’s nearly 9.5 times faster. However, my comparison is not entirely without deception. Note the multiply function is using two uint32_t size values. This nearly guarantees the multiplication will overflow for all but the smallest numbers. The multiply should promote the values to uint64_t in order to prevent an overflow. A 64-bit multiply results in an extremely slow function, much slower than the heavily optimized internal assembly language routine found inside the GCC library. Morale of the story is, you better know what you’re doing if you want to improve upon the optimized math already used by the compiler.

Here is a comparison of float-point and fixed-point math on an AVR.

Here is some more interesting information I found on this topic.

1. avrfix: fixed point library.

2. libfixmath: cross platform fixed point routines in C.

3. fixedptc: fixed point math library for C.

4. GNU MPFR: C library for multiple-precision floating-point computations.

More to follow…

Follow up posts are here and here.

And see my posting here for basic arduino compatible routines.

About Jim Eli

µC experimenter
This entry was posted in Uncategorized and tagged , , , , , . Bookmark the permalink.

11 Responses to Fixed Point Math on the Arduino Platform

  1. I have a hunch some important things you left out or I’m out on some things. I did the test using your code and the Arduino function ‘micros()’ and both floating and fixed point gave me the same measure of timelapse.

    • Jim Eli says:

      The default compiler optimization level on the arduino will alter the code and make the comparison appear oddly incorrect.

      • That’s funny, ok then. What is true is that when you don’t use results of calculations by outputting them via whatever means the compiler discards code. I use the default compiler and it appears that the
        ‘#define FIXED_MAKE(a) (int32_t)((a*(1 << FIXED_FBITS)))
        isn't even usable in the execution because the decimal fraction is truncated. So to undo the truncation the '(int32_t)((a*(1 << FIXED_FBITS)))' had to be changed to 'a*(pow(2,FIXED_FBITS))'. Since the input value 'a' is a float type there's no use to cast it into a int32_t type to enable the bit shifting since that will cut off the fractional part in the process. Tested and observed. Anyway, so does it happen with the default compiler and an ATMega168 on a duemilanove.

        So how did you measure the microseconds then?

  2. Jim Eli says:

    If I remember correctly, with the default arduino optimization level, the compiler pre-calculates these values, eliminates the actual math and simply inserts constants into the output code. I think I timed it with a Saleae Logic Analyzer by toggling a pin before/after the relevant code sections.

    • Hmmm…that’s strange. We’re dealing with variables right? So how can the compiler decide for the resulting constant out of the formula (math) if we’re dealing with a variable in that same formula that will change during execution time rendering the pre-calculates these value ‘null and void’?

  3. Jim Eli says:

    For example, here is what the optimizing compiler does:

    uint8_t a, b, c;

    void loop() {
    a = 1;
    b = 2;
    c = a + b;

    Note in the following disassembly, the answer was pre-computed, no math is performed, and a constant (not the variable) is sent to the serial print function:

    ldi r24, 0x01 ; 1
    sts 0x0110, r24
    ldi r24, 0x02 ; 2
    sts 0x0111, r24
    ldi r24, 0x03 ; 3
    sts 0x0112, r24
    ldi r24, 0x9B ; 155
    ldi r25, 0x01 ; 1
    ldi r22, 0x03 ; 3
    ldi r20, 0x0A ; 10
    ldi r21, 0x00 ; 0
    call 0x6e4 ; 0x6e4

    • No problem. But the latter example given is not applicable to the main subject presented above I believe to be sure. The example doesn’t have decimal fractions. I do understand what is happening. Once the compiler figures out the ‘variables’ will not change during execution time it steers its optimasation to that fact. And the compiler doesn’t need to figure that out if we use #define to create constants in which also math may be applied. That’s clear to us. The compiler won’t even allow mixing defined constants with variables if the code defies the change of these variables during execution time.

      In the code above we have:
      a = FIXED_MAKE(8.0); //11 clock cycles
      b = FIXED_MAKE(2.5); //12 clock cycles
      c = FIXED_Mul(a, b); //175 clock cycles (10.94us)
      which is doing nothing since:
      #define FIXED_MAKE(a) (int32_t)((a*(1 << FIXED_FBITS)))
      isn't doing anything neither. You cast ((a*(1 << FIXED_FBITS))) in an integer and as this happens all decimal fractions are gone, zip, nullified, truncated, goodbyed, tataa!

      Notwithstanding your code has led me to scrutinize the fixed point subject with Arduino and has helped me somehow because I've tested it. For what counts for me is that the variables must be able to change via external (realworld) elements during execution as part of the quality (speed) of fixed point calculation coding.

      For now I use the default compiler coming with the Arduino IDE 1.0.5. I'll get to Eclipse later to have a view behind the curtains.

      • Jim Eli says:

        “You cast ((a*(1 << FIXED_FBITS))) in an integer and as this happens all decimal fractions are gone, zip, nullified, truncated, goodbyed, tataa!"

        Your statement doesn't make sense. I suggest you read up on fixed point math. The above program works with 24.8 fixed point non-integer values, and the multiply/shift/cast converts a float to a fixed point value as expected.

  4. Jim Eli is living in another dimension. His example code above doesn’t do anything. It’s dummy code doing nothing. And he is still claiming it does what it does: fixed point calculations. Dellusional.

    • Jim Eli says:

      Here is an extended example of fixed point math. Refer to the code below, in which I have inserted some serial print functions so an individual might closely examine what happens:

      #define FIXED_BITS        32
      #define FIXED_WBITS       24
      #define FIXED_FBITS       8
      #define FIXED_TO_INT(a)   ((a) >> FIXED_FBITS)
      #define FIXED_FROM_INT(a) (int32_t)((a) << FIXED_FBITS)
      #define FIXED_MAKE(a)     (int32_t)((a*(1 << FIXED_FBITS)))
      #define FIXED_ONE         ((int32_t)((int32_t)1 << FIXED_FBITS))
      #define FIXED_TO_FLOAT(a) (float) a / FIXED_ONE
      int32_t a, b, c;
      static int32_t FIXED_Mul(int32_t a, int32_t b) {
        return(((int32_t)a*(int32_t)b) >> FIXED_FBITS);
      void setup(void) { 
        a = FIXED_MAKE(3.5);
        Serial.print("whole part = "); Serial.println((a&0xffffff00)); 
        Serial.print("whole part /256 = "); Serial.println((a&0xffffff00)>>8); 
        Serial.print("fractional part = "); Serial.println(a&0xff);
        Serial.print("fractional part /256 = "); Serial.println((float)(a&0xff)/256);
        b = FIXED_MAKE(2.5);
        Serial.print("whole part = "); Serial.println((b&0xffffff00)); 
        Serial.print("whole part /256 = "); Serial.println((b&0xffffff00)>>8); 
        Serial.print("fractional part = "); Serial.println(b&0xff);
        Serial.print("fractional part /256 = "); Serial.println((float)(b&0xff)/256);
        c = FIXED_Mul(a, b);
        Serial.print("whole part = "); Serial.println((c&0xffffff00)); 
        Serial.print("whole part /256 = "); Serial.println((c&0xffffff00)>>8); 
        Serial.print("fractional part = "); Serial.println(c&0xff);
        Serial.print("fractional part /256 = "); Serial.println((float)(c&0xff)/256);
        Serial.print("c = ");  Serial.println(FIXED_TO_FLOAT(c));
      void loop(void) {

      Here is the resultant printout when the above code is run on arduino:

      a = 896
      whole part = 768
      whole part /256 = 3
      fractional part = 128
      fractional part /256 = 0.50
      b = 640
      whole part = 512
      whole part /256 = 2
      fractional part = 128
      fractional part /256 = 0.50
      c = 2240
      whole part = 2048
      whole part /256 = 8
      fractional part = 192
      fractional part /256 = 0.75 
      c = 8.75


      3.5 * 2.5 = 8.75

      In 24.8 fixed point (in which we multiply the whole number and fractional part by 256):

      (3 * 256) + (0.5 * 256) = 768 + 128 = 896 
      (2 * 256) + (0.5 * 256) = 512 + 128 = 640
      (896 * 640) / 256 = 2240
      2240 / 256 = 8.75

      And that is how fixed point math works for the deluded.

  5. Pingback: Embed with Elliot: Keeping it Integral | Hackaday

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s