A Tale of Two Atoi(s)

All due respect to Charles Dickens.

When you call a built-in function do you ever wonder where it comes from? An Arduino specific function, like digitalRead or attachInterrupt comes from the Arduino core hardware libraries. The source code for these functions can be browsed here. However, many standard C library functions can also be used.

What are the standard C library functions? They are a core of basic C functions, macros, and type definitions for executing tasks like string handling, math computations, memory functions and many other services. These functions get added, or “linked” to your program typically by “including” the respective header file (i.e. #include <stdio.h> ). However, in most cases you don’t need to explicitly include the header files in your Arduino sketch. The refernce for the avr-libc library can be found here.

Lets examine one of the standard library functions and compare it to a version of the same function that we write ourselves. We will use the atoi function for our example. Atoi is an abbreviation of “ASCII-to-Integer”. The Atoi function converts a string to an integer, and is referenced in the “stdlib” header file:

``` int atoi ( const char * s ) ```

Here is the C code for a basic atoi function:

```int16_t atoi(char s[]) {
uint8_t i;
int16_t n, sign;

//skip white space
for (i=0; s[i]==' ' || s[i] == '\n' || s[i] == '\t'; i++)
;

//sign
sign = 1;
if (s[i] == '+' || s[i] == '-')
sign = (s[i++] == '+') ? 1 : -1;

//convert
for (n=0; s[i]>='0' && s[i]<='9'; i++)
n = 10*n + s[i] - '0';

return (sign*n);
}
```

Simple enough. Below is the resulting assembly listing of the above function after inclusion in an Arduino sketch. It results in 129 bytes of machine code.

```atoi:
movw	r26, r24
ldi 	r18, 0x00	; 0
rjmp	.+2      	; 0x162 <_Z4AtoiPc+0x8>

subi	r18, 0xFF	; 255
movw	r30, r26
ld  	r30, Z
cpi 	r30, 0x20	; 32
breq	.-14     	; 0x160 <_Z4AtoiPc+0x6>
cpi 	r30, 0x0A	; 10
breq	.-18     	; 0x160 <_Z4AtoiPc+0x6>
cpi 	r30, 0x09	; 9
breq	.-22     	; 0x160 <_Z4AtoiPc+0x6>

cpi 	r30, 0x2B	; 43
breq	.+4      	; 0x17e <_Z4AtoiPc+0x24>
cpi 	r30, 0x2D	; 45
brne	.+12     	; 0x18a <_Z4AtoiPc+0x30>

subi	r18, 0xFF	; 255
cpi 	r30, 0x2B	; 43
breq	.+6      	; 0x18a <_Z4AtoiPc+0x30>
ldi 	r22, 0xFF	; 255
ldi 	r23, 0xFF	; 255
rjmp	.+4      	; 0x18e <_Z4AtoiPc+0x34>
ldi 	r22, 0x01	; 1
ldi 	r23, 0x00	; 0
ldi 	r20, 0x00	; 0
ldi 	r21, 0x00	; 0
rjmp	.+38     	; 0x1ba <_Z4AtoiPc+0x60>

movw	r24, r20
ldi 	r31, 0x03	; 3
dec 	r31
brne	.-8      	; 0x198 <_Z4AtoiPc+0x3e>
subi	r20, 0x30	; 48
sbci	r21, 0x00	; 0
mov 	r24, r30
eor 	r25, r25
sbrc	r24, 7
com 	r25

subi	r18, 0xFF	; 255
movw	r30, r26
ld  	r30, Z
mov 	r24, r30
subi	r24, 0x30	; 48
cpi 	r24, 0x0A	; 10
brcs	.-54     	; 0x194 <_Z4AtoiPc+0x3a>
mul 	r20, r22
movw	r18, r0
mul 	r20, r23
mul 	r21, r22
eor 	r1, r1

movw	r24, r18
ret
```

For comparison, the avr-libc atoi function is hard-coded in assembler. The highly optimized source code for this function can be found here. Below is the listing of the disassembly from inside an Arduino sketch. Notice it results in only 73 bytes of machine code, 40% smaller than our basic implementation of the function!

```atoi:
movw	r30, r24
eor 	r24, r24
eor 	r25, r25
clt

ld  	r18, Z+
cpi 	r18, 0x20	; 32
breq	.-6      	; 0x118 <atoi+0x8>
cpi 	r18, 0x09	; 9
brcs	.+4      	; 0x126 <atoi+0x16>
cpi 	r18, 0x0E	; 14
brcs	.-14     	; 0x118 <atoi+0x8>

cpi 	r18, 0x2B	; 43
breq	.+14     	; 0x138 <atoi+0x28>
cpi 	r18, 0x2D	; 45
brne	.+12     	; 0x13a <atoi+0x2a>
set
rjmp	.+6      	; 0x138 <atoi+0x28>

rcall	.+22     	; 0x14a <__mulhi_const_10>

ld  	r18, Z+

subi	r18, 0x30	; 48
cpi 	r18, 0x0A	; 10
brcs	.-14     	; 0x132 <atoi+0x22>

brtc	.+6      	; 0x148 <atoi+0x38>
com 	r25
neg 	r24
sbci	r25, 0xFF	; 255

ret

__mulhi_const_10:
ldi 	r23, 0x0A	; 10
mul 	r25, r23
mov 	r25, r0
mul 	r24, r23
mov 	r24, r0
eor 	r1, r1
ret
```

You can browse the entire avr-libc source code here.

I suggest the following links for discovering further information on alternative standard C libraries:

Newlib – A small C library including an equally small math library designed for use in embedded applications.

uClibc – Probably the best known and most complete alternate library designed for embedded systems.