AVR Hex File Dissection (or, why is my hex file so big?)

bloat

If we compile the blinky example program for an Arduino Uno, you might notice that the hex file, which is basically the machine code that gets loaded into the Arduino, results in a file that is 2,918 bytes large.

blinky hex file size

Yet, immediately after compilation, the Arduino IDE claims the program size is only 1030 bytes. The compiled hex file is almost 3 times larger than this.


Sketch uses 1,030 bytes (3%) of program storage space. Maximum is 32,256 bytes.
Global variables use 9 bytes (0%) of dynamic memory, leaving 2,039 bytes for local variables. Maximum is 2,048 bytes.

Does that mean the Arduino is loaded with a much larger program than the IDE claims?

Hex Editor
Lets examine the hex file contents with a hex-editor program (like HxD) to see if we can find the reason. Here is a screen shot of the beginning of the blinky hex file:

hex editor

Records
The first thing we notice, is the hex file is comprised of 45-byte long records, except for the last record, which is truncated. Here are the first 14 records of the file:


:100000000C945C000C946E000C946E000C946E00CA..
:100010000C946E000C946E000C946E000C946E00A8..
:100020000C946E000C946E000C946E000C946E0098..
:100030000C946E000C946E000C946E000C946E0088..
:100040000C9488000C946E000C946E000C946E005E..
:100050000C946E000C946E000C946E000C946E0068..
:100060000C946E000C946E00000000080002010069..
:100070000003040700000000000000000102040863..
:100080001020408001020408102001020408102002..
:10009000040404040404040402020202020203032E..
:1000A0000303030300000000250028002B000000CC..
:1000B0000000240027002A0011241FBECFEFD8E043..
:1000C000DEBFCDBF21E0A0E0B1E001C01D92A930AC..
:1000D000B207E1F70E94F1010C9401020C940000B8..

Here is the first record with spaces added between the component parts (or fields):


: 10 0000 00 0C945C000C946E000C946E000C946E00 CA ..

Each record begins with a RECORD MARK field containing 3A, which is the ASCII code for the colon (’ : ’) character.

The following 2-bytes is a RECLEN field specifying the number of bytes of information or data in the record. The maximum value of the RECLEN field is hexadecimal ’FF’ or 255. Here, the length is hexadecimal 10, which is decimal 16. Note that one data byte is represented by two ASCII characters, which therefore results in 32 bytes of data.

The next 4-bytes represent the LOAD OFFSET field which specifies a 16-bit starting offset of where to load the data bytes. Since this is the first record in the file, the load offset is 0000. Obviously, the following record has a load offset of 10 (hex).

The next field specifies the record type. This RECTYP field is used to interpret the remaining information within the record. The RECTYPE of this record is “00”, which indicates a data record. Valid record types are:
’00’ Data Record
’01’ End of File Record
’02’ Extended Segment Address Record
’03’ Start Segment Address Record
’04’ Extended Linear Address Record
’05’ Start Linear Address Record

The next 32 bytes are the actual machine code bytes of the program. This is the data that is loaded into the Arduino memory.

The last field ‘CA’ is a checksum, followed by the ASCII carriage return/line feed characters “OD OA”.

Disassembly
Here we see the data from the first record (which I divided up into 4-byte chunks) in a disassembly of the program:

0C945C00 0C946E00 0C946E00 0C946E00

00000000 <__vectors>:
   0:	0c 94 5c 00 	jmp	0xb8	; 0xb8 <__ctors_end>
   4:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
   8:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
   c:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>

File Size Math
So, each of the program’s 1030 bytes is stored in two-byte ASCII format, which requires a total of 2060 bytes. These 2060 bytes are stored in 32-byte sections (2060 / 32 = 64.375 sections), which each have 9-bytes of header and 4-bytes of footer appended to them (65 * 13 = 845). Finally, a 13-byte “end of file record” follows. Adding all of this together yields 2918 (2060 + 845 + 13).

Refereneces
https://en.wikipedia.org/wiki/Intel_HEX
http://microsym.com/editor/assets/intelhex.pdf

Obviously, there are more file formats than just “:ihex” used by the AVR architecture. A few notable ones are, raw binary (little-endian byte order, in the case of the flash ROM data), binary, Motorola S-record, and ELF. Google is your friend.

Posted in Uncategorized | Leave a comment

STM32F411RE Nucleo PCD8544/Nokia5110 Simplistic Chronograph (Timer) Program

stopwatch

A very simplistic timer implemented on an STM32 Nucleo board using a PCD8544 controller/Nokia 5110 LCD for display output. The display uses a memory buffer to construct the screen display, and sends the complete buffer to the LCD instead of drawing directly on the LCD screen. This method would be suitable for displaying video or gaming.

A finite state machine for the main loop results in rather compact code:

  while(1) {
    if (!ignore_input) {
      //button press?			
      if ((GPIOC->IDR & GPIO_Pin_13) == (uint32_t)Bit_RESET) {
        ignore_input = 100;  //start debounce
        switch (state) {     //change timer state
          case STOPPED:
            state = RUNNING;
            break;
          case RUNNING:
          default:
            state = STOPPED;
            break;
        }
      }
    }
    if (ignore_input)     
      ignore_input--;     //decrement debounce timer
    ClearDisplayBuffer(); //erase
    DisplayTime(msTicks); //draw
    PCD8544Update();      //display
  }

Admittedly, this FSM has only 2 states, however adding additional states is as simple as inserting another case to the switch statement. For example, if we added a reset button we might add the following code:

          case RESET:
            state = STOPPED
            msTicks = 0UL;
            break;

The font was constructed using MikroElectronika GLCD Font Creator program. One could easily extract the PCD8544/Nokia5110 routines and incorporate them in their own project, or make a library. Here is the full source code:

//pcd8544/nokia5110 test
#include "stm32f4xx_rcc.h"
#include "stm32f4xx_gpio.h"
#include "stm32f4xx_spi.h"
#include <stdlib.h>
#include "string.h"

//pcd8544/nokia5110 pins
//CS  GPIOA, GPIO_Pin_8
//RST GPIOB, GPIO_Pin_10
//DC  GPIOB, GPIO_Pin_4
//MO  GPIOB, GPIO_Pin_5
//SCK GPIOB, GPIO_Pin_3

//possible timer states
#define STOPPED 0
#define RUNNING 1

//timer state
volatile uint16_t state;
//counts 1ms timeTicks
volatile uint32_t msTicks; 
//debounce button by ignoring input after state change
uint16_t ignore_input;

void SysTick_Handler(void) {
  if (state)
    msTicks++;    
}

//
// PCD8551/Nokia5110 Display stuff
//
#define BLACK                       1
#define WHITE                       0
#define LCDWIDTH                    84
#define LCDHEIGHT                   48
#define PCD8544_POWERDOWN           0x04
#define PCD8544_ENTRYMODE           0x02
#define PCD8544_EXTENDEDINSTRUCTION 0x01
#define PCD8544_DISPLAYBLANK        0x0
#define PCD8544_DISPLAYNORMAL       0x4
#define PCD8544_DISPLAYALLON        0x1
#define PCD8544_DISPLAYINVERTED     0x5
#define PCD8544_FUNCTIONSET         0x20
#define PCD8544_DISPLAYCONTROL      0x08
#define PCD8544_SETYADDR            0x40
#define PCD8544_SETXADDR            0x80
#define PCD8544_SETTEMP             0x04
#define PCD8544_SETBIAS             0x10
#define PCD8544_SETVOP              0x80

//bit value
#define _BV(x) (1<<x) 

//buffer for the LCD screen
uint8_t pcd8544_buffer[LCDWIDTH*LCDHEIGHT/8] = {
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 128, 0,
  0, 16, 0, 0, 16, 16, 224, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 128, 128, 192,
  192, 192, 192, 96, 96, 96, 255, 252, 254, 254, 206, 204, 230, 243, 97, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  128, 192, 224, 48, 24, 12, 14, 6, 3, 3, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
  3, 7, 31, 255, 255, 254, 252, 248, 128, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 240, 252, 255, 7, 1, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 192, 240, 252, 255, 255, 255, 255, 63, 3, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 31, 127, 255, 248, 224,
  192, 192, 128, 128, 128, 128, 128, 128, 128, 128, 128, 192, 192, 224, 224, 240,
  248, 248, 252, 254, 127, 127, 63, 31, 15, 7, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 7, 7, 7, 15, 15, 15,
  15, 15, 15, 15, 15, 15, 7, 7, 7, 3, 3, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
};

//Font Generated by MikroElektronika GLCD Font Creator 1.2.0.0
//GLCD FontName : Courier_New
//GLCD FontSize : 18 x 23
static const unsigned char Courier_New18x23[] = {
  0x11, 0x00, 0x00, 0x00, 0xC0, 0xFF, 0x01, 0xF0, 0xFF, 0x07, 0xF8, 0xFF, 0x0F, 0x7E, 0x00, 0x3F, 0x1E, 0x00, 0x3C, 0x0F, 0x00, 0x78, 0x07, 0x00, 0x70, 0x07, 0x00, 0x70, 0x07, 0x00, 0x70, 0x07, 0x00, 0x70, 0x0F, 0x00, 0x78, 0x1E, 0x00, 0x3C, 0x7C, 0x00, 0x3F, 0xF8, 0xFF, 0x0F, 0xF0, 0xFF, 0x07, 0xC0, 0xFF, 0x01, 0x00, 0x00, 0x00,  // Code for char 0
  0x11, 0x00, 0x00, 0x70, 0x1C, 0x00, 0x70, 0x1C, 0x00, 0x70, 0x1C, 0x00, 0x70, 0x1E, 0x00, 0x70, 0x0E, 0x00, 0x70, 0x0E, 0x00, 0x70, 0xFE, 0xFF, 0x7F, 0xFF, 0xFF, 0x7F, 0xFF, 0xFF, 0x7F, 0x00, 0x00, 0x70, 0x00, 0x00, 0x70, 0x00, 0x00, 0x70, 0x00, 0x00, 0x70, 0x00, 0x00, 0x70, 0x00, 0x00, 0x70, 0x00, 0x00, 0x70, 0x00, 0x00, 0x00,  // Code for char 1
  0x11, 0x00, 0x00, 0x70, 0xF0, 0x00, 0x78, 0xF8, 0x00, 0x7C, 0xFC, 0x00, 0x7E, 0x1E, 0x00, 0x77, 0x0E, 0x00, 0x73, 0x0F, 0x80, 0x73, 0x07, 0xC0, 0x71, 0x07, 0xE0, 0x70, 0x07, 0xF0, 0x70, 0x07, 0x70, 0x70, 0x07, 0x38, 0x70, 0x0E, 0x1C, 0x70, 0x1E, 0x1F, 0x70, 0xFC, 0x0F, 0x78, 0xF8, 0x07, 0x78, 0xE0, 0x01, 0x70, 0x00, 0x00, 0x00,  // Code for char 2
  0x11, 0x00, 0x00, 0x18, 0x0C, 0x00, 0x38, 0x1E, 0x00, 0x38, 0x1E, 0x00, 0x78, 0x0E, 0x00, 0x70, 0x0F, 0x00, 0x70, 0x07, 0x0E, 0x70, 0x07, 0x0E, 0x70, 0x07, 0x0E, 0x70, 0x07, 0x0E, 0x70, 0x07, 0x0E, 0x70, 0x0F, 0x1F, 0x70, 0x8E, 0x3F, 0x38, 0xFE, 0x7B, 0x3C, 0xFC, 0xF1, 0x1F, 0xF0, 0xF0, 0x0F, 0x00, 0xC0, 0x07, 0x00, 0x00, 0x00,  // Code for char 3
  0x11, 0x00, 0x00, 0x00, 0x00, 0x80, 0x03, 0x00, 0xE0, 0x03, 0x00, 0xF8, 0x03, 0x00, 0xFE, 0x03, 0x00, 0x9F, 0x03, 0xC0, 0x87, 0x03, 0xF0, 0x81, 0x73, 0x7C, 0x80, 0x73, 0x3E, 0x80, 0x73, 0x0F, 0x80, 0x73, 0xFF, 0xFF, 0x7F, 0xFF, 0xFF, 0x7F, 0xFF, 0xFF, 0x7F, 0x00, 0x80, 0x73, 0x00, 0x80, 0x73, 0x00, 0x80, 0x73, 0x00, 0x00, 0x00,  // Code for char 4
  0x11, 0x00, 0x00, 0x18, 0x00, 0x00, 0x3C, 0xFF, 0x0F, 0x3C, 0xFF, 0x1F, 0x38, 0xFF, 0x1F, 0x78, 0x07, 0x0E, 0x70, 0x07, 0x0F, 0x70, 0x07, 0x07, 0x70, 0x07, 0x07, 0x70, 0x07, 0x07, 0x70, 0x07, 0x07, 0x70, 0x07, 0x07, 0x70, 0x07, 0x0E, 0x38, 0x07, 0x1E, 0x3C, 0x07, 0xFC, 0x1F, 0x03, 0xF8, 0x0F, 0x00, 0xE0, 0x07, 0x00, 0x00, 0x00,  // Code for char 5
  0x12, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0x00, 0xC0, 0xFF, 0x07, 0xE0, 0xFF, 0x1F, 0xF0, 0x79, 0x3E, 0x78, 0x3C, 0x38, 0x3C, 0x1C, 0x70, 0x1E, 0x1E, 0x70, 0x0E, 0x0E, 0x70, 0x0F, 0x0E, 0x70, 0x07, 0x0E, 0x70, 0x07, 0x0E, 0x70, 0x07, 0x1C, 0x78, 0x07, 0x3C, 0x3C, 0x0F, 0xF8, 0x3F, 0x0F, 0xF0, 0x1F, 0x06, 0xC0, 0x07,  // Code for char 6
  0x11, 0x00, 0x00, 0x00, 0x3F, 0x00, 0x00, 0x3F, 0x00, 0x00, 0x3F, 0x00, 0x00, 0x07, 0x00, 0x00, 0x07, 0x00, 0x00, 0x07, 0x00, 0x00, 0x07, 0x00, 0x70, 0x07, 0x00, 0x7E, 0x07, 0xC0, 0x7F, 0x07, 0xF0, 0x0F, 0x07, 0xFE, 0x01, 0x87, 0x7F, 0x00, 0xF7, 0x0F, 0x00, 0xFF, 0x03, 0x00, 0x7F, 0x00, 0x00, 0x0F, 0x00, 0x00, 0x00, 0x00, 0x00,  // Code for char 7
  0x11, 0x00, 0x00, 0x00, 0xF0, 0xC1, 0x07, 0xF8, 0xF7, 0x1F, 0xFC, 0xFF, 0x1F, 0x1E, 0x7F, 0x3C, 0x0E, 0x3E, 0x38, 0x07, 0x3C, 0x70, 0x07, 0x1C, 0x70, 0x07, 0x1C, 0x70, 0x07, 0x1C, 0x70, 0x07, 0x1C, 0x70, 0x07, 0x3C, 0x70, 0x0E, 0x3E, 0x38, 0x1E, 0x7F, 0x3C, 0xFC, 0xFF, 0x1F, 0xF8, 0xF7, 0x1F, 0xF0, 0xC1, 0x07, 0x00, 0x00, 0x00,  // Code for char 8
  0x12, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xE0, 0x03, 0x30, 0xF8, 0x0F, 0x78, 0xFC, 0x1F, 0x78, 0x1E, 0x3C, 0x70, 0x0F, 0x38, 0x70, 0x07, 0x70, 0x70, 0x07, 0x70, 0x70, 0x07, 0x70, 0x78, 0x07, 0x70, 0x38, 0x07, 0x78, 0x3C, 0x0E, 0x38, 0x1E, 0x1E, 0x3C, 0x1F, 0x3C, 0x9E, 0x0F, 0xF8, 0xFF, 0x07, 0xF0, 0xFF, 0x03, 0xC0, 0x7F, 0x00   // Code for char 9
};

//holds number to display in ascii
uint8_t d[5];

#define swap(a, b) { int16_t t = a; a = b; b = t; }

void PCD8544Write(uint8_t d) {
  uint8_t i;
  
  GPIOA->BSRR |= (GPIO_Pin_8<<16);   //clear cs
  //bit bang
  for (i=0; i<8; i++) {
    if (d & 0x80)                    //output bit
      GPIOB->BSRR |= GPIO_Pin_5;     //din
    else
      GPIOB->BSRR |= (GPIO_Pin_5<<16);
    d <<= 1;                         //next bit
    GPIOB->BSRR |= (GPIO_Pin_3<<16); //toggle clock
    GPIOB->BSRR |= GPIO_Pin_3;
  }
  //transmission complete
  GPIOA->BSRR |= GPIO_Pin_8;         //set cs
}

void PCD8544SendCommand(uint8_t c) {
  //clear dc
  GPIOB->BSRR |= (GPIO_Pin_4<<16);
  //send command
  PCD8544Write(c);
}

//write buffer to lcd
void PCD8544Update(void) {
  uint8_t column, max_column, p;
  
  max_column = LCDWIDTH - 1;
  for (p=0; p<6; p++) {
    PCD8544SendCommand(PCD8544_SETYADDR | p);
    PCD8544SendCommand(PCD8544_SETXADDR);
    GPIOB->BSRR |= GPIO_Pin_4;                //set dc
    for (column=0; column<=max_column; column++)
      PCD8544Write(pcd8544_buffer[(LCDWIDTH*p) + column]);
  }
  //no idea why this is necessary but it is to finish the last byte?
  PCD8544SendCommand(PCD8544_SETYADDR );  
}

//set a single pixel
void SetPixel(int16_t x, int16_t y) {
  if ((x < 0) || (x >= LCDWIDTH) || (y < 0) || (y >= LCDHEIGHT))
    return;
  pcd8544_buffer[x + (y/8)*LCDWIDTH] |= _BV(y%8);
}

//set a single pixel
void ClearPixel(int16_t x, int16_t y) {
  if ((x < 0) || (x >= LCDWIDTH) || (y < 0) || (y >= LCDHEIGHT))
    return;
  pcd8544_buffer[x + (y/8)*LCDWIDTH] &= ~_BV(y%8);
}

//Bresenham's algorithm
void DrawLine(int16_t x0, int16_t y0, int16_t x1, int16_t y1) {
  int16_t steep, dx, dy, y_step, err;

  steep = abs(y1 - y0) > abs(x1 - x0);
  if (steep) {
    swap(x0, y0);
    swap(x1, y1);
  }
  if (x0 > x1) {
    swap(x0, x1);
    swap(y0, y1);
  }
  dx = x1 - x0;
  dy = abs(y1 - y0);
  err = dx/2;
  if (y0 < y1)
    y_step = 1;
  else
    y_step = -1;
  for (; x0<=x1; x0++) {
    if (steep)
      SetPixel(y0, x0);
    else
      SetPixel(x0, y0);
    err -= dy;
    if (err < 0) {
      y0 += y_step;
      err += dx;
    }
  }
}

void FillRect(int16_t x, int16_t y, int16_t w, int16_t h) {
  int16_t i;
  
  for (i=x; i<x + w; i++)
    DrawLine(i, y, i, y + h - 1);
}

//font data: Courier_New18x23
#define FONT_WIDTH      18
#define FONT_HEIGHT     23
#define FONT_START_CHAR 48
#define FONT_END_CHAR   57

//draw character
uint8_t DrawCharXY(uint8_t x, uint8_t y, char c) {
  uint8_t i, char_width, bytes_high = FONT_HEIGHT/8 + 1;
  uint8_t bytes_per_char = FONT_WIDTH*(FONT_HEIGHT/8+1) + 1; //+1 for width byte at start
  const unsigned char *p;

  if (c < FONT_START_CHAR || c > FONT_END_CHAR)
    c = '0';
  p = Courier_New18x23 + (c - FONT_START_CHAR)*bytes_per_char;
  //first byte of character is always width of character
  char_width = *p;
  p++;                                //step over width field
  for (i=0; i<char_width; i++) {
    uint8_t j;

    for (j=0; j<bytes_high; j++) {
      uint8_t b, dat;

      dat = *(p + i*bytes_high + j);
      for (b=0; b<8; b++) {
        if (x + i >= LCDWIDTH || y + j*8 + b >= LCDHEIGHT)
          return 0;                   //don't write past dimensions of LCD, skip entire char
        if ((j*8 + b) >= FONT_HEIGHT)  //we should not write if y bit exceeds font height
          continue;                   //skip bit
        if (dat & (1<<b))
          SetPixel(x + i, y + j*8 + b);
        else
          ClearPixel(x + i, y + j*8 + b);
      }
    }
  }
  return char_width;
}

//convert uint32_t millis value to array of 4 digits
void ExtractDigits(uint32_t number) {
  uint8_t i;
  
  number /= 10;  //eliminate thousands digit
  i = 0;
  while (number > 0) {
    //pull individual digits from 'number' and stuff into 'd' array
    d[i++] = (uint8_t)(number%10) + 48;
    number /= 10;
  }
  while (i <= 4) //display needs 4 chars
    d[i++] = 48; //fill with leading '0's 
}

void DisplayTime(uint32_t number) {
  ExtractDigits(number);    //convert UL to byte array
  DrawCharXY(1, 10, d[3]);
  DrawCharXY(21, 10, d[2]);
  DrawCharXY(46, 10, d[1]);
  DrawCharXY(66, 10, d[0]);
  //decimal point
  FillRect(41, 30, 3, 3);
}

void PCD8544SetContrast(uint8_t val) {
  if (val > 0x7f)
    val = 0x7f;
  PCD8544SendCommand(PCD8544_FUNCTIONSET | PCD8544_EXTENDEDINSTRUCTION);
  PCD8544SendCommand(PCD8544_SETVOP | val); 
  PCD8544SendCommand(PCD8544_FUNCTIONSET);
}

//clear buffer
void ClearDisplayBuffer(void) {
  memset(pcd8544_buffer, 0, LCDWIDTH*LCDHEIGHT/8);
}

void PCD8544GPIOConfig(void) {
  //cs
  GPIOA->MODER &= ~(GPIO_MODER_MODER0<<(8*2));
  GPIOA->MODER |= (((uint32_t)GPIO_Mode_OUT)<<(8*2));
  GPIOA->OSPEEDR &= ~(GPIO_OSPEEDER_OSPEEDR0<<(8*2));
  GPIOA->OSPEEDR |= ((uint32_t)(GPIO_High_Speed)<<(8*2));
  GPIOA->OTYPER &= ~((GPIO_OTYPER_OT_0)<<((uint16_t)8)); 
  GPIOA->OTYPER |= (uint16_t)(((uint16_t)GPIO_OType_PP)<<((uint16_t)8));
  GPIOA->PUPDR &= ~(GPIO_PUPDR_PUPDR0<<((uint16_t)8*2));
  GPIOA->PUPDR |= (((uint32_t)GPIO_PuPd_NOPULL)<<(8*2));
  //mo
  GPIOB->MODER &= ~(GPIO_MODER_MODER0<<(10*2));
  GPIOB->MODER |= (((uint32_t)GPIO_Mode_OUT)<<(10*2));
  GPIOB->OSPEEDR &= ~(GPIO_OSPEEDER_OSPEEDR0<<(10*2));
  GPIOB->OSPEEDR |= ((uint32_t)(GPIO_High_Speed)<<(10*2));
  GPIOB->OTYPER &= ~((GPIO_OTYPER_OT_0)<<((uint16_t)10)); 
  GPIOB->OTYPER |= (uint16_t)(((uint16_t)GPIO_OType_PP)<<((uint16_t)10));
  GPIOB->PUPDR &= ~(GPIO_PUPDR_PUPDR0<<((uint16_t)10*2));
  GPIOB->PUPDR |= (((uint32_t)GPIO_PuPd_NOPULL)<<(10*2));
  //dc
  GPIOB->MODER &= ~(GPIO_MODER_MODER0<<(4*2));
  GPIOB->MODER |= (((uint32_t)GPIO_Mode_OUT)<<(4*2));
  GPIOB->OSPEEDR &= ~(GPIO_OSPEEDER_OSPEEDR0<<(4*2));
  GPIOB->OSPEEDR |= ((uint32_t)(GPIO_High_Speed)<<(4*2));
  GPIOB->OTYPER &= ~((GPIO_OTYPER_OT_0)<<((uint16_t)4)); 
  GPIOB->OTYPER |= (uint16_t)(((uint16_t)GPIO_OType_PP)<<((uint16_t)4));
  GPIOB->PUPDR &= ~(GPIO_PUPDR_PUPDR0<<((uint16_t)4*2));
  GPIOB->PUPDR |= (((uint32_t)GPIO_PuPd_NOPULL)<<(4*2));
  //rst
  GPIOB->MODER &= ~(GPIO_MODER_MODER0<<(5*2));
  GPIOB->MODER |= (((uint32_t)GPIO_Mode_OUT)<<(5*2));
  GPIOB->OSPEEDR &= ~(GPIO_OSPEEDER_OSPEEDR0<<(5*2));
  GPIOB->OSPEEDR |= ((uint32_t)(GPIO_High_Speed)<<(5*2));
  GPIOB->OTYPER &= ~((GPIO_OTYPER_OT_0)<<((uint16_t)5)); 
  GPIOB->OTYPER |= (uint16_t)(((uint16_t)GPIO_OType_PP)<<((uint16_t)5));
  GPIOB->PUPDR &= ~(GPIO_PUPDR_PUPDR0<<((uint16_t)5*2));
  GPIOB->PUPDR |= (((uint32_t)GPIO_PuPd_NOPULL)<<(5*2));
  //sck
  GPIOB->MODER &= ~(GPIO_MODER_MODER0<<(3*2));
  GPIOB->MODER |= (((uint32_t)GPIO_Mode_OUT)<<(3*2));
  GPIOB->OSPEEDR &= ~(GPIO_OSPEEDER_OSPEEDR0<<(3*2));
  GPIOB->OSPEEDR |= ((uint32_t)(GPIO_High_Speed)<<(3*2));
  GPIOB->OTYPER &= ~((GPIO_OTYPER_OT_0)<<((uint16_t)3)); 
  GPIOB->OTYPER |= (uint16_t)(((uint16_t)GPIO_OType_PP)<<((uint16_t)3));
  GPIOB->PUPDR &= ~(GPIO_PUPDR_PUPDR0<<((uint16_t)3*2));
  GPIOB->PUPDR |= (((uint32_t)GPIO_PuPd_NOPULL)<<(3*2));
}

void PCD8544Init(void) {
  uint32_t i;
  
  PCD8544GPIOConfig();  
  GPIOB->BSRR |= GPIO_Pin_4;        //dc
  GPIOB->BSRR |= GPIO_Pin_5;        //mo
  GPIOB->BSRR |= GPIO_Pin_3;        //clock
  GPIOA->BSRR |= GPIO_Pin_8;        //cs
  GPIOB->BSRR |= (GPIO_Pin_10<<16); //rst
  //slight delay
  for (i=0;i<10000; i++) 
    ;
  GPIOB->BSRR |= GPIO_Pin_10;       //rst
  //get into the EXTENDED mode
  PCD8544SendCommand(PCD8544_FUNCTIONSET | PCD8544_EXTENDEDINSTRUCTION );
  //contrast
  PCD8544SendCommand(PCD8544_SETVOP | 0x40);
  //LCD bias select (4 is optimal?)
  PCD8544SendCommand(PCD8544_SETBIAS | 0x04);
  //normal mode
  PCD8544SendCommand(PCD8544_FUNCTIONSET);
  PCD8544SendCommand(PCD8544_DISPLAYCONTROL | PCD8544_DISPLAYNORMAL);
  //push out pcd8544_buffer to the display
  PCD8544Update();
}

//configure SystemCoreClock using HSI (HSE is not populated on Nucleo board)
void SystemCoreClockConfigure(void) {
  //enable HSI
  RCC->CR |= ((uint32_t)RCC_CR_HSION);                     
  while ((RCC->CR & RCC_CR_HSIRDY) == 0)
    ; //Wait for HSI Ready RCC->CFGR = RCC_CFGR_SW_HSI;
  while ((RCC->CFGR & RCC_CFGR_SWS) != RCC_CFGR_SWS_HSI)
    ; //wait for HSI used as system clock
  FLASH->ACR  = FLASH_ACR_PRFTEN;      //enable Prefetch Buffer
  FLASH->ACR |= FLASH_ACR_ICEN;        //instruction cache enable
  FLASH->ACR |= FLASH_ACR_DCEN;        //data cache enable
  FLASH->ACR |= FLASH_ACR_LATENCY_5WS; //flash 5 wait state
  //HCLK = SYSCLK
  RCC->CFGR |= RCC_CFGR_HPRE_DIV1;                         
  //APB1 = HCLK/2
  RCC->CFGR |= RCC_CFGR_PPRE1_DIV2;                        
  //APB2 = HCLK/1
  RCC->CFGR |= RCC_CFGR_PPRE2_DIV1;                        
  //disable PLL
  RCC->CR &= ~RCC_CR_PLLON;                                
  //PLL configuration: VCO=HSI/M*N, Sysclk=VCO/P
  //PLL_M=16, PLL_N=320, PLL_P=4, PLL_SRC=HSI, PLL_Q=8 for 80MHz SYSCLK/APB2, 40MHz APB1
  RCC->PLLCFGR = (16ul | (320ul<<6) | (1ul<<16) | (RCC_PLLCFGR_PLLSRC_HSI) | (8ul<<24));
  //enable PLL
  RCC->CR |= RCC_CR_PLLON;                                 
  //Wait till PLL is ready
  while((RCC->CR & RCC_CR_PLLRDY) == 0) 
    __NOP();
  //select PLL as system clock source
  RCC->CFGR &= ~RCC_CFGR_SW;                               
  RCC->CFGR |=  RCC_CFGR_SW_PLL;
  while ((RCC->CFGR & RCC_CFGR_SWS) != RCC_CFGR_SWS_PLL)
    ; //wait till PLL is system clock src
}

int main(void) {
  //configure HSI as System Clock
  SystemCoreClockConfigure();                              
  SystemCoreClockUpdate();

  //enable GPIOA peripheral clock
  RCC->AHB1ENR |= (1ul<<0);

  //configure nucleo led (pa5) pin as output, push-pull, no pull-up/down 
  GPIOA->MODER &= ~((3ul<<2*5));                //clear both mode bits
  GPIOA->MODER |= ((GPIO_Mode_OUT<<2*5));       //set as general purpose output
  GPIOA->OTYPER &= ~((1ul<<5));                 //clear (push/pull)
  GPIOA->OSPEEDR &= ~((3ul<<2*5));              //clear both speed bits
  GPIOA->OSPEEDR |= ((GPIO_Medium_Speed<<2*5)); //set medium speed
  GPIOA->PUPDR &= ~((3ul<<2*5));                //clear both pull up/down status (none)
  //configure nucleo blue button (pa13) pin as input, push-pull with pull-down 
  RCC->AHB1ENR |= (2ul<<1);                     //enable GPIOC clock
  GPIOC->MODER &= ~((3ul<<2*13));               //clear both mode bits
  GPIOC->MODER |= ((GPIO_Mode_IN<<2*13));       //set as input
  GPIOC->OTYPER &= ~((1ul<<13));                //clear (push/pull)
  GPIOC->OSPEEDR &= ~((3ul<<2*13));             //clear both speed bits
  GPIOC->OSPEEDR |= ((GPIO_High_Speed<<2*13));  //set high speed
  GPIOC->PUPDR |= (GPIO_PuPd_DOWN<<2*13);       //set pull down status

  //SysTick 1 msec interrupts
  SysTick_Config(SystemCoreClock/1000);                  

  //enable GPIOB peripheral clock
  RCC->AHB1ENR |= (1ul<<1);
  //configure pcd8544 pins & lcd setup
  PCD8544Init();

  //inital states
  state = STOPPED;
  msTicks = 0UL;
  ignore_input = 0;

  //endless loop
  while(1) {
    if (!ignore_input) {
      if ((GPIOC->IDR & GPIO_Pin_13) == (uint32_t)Bit_RESET) {
        ignore_input = 100;
        switch (state) {
          case STOPPED:
            state = RUNNING;
            break;
          case RUNNING:
            state = STOPPED;
            break;
          default:
            state = STOPPED;
            break;
        }
      }
    }
    if (ignore_input)     
      ignore_input--;     //debounce button
    ClearDisplayBuffer(); //erase
    DisplayTime(msTicks); //draw
    PCD8544Update();      //display
  }
}
Posted in Uncategorized | Tagged , , , , , , | Leave a comment

STM32F411RE Nucleo 40MHz SPI with Cypress FM25CL64B FRAM

nucleo

Previously, I tested a Cypress FRAM memory chip with the Arduino. A feature of FRAM memory is the speed at which it can be accessed. Cypress claims the FM25xxx chips can operate at 40MHz, however the maximum speed for SPI on an Arduino Uno is only 8MHz, or half the system clock frequency.

In order to test the FRAM at higher speeds, we need a µC capable of operating at much higher clock frequencies. The easiest option I had available was a STM32F411 Nucleo board.

The highly affordable STM32 Nucleo boards (only $10.33 USD at mouser.com) are available with an assortment of ARM Cortex-M µCs, share Arduino connectors, come with an integrated USB debugger/programmer, and can be used with a wide range of development environments. Admittedly, learning to program the ARM µC requires climbing a much steeper learning curve, but the Nucleo boards offer incredible performance at a fraction of the cost of Arduino. For example, my STM32F411RE based Nucleo board incorporates an ARM Cortex-M4 processor (with FPU), 512Kb of flash, 128kB of SRAM memory, up to 81 I/O ports, up to 13 communication interfaces (USB, USART, SPI, I2C and SDIO), up to 11 timers, a 12-bit ADC, and an RTC. Check them out.

While the F411 µC Nucleo board utilizes only a 16MHz internal clock (there is no external crystal installed on the Nucleos), it has provisions for a Phase Lock Loop (PLL). The PLL, in conjunction with the internal clock, can run the chip at up to 100MHz! However, getting 100MHz out of a 16MHz oscillator takes just a little programming magic.

I decided to select the PLL to drive the system clock at 80MHz, also choosing to operate the APB2 peripheral bus at this same 80MHz frequency. The µC SPI peripherals utilize either the APB1 or the APB2 bus for timing. The maximum speed for the APB1 bus is half the system bus. Selecting SPI #5 which is on the APB2 bus and using a divide by 2 prescaler creates an SPI peripheral operating at 40MHz. There is both a spreadsheet based clock configuration tool, and a graphical µC configuration program available for the STM32 µCs to assist with clock setup.

Unfortunately, my Saleae Logic Analyzer is maxed out at 24MHz, so I can’t confirm the SPI speed. The following test program is our only proof. If the SPI frequency is increased to 50MHz in the program, as expected, the FRAM read/writes become unreliable and start to flash the error LED.

SPI Test:

//spi test
#include "stm32f4xx_rcc.h"
#include "stm32f4xx_gpio.h"
#include "stm32f4xx_spi.h"

//FM25CL64b FRAM opcodes
#define WREN  0x06 //set write enable latch
#define RDSR  0x05 //read status register
#define WRDI  0x04 //write disable
#define READ  0x03 //read memory data
#define WRITE 0x02 //write memory data
#define WRSR  0x01 //write status register
//pinouts:
//  CS 1 - 5 VCC
//  SO 2 - 6 HOLD
//  WP 3 - 7 SCK
// GND 4 - 8 SI
//use 10K pull up on CS, tie WP & HOLD to VCC
//Nucleo: pb1=ss, pb0=sck, pa12=miso, pa10=mosi 

//counts 1ms timeTicks
volatile uint32_t msTicks; 

void SysTick_Handler(void) {
  msTicks++;
}

//delay a number of Systicks
void Delay (uint32_t dlyTicks) {
  uint32_t curTicks;

  curTicks = msTicks;
  while ((msTicks - curTicks) < dlyTicks) { 
    __NOP(); 
  }
}

//configure SystemCoreClock using HSI (HSE is not populated on Nucleo board)
void SystemCoreClockConfigure(void) {
  //enable HSI
  RCC->CR |= ((uint32_t)RCC_CR_HSION);                     
  while ((RCC->CR & RCC_CR_HSIRDY) == 0)
    ; //Wait for HSI Ready RCC->CFGR = RCC_CFGR_SW_HSI;
  while ((RCC->CFGR & RCC_CFGR_SWS) != RCC_CFGR_SWS_HSI)
    ; //wait for HSI used as system clock
  FLASH->ACR  = FLASH_ACR_PRFTEN;      //enable Prefetch Buffer
  FLASH->ACR |= FLASH_ACR_ICEN;        //instruction cache enable
  FLASH->ACR |= FLASH_ACR_DCEN;        //data cache enable
  FLASH->ACR |= FLASH_ACR_LATENCY_5WS; //flash 5 wait state
  //HCLK = SYSCLK
  RCC->CFGR |= RCC_CFGR_HPRE_DIV1;                         
  //APB1 = HCLK/2
  RCC->CFGR |= RCC_CFGR_PPRE1_DIV2;                        
  //APB2 = HCLK/1
  RCC->CFGR |= RCC_CFGR_PPRE2_DIV1;                        
  //disable PLL
  RCC->CR &= ~RCC_CR_PLLON;                                
  //PLL configuration: VCO=HSI/M*N, Sysclk=VCO/P
  //PLL_M=16, PLL_N=320, PLL_P=4, PLL_SRC=HSI, PLL_Q=8 for 80MHz SYSCLK/APB2, 40MHz APB1
  RCC->PLLCFGR = (16ul | (320ul<<6) | (1ul<<16) | (RCC_PLLCFGR_PLLSRC_HSI) | (8ul<<24));
  //enable PLL
  RCC->CR |= RCC_CR_PLLON;                                 
  //Wait till PLL is ready
  while((RCC->CR & RCC_CR_PLLRDY) == 0) 
    __NOP();
  //select PLL as system clock source
  RCC->CFGR &= ~RCC_CFGR_SW;                               
  RCC->CFGR |=  RCC_CFGR_SW_PLL;
  while ((RCC->CFGR & RCC_CFGR_SWS) != RCC_CFGR_SWS_PLL)
    ; //wait till PLL is system clock src
}

//configure spi5 using: spi5 pb1=ss, pb0=sck, pa12=miso, pa10=mosi 
void SPIConfigure(void) {
  uint32_t temp = 0x0;

  //gpio pin setup
  //enable GPIOA peripheral clock for MISO & MOSI
  RCC->AHB1ENR |= (1ul<<0);
  //enable GPIOB clock for SS and SCK
  RCC->AHB1ENR |= (1ul<<1);
  //alternate pin functions (AF06)
  //sck
  GPIOB->AFR[0] &= ~((uint32_t)0xf);
  GPIOB->AFR[0] |= ((uint32_t)0x6);
  //miso
  GPIOA->AFR[1] &= ~((uint32_t)0xf0000);
  GPIOA->AFR[1] |= ((uint32_t)0x60000);
  //mosi
  GPIOA->AFR[1] &= ~((uint32_t)0xf00);
  GPIOA->AFR[1] |= ((uint32_t)0x600);
  //ss
  GPIOB->MODER &= ~(GPIO_MODER_MODER0<<(1*2));
  GPIOB->MODER |= (((uint32_t)GPIO_Mode_OUT)<<(1*2));
  GPIOB->OSPEEDR &= ~(GPIO_OSPEEDER_OSPEEDR0<<(1*2));
  GPIOB->OSPEEDR |= ((uint32_t)(GPIO_High_Speed)<<(1*2));
  GPIOB->OTYPER &= ~((GPIO_OTYPER_OT_0)<<((uint16_t)1));
  GPIOB->OTYPER |= (uint16_t)(((uint16_t)GPIO_OType_PP)<<((uint16_t)1));
  GPIOB->PUPDR &= ~(GPIO_PUPDR_PUPDR0<<((uint16_t)1*2));
  GPIOB->PUPDR |= (((uint32_t)GPIO_PuPd_NOPULL)<<(1*2));
  GPIOB->BSRR |= GPIO_Pin_1;  //set chip select to high
  //sck
  GPIOB->MODER &= ~(GPIO_MODER_MODER0<<(0*2));
  GPIOB->MODER |= (((uint32_t)GPIO_Mode_AF)<<(0*2));
  GPIOB->OSPEEDR &= ~(GPIO_OSPEEDER_OSPEEDR0<<(0*2));
  GPIOB->OSPEEDR |= ((uint32_t)(GPIO_High_Speed)<<(0*2));
  GPIOB->OTYPER &= ~((GPIO_OTYPER_OT_0)<<((uint16_t)0)); 
  GPIOB->OTYPER |= (uint16_t)(((uint16_t)GPIO_OType_PP)<<((uint16_t)0));
  GPIOB->PUPDR &= ~(GPIO_PUPDR_PUPDR0<<((uint16_t)0*2));
  GPIOB->PUPDR |= (((uint32_t)GPIO_PuPd_NOPULL)<<(0*2));
  //miso
  GPIOA->MODER &= ~(GPIO_MODER_MODER0<<(12*2));
  GPIOA->MODER |= (((uint32_t)GPIO_Mode_AF)<<(12*2));
  GPIOA->OSPEEDR &= ~(GPIO_OSPEEDER_OSPEEDR0<<(12*2));
  GPIOA->OSPEEDR |= ((uint32_t)(GPIO_High_Speed)<<(12*2));
  GPIOA->OTYPER &= ~((GPIO_OTYPER_OT_0)<<((uint16_t)12)); 
  GPIOA->OTYPER |= (uint16_t)(((uint16_t)GPIO_OType_PP)<<((uint16_t)12));
  GPIOA->PUPDR &= ~(GPIO_PUPDR_PUPDR0<<((uint16_t)12*2));
  GPIOA->PUPDR |= (((uint32_t)GPIO_PuPd_NOPULL)<<(12*2));
  //mosi
  GPIOA->MODER &= ~(GPIO_MODER_MODER0<<(10*2));
  GPIOA->MODER |= (((uint32_t)GPIO_Mode_AF)<<(10*2));
  GPIOA->OSPEEDR &= ~(GPIO_OSPEEDER_OSPEEDR0<<(10*2));
  GPIOA->OSPEEDR |= ((uint32_t)(GPIO_High_Speed)<<(10*2));
  GPIOA->OTYPER &= ~((GPIO_OTYPER_OT_0)<<((uint16_t)10));
  GPIOA->OTYPER |= (uint16_t)(((uint16_t)GPIO_OType_PP)<<((uint16_t)10));
  GPIOA->PUPDR &= ~(GPIO_PUPDR_PUPDR0<<((uint16_t)10*2));
  GPIOA->PUPDR |= (((uint32_t)GPIO_PuPd_NOPULL)<<(10*2));

  //enable spi peripheral clock 
  RCC->APB2ENR |= (1ul<<20);
  //spi polarity, phase, first data, baud prescale, master, mode
  temp = SPI5->CR1; 
  //clear BIDIMode, BIDIOE, RxONLY, SSM, SSI, LSBFirst, BR, MSTR, CPOL and CPHA bits
  temp &= (uint16_t)0x3040; //CR1_CLEAR_MASK
  temp |= (uint16_t)((uint32_t)SPI_Direction_2Lines_FullDuplex | SPI_Mode_Master |
          SPI_DataSize_8b | SPI_CPOL_Low | SPI_CPHA_1Edge | SPI_NSS_Soft | SPI_NSSInternalSoft_Set |
          SPI_BaudRatePrescaler_2 | SPI_FirstBit_MSB);
  SPI5->CR1 = temp;
  //enable ss output
  //SPI5->CR2 |= (uint16_t)SPI_CR2_SSOE;
  //activate spi mode
  SPI5->I2SCFGR &= (uint16_t)~((uint16_t)SPI_I2SCFGR_I2SMOD);
  //enable spi
  SPI5->CR1 |= SPI_CR1_SPE;
}

//bare send
void SPISend(uint8_t data) {
  SPI5->DR = data;
}
//bare receive
uint8_t SPIReceive(void) {
  return SPI5->DR;
}

//spi xmit with busy wait
uint8_t SPIt(uint8_t data) {
  SPI5->DR = data; //write data for transmit to data register
  while(!(SPI5->SR & SPI_I2S_FLAG_TXE))
    ; //wait until send complete
  while(!(SPI5->SR & SPI_I2S_FLAG_RXNE))
    ; //wait until receive complete
  while(SPI5->SR & SPI_I2S_FLAG_BSY)
    ; //wait until SPI is not busy anymore
  return SPI5->DR; //return received data from SPI data register
}

uint8_t SpiFRAMRead8(uint16_t address) {
  uint8_t data;
 
  //cs low
  GPIOB->BSRR |= (GPIO_Pin_1<<16);
  SPIt(READ);
  SPIt((uint8_t)((address>>8)&0xff));
  SPIt((uint8_t)address);
  data = SPIt(0xff);
  //cs high
  GPIOB->BSRR |= GPIO_Pin_1;
  return (data);
}

void SpiFRAMWrite8(uint16_t address, uint8_t data) {
  //cs low
  GPIOB->BSRR |= (GPIO_Pin_1<<16);
  SPIt(WREN);
  //cs high
  GPIOB->BSRR |= GPIO_Pin_1;
  //cs low
  GPIOB->BSRR |= (GPIO_Pin_1<<16);
  SPIt(WRITE);
  //13-bit address MSB, LSB
  SPIt((uint8_t)((address>>8)&0xff));
  SPIt((uint8_t)address);
  SPIt(data);
  //cs high
  GPIOB->BSRR |= GPIO_Pin_1;
}

int main(void) {
  uint8_t rxbuf[5], txbuf[5] = { 'H', 'e', 'l', 'l', 'o' };
	
  //configure HSI as System Clock
  SystemCoreClockConfigure();                              
  SystemCoreClockUpdate();

  //initialize spi5
  SPIConfigure();

  //configure nucleo led (pa5) pin as output, push-pull, no pull-up/down 
  GPIOA->MODER &= ~((3ul<<2*5));                //clear both mode bits
  GPIOA->MODER |= ((GPIO_Mode_OUT<<2*5));       //set as general purpose output
  GPIOA->OTYPER &= ~((1ul<<5));                 //clear (push/pull)
  GPIOA->OSPEEDR &= ~((3ul<<2*5));              //clear both speed bits
  GPIOA->OSPEEDR |= ((GPIO_Medium_Speed<<2*5)); //set medium speed
  GPIOA->PUPDR &= ~((3ul<<2*5));                //clear both pull up/down status (none)
  //configure nucleo blue button (pa13) pin as input, push-pull with pull-down 
  RCC->AHB1ENR |= (2ul<<1);                     //enable GPIOC clock
  GPIOC->MODER &= ~((3ul<<2*13));               //clear both mode bits
  GPIOC->MODER |= ((GPIO_Mode_IN<<2*13));       //set as input
  GPIOC->OTYPER &= ~((1ul<<13));                //clear (push/pull)
  GPIOC->OSPEEDR &= ~((3ul<<2*13));             //clear both speed bits
  GPIOC->OSPEEDR |= ((GPIO_High_Speed<<2*13));  //set high speed
  GPIOC->PUPDR |= (GPIO_PuPd_DOWN<<2*13);       //set pull down status
  
  //SysTick 1 msec interrupts
  SysTick_Config(SystemCoreClock/1000);                  

  //test it
  while(1) {
    uint8_t i;
		
    while ((GPIOC->IDR & GPIO_Pin_13) != (uint32_t)Bit_RESET)
      ; //wait until button press
    for (i=0; i<5; i++) {
      //send data
      SpiFRAMWrite8((uint16_t)i, txbuf[i]);
      //receive data
      rxbuf[i] = SpiFRAMRead8((uint16_t)i);
    }
    for (i=0; i<5; i++) {
      if (rxbuf[i] != txbuf[i]) {
        //flash led for fail
        GPIOA->BSRR |= GPIO_Pin_5;
        Delay(500);
        GPIOA->BSRR |= (GPIO_Pin_5<<16);
        Delay(500);
      }
    }
  }
}
Posted in Uncategorized | Tagged , , , , , , | Leave a comment

LPC81x ARM Cortex-M0 Basics

arm logo

ARM Cortex-M0+ Architecture Basics

Based upon Harvard Architecture, the LPC812 uses an ARM Cortex-M0+ processor. This means it has separate instruction (flash) and data (SRAM) memory. The basic architecture includes the core components and peripherals.

The core consists of:

  • Processor
  • Memories
  • GPIO
  • Pin interrupts
  • SCTimer/PWM

Peripherals:

  • USARTs
  • SPIs
  • I2C
  • ADC
  • IOCON
  • Multi-rate Timer
  • Watchdog Timer

All output is routed through the Switch Matrix to the individual pins. The switch matrix allows the flexibility of swapping the digital peripheral functions amongst the pins. Obviously, the basic functions like GPIO, power, ground and some others cannot be swapped.

LPC81x Block Diagram

block diagram

Memory Mapping

Fortunately for us, we don’t need to focus too much on the internal design. The main factor to remember is the peripherals are memory mapped. This means our interaction with them (configuration, control, input, and output) is accomplished through an address. Accessing a peripheral is just like writing or reading a value in memory.

It is good practice to access peripherals using a read-modify-write strategy. This strategy is seen throughout ARM and LPC examples.

Read-Write-Modify Example:

GPIO_DIR |= (1<<9);  //proper method preserves unaffected bits of register
//assembler translation:
         0xd6: 0x4813         LDR.N     R0, [PC, #0x4c]         ; [0x124] DIR0
         0xd8: 0x6800         LDR       R0, [R0]
         0xda: 0x2180         MOVS      R1, #128                ; 0x80
         0xdc: 0x0089         LSLS      R1, R1, #2
         0xde: 0x4301         ORRS      R1, R1, R0
         0xe0: 0x4810         LDR.N     R0, [PC, #0x40]         ; [0x124] DIR0
         0xe2: 0x6001         STR       R1, [R0]
…
        0x124: 0xa0002000     DC32      DIR0
//
//
//
GPIO_DIR = (1<<9);  //improper method clobbers all bit of the register
//assembler translation:
         0xc2: 0x2080         MOVS      R0, #128                ; 0x80
         0xc4: 0x0080         LSLS      R0, R0, #2
         0xc6: 0x4911         LDR.N     R1, [PC, #0x44]         ; [0x10c] DIR0
         0xc8: 0x6008         STR       R0, [R1]
…
        0x10c: 0xa0002000     DC32      DIR0

Flash, SRAM and ROM memory

The LPC81xM contain up to 16kB of flash program memory, a total of up to 4kB static RAM data memory, and 8kB of on-chip ROM. The ROM contains the boot loader and In-System Programming (ISP) and In-Application Programming (IAP) support for flash programming, profiles for configuring power consumption and PLL settings, USART driver API routines, and I2C-bus driver routines.

The Very Basic Memory Map

0x00000000 - 0x00004000: Flash program memory
0x10000000 - 0x10001000: SRAM memory
0x1FFF0000 - 0x1FFF2000: Boot ROM (8kB)
0x40000000 - 0x40070000: All APB peripherals
0x50004000 - 0x50008000: SCTimer/PWM
0xA0000000 - 0xA0008000: GPIO

My next post about ARM Cortex-M0.

Posted in Uncategorized | Tagged , , , , | Leave a comment

Arduino Mode0 SPI Bit Bang and Bare Metal Hardware SPI

bare metal truck

Here are two additional versions of the SPI program from my previous post. The first of these programs use a “bare-metal” version of hardware SPI. The second is a bit-bang version using different pins.

How does the speed compare between the two versions? Not even close. The hardware SPI is running at 8MHz (half the system clock speed) and on average, transfers one byte in 2.438us. The bit-bang version takes about 12.56us to transfer a byte. I timed the period the CS pin is pulled low. Note, under HW SPI, the delay between CS going low and the first clock pulse is 0.8125Us, while in the bit-bang version the delay is approximately 2.125us (the y axis scale of the two screen captures is not the same).

Hardware SPI:
hardware spi

Bit Bang SPI:
bit bang spi

Bare-Metal Version:


//
//FM24CL64B SPI F-RAM
//64-Kbit
//
//using bare-metal hardware spi
//
/*
Arduino--Logic Conv--FRAM
D13------TXH/TXL-----6.SCK
D12------------------2.MISO
D11------TXH/TXL-----5.MOSI
D10------------------1.CS
3V3------LV
5V-------HV
GND------HV GND
GND------------------4.VSS
3V3------------------8.VCC
3V3------------------7.HOLD (tie to Vcc if not used)
3V3------[10KR]------1.CS
3.WP (active low, tie to Vcc if not used)
*/

#ifndef LSBFIRST
#define LSBFIRST 0
#endif
#ifndef MSBFIRST
#define MSBFIRST 1
#endif

#define CLOCK_DIV4 0x00
#define CLOCK_DIV16 0x01
#define CLOCK_DIV64 0x02
#define CLOCK_DIV128 0x03
#define CLOCK_DIV2 0x04
#define CLOCK_DIV8 0x05
#define CLOCK_DIV32 0x06
#define MODE0 0x00
#define MODE1 0x04
#define MODE2 0x08
#define MODE3 0x0C
#define MODE_MASK 0x0C // CPOL = bit 3, CPHA = bit 2 on SPCR
#define CLOCK_MASK 0x03 // SPR1 = bit 1, SPR0 = bit 0 on SPCR
#define CLOCKX2_MASK 0x01 // SPI2X = bit 0 on SPSR

//spi hardware transfer
inline static uint8_t SpiTransfer(uint8_t data) {
SPDR = data;
asm volatile("nop");
while (!(SPSR & _BV(SPIF)))
; // wait
return SPDR;
}

//SRAM opcodes
#define WREN 0b00000110 //set write enable latch
#define WRDI 0b00000100 //write disable
#define RDSR 0b00000101 //read status register
#define WRSR 0b00000001 //write status register
#define READ 0b00000011 //read memory data
#define WRITE 0b00000010 //write memory data

uint8_t SpiRAMRead8(uint16_t address) {
uint8_t read_byte;

PORTB &= ~(1<>8)&0xff));
SpiTransfer((char)address);
read_byte = SpiTransfer(0xff);
PORTB |= (1<<PORTB2); //set CS high
return read_byte;
}

void SpiRAMWrite8(uint16_t address, uint8_t data) {
PORTB &= ~(1<<PORTB2); //set CS low
SpiTransfer(WREN);
PORTB |= (1<<PORTB2); //set CS high
PORTB &= ~(1<>8)&0xff));
SpiTransfer((char)address);
SpiTransfer(data);
PORTB |= (1<<PORTB2); //set CS high
}

void setup(void) {
uint16_t addr;
uint8_t i, sreg;

Serial.begin(9600);
sreg = SREG;
noInterrupts();

//pin setup
pinMode(10, OUTPUT); //CS
pinMode(11, OUTPUT); //MOSI
pinMode(12, INPUT); //MISO
pinMode(13, OUTPUT); //SCK
PORTB |= (1<> 2) & CLOCKX2_MASK);

//test it
for (addr=0; addr<4; addr++) {
SpiRAMWrite8(addr, (uint8_t)addr);
Serial.print("Addr: ");
Serial.print(addr);
i = SpiRAMRead8(addr);
Serial.print(" | Read: ");
Serial.println((uint16_t)i);
}
}

void loop() { }

Bit Bang Version:

//
//FM24CL64B SPI F-RAM
//64-Kbit
//
//bit-bang
//
/*
Arduino--Logic Conv--FRAM
D7-------TXH/TXL-----6.SCK
D6-------------------2.MISO
D5-------TXH/TXL-----5.MOSI
D4-------------------1.CS
3V3------LV
5V-------HV
GND------HV GND
GND------------------4.VSS
3V3------------------8.VCC
3V3------------------7.HOLD (tie to Vcc if not used)
3V3------[10KR]------1.CS
3.WP (active low, tie to Vcc if not used)
*/

//bitbang
uint8_t SpiTransfer(uint8_t _data) {
for (uint8_t bit=0; bit<8; bit++) {
if (_data & 0x80) //set/clear mosi bit
PORTD |= (1<<PORTD5);
else
PORTD &= ~(1<<PORTD5);
_data <<= 1; //shift for next bit
if (PIND) //capture miso bit
_data |= (PIND & (1<<PORTD6)) != 0;
PORTD |= (1<<PORTD7); //pulse clock
asm volatile ("nop \n\t"); //pause
PORTD &= ~(1<<PORTD7);
}
return _data;
}

//SRAM opcodes
#define WREN 0b00000110 //set write enable latch
#define WRDI 0b00000100 //write disable
#define RDSR 0b00000101 //read status register
#define WRSR 0b00000001 //write status register
#define READ 0b00000011 //read memory data
#define WRITE 0b00000010 //write memory data

uint8_t SpiRAMRead8(uint16_t address) {
uint8_t read_byte;

PORTD &= ~(1<>8)&0xff));
SpiTransfer((char)address);
read_byte = SpiTransfer(0xff);
PORTD |= (1<<PORTD4); //set CS high
return read_byte;
}

void SpiRAMWrite8(uint16_t address, uint8_t data) {
PORTD &= ~(1<<PORTD4); //set CS low
SpiTransfer(WREN);
PORTD |= (1<<PORTD4); //set CS high
PORTD &= ~(1<>8)&0xff));
SpiTransfer((char)address);
SpiTransfer(data);
PORTD |= (1<<PORTD4); //set CS high
}

void setup(void) {
uint16_t addr;
uint8_t i, sreg;

Serial.begin(9600);
//configure pins
pinMode(4, OUTPUT); //CS
pinMode(5, OUTPUT); //MOSI
pinMode(6, INPUT); //MISO
pinMode(7, OUTPUT); //SCK
PORTD |= (1<<PORTD4); //set CS high
PORTD &= ~_BV(PORTD7); //set clock low

//test it
for (addr=0; addr<32; addr++) {
SpiRAMWrite8(addr, (uint8_t)addr);
Serial.print("Addr: ");
Serial.print(addr);
i = SpiRAMRead8(addr);
Serial.print(" | Read: ");
Serial.println((uint16_t)i);
}
}

void loop() { }

Posted in Uncategorized | Tagged , , , | Leave a comment

Arduino and Cypress SPI FM25CL64B FRAM

lightning
The FM25CL64B is a 64K-bit ferroelectric RAM (F-RAM or FRAM) memory chip. Unlike typical flash and EEPROM memory, FRAM is capable of performing write operations at bus speed. According to the Cypress datasheet, this FRAM chip is capable of being clocked at up to 40MHz. I purchased a few SOIC-8 (150mils) chips for testing. I soldered the chip to a dipmicro SMT SOIC-to-DIP adapter PCB and kludged together a simple test program for my arduino:

FM25CL64Ba

FM25CL64Bb

Since the chip is not 5V tolerant, I used a SparkFun 12009 level shifter to perform the 5V to 3V3 logic conversions. Here is how I connect the arduino, FM25CL64B and level shifter:

Arduino--Logic Conv--FRAM
D13------TXH/TXL-----6.SCK
D12------------------2.MISO
D11------TXH/TXL-----5.MOSI
D10------------------1.CS
3V3------LV
5V-------HV
GND------HV GND
GND------------------4.VSS
3V3------------------8.VCC
3V3------------------7.HOLD (tie to Vcc if not used)
3V3------[10KR]------1.CS
                     3.WP (active low, tie to Vcc if not used)

Arduino program:

//
//FM24CL64B SPI F-RAM
//64-Kbit simple test
//
#include <SPI.h>

//SRAM opcodes
#define WREN  0b00000110 //set write enable latch
#define WRDI  0b00000100 //write disable
#define RDSR  0b00000101 //read status register
#define WRSR  0b00000001 //write status register
#define READ  0b00000011 //read memory data
#define WRITE 0b00000010 //write memory data
 
uint8_t SpiRAMRead8(uint16_t address) {
  uint8_t read_byte;
 
  PORTB &= ~(1<<PORTB2);              //set CS low
  SPI.transfer(READ);
  //13-bit address MSB, LSB
  SPI.transfer((char)((address>>8)&0xff));
  SPI.transfer((char)address);
  read_byte = SPI.transfer(0xFF);
  PORTB |= (1<<PORTB2);               //set CS high
  return read_byte;
}
 
void SpiRAMWrite8(uint16_t address, uint8_t data_byte) {
  PORTB &= ~(1<<PORTB2);              //set CS low
  SPI.transfer(WREN);
  PORTB |= (1<<PORTB2);               //set CS high
  PORTB &= ~(1<<PORTB2);              //set CS low
  SPI.transfer(WRITE);
  //13-bit address MSB, LSB
  SPI.transfer((char)((address>>8)&0xff));
  SPI.transfer((char)address);
  SPI.transfer(data_byte);
  PORTB |= (1<<PORTB2);               //set CS high
}
 
void setup(void) {
  uint16_t addr;
  uint8_t i;

  Serial.begin(9600);
  pinMode(10, OUTPUT);                //CS
  pinMode(11, OUTPUT);                //MOSI 
  pinMode(12, INPUT);                 //MISO
  pinMode(13, OUTPUT);                //SCK
  PORTB |= (1<<PORTB2);               //set CS high
  SPI.begin();
  SPI.setDataMode(SPI_MODE0);
  SPI.setBitOrder(MSBFIRST);
  SPI.setClockDivider (SPI_CLOCK_DIV2);
  for (addr=0; addr<32; addr++) {
    SpiRAMWrite8(addr, (uint8_t)addr);
    Serial.print("Addr: ");
    Serial.print(addr);
    i = SpiRAMRead8(addr);
    Serial.print(" | Read: ");
    Serial.println((uint16_t)i);
  }
}
 
void loop() { }

Posted in Uncategorized | Tagged , , , , , , | Leave a comment

Convert an ASCII String to Fixed Point: atofp()

conversion

Here is a small utility routine which converts an ASCII string floating point number into an s16.15 format fixed point number. Most fixed point libraries neglect this conversion. However, in practice, this routine is very useful. If the conversion process is not efficient, the gains of using fixed point over floating point math can be eliminated. Having said that, this is not pretty code and neither is it efficient. And it breaks a few coding rules too.

It is also important to note, the routine does very little (almost no) validity testing of the input values (size of integer/fixed point numbers, valid characters, sufficient string space, etc.). So there is plenty of opportunity here for spectacular failure.

The complementary conversion, fptoa() is also included.

//atol function ignores sign
int32_t _atol(const char* s) {
  int32_t v=0;
  
  while (*s == ' ' || (uint16_t)(*s - 9) < 5u) {
    ++s;
  }
  if (*s == '-' || *s == '+') {
    ++s;
  }
  while ((uint16_t)(*s - '0') < 10u) {
    v = v*10 + *s - '0';
    ++s;
  }
  return v;
}

#define MAX_STRING_SIZE 8

//basic string copy
static inline void _strcpy(char *d, const char *s) {
  uint8_t n=0;
  
  while (*s != '\0') {
    if (n++ >= MAX_STRING_SIZE) {
      //destination max size
      return;
    }
    *d++ = *s++;
  }
}

//basic string concatenation
void _concat(char *d, char *s) {
  uint8_t n=0;
  
  while(*d) {
    d++;
  }
  while(*s && n<MAX_STRING_SIZE) {
    *d++ = *s++;
    n++;
  }
  *d = '\0';
}

//int32_t atofp(char *)
int32_t FP_StrToFix(char *s) {
  int32_t f, fpw, fpf, bit, r[15] = {
    0x2faf080, 0x17d7840, 0xbebc20, 0x5f5e10, 0x02faf08, 0x017d784, 0x0bebc2, 0x05f5e1,
    0x002faf1, 0x0017d78, 0x00bebc, 0x005f5e, 0x0002faf, 0x00017d8, 0x000bec //0x0005f6
  };
  uint8_t sign, i;
  char *p=s, temp[9] = "00000000";

  sign = 0;
  //separate whole & fraction portions
  while (*p != '.') {
    //check for negative sign
    if (*p == '-') {
      sign = 1;
    }
    if (*p == '\0') {
      //no decimal found, return integer as fixed point
      return sign ? -(_atol(s)<<FP_FBITS) : (_atol(s)<<FP_FBITS);
    }
    p++;
  }

  //whole part
  *p = '\0';
  fpw = (_atol(s)<<FP_FBITS);

  //pad fraction part with trailing zeros
  _strcpy(temp, (p + 1));
  //get fraction
  f = _atol(temp);
  //re-insert decimal point
  *p = '.';

  fpf = 0;
  bit = 0x4000;
  //convert base10 fraction to fixed point base2
  for (i=0; i<15; i++) {
    if (f - r[i] > 0) {
      f -= r[i];
      fpf += bit;
    }
    bit >>= 1;
  }

  //join fixed point whole and fractional parts
  return sign ? -(fpw + fpf) : (fpw + fpf);
}

//void fptoa(int32_t, char *)
void FP_FixToStr(int32_t f, char *s) {
  int32_t fp, bit=0x4000, r[16] = { 50000, 25000, 12500, 6250, 3125, 1563, 781, 391, 195, 98, 49, 24, 12, 6, 3 };
  int32_t d[5] = { 10000, 1000, 100, 10 };
  char *p=s, *sf, temp[12];
  uint8_t i;
  
  //get whole part
  fp = ktoi(f);
  if (fp == 0) {
    *p = '0';
    } else {
      p = ltoa(fp, s, 10);
  }

  //get fractional part
  fp = FP_FRAC_PART(f);
  if (fp == 0) {
    return;
  }
  //iterate to end of string
  while (*p != '\0') p++;
  *p++ = '.'; //add decimal to end of s
  *p = '\0';  //terminate string
  
  f = 0;
  //convert fraction base 2 to base 10
  for (i=0; i<15; i++) {
    if (fp & bit) {
      f += r[i];
    }
    bit >>= 1;
  }
  //temporary string storage space
  sf = temp;
  sf = ltoa(f, sf, 10);
  
  // if needed, add leading zeros to fractional portion
  for (i=0; i<4; i++) {
    if (f < d[i]) {
      *p++ = '0';
      *p = '\0';
    } else {
      break;
    }
  }
  
  //combine whole & fractional parts
  _concat(s, sf);
}
Posted in Uncategorized | Tagged , , , , | 2 Comments