Writing code robust to SD card failures
Posted: Mon Dec 13, 2021 12:05 pm
Hello ESP community,
I'd like to share my code and ideas on how to continuously read a sensor and write the measurements to an SD card in a robust way. Robustness means that:
1) sensor measurement delays are not affected by long SD card IO commands;
2) all measurements are guaranteed to be saved to an SD card in the correct order.
The post is based on the SD lib issue in arduino-esp32 https://github.com/espressif/arduino-esp32/issues/5998 although the code was originally written in ESP-IDF.
The first criteria obviously demand a thread-based implementation with one thread for reading (polling) a sensor and at least one thread for dumping the measurements to an SD card. Here is the framework I've outlined for doing this:
Suggestions? How are you handling file IO errors?
Best,
Danylo
I'd like to share my code and ideas on how to continuously read a sensor and write the measurements to an SD card in a robust way. Robustness means that:
1) sensor measurement delays are not affected by long SD card IO commands;
2) all measurements are guaranteed to be saved to an SD card in the correct order.
The post is based on the SD lib issue in arduino-esp32 https://github.com/espressif/arduino-esp32/issues/5998 although the code was originally written in ESP-IDF.
The first criteria obviously demand a thread-based implementation with one thread for reading (polling) a sensor and at least one thread for dumping the measurements to an SD card. Here is the framework I've outlined for doing this:
- Start two threads: read_sensor (the sender) bound to core 0 and write_data (the receiver) to core 1. It's better to swap the cores on which to run the code because there is some code routine hidden from the developer that is always being run on code 0, but I found that some commands and libraries like Arduino Wire don't like to be run on core 1.
- The "read_sensor" thread is sensor-specific, but if the data needs to be sampled at a high frequency (> 1000 Hz), you can't use the vTaskDelay() function in the thread. Instead, I
- start a timer with the resolution you want (say, 500 us)
- when the timer is triggered, I call "xTaskNotifyGive(read_sensor_task_handle)"
- inside the "read_sensor" task body, I'm awaiting for "ulTaskNotifyTake(pdTRUE, portMAX_DELAY)".
In either case, the measurements should be sent to a receiving thread ("write_data") via the "xQueueSend()" function. - The receiver, "write_data", awaits for messages with the "xQueueReceive()" function and accumulates them in a temporary buffer array. Ideally, the size of this array should be a multiple of the SD card sector size, which is 512 bytes. But I found having the exact match is not crucial. Once the array is filled with the desired number of samples, a write command is issued to a previously open file. That's where I spent the whole week trying different options best for Arduino SD lib and here is the pseudo-code I've come up with:
Code: Select all
#include "SD.h" // sometimes we can get away without (long) SD card restarting #define SD_WAIT_UNTIL_RESTART_MS 100 typedef struct Record ... static FILE* open_file() { static int trial = 0; char fpath[128]; snprintf(fpath, sizeof(fpath), "/sd/file-%03d.BIN", trial++); FILE *file = fopen(fpath, "w"); int64_t t_last = esp_timer_get_time(); while (file == NULL) { int64_t t_curr = esp_timer_get_time(); if (t_curr - t_last > SD_WAIT_UNTIL_RESTART_MS * 1000L) { // this is Arduino-specific code to restart the SD card SD.end(); while (!SD.begin()) delay(10); t_last = t_curr; } // a delay is needed to notify the WatchDog // and let other threads not to miss their work vTaskDelay(pdMS_TO_TICKS(10)); file = fopen(fpath, "w"); } return file; } static void write_data() { Record records[RECORDS_BUFFER_SIZE]; FILE *file = open_file(); while (1) { // fill in the records here with the xQueueReceive // ... size_t written_cnt = fwrite(records, sizeof(Record), RECORDS_BUFFER_SIZE, file); int fflush_res = fflush(file); int fsync_res = fsync(fileno(file)); while (!(written_cnt == RECORDS_BUFFER_SIZE && fflush_res == 0 && fsync_res == 0)) { ESP_LOGE(TAG, "fwrite failed. Reopening..."); fclose(file); file = open_file(); written_cnt += fwrite(&records[written_cnt], sizeof(Record), RECORDS_BUFFER_SIZE - written_cnt, file); } vTaskDelay(pdMS_TO_TICKS(10)); } }
Suggestions? How are you handling file IO errors?
Best,
Danylo