FAT FS on External Flash example

derricksenva
Posts: 21
Joined: Tue Aug 02, 2022 8:19 pm

FAT FS on External Flash example

Postby derricksenva » Mon Nov 13, 2023 7:26 pm

We're currently prototyping the addition of an external flash chip (Winbond W25Q128JV) to an ESP32-PICO-V3-02 and basing it off of the FAT FS on External Flash example. The manufacturer ID, device ID, and memory size are read correctly during initialization, but the idle task watchdog eventually times out when attempting to erase all 16 Mbytes of memory by 64-kbyte blocks at a time. From the data sheet, the flash would need more than 40 seconds to erase all blocks, but the idle task triggers the task watchdog about 5 seconds into the call to:

Code: Select all

ESP_ERROR_CHECK(esp_partition_erase_range(fat_partition, offset, ext_flash->size));
I've tried adjusting CONFIG_SPI_FLASH_ERASE_YIELD_TICKS to 20 ticks and CONFIG_SPI_FLASH_ERASE_YIELD_DURATION_MS to 150 ms, but I didn't notice any difference. An SPI capture still shows the ESP32 continuously polling the flash chip for a "write finished" status. I'm confused by the descriptions as to what these two config options actually do. It doesn't seem like the CPU yields at all to allow the idle task to run.

Is the problem that the SPI flash driver is intended for code storage and not data storage? Is this not configurable? ESP-IDF uses it for the internal SPI flash chip used for code, and the SPI Flash API documentation says:
OS functions can also help to avoid a watchdog timeout when erasing large flash areas. During this time, the CPU is occupied with the flash erasing task. This stops other tasks from being executed. Among these tasks is the idle task to feed the watchdog timer (WDT).
This seems to line up with what we're observing. Since we're only using this external flash for data and it's not directly memory mapped by the MMU, we have no requirement to stop every task from running. We simply need our read/write/erase calls to block the calling task until they finish (or provide a callback when they finish). Should we be using a different API or configuring/customizing the current one for our application? Any guidance would be appreciated.

For reference, this is the output log dump:

Code: Select all

I (0) cpu_start: App cpu up.
I (310) cpu_start: Pro cpu start user code
I (310) cpu_start: cpu freq: 160000000 Hz
I (310) cpu_start: Application information:
I (315) cpu_start: Project name:     ext_flash_fatfs
I (321) cpu_start: App version:      1
I (325) cpu_start: Compile time:     Nov 13 2023 09:25:18
I (331) cpu_start: ELF file SHA256:  e756ddec5326d3ed...
I (337) cpu_start: ESP-IDF:          v5.0.3-1-ga1b780f6a4
I (343) cpu_start: Min chip rev:     v3.0
I (348) cpu_start: Max chip rev:     v3.99
I (353) cpu_start: Chip rev:         v3.0
I (358) heap_init: Initializing. RAM available for dynamic allocation:
I (365) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
I (371) heap_init: At 3FFB2920 len 0002D6E0 (181 KiB): DRAM
I (377) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (383) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (390) heap_init: At 4008C324 len 00013CDC (79 KiB): IRAM
I (398) spi_flash: detected chip: generic
I (401) spi_flash: flash io: dio
W (405) flash_encrypt: Flash encryption mode is DEVELOPMENT (not secure)
I (413) cpu_start: Starting scheduler on PRO CPU.
I (0) cpu_start: Starting scheduler on APP CPU.
I (423) example: Initializing external SPI Flash
I (433) example: Pin assignments:
I (433) example: MOSI:  8   MISO: 34   SCLK:  5   CS:  2
I (443) example: DMA CHANNEL: 3
I (443) spi_flash: detected chip: winbond
I (443) spi_flash: flash io: fastrd
I (453) example: Initialized external Flash, size=16384 KB, ID=0xef4018
I (463) example: Adding external Flash as a partition, label="storage", size=16384 KB
I (463) example: Erasing partition range, offset=0 size=16384 KB
E (5423) task_wdt: Task watchdog got triggered. The following tasks/users did not reset the watchdog in time:
E (5423) task_wdt:  - IDLE (CPU 0)
E (5423) task_wdt: Tasks currently running:
E (5423) task_wdt: CPU 0: main
E (5423) task_wdt: CPU 1: IDLE
E (5423) task_wdt: Print CPU 0 (current core) backtrace


Backtrace: 0x400DF2DF:0x3FFB0E30 0x400DF466:0x3FFB0E50 0x40082775:0x3FFB0E70 0x40089294:0x3FFB49D0 0x4008B166:0x3FFB4A10 0x4008BBD7:0x3FFB4A50 0x4008BC25:0x3FFB4A70 0x4008C02B:0x3FFB4AA0 0x400845E5:0x3FFB4AC0 0x400DE77B:0x3FFB4B00 0x400D61D1:0x3FFB4B20 0x400D62C4:0x3FFB4B60 0x400F1ABA:0x3FFB4C20 0x40088AE1:0x3FFB4C50
0x400df2df: task_wdt_timeout_handling at C:/Espressif/frameworks/esp-idf-v5.0.3/components/esp_system/task_wdt/task_wdt.c:461 (discriminator 3)

0x400df466: task_wdt_isr at C:/Espressif/frameworks/esp-idf-v5.0.3/components/esp_system/task_wdt/task_wdt.c:585

0x40082775: _xt_lowint1 at C:/Espressif/frameworks/esp-idf-v5.0.3/components/freertos/FreeRTOS-Kernel/portable/xtensa/xtensa_vectors.S:1118

0x40089294: spi_flash_ll_set_usr_address at C:/Espressif/frameworks/esp-idf-v5.0.3/components/hal/esp32/include/hal/spi_flash_ll.h:360
 (inlined by) spi_flash_hal_common_command at C:/Espressif/frameworks/esp-idf-v5.0.3/components/hal/spi_flash_hal_common.inc:147

0x4008b166: memspi_host_read_status_hs at C:/Espressif/frameworks/esp-idf-v5.0.3/components/spi_flash/memspi_host_driver.c:120

0x4008bbd7: spi_flash_chip_generic_read_reg at C:/Espressif/frameworks/esp-idf-v5.0.3/components/spi_flash/spi_flash_chip_generic.c:404

0x4008bc25: spi_flash_chip_generic_wait_idle at C:/Espressif/frameworks/esp-idf-v5.0.3/components/spi_flash/spi_flash_chip_generic.c:454

0x4008c02b: spi_flash_chip_winbond_erase_block at C:/Espressif/frameworks/esp-idf-v5.0.3/components/spi_flash/spi_flash_chip_winbond.c:142

0x400845e5: esp_flash_erase_region at C:/Espressif/frameworks/esp-idf-v5.0.3/components/spi_flash/esp_flash_api.c:615
0x400de77b: esp_partition_erase_range at C:/Espressif/frameworks/esp-idf-v5.0.3/components/esp_partition/partition_target.c:134

0x400d61d1: example_add_partition at C:/ext_flash_fatfs/main/ext_flash_fatfs_example_main.c:179 (discriminator 13)

0x400d62c4: app_main at C:/ext_flash_fatfs/main/ext_flash_fatfs_example_main.c:77

0x400f1aba: main_task at C:/Espressif/frameworks/esp-idf-v5.0.3/components/freertos/FreeRTOS-Kernel/portable/port_common.c:131 (discriminator 2)

0x40088ae1: vPortTaskWrapper at C:/Espressif/frameworks/esp-idf-v5.0.3/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:154

MicroController
Posts: 1708
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: FAT FS on External Flash example

Postby MicroController » Mon Nov 13, 2023 10:36 pm

Code: Select all

ESP_ERROR_CHECK(esp_partition_erase_range(fat_partition, offset, ext_flash->size));
I'd suggest trying to erase in multiple smaller ranges instead of all at once, like

Code: Select all

const size_t page_size = 4096;
size_t offset = 0;
size_t pagesLeft = ext_flash->size / page_size;
while (pagesLeft) {
  const size_t pages = (pagesLeft >= 32) ? 32 : pagesLeft;
  ESP_ERROR_CHECK(esp_partition_erase_range(fat_partition, offset, pages*page_size));
  offset += pages*page_size;
  pagesLeft -= pages;
  vTaskDelay(1);
}

derricksenva
Posts: 21
Joined: Tue Aug 02, 2022 8:19 pm

Re: FAT FS on External Flash example

Postby derricksenva » Tue Nov 14, 2023 5:26 pm

MicroController wrote:
Mon Nov 13, 2023 10:36 pm
I'd suggest trying to erase in multiple smaller ranges instead of all at once, like
Your suggestion does fix the immediate problem. The example works now that we're manually ensuring a single call to esp_partition_erase_range will not last longer than the task watchdog timeout.

Our concerns are still in question though. If the underlying calls to erase flash are blocking all tasks from executing, then we're thinking to either modify SPI Flash API provided by ESP-IDF, or develop a different solution. Modifying the current API seems like it would be difficult since it's also already used for the internal flash where code executed from. But, since there are a lot of underlying layers to this API, I'm wondering if there's any way to use/modify it in such a way that's ideal for external flash that's only used for data storage (i.e. does not block all tasks from executing).

MicroController
Posts: 1708
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: FAT FS on External Flash example

Postby MicroController » Tue Nov 14, 2023 6:10 pm

Modifying the current API seems like it would be difficult since it's also already used for the internal flash where code executed from
If you access the external flash over the same SPI bus which the internal flash uses I'm not sure that's even possible. If the CPU kept running other tasks it would likely have to fetch instructions from the internal flash at some point, which won't work while the SPI bus is in use for the external flash.
However, I too think that CONFIG_SPI_FLASH_ERASE_YIELD_TICKS and CONFIG_SPI_FLASH_ERASE_YIELD_DURATION_MS should actually resolve the issue. Not sure why it didn't in your case.

(esp_flash_erase_region(...) seems to do just an erase-yield loop aswell.)

derricksenva
Posts: 21
Joined: Tue Aug 02, 2022 8:19 pm

Re: FAT FS on External Flash example

Postby derricksenva » Wed Nov 15, 2023 5:32 pm

MicroController wrote:
Tue Nov 14, 2023 6:10 pm
If you access the external flash over the same SPI bus which the internal flash uses I'm not sure that's even possible. If the CPU kept running other tasks it would likely have to fetch instructions from the internal flash at some point, which won't work while the SPI bus is in use for the external flash.
However, I too think that CONFIG_SPI_FLASH_ERASE_YIELD_TICKS and CONFIG_SPI_FLASH_ERASE_YIELD_DURATION_MS should actually resolve the issue. Not sure why it didn't in your case.

(esp_flash_erase_region(...) seems to do just an erase-yield loop aswell.)
I should have clarified, we're using a different bus, SPI3/VSPI, for this external flash.

I took a brief look at the source where CONFIG_SPI_FLASH_ERASE_YIELD_TICKS and CONFIG_SPI_FLASH_ERASE_YIELD_DURATION_MS are referenced, but couldn't immediately tell how a CPU yield is supposedly accomplished. I guess I can spend a little more time looking at that.

I was hoping someone from Espressif could also at least comment or give some guidance in case I should be looking elsewhere. The initial idea to explore the FAT option for external flash was mostly driven by the fact that we're already using that API and would prefer to reuse components rather than add new complexity. We currently use an FAT partition on internal flash for separate data storage, but that specifically required the VFS interface, built-in encryption, wear leveling, and the partition is less than a Mbyte. For this external flash, we really only require wear-leveling, but having a VFS interface would be very helpful. Maybe we should consider using SPIFFS instead? Outside of ESP-IDF, I've seen some other users in this forum suggest littlefs. Maybe that could be paired with the VFS component?

MicroController
Posts: 1708
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: FAT FS on External Flash example

Postby MicroController » Wed Nov 15, 2023 6:29 pm

how a CPU yield is supposedly accomplished
vTaskDelay(...) is how this kind of yielding is done.
Maybe we should consider using SPIFFS instead?
The choice of file system will hardly have any influence on whether or not flash writes (need to) disable task scheduling.

derricksenva
Posts: 21
Joined: Tue Aug 02, 2022 8:19 pm

Re: FAT FS on External Flash example

Postby derricksenva » Wed Nov 15, 2023 6:50 pm

MicroController wrote:
Wed Nov 15, 2023 6:29 pm
how a CPU yield is supposedly accomplished
vTaskDelay(...) is how this kind of yielding is done.
You're right. I guess this is why I'm confused why the original example code was still triggering the watchdog timeout when attempting to erase all 16 Mbytes (in blocks of 64 kbytes) in a single call. The yield appears to be pointless if every task is still forcefully suspended as it says in the documentation.
Maybe we should consider using SPIFFS instead?
The choice of file system will hardly have any influence on whether or not flash writes (need to) disable task scheduling.
Yes. I was just wondering if ESP-IDF's SPIFFS implementation is also built on top of the same SPI Flash API with no adjustments. In that case, it would still have the same issue with suspending all tasks during erase operations.

Who is online

Users browsing this forum: Bing [Bot], Google [Bot], Majestic-12 [Bot] and 90 guests