I'm experiencing an issue on a small percentage of boards where the flash config of a board appears to become corrupt in the field which essentially bricks the device.
When this happens, the console starts looping with this message:
Code: Select all
rst:0x10 (RTCWDT_RTC_RESET),boot:0x3b (SPI_FAST_FLASH_BOOT)
flash read err, 1000
ets_main.c 371
ets Jun 8 2016 00:22:57
I pulled a complete flash image off a device displaying this behavior using esptool and saved it. I can share this with Espressif if needed. It contains proprietary information, of course, so that will need to be done privately.
I have tried just flashing individual sections including the ota_data, the partition table, and the bootloader, and this does not fix the issue.
I then performed a `make flash` and that fixed the issue. Reflashing the original "bad" image with esptool.py puts the board back into the bricked state - so this does appear to be due to something stored in flash.
I suspect this has to do with the "flash config" structure, but I have not had any luck finding more information about that. It seems that it is referenced from ROM, perhaps?
I believe this situation appears when the board has been power cycled repeatedly, but I don't have definitive proof of that.
My firmware does not use any flash writing features except for OTA and NVS.
So, my questions are:
1. Any ideas of how this could happen in the field? Could power cycling the module repeatedly cause some kind of flash corruption?
2. Would providing my "broken" image to Espressif help me debug this issue?
3. Are there things I can look for in the broken image that might lead me to understand what went wrong?
Thanks,
Jason