ESP32-WROOM first stage bootloader failure

JamesZT
Posts: 4
Joined: Fri Aug 27, 2021 2:10 am

ESP32-WROOM first stage bootloader failure

Postby JamesZT » Fri Aug 27, 2021 2:44 am

One of our development boards that gets a fair amount of abuse has recently developed a boot failure:
ets Jul 29 2019 12:21:46

rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0018,len:4
load:0xff001cff,len:267327
1150 mmu set 00010000, pos 00010000
1150 mmu set 00020000, pos 00020000
1150 mmu set 00030000, pos 00030000
1150 mmu set 00040000, pos 00040000
ho 0 tail 11 room 5
load:0x00201929,len:36306944
ets Jul 29 2019 12:21:46

rst:0x10 (RTCWDT_RTC_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0018,len:4
load:0xff001cff,len:267327
1150 mmu set 00010000, pos 00010000
1150 mmu set 00020000, pos 00020000
1150 mmu set 00030000, pos 00030000
1150 mmu set 00040000, pos 00040000
ho 0 tail 11 room 5
load:0x00201929,len:36306944

It looks like reading the address that it should load the second stage to has been incorrectly read by one byte.
Reporting:
load:0xff001cff,len:267327

Should be:
load:0x3fff001c,len:1044

This device has been subjected to brownout situations with CHIP_PU high and this may be the root cause based on others experiences:
https://github.com/espressif/esp-idf/issues/4968

I have re-flashed this device with a full flash image using esptool from a working device and also compared all the efuse settings between the two but the issue remains.

Whilst I am not too concerned about fixing this specific device, I would like to understand the failure mechanism and whether this state is recoverable. I suspect we have production units in the field that have suffered from a similar fate and we are looking to get them back for forensics.

Any insights would be most appreciated.

WiFive
Posts: 3529
Joined: Tue Dec 01, 2015 7:35 am

Re: ESP32-WROOM first stage bootloader failure

Postby WiFive » Fri Aug 27, 2021 6:30 am

The read address of the second segment header is off by one byte but it is not actually reading any corrupt bytes. That is strange because it read the first segment correctly. If you dumped the flash and it matches the image then I would put a logic analyzer or scope on the spi lines.

JamesZT
Posts: 4
Joined: Fri Aug 27, 2021 2:10 am

Re: ESP32-WROOM first stage bootloader failure

Postby JamesZT » Fri Aug 27, 2021 6:53 am

Thanks WiFive, good suggestion.

Yes, I've dumped the second stage header from flash and it looks fine to me.

Code: Select all

00001000: e904 024f a806 0840 ee00 0000 0000 0000  ...O...@........
00001010: 0000 0000 0000 0001 1800 ff3f 0400 0000  ...........?....
00001020: ffff ffff 1c00 ff3f 1404 0000 0000 0000  .......?........
00001030: 0000 0080 0000 00a0 0000 00c0 0000 00e0  ................
It's as if a bit in ROM has been flipped on the header location address to increment it by one. The above behavior is now 100% repeatable for this device so it doesn't appear to be a random timing issue or edge case.

WiFive
Posts: 3529
Joined: Tue Dec 01, 2015 7:35 am

Re: ESP32-WROOM first stage bootloader failure

Postby WiFive » Fri Aug 27, 2021 5:06 pm

Decremented by 1, it reads x1023 instead of x1024 which is more than 1 bit flip. It is also not a fixed address it is based on len(segment_1).

JamesZT
Posts: 4
Joined: Fri Aug 27, 2021 2:10 am

Re: ESP32-WROOM first stage bootloader failure

Postby JamesZT » Fri Aug 27, 2021 9:46 pm

Good catch, yes decremented by 1. Does anyone know if the source for the first stage bootloader is available?

WiFive
Posts: 3529
Joined: Tue Dec 01, 2015 7:35 am

Re: ESP32-WROOM first stage bootloader failure

Postby WiFive » Sat Aug 28, 2021 11:42 pm

It is not. I would first capture the spi transaction to see if it is sending the correct read address then you know whether to blame esp32 or the flash chip.

JamesZT
Posts: 4
Joined: Fri Aug 27, 2021 2:10 am

Re: ESP32-WROOM first stage bootloader failure

Postby JamesZT » Mon Sep 06, 2021 1:33 am

I'll have to see if we can get the shield off to get to the SPI bus. This is the ESP32-WROOM-32UE variant with onboard external flash and the SPI bus is not connected to the package's pins. Sorry, I should have picked up on this earlier.

phatpaul
Posts: 110
Joined: Fri Aug 24, 2018 1:14 pm

Re: ESP32-WROOM first stage bootloader failure

Postby phatpaul » Tue Sep 21, 2021 4:46 pm

We've found a similar error on a unit returned from a customer.
Curiously the garbage load address is the same as OP reported.
0xff001cff looks like a left-shifted version of the correct 0x3fff001c

How does this happen to code that's in ROM?

Code: Select all

rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0018,len:4
load:0xff001cff,len:1809471
LLMmmu set 00010000, pos 00010000
LLMmmu set 00020000, pos 00020000
LLMmmu set 00030000, pos 00030000
LLMmmu set 00040000, pos 00040000
LLMmmu set 00050000, pos 00050000
LLMmmu set 00060000, pos 00060000
LLMmmu set 00070000, pos 00070000
LLMmmu set 00080000, pos 00080000
LLM[[]®Y]0090000, pos 00090000
LLMmmu set 000a0000, pos 000a0000
LLMmmu set 000b0000, pos 000b0000
LLMmmu set 000c0000, pos 000c0000
LLMmmu set 000d0000, pos 000d0000
LLMmmu set 000e0000, pos 000ýets Jul 29 2019 12:21:46

rst:0x10 (RTCWDT_RTC_RESET),boot:0x3 (DOWNLOAD_BOOT(UART0/UART1/SDIO_REI_REO_V2))
waiting for download
Yº.H©][9 2019 12:21:46

rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0018,len:4
load:0xff001cff,len:1809471
LLMmmu set 00010000, pos 00010000
LLMmmu set 00020000, pos 00020000
LLMmmu set 00030000, pos 00030000
LLMmmu set 00040000, pos 00040000
LLMmmu set 00050000, pos 00050000
LLMmmu set 00060000, pos 00060000
LLMmmu set 00070000, pos 00070000
LLMmmu set 00080000, pos 00080000
LLMmmu set 00090000, pos 00090000
LLMmmu set 000a0000, pos 000a0000
LLMmmu set 000b0000, pos 000b0000
LLMmmu set 000c0000, pos 000c0000
LLMmmu set 000d0000, pos 000d0000
LLMmmu set 000e0000, pos 000eets Jul 29 2019 12:21:46

rst:0x10 (RTCWDT_RTC_RESET),boot:0x3 (DOWNLOAD_BOOT(UART0/UART1/SDIO_REI_REO_V2))
waiting for download
ESP-WROVER-E
Chip is ESP32-D0WD-V3 (revision 3)
Features: WiFi, BT, Dual Core, 240MHz, VRef calibration in efuse, Coding Scheme None
Crystal is 40MHz
MAC: e0:e2:e6:4b:be:f8

phatpaul
Posts: 110
Joined: Fri Aug 24, 2018 1:14 pm

Re: ESP32-WROOM first stage bootloader failure

Postby phatpaul » Mon Nov 15, 2021 4:43 pm

We've recently gotten at least 2 more bricked devices returned from customers with exactly the same symptoms as above. Of course my boss wants a root cause and solution. Is this a known bug?

Serial console shows:

Code: Select all

ets Jul 29 2019 12:21:46

rst:0x10 (RTCWDT_RTC_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0018,len:4
load:0xff001cff,len:1809471
1150 mmu set 00010000, pos 00010000
1150 mmu set 00020000, pos 00020000
1150 mmu set 00030000, pos 00030000
1150 mmu set 00040000, pos 00040000
1150 mmu set 00050000, pos 00050000
1150 mmu set 00060000, pos 00060000
1150 mmu set 00070000, pos 00070000
1150 mmu set 00080000, pos 00080000
1150 mmu set 00090000, pos 00090000
1150 mmu set 000a0000, pos 000a0000
1150 mmu set 000b0000, pos 000b0000
1150 mmu set 000c0000, pos 000c0000
1150 mmu set 000d0000, pos 000d0000
1150 mmu set 000e0000, pos 000e0000
ets Jul 29 2019 12:21:46

rst:0x10 (RTCWDT_RTC_RESET),boot:0x3 (DOWNLOAD_BOOT(UART0/UART1/SDIO_REI_REO_V2))
waiting for download

WiFive
Posts: 3529
Joined: Tue Dec 01, 2015 7:35 am

Re: ESP32-WROOM first stage bootloader failure

Postby WiFive » Wed Nov 24, 2021 6:44 pm

Interested to know if there is any updates or resolution to these issues

Who is online

Users browsing this forum: pmi2410 and 105 guests