Hi,
I have a nice product using ESP-WROOM-32 in AP + STA mode. This application works mainly as expected but sometimes the system reboots and from that moment on it freezes... Looking under the hood, there is a infinite sequence of retries as shown here below:
..........
ets Jun 8 2016 00:22:57
rst:0x7 (TG0WDT_SYS_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0018,len:4
load:0x4f3f0045,len:4195573
1162 mmu set 00010000, pos 00010000
1162 mmu set 00020000, pos 00020000
1162 mmu set 00030000, pos 00030000
1162 mmu set 00040000, pos 00040000
1162 mmu set 00050000, pos 00050000
1162 mmu set 00060000, pos 00060000
1162 mmu set 00070000, pos 00070000
1162 mmu set 00080000, pos 00080000
1162 mmu set 00090000, pos 00090000
1162 mmu set 000a0000, pos 000a0000
1162 mmu set 000b0000, pos 000b0000
1162 mmu set 000c0000, pos 000c0000
1162 mmu set 000d0000, pos 000d0000
1162 mmu set 000e0000, pos 000e0000
1162 mmu set 000f0000, pos 000f0000
1162 mmu set 00100000, pos 00100000
1162 mmu set 00110000, pos 00110000
1162 mmu set 00120000, pos 00120000
1162 mmu set 00130000, pos 00130000
1162 mmu set 00140000, pos 00140000
1162 mmu set 00150000, pos 00150000
1162 mmu set 00160000, pos 00160000
1162 mmu set 00170000, pos 00170000
1162 mmu set 00180000, pos 00180000
1162 mmu set 00190000, pos 00190000
1162 mmu set 001a0000, pos 001a0000
ets Jun 8 2016 00:22:57
rst:0x7 (TG0WDT_SYS_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0018,len:4
load:0x4f3f004d,len:4195573
1162 mmu set 00010000, pos 00010000
1162 mmu set 00020000, pos 00020000
1162 mmu set 00030000, pos 00030000
1162 mmu set 00040000, pos 00040000
1162 mmu set 00050000, pos 00050000
1162 mmu set 00060000, pos 00060000
1162 mmu set 00070000, pos 00070000
1162 mmu set 00080000, pos 00080000
1162 mmu set 00090000, pos 00090000
1162 mmu set 000a0000, pos 000a0000
1162 mmu set 000b0000, pos 000b0000
1162 mmu set 000c0000, pos 000c0000
1162 mmu set 000d0000, pos 000d0000
1162 mmu set 000e0000, pos 000e0000
1162 mmu set 000f0000, pos 000f0000
1162 mmu set 00100000, pos 00100000
1162 mmu set 00110000, pos 00110000
1162 mmu set 00120000, pos 00120000
1162 mmu set 00130000, pos 00130000
1162 mmu set 00140000, pos 00140000
1162 mmu set 00150000, pos 00150000
1162 mmu set 00160000, pos 00160000
1162 mmu set 00170000, pos 00170000
1162 mmu set 00180000, pos 00180000
1162 mmu set 00190000, pos 00190000
1162 mmu set 001a0000, pos 001a0000
ets Jun 8 2016 00:22:57
rst:0x7 (TG0WDT_SYS_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0018,len:4
load:0x6f3f0045,len:4195573
1162 mmu set 00010000, pos 00010000
1162 mmu set 00020000, pos 00020000
1162 mmu set 00030000, pos 00030000
1162 mmu set 00040000, pos 00040000
1162 mmu set 00050000, pos 00050000
1162 mmu set 00060000, pos 00060000
1162 mmu set 00070000, pos 00070000
1162 mmu set 00080000, pos 00080000
1162 mmu set 00090000, pos 00090000
1162 mmu set 000a0000, pos 000a0000
1162 mmu set 000b0000, pos 000b0000
1162 mmu set 000c0000, pos 000c0000
1162 mmu set 000d0000, pos 000d0000
1162 mmu set 000e0000, pos 000e0000
1162 mmu set 000f0000, pos 000f0000
1162 mmu set 00100000, pos 00100000
1162 mmu set 00110000, pos 00110000
1162 mmu set 00120000, pos 00120000
1162 mmu set 00130000, pos 00130000
1162 mmu set 00140000, pos 00140000
1162 mmu set 00150000, pos 00150000
1162 mmu set 00160000, pos 00160000
1162 mmu set 00170000, pos 00170000
1162 mmu set 00180000, pos 00180000
1162 mmu set 00190000, pos 00190000
1162 mmu set 001a0000, pos 001a0000
ets Jun 8 2016 00:22:57
..........
The weird thing is the baudrate of serial console... to correctly capture above console messages after the first system reboot we need to change the serial baudrate to about 170kbps (originally our application uses a more regular 115200 bps).
I guess that after first reboot the ESP32 can't correctly load 2nd bootloader from external SPI flash, but I don't know exactly what is and why there could be a TG0WDT_SYS_RESET...
In addition, why a change in serial baudrate?
Thank you in advance for your ideas!
rayf15
ESP32 reboots, SPI flash issues and console baudrates...
Re: ESP32 reboots, SPI flash issues and console baudrates...
Hi all,
I can't believe that nobody has ever faced such a situation...
It is really important to understand if a wrong status of ESP32 bootstrap pins could be a possible reason for such a locked system.
Moreover, just to be more explicit in my questions, is the baud rate of ESP32's console set only using a fixed prescaler ratio, hardcoded in the chip's ROM used for first internal bootloader?
If so, there must be a real problem with the boot mechanism of the chip, at least in the described situation.
Thank you... any suggestion is always welcome!
rayf15
I can't believe that nobody has ever faced such a situation...
It is really important to understand if a wrong status of ESP32 bootstrap pins could be a possible reason for such a locked system.
Moreover, just to be more explicit in my questions, is the baud rate of ESP32's console set only using a fixed prescaler ratio, hardcoded in the chip's ROM used for first internal bootloader?
If so, there must be a real problem with the boot mechanism of the chip, at least in the described situation.
Thank you... any suggestion is always welcome!
rayf15
Re: ESP32 reboots, SPI flash issues and console baudrates...
Suggest you tidy the post a little. You have several questions buried in the post & summarising might help.
I'll pick out the more obvious questions:
The watchdog reset is likely to be a software bug and is likely the mechanism which starts the FLASH/Bootstrap issue(s).
Not all resets cause the strapping pins to be sampled again but a watchdog reset does.
It follows then that if you have a peripheral connected to an important strapping pin and that peripheral is not also reset - well you can see how that might ruin your day.
I suggest that you stream & capture your ESP_LOGs which will usually take you to the offending line. EDIT you seem to have a 232 port so start by running with that connected.
I'll pick out the more obvious questions:
Well no & neither will we. All we have to work on is that you have a nice AP + STA mode application.but I don't know exactly what is and why there could be a TG0WDT_SYS_RESET
The watchdog reset is likely to be a software bug and is likely the mechanism which starts the FLASH/Bootstrap issue(s).
Yes absolutely. I suggest you look here: https://github.com/espressif/esptool/wi ... -Selectionif a wrong status of ESP32 bootstrap pins could be a possible reason for such a locked system.
Not all resets cause the strapping pins to be sampled again but a watchdog reset does.
It follows then that if you have a peripheral connected to an important strapping pin and that peripheral is not also reset - well you can see how that might ruin your day.
I suggest that you stream & capture your ESP_LOGs which will usually take you to the offending line. EDIT you seem to have a 232 port so start by running with that connected.
& I also believe that IDF CAN should be fixed.
Re: ESP32 reboots, SPI flash issues and console baudrates...
It thinks the 2nd stage bootloader is 4mb so it takes too long to try and load it causing a wdt reset. This could be a bit error in flash read which when considering the baud rate issue may be due to unstable xtal clock or bad register value. If a hard reset fixes it then it implies something corrupts one of the clock registers during execution.
Re: ESP32 reboots, SPI flash issues and console baudrates...
Thank you PeterR and WiFive for your answers.
@PeterR
I'll try to better explain the situation. The ESP32 hardware watchdog is enabled and is linked to both CPU0 and CPU1 idle tasks and lowest priority application task. After some investigations we discovered that those reboots are caused by switching operations of a couple of power relays placed on the same PCB as the ESP-WROOM-32. So ti appears to be a classical EMC self-immunity issue.
Clearly, our current efforts are aimed at solving this mutual interference between the digital section and the power section of the board, using suitable filters/decoupling methods. But meanwhile we must be sure that even if ESP32 hardware watchdog (WDT) reset should occurs, the system application restarts correctly, because a locked situation as described is not acceptable for the application.
We have therefore carried out a long sequence of tests in which deliberately the ESP32 hardware watchdog (WDT) resets the system at random time... even with such a big number of resets we never noted a locked situation as described in my first post.
@WiFive
The ESP-WROOM-32 module has 8MB of flash and I confirm that a hard reset makes the system serviceable again. I think that the first ESP32 reset recorded in the log was due to electrical noise hitting the ESP32... next infinite sequence of reboots depends on TG0WDT_SYS_RESET.
If electrical noise hits ESP32 only during first reboot, I don't understand why at next reboots serial console is still misconfigured... There could be only a reason: TG0WDT_SYS_RESET does not reset completely all internal ESP32 circuitry and a wrong serial console baudrate configuration is still repeatedly present.
You wrote about possible unstable xtal clock or bad register value and I agree with you... when you stated that there could be a bit error in flash read, do you mean in internal flash (ROM) of ESP32 or the external SPI flash?
Thank you again.
rayf15
@PeterR
I'll try to better explain the situation. The ESP32 hardware watchdog is enabled and is linked to both CPU0 and CPU1 idle tasks and lowest priority application task. After some investigations we discovered that those reboots are caused by switching operations of a couple of power relays placed on the same PCB as the ESP-WROOM-32. So ti appears to be a classical EMC self-immunity issue.
Clearly, our current efforts are aimed at solving this mutual interference between the digital section and the power section of the board, using suitable filters/decoupling methods. But meanwhile we must be sure that even if ESP32 hardware watchdog (WDT) reset should occurs, the system application restarts correctly, because a locked situation as described is not acceptable for the application.
We have therefore carried out a long sequence of tests in which deliberately the ESP32 hardware watchdog (WDT) resets the system at random time... even with such a big number of resets we never noted a locked situation as described in my first post.
@WiFive
The ESP-WROOM-32 module has 8MB of flash and I confirm that a hard reset makes the system serviceable again. I think that the first ESP32 reset recorded in the log was due to electrical noise hitting the ESP32... next infinite sequence of reboots depends on TG0WDT_SYS_RESET.
If electrical noise hits ESP32 only during first reboot, I don't understand why at next reboots serial console is still misconfigured... There could be only a reason: TG0WDT_SYS_RESET does not reset completely all internal ESP32 circuitry and a wrong serial console baudrate configuration is still repeatedly present.
You wrote about possible unstable xtal clock or bad register value and I agree with you... when you stated that there could be a bit error in flash read, do you mean in internal flash (ROM) of ESP32 or the external SPI flash?
Thank you again.
rayf15
Who is online
Users browsing this forum: Majestic-12 [Bot] and 131 guests