I2C crash with release/v3.0 - what's an effective way to debug this?

meowsqueak
Posts: 151
Joined: Thu Jun 15, 2017 4:54 am
Location: New Zealand

I2C crash with release/v3.0 - what's an effective way to debug this?

Postby meowsqueak » Fri Jan 19, 2018 11:40 am

I'm struggling with a persistent and frequent crash scenario and I'd like to request some advice with debugging it please, which seems to happen at or around the time of calling i2c_driver_install().

Code: Select all

Guru Meditation Error: Core  0 panic'ed (Interrupt wdt timeout on CPU0)
Register dump:
PC      : 0x400847e7  PS      : 0x00060034  A0      : 0x80084c94  A1      : 0x3ffb05b0  
0x400847e7: i2c_master_cmd_begin_static at /Users/david/esp32/esp-idf-v3.0/components/driver/./i2c.c:1023

A2      : 0x00000000  A3      : 0x00000001  A4      : 0x80088f18  A5      : 0x3ffbaee0  
A6      : 0x3ffb9138  A7      : 0x00000000  A8      : 0x3ffb6d88  A9      : 0x3ffb0590  
A10     : 0x00000001  A11     : 0x00000000  A12     : 0x3ffb05b4  A13     : 0x00000002  
A14     : 0x3ffbccec  A15     : 0x00060021  SAR     : 0x00000020  EXCCAUSE: 0x00000005  
EXCVADDR: 0x00000000  LBEG    : 0x4000c2e0  LEND    : 0x4000c2f6  LCOUNT  : 0xffffffff  

Backtrace: 0x400847e7:0x3ffb05b0 0x40084c91:0x3ffb05e0 0x40082bd1:0x3ffb0610 0x4000bfed:0x00000000
0x400847e7: i2c_master_cmd_begin_static at /Users/david/esp32/esp-idf-v3.0/components/driver/./i2c.c:1023

0x40084c91: i2c_isr_handler_default at /Users/david/esp32/esp-idf-v3.0/components/driver/./i2c.c:1023

0x40082bd1: _xt_lowint1 at /Users/david/esp32/esp-idf-v3.0/components/freertos/./xtensa_vectors.S:1105


Rebooting...
I'm using the latest code on the release/v3.0 branch (e7dc749e2f953a976325964b5ac49b53c8bafc42).

What happens is that the application will happily configure and use the I2C bus for sometimes an hour or so, then suddenly crash and go into a repeating loop of crashing every time the app restarts, which is where i2c_driver_install() is called. Every crash shows pretty much the same panic output, although the PC changes each time.

I've tried a hardware JTAG debugger but it doesn't seem to catch the issue - with newer versions of the ESP-IDF it doesn't halt on a crash and as I've mentioned in another thread I have had all sorts of problems with debugging with v3.0.

Based on error message I'd say it looks like an interrupt handler is taking too long to execute, but the line number of i2c.c:1023 doesn't seem to mean much in relation to an interrupt handler - it's the end of the i2c_master_read function.

I've turned on heap corruption detection and stack smashing detection - nothing obvious shows up.

Can someone familiar with the I2C code please advise me on how to tackle this problem? Are there additional options I can configure in 'sdkconfig' to get more information? Are there any tricks or approaches anyone can recommend for narrowing this down? Any GDB "magic" that might help here? I'm inexperienced with debugging at this level on the ESP32 so any suggestions on how to proceed would be very helpful, please.

I would dearly like to solve this problem because a CPU reset doesn't clear it - once it starts crashing it just crashes over and over until intervention occurs (power cycle or a long press on the board's reset button).

Note that I didn't see this error with v2.1.1 - this only seems to happen with release/v3.0 and master. However the I2C lock-up problem in 2.1.1 may have been shielding me from this issue.

meowsqueak
Posts: 151
Joined: Thu Jun 15, 2017 4:54 am
Location: New Zealand

Re: I2C crash with release/v3.0 - what's an effective way to debug this?

Postby meowsqueak » Thu Jan 25, 2018 11:18 pm

An update (of sorts): I'm pretty much at a dead end with this. It can sometimes run happily for a day or two before it crashes, other times it crashes after 10 minutes. There seems to be no way out of the repeated crash cycle without human intervention. I suspect (guess) that something in the I2C hardware is getting stuck and the interrupt handler blocks unexpectedly, and a short reset doesn't clear it, but I can only guess because I can't get the debugger working properly with v3.0. Back to the drawing board...

chegewara
Posts: 2364
Joined: Wed Jun 14, 2017 9:00 pm

Re: I2C crash with release/v3.0 - what's an effective way to debug this?

Postby chegewara » Fri Jan 26, 2018 5:43 am

Sometimes im using this tool to decode errors even from esp-idf projects. Its enough to start new arduino project and start this tool, then it will ask you to load elf file you want to decode and then provide backtrace. Thats all.

ginodecock
Posts: 30
Joined: Thu Jun 29, 2017 7:10 pm

Re: I2C crash with release/v3.0 - what's an effective way to debug this?

Postby ginodecock » Fri Jan 26, 2018 4:55 pm

Hi,

I've got a simular error when the i2c sensor is disconnected.

meowsqueak
Posts: 151
Joined: Thu Jun 15, 2017 4:54 am
Location: New Zealand

Re: I2C crash with release/v3.0 - what's an effective way to debug this?

Postby meowsqueak » Wed Jan 31, 2018 8:59 am

chegewara wrote:Sometimes im using this tool to decode errors even from esp-idf projects. Its enough to start new arduino project and start this tool, then it will ask you to load elf file you want to decode and then provide backtrace. Thats all.
That sounds useful, but which tool do you mean?

meowsqueak
Posts: 151
Joined: Thu Jun 15, 2017 4:54 am
Location: New Zealand

Re: I2C crash with release/v3.0 - what's an effective way to debug this?

Postby meowsqueak » Wed Jan 31, 2018 9:01 am

ginodecock wrote:Hi,

I've got a simular error when the i2c sensor is disconnected.
What kind of sensor are you using? Can you paste your exact error message? When it crashes, does it crash over-and-over?

ESP_Angus
Posts: 2344
Joined: Sun May 08, 2016 4:11 am

Re: I2C crash with release/v3.0 - what's an effective way to debug this?

Postby ESP_Angus » Wed Jan 31, 2018 9:41 am

Hi meowsqueak,

Sorry it's taken a while to get back to you.
meowsqueak wrote: Based on error message I'd say it looks like an interrupt handler is taking too long to execute, but the line number of i2c.c:1023 doesn't seem to mean much in relation to an interrupt handler - it's the end of the i2c_master_read function.
Due to a binutils bug of some kind, the line numbers in the monitor output (from addr2line) are sometimes wrong. The function names are always correct, so it's definitely somewhere in the interrupt handler.

You can get an exact line number by running "xtensa-esp32-elf-gdb build/path-to-your.elf" and then running the command "info line *0x400847e7" (replace the hex address with one that matches your binary, if you've recompiled it. Note the * character as well. Please do the same for the other i2c.c frame in the call stack.

The exception cause "interrupt wdt timeout" is also a clue - something is spending too long in an interrupt context (or with interrupts disabled). You can read about the interrupt watchdog here:
http://esp-idf.readthedocs.io/en/v3.0-r ... t-watchdog
meowsqueak wrote:Are there additional options I can configure in 'sdkconfig' to get more information?
What timeout do you have set for the interrupt watchdog timer in menuconfig? You can look for CONFIG_INT_WDT_TIMEOUT_MS in the sdkconfig file to know this.
meowsqueak wrote: I would dearly like to solve this problem because a CPU reset doesn't clear it - once it starts crashing it just crashes over and over until intervention occurs (power cycle or a long press on the board's reset button).

Note that I didn't see this error with v2.1.1 - this only seems to happen with release/v3.0 and master. However the I2C lock-up problem in 2.1.1 may have been shielding me from this issue.
This makes it sound like something in the I2C peripheral is getting stuck in a way which isn't being correctly reset by the driver after a CPU restart.

If you can get the accurate line numbers from the backtrace, as described above, then this will help the driver team look into this.

Thanks,

Angus

chegewara
Posts: 2364
Joined: Wed Jun 14, 2017 9:00 pm

Re: I2C crash with release/v3.0 - what's an effective way to debug this?

Postby chegewara » Wed Jan 31, 2018 5:30 pm

meowsqueak wrote:
chegewara wrote:Sometimes im using this tool to decode errors even from esp-idf projects. Its enough to start new arduino project and start this tool, then it will ask you to load elf file you want to decode and then provide backtrace. Thats all.
That sounds useful, but which tool do you mean?
https://github.com/me-no-dev/EspExceptionDecoder

meowsqueak
Posts: 151
Joined: Thu Jun 15, 2017 4:54 am
Location: New Zealand

Re: I2C crash with release/v3.0 - what's an effective way to debug this?

Postby meowsqueak » Wed Jan 31, 2018 6:59 pm

Hi ESP_Angus,

The error I'm currently getting (which I believe is the same error) is currently displayed in the log as:

Code: Select all

ets Jun  8 2016 00:22:57

rst:0xc (SW_CPU_RESET),boot:0x17 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0018,len:4
load:0x3fff001c,len:5576
load:0x40078000,len:0
ho 12 tail 0 room 4
load:0x40078000,len:13716
entry 0x40078fb4
I (30) boot: ESP-IDF v3.0-rc1-151-gb6b8af49-dirty 2nd stage bootloader
I (30) boot: compile time 12:35:35
I (32) boot: Enabling RNG early entropy source...
I (37) boot: SPI Speed      : 40MHz
I (41) boot: SPI Mode       : DIO
I (45) boot: SPI Flash Size : 4MB
I (49) boot: Partition Table:
I (53) boot: ## Label            Usage          Type ST Offset   Length
I (60) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (68) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (75) boot:  2 factory          factory app      00 00 00010000 00100000
I (83) boot: End of partition table
I (87) esp_image: segment 0: paddr=0x00010020 vaddr=0x3f400020 size=0x17d74 ( 97652) map
I (130) esp_image: segment 1: paddr=0x00027d9c vaddr=0x3ffb0000 size=0x033a0 ( 13216) load
I (135) esp_image: segment 2: paddr=0x0002b144 vaddr=0x40080000 size=0x00400 (  1024) load
I (137) esp_image: segment 3: paddr=0x0002b54c vaddr=0x40080400 size=0x04ac4 ( 19140) load
I (153) esp_image: segment 4: paddr=0x00030018 vaddr=0x400d0018 size=0x64ae4 (412388) map
I (297) esp_image: segment 5: paddr=0x00094b04 vaddr=0x40084ec4 size=0x0de24 ( 56868) load
I (321) esp_image: segment 6: paddr=0x000a2930 vaddr=0x400c0000 size=0x00000 (     0) load
I (332) boot: Loaded app from partition at offset 0x10000
I (332) boot: Disabling RNG early entropy source...
I (333) cpu_start: Pro cpu up.
I (336) cpu_start: Starting app cpu, entry point is 0x4008102c
I (328) cpu_start: App cpu up.
I (347) heap_init: Initializing. RAM available for dynamic allocation:
D (354) heap_init: New heap initialised at 0x3ffae6e0
I (359) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
D (366) heap_init: New heap initialised at 0x3ffb9140
I (370) heap_init: At 3FFB9140 len 00026EC0 (155 KiB): DRAM
I (376) heap_init: At 3FFE0440 len 00003BC0 (14 KiB): D/IRAM
I (383) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
D (390) heap_init: New heap initialised at 0x40092ce8
I (394) heap_init: At 40092CE8 len 0000D318 (52 KiB): IRAM
I (400) cpu_start: Pro cpu start user code
D (412) clk: RTC_SLOW_CLK calibration value: 3300275
D (86) intr_alloc: Connected src 46 to int 2 (cpu 0)
D (87) intr_alloc: Connected src 57 to int 3 (cpu 0)
D (88) stack_chk: Intialize random stack guard
D (92) intr_alloc: Connected src 24 to int 9 (cpu 0)
I (97) cpu_start: Starting scheduler on PRO CPU.
D (0) intr_alloc: Connected src 25 to int 2 (cpu 1)
I (0) cpu_start: Starting scheduler on APP CPU.
D (112) heap_init: New heap initialised at 0x3ffe0440
D (122) heap_init: New heap initialised at 0x3ffe4350
D (122) intr_alloc: Connected src 16 to int 12 (cpu 0)
W (2142) app_main: about to run i2c_master_init
W (2142) i2c: about to run i2c_param_config
W (2142) i2c: about to run i2c_driver_install
Guru Meditation Error: Core  0 panic'ed (Interrupt wdt timeout on CPU0)
Register dump:
PC      : 0x40086653  PS      : 0x00060034  A0      : 0x80085b98  A1      : 0x3ffb0570  
A2      : 0x3ffb9e80  A3      : 0x00060023  A4      : 0x00060021  A5      : 0x000000ff  
A6      : 0x00000002  A7      : 0x3ffba0e5  A8      : 0x3ffb5d3c  A9      : 0x3ffb0560  
A10     : 0x00000003  A11     : 0x00060023  A12     : 0x00060023  A13     : 0x00000001  
A14     : 0x00060023  A15     : 0x00000000  SAR     : 0x0000001f  EXCCAUSE: 0x00000005  
EXCVADDR: 0x00000000  LBEG    : 0x4000c2e0  LEND    : 0x4000c2f6  LCOUNT  : 0xffffffff  

Backtrace: 0x40086653:0x3ffb0570 0x40085b95:0x3ffb0590 0x40084852:0x3ffb05b0 0x40084c7d:0x3ffb05e0 0x40082bbd:0x3ffb0610 0x4000bfed:0x00000000

Rebooting...
I plugged in the Backtrace addresses to get:

Code: Select all

$ xtensa-esp32-elf-gdb build/esp32-poolmon.elf 
GNU gdb (crosstool-NG crosstool-ng-1.22.0-80-g6c4433a) 7.10
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=x86_64-build_apple-darwin16.3.0 --target=xtensa-esp32-elf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from build/esp32-poolmon.elf...done.
(gdb) info line *0x40085b9b
Line 1184 of "/Users/david/esp32/esp-idf-v3.0/components/freertos/./queue.c"
   starts at address 0x40085b98 <xQueueGenericSendFromISR+96> and ends at 0x40085ba5 <xQueueGenericSendFromISR+109>.
(gdb) info line *0x40084852
Line 1045 of "/Users/david/esp32/esp-idf-v3.0/components/driver/./i2c.c"
   starts at address 0x40084848 <i2c_master_cmd_begin_static+136>
   and ends at 0x40084855 <i2c_master_cmd_begin_static+149>.
(gdb) info line *0x40084c7d
Line 375 of "/Users/david/esp32/esp-idf-v3.0/components/driver/./i2c.c"
   starts at address 0x40084c7b <i2c_isr_handler_default+155> and ends at 0x40084c84 <i2c_isr_handler_default+164>.
(gdb) info line *0x40082bbd
Line 1105 of "/Users/david/esp32/esp-idf-v3.0/components/freertos/./xtensa_vectors.S"
   starts at address 0x40082b6e <_xt_lowint1+42> and ends at 0x40082bcd <_xt_lowint1+137>.
(gdb) info line *0x4000bfed
No line number information available for address 0x4000bfed
I'm not sure what to do with the second part of each address (i.e. the ":0x3ffb0570" in "0x40086653:0x3ffb0570").

Also:

Code: Select all

$ grep CONFIG_INT_WDT sdkconfig
CONFIG_INT_WDT=y
CONFIG_INT_WDT_TIMEOUT_MS=300
CONFIG_INT_WDT_CHECK_CPU1=y
Please also refer to: https://github.com/espressif/esp-idf/issues/1503

meowsqueak
Posts: 151
Joined: Thu Jun 15, 2017 4:54 am
Location: New Zealand

Re: I2C crash with release/v3.0 - what's an effective way to debug this?

Postby meowsqueak » Wed Jan 31, 2018 7:06 pm

ESP_Angus wrote: The exception cause "interrupt wdt timeout" is also a clue - something is spending too long in an interrupt context (or with interrupts disabled). You can read about the interrupt watchdog here:
http://esp-idf.readthedocs.io/en/v3.0-r ... t-watchdog
Ah, thanks for that link, in particular:
Therefore OpenOCD will disable the hardware timers of both the interrupt and task watchdogs at every breakpoint. Moreover, OpenOCD will not reenable them upon leaving the breakpoint. This means that interrupt watchdog and task watchdog functionality will essentially be disabled. No warnings or panics from either watchdogs will be generated when the ESP32 is connected to OpenOCD via JTAG.
That explains why I wasn't able to catch this issue with my JTAG debugger - the watchdog was disabled. Is there a good strategy for debugging this? Maybe a way to prevent OpenOCD from disabling the watchdog until it first fires? I'll need to revisit using my debugger with this new info.

Who is online

Users browsing this forum: Google [Bot] and 141 guests