esp_timer_start_once - LoadProhibited on the second timer_armed call 1 in a million times
Posted: Tue Jul 30, 2019 6:35 pm
Code: Select all
Guru Meditation Error: Core 0 panic'ed (LoadProhibited). Exception was unhandled.
Core 0 register dump:
PC : 0x40089a1b PS : 0x00060130 A0 : 0x80081ee3 A1 : 0x3ffaf950
0x40089a1b: timer_armed at C:/msys32/home/jcsba/esp/esp-idf/components/esp32/esp_timer.c:443
A2 : 0x00000000 A3 : 0x000007e0 A4 : 0x3ffaf960 A5 : 0x00000008
A6 : 0x2616bdb6 A7 : 0x2616bdb6 A8 : 0x3ffafa80 A9 : 0x00000001
A10 : 0x00000000 A11 : 0x00000000 A12 : 0x00060120 A13 : 0x00000001
A14 : 0x00060123 A15 : 0x00000000 SAR : 0x00000008 EXCCAUSE: 0x0000001c
EXCVADDR: 0x00000000 LBEG : 0x4008e7b9 LEND : 0x4008e7ed LCOUNT : 0xffffffff
0x4008e7b9: memcpy at /home/jeroen/esp8266/esp32/newlib_xtensa-2.2.0-bin/newlib_xtensa-2.2.0/xtensa-esp32-elf/newlib/libc/machine/xten
sa/../../../../.././newlib/libc/machine/xtensa/memcpy.S:291
0x4008e7ed: memcpy at /home/jeroen/esp8266/esp32/newlib_xtensa-2.2.0-bin/newlib_xtensa-2.2.0/xtensa-esp32-elf/newlib/libc/machine/xten
sa/../../../../.././newlib/libc/machine/xtensa/memcpy.S:324
ELF file SHA256: 3faba7506ae8f3a2274420df831d2d67cb00ec1d93b56dcdc10a065e478ad173
Backtrace: 0x40089a1b:0x3ffaf950 0x40081ee0:0x3ffaf970 0x400db723:0x3ffaf990 0x400d32a4:0x3ffaf9b0 0x400d330b:0x3ffaf9d0 0x40095285:0x
3ffaf9f0
0x40089a1b: timer_armed at C:/msys32/home/jcsba/esp/esp-idf/components/esp32/esp_timer.c:443
0x40081ee0: esp_timer_start_once at C:/msys32/home/jcsba/esp/esp-idf/components/esp32/esp_timer.c:443
0x400db723: continue_send at C:/Users/jcsba/Documents/GitHub/new1/main/isotp.c:281
0x400d32a4: timer_process_alarm at C:/msys32/home/jcsba/esp/esp-idf/components/esp32/esp_timer.c:443
0x400d330b: timer_task at C:/msys32/home/jcsba/esp/esp-idf/components/esp32/esp_timer.c:443
0x40095285: vPortTaskWrapper at C:/msys32/home/jcsba/esp/esp-idf/components/freertos/port.c:435
Rebooting...
I take precautions to try to not have race conditions, but could not get on with critical sections around deleting and setting of timers or it would produce interrupt watchdog timeouts about 1 in 100000 times.
I'm looking at esp_timer.c to try to work out if it is something I could be doing wrong, but it is the second timer_armed call that does it. I do check the pointer passed to esp_timer_start once and cannot see anything else I'm doing wrong, but both my and the timer code look to have opportunities for tasks on different cores to fight each other where checks are done outside a criticial section.
Previously some similar timer delete bugs were resolved by changing to unicore, but the performance suffered and I am presently in dual core this is my last known remaining bug. I have critical sections and state variables protecting what I can but have to exit them to do timer activities or the WDT timeout (even when extended from 300ms).
esp-idf release 3.3 (beta 3) with WROVER 16MB/8MB.
I have not applied the PSRAM patch as could not work out the details on how it worked in Windows, but prob not related because only big mallocs will go in PSRAM by default, but I heavily use PSRAM with no evident crashes from that so far.
Any hints further where to look appreciated. It is a difficult one to solve.
I cannot produce a minimal example, the project is over 10000 lines of code with many tasks running. I can run things continuously for a long time with no signs of memory leaks, corrupt data, heap corruption, heap fragmentation etc. Then this randomly, it seems like a race condition between the cores.
I might be able to solve it by pinning certain tasks to the same core, but would like to solve it "properly" before doing kludges.