Diagnosing heap corruption
Re: Diagnosing heap corruption
Ok, that's what I thought. Thanks.
Re: Diagnosing heap corruption
Code: Select all
CORRUPT HEAP: Bad head at 0x3ffd42dc. Expected 0xabba1234 got 0xfefefefe
CORRUPT HEAP: Bad head at 0x3ffd424c. Expected 0xabba1234 got 0xfefefefe
CORRUPT HEAP: Bad head at 0x3ffd5158. Expected 0xabba1234 got 0xfefefefe
CORRUPT HEAP: Bad head at 0x3ffd424c. Expected 0xabba1234 got 0xfefefefe
CORRUPT HEAP: Bad head at 0x3ffd424c. Expected 0xabba1234 got 0xfefefefe
CORRUPT HEAP: Bad head at 0x3ffd4270. Expected 0xabba1234 got 0xfefefefe
CORRUPT HEAP: Bad head at 0x3ffd424c. Expected 0xabba1234 got 0xfefefefe
CORRUPT HEAP: Bad head at 0x3ffd424c. Expected 0xabba1234 got 0xfefefefe
CORRUPT HEAP: Bad head at 0x3ffd4270. Expected 0xabba1234 got 0xfefefefe
Also, there is no mention of 0xabba1234, is that value of any significance?If an application crashes reading/writing an address related to 0xFEFEFEFE, this indicates it is reading heap memory after it has been freed (a “use after free bug”.) The application should be changed to not access heap memory after it has been freed.
If the IDF heap allocator fails because the pattern 0xFEFEFEFE was not found in freed memory then this indicates the app has a use-after-free bug where it is writing to memory which has already been freed.
I'm guessing on the latter, since the application continues to run until I eventually get a crash, though I don't know if they are actually related:
Code: Select all
Guru Meditation Error of type InstrFetchProhibited occurred on core 0. Exception was unhandled.
Register dump:
PC : 0x00000000 PS : 0x00060530 A0 : 0x8013c1d6 A1 : 0x3ffd3a60
A2 : 0x3ffc7bc0 A3 : 0x3f40f774 A4 : 0x3ffd3b20 A5 : 0x00000000
A6 : 0x3ffd3a70 A7 : 0x3ffca3a0 A8 : 0x80144368 A9 : 0x3ffd3a30
A10 : 0x3ffc7bc0 A11 : 0x00000000 A12 : 0x00000004 A13 : 0x3ffd3b20
A14 : 0x00000000 A15 : 0xff000000 SAR : 0x00000018 EXCCAUSE: 0x00000014
EXCVADDR: 0x00000000 LBEG : 0x4000c28c LEND : 0x4000c296 LCOUNT : 0x00000000
Backtrace: 0x00000000:0x3ffd3a60 0x4013c1d3:0x3ffd3c10 0x4013c259:0x3ffd3c30 0x4013fbb9:0x3ffd3c50 0x40145829:0x3ffd3c80 0x4015a15d:0x3ffd3ca0 0x4013c26c:0x3ffd3cc0 0x40138161:0x3ffd3ce0 0x401381dd:0x3ffd3d40
0x4013c1d3: _ZN6smooth11application7network4mqtt10MqttClient11send_packetERNS2_6packet10MQTTPacketE at /home/permal/esp/xtensa-esp32-elf/xtensa-esp32-elf/include/c++/5.2.0/xtensa-esp32-elf/bits/gthr-default.h:778
0x4013c259: _ZThn80_N6smooth11application7network4mqtt10MqttClient11send_packetERNS2_6packet10MQTTPacketE at ??:?
0x4013fbb9: _ZN6smooth11application7network4mqtt11Publication12publish_nextERNS2_11IMqttClientE at /home/permal/code/SmoothTest/components/Smooth/application/network/mqtt/Publication.cpp:116
0x40145829: _ZN6smooth11application7network4mqtt5state8RunState4tickEv at /home/permal/code/SmoothTest/components/Smooth/application/network/mqtt/state/RunState.cpp:25
0x4015a15d: _ZN6smooth11application7network4mqtt5state7MqttFSMINS3_13MQTTBaseStateEE4tickEv at /home/permal/code/SmoothTest/components/Smooth/include/smooth/application/network/mqtt/state/MqttFSM.h:81
0x4013c26c: _ZN6smooth11application7network4mqtt10MqttClient4tickEv at /home/permal/esp/xtensa-esp32-elf/xtensa-esp32-elf/include/c++/5.2.0/xtensa-esp32-elf/bits/gthr-default.h:778
0x40138161: _ZN6smooth4core4Task4execEv at /home/permal/code/SmoothTest/components/Smooth/core/Task.cpp:106
0x401381dd: _ZZN6smooth4core4Task5startEvENUlPvE_4_FUNES2_ at /home/permal/code/SmoothTest/components/Smooth/core/Task.cpp:63
(inlined by) _FUN at /home/permal/code/SmoothTest/components/Smooth/core/Task.cpp:63
Re: Diagnosing heap corruption
I have some good news and some bad news. I was just looking into a bug which looked very similar, and turns out it's a race condition bug in the "Comprehensive" level heap allocator. This error does not indicate heap corruption and the program can keep running normally after it is displayed.permal wrote: The above messages are fairly reproducable, but which of the two cases in the docs do they indicate? The following two descriptions are very similar, I can't make out the difference.
Also, there is no mention of 0xabba1234, is that value of any significance?If an application crashes reading/writing an address related to 0xFEFEFEFE, this indicates it is reading heap memory after it has been freed (a “use after free bug”.) The application should be changed to not access heap memory after it has been freed.
0xABBA1234 is the poison "head" word which is written before any block of memory which is allocated in heap, when the debug level is set to "Light Impact" or "Comprehensive". When the debug level is set to "Comprehensive", memory is also overwritten to 0xFEFEFEFE when freed. heap_caps_check_integrity() verifies both these things for all heap blocks, at these heap debugging levels.
The race is that the heap poisoning implementation doesn't lock the heap before setting 0xFEFEFEFE in multi_heap_free(). This means that there is a brief window where all the data (including the 0xABBA1234 "head word") is written to 0xFEFEFEFE. This doesn't matter for normal operation, because the memory is in the process of being freed so noone should be using it. But there is a race where heap_caps_check_integrity() may come to verify the block at this exact moment, and it sees all 0xFEFEFEFE instead of the expected head word.
Will have a fix ASAP, but for now you can disregard "errors" that meet the form "Bad head at X. Expected 0xabba1234 got 0xfefefefe".
Re: Diagnosing heap corruption
Alright. Then I can concentrate on looking at the logic surrounding the crash instead of trying to find a non-existent corrupted heap. Thanks for the quick reply.
Re: Diagnosing heap corruption
ESP_Angus wrote: race condition bug in the "Comprehensive" level heap allocator. This error does not indicate heap corruption and the program can keep running normally after it is displayed.
This fix is now in the IDF master branch on github.ESP_Angus wrote: Will have a fix ASAP, but for now you can disregard "errors" that meet the form "Bad head at X. Expected 0xabba1234 got 0xfefefefe".
Re: Diagnosing heap corruption
Thanks. Good job.
Who is online
Users browsing this forum: cdollar, Google Adsense [Bot], maurizio.scian and 109 guests