How do I upload a core dump to a server and analyze it?

jcolebaker
Posts: 64
Joined: Thu Mar 18, 2021 12:23 am

How do I upload a core dump to a server and analyze it?

Postby jcolebaker » Sun Aug 06, 2023 8:28 pm

We have a requirement to upload core dumps to a server from production devices, and analyze them for debugging.

I've configured the core dump system to store dumps to a flash partition. We're using AWS IoT services, and when a core dump is found, I plan to upload it as a stream of MQTT messages (containing Base64 encoded chunks of the core dump data).

I've written code which reads the core dump partition in 3 KiB chunks, encodes each chunk as Base64 text, and prints the text to the debug UART (for testing).

I'm trying to simulate what we would do with these chunks once we got them from the cloud service. I concatenated the chunks together into a txt file, and decoded the Base64 to give a binary file. It looks OK - I can see the right amount of data, and it includes the string "ELF" near the start.

I tried to analyze this file using the espcoredump.py tool from the ESP-IDF sdk:

Code: Select all

& "$Env:IDF_PYTHON_ENV_PATH\Scripts\python.exe" `
    "$Env:IDF_PATH\components\espcoredump\espcoredump.py" info_corefile --core core_dump.bin  --core-format elf .\build\wireless-controller.elf
However, this gave an error:

Code: Select all

construct.core.ConstError: Error in path (parsing) -> elf_header -> e_ident -> EI_MAG
parsing expected b'\x7fELF' but parsed b'\xa4\xac\x00\x00'
I noticed that the b'\x7fELF' sequence occurs at 20 characters in, so I trimmed the leading bytes (some kind of header?). However, I still get an error:

Code: Select all

construct.core.StreamError: Error in path (parsing) -> program_headers -> p_flags
stream read less than specified amount, expected 4, found 0
For comparison, I turned on the "print core dump to UART" option in the config, and generated a crash. The Base64 printed out at the console looks different:

Code: Select all

IKUAAAABAAAAAAAAAAAAAAAAAAA=
f0VMRgEBAQAAAAAAAAAAAAQAXgABAAAAAAAAADQAAAAAAAAAAAAAADQAIAA6ACgA
AAAAAA==
BAAAAHQHAAAAAAAAAAAAAIBCAACAQgAABgAAAAAAAAA=
AQAAAPRJAADEzv4/xM7+P2QBAABkAQAABgAAAAAAAAA=
AQAAAFhLAAAAzf4/AM3+P7ABAACwAQAABgAAAAAAAAA=
AQAAAAhNAACcvfs/nL37P2QBAABkAQAABgAAAAAAAAA=
...
I.e. each line is Base64 encoded separately, it's not just the Base64 encoded version of the content of the core dump partition.

I copied this from the serial terminal to a text file, and analyzed with the espcoredump.py tool:

Code: Select all

& "$Env:IDF_PYTHON_ENV_PATH\Scripts\python.exe" `
    "$Env:IDF_PATH\components\espcoredump\espcoredump.py" info_corefile --core uart_core_dump_b64.txt  --core-format b64 .\build\wireless-controller.elf
That worked a little bit better but still didn't really work. The tool printed out the following, then hung:

Code: Select all

espcoredump.py v0.4-dev
===============================================================
==================== ESP32 CORE DUMP START ====================

Crashed task handle: 0x3ffbbd9c, name: '', GDB name: 'process 1073462684'

================== CURRENT THREAD REGISTERS ===================
exccause       0x1d (StoreProhibitedCause)
excvaddr       0x0
epc1           0x40128fdb
epc2           0x0
epc3           0x401a3136
epc4           0x40095bc6
epc5           0x0
epc6           0x0
eps2           0x0
eps3           0x60320
eps4           0x60f23
eps5           0x0
eps6           0x0


==================== CURRENT THREAD STACK =====================


======================== THREADS INFO =========================

Traceback (most recent call last):
  File "C:\Users\Jeremycb\esp-idf\components\espcoredump\espcoredump.py", line 350, in <module>
    temp_core_files = info_corefile()
  File "C:\Users\Jeremycb\esp-idf\components\espcoredump\espcoredump.py", line 200, in info_corefile
    threads, _ = gdb.get_thread_info()
  File "C:\Users\Jeremycb\esp-idf\components\espcoredump\corefile\gdb.py", line 114, in get_thread_info
    current_thread_id = result['current-thread-id']
KeyError: 'current-thread-id'
Not very useful.

So is there a good method for uploading core dumps to a server for analysis? How do I get useful info from the core dump?

There is this page, but it only gives a general overview: https://docs.espressif.com/projects/esp ... rnals.html

We are using ESP-IDF 4.4.4

MicroController
Posts: 1724
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: How do I upload a core dump to a server and analyze it?

Postby MicroController » Mon Aug 07, 2023 12:42 pm

I concatenated the chunks together into a txt file, and decoded the Base64 to give a binary file
Due to padding at the end of a Base64 string you cannot do that. You need to decode each string to binary first, then concat the binary bytes.

jcolebaker
Posts: 64
Joined: Thu Mar 18, 2021 12:23 am

Re: How do I upload a core dump to a server and analyze it?

Postby jcolebaker » Mon Aug 07, 2023 6:57 pm

Due to padding at the end of a Base64 string you cannot do that. You need to decode each string to binary first, then concat the binary bytes.
Note that I was not concatenating the Base64 from the normal UART output of core dumps (although your solution might work for using that output). I was concatenating my own Base64 which was a representation of the entire content of the core dump partition (containing a core dump). We want to upload this core dump in Base64 encoded chunks of 3-4 KiB.

So my binary file should be a mirror of the content from the partition. What I want to know is how to use that data. Feeding it to the espcoredump.py tool doesn't work.

DrMickeyLauer
Posts: 168
Joined: Sun May 22, 2022 2:42 pm

Re: How do I upload a core dump to a server and analyze it?

Postby DrMickeyLauer » Sat Mar 30, 2024 1:04 pm

Did you figure it out? Stumbling over the same problem now.

DrMickeyLauer
Posts: 168
Joined: Sun May 22, 2022 2:42 pm

Re: How do I upload a core dump to a server and analyze it?

Postby DrMickeyLauer » Sat Mar 30, 2024 1:34 pm

Ok, I found out that -- for whatever reason -- the contents of the partition contain 24 bytes before the actual coredump in ELF format starts. Stripping these makes it work. Can anyone comment on why this is so?

iParcelBox
Posts: 31
Joined: Sun Oct 27, 2019 3:12 pm

Re: How do I upload a core dump to a server and analyze it?

Postby iParcelBox » Sun Oct 20, 2024 8:06 pm

I'm trying to exactly this, but really struggling to get it to work. If I cause a panic crash dump to UART, I see it starts with something like:

`
================= CORE DUMP START =================
rI0AAAIBCQAAAAAAAAAAAAAAAAACAAAA
f0VMRgEBAQAAAAAAAAAAAAQAXgABAAAAAAAAADQAAAAAAAAAAAAAADQAIAAqACgA
AAAAAA==
BAAAAHQFAAAAAAAAAAAAAIAvAACALwAABgAAAAAAAAA=
AQAAAPQ0AABYfM0/WHzNP1wBAABcAQAABgAAAAAAAAA=
`
and ends
`h0AnN7DTBZU04aj5/L6p2rsm/Rq3DcGBB8R2Yl+xkMU44XuI12ZOkcNow==`


If I swap the firmware to dump crash to flash, and call the same panic, I occasionally get success. However, if the device then crashes itself, although I'm able to read it fine, when I base64 encode it, I get a different output, which tends to start:

`/////wAAAAAAAAAAAAAAANPS+tkEvgRlV+84OaivrXkkHptZ/JVN6clSQ/dwEQdFwVqRdy70R36`
and ends
`H25nYtGgwh0AnN7DTBZU04aj5/L6p2rsm/Rq3DcGBB8R2Yl+xkMU44XuI12ZOkcNow==`

This is my function:

Code: Select all

char *rpc_coredump(void *params)
{
    ESP_LOGI(TAG, "Generating core dump...");

    // Find the coredump partition
    const esp_partition_t *core_dump_partition = esp_partition_find_first(
        ESP_PARTITION_TYPE_DATA, ESP_PARTITION_SUBTYPE_DATA_COREDUMP, "coredump");

    ESP_LOGI(TAG, "Core dump partition: %s, size: %ld", core_dump_partition->label, core_dump_partition->size);

    if (!core_dump_partition)
    {
        ESP_LOGE(TAG, "Core dump partition not found!");
        return strdup("Core dump partition not found!"); // freed in calling function
    }

    // Read the core dump size
    size_t core_dump_size = core_dump_partition->size;
    uint8_t *core_dump_data = malloc(core_dump_size + 1);
    if (!core_dump_data)
    {
        ESP_LOGE(TAG, "Failed to allocate memory for core dump");
        return NULL;
    }

    esp_err_t err = esp_partition_read(core_dump_partition, 0, core_dump_data, core_dump_size);
    if (err != ESP_OK)
    {
        ESP_LOGE(TAG, "Failed to read core dump data: %s", esp_err_to_name(err));
        free(core_dump_data);
        return strdup("Failed to read core dump data"); // freed in calling function
    }

    // Calculate Base64 encoded size
    size_t encoded_size = 0;
    mbedtls_base64_encode(NULL, 0, &encoded_size, core_dump_data, core_dump_size);
    uint8_t *encoded_data = malloc(encoded_size + 1); // used as return, so freed in calling function
    if (!encoded_data)
    {
        ESP_LOGE(TAG, "Failed to allocate memory for encoded data");
        free(core_dump_data);
        return NULL;
    }

    // Encode core dump to Base64
    err = mbedtls_base64_encode(encoded_data, encoded_size, &encoded_size, core_dump_data, core_dump_size);
    if (err != 0)
    {
        ESP_LOGE(TAG, "Failed to encode core dump data to Base64");
        free(core_dump_data);
        free(encoded_data);
        return NULL;
    }
    encoded_data[encoded_size] = '\0'; // Null-terminate the Base64 string

    ESP_LOGI(TAG, "Core dump encoded size: %d", encoded_size);

    printf("Core dump data (encoded): %s\n", (const char *)encoded_data);
    return (char *)encoded_data;
}

If I call `idf.py coredump-info -c coredump.bin` on the resulting file, I get an error:
`The format of the provided core-file is not recognized. Please ensure that the core-format matches one of the following: ELF (“elf”), raw (raw) or base64-encoded (b64) binary`

In particular, I've noticed that although both files are 87kb (encoded), the working version has a whole section at the top with lots of AAAAAAAA, whereas the non-working version doesn't. Could it be that because my flash is encrypted, something is going wrong when the device is saving the coredump to the flash?

Any help you could provide @DrMickeyLauer based on how you managed to get it to work would be very much appreciated!

iParcelBox
Posts: 31
Joined: Sun Oct 27, 2019 3:12 pm

Re: How do I upload a core dump to a server and analyze it?

Postby iParcelBox » Sun Oct 20, 2024 8:36 pm

As an update, I've realised that most times when I force an abort() it saves the crash dump correctly and I'm able to download it from flash.

However, occasionally I get a crash and, despite coredump being set to write to flash, I see the below in the console:

`
[Oct 20 21:33:45.868] I (5861) esp_core_dump_flash: Core dump data checksum is correct
[Oct 20 21:33:45.876] D (5863) esp_core_dump_elf: ELF ident 7f E L F
[Oct 20 21:33:45.879] D (5867) esp_core_dump_elf: Ph_num 42 offset 34
[Oct 20 21:33:45.884] D (5871) esp_core_dump_elf: PHDR type 4 off 574 vaddr 0 paddr 0 filesz 2f80 memsz 2f80 flags 6 align 0
[Oct 20 21:33:45.893] D (5881) esp_core_dump_elf: PHDR type 1 off 34f4 vaddr 3fccf738 paddr 3fccf738 filesz 15c memsz 15c flags 6 align 0
[Oct 20 21:33:45.904] D (5891) esp_core_dump_elf: PHDR type 1 off 3650 vaddr 3fccf3e0 paddr 3fccf3e0 filesz 340 memsz 340 flags 6 align 0
[Oct 20 21:33:45.914] D (6383) esp_core_dump_elf: PHDR type 1 off 3990 vaddr 3fcb46a0 paddr 3fcb46a0 filesz 15c memsz 15c flags 6 align 0
[Oct 20 21:33:45.926] D (6394) esp_core_dump_elf: PHDR type 1 off 3aec vaddr 3fcb43e0 paddr 3fcb43e0 filesz 2a0 memsz 2a0 flags 6 align 0
[Oct 20 21:33:45.935] D (6404) esp_core_dump_elf: PHDR type 1 off 3d8c vaddr 3fcb3ee0 paddr 3fcb3ee0 filesz 15c memsz 15c flags 6 align 0
[Oct 20 21:33:45.946] D (6415) esp_core_dump_elf: PHDR type 1 off 3ee8 vaddr 3fcb3c20 paddr 3fcb3c20 filesz 2a0 memsz 2a0 flags 6 align 0
[Oct 20 21:33:45.956] D (6425) esp_core_dump_elf: PHDR type 1 off 4188 vaddr 3fce699c paddr 3fce699c filesz 15c memsz 15c flags 6 align 0
[Oct 20 21:33:45.967] D (6436) esp_core_dump_elf: PHDR type 1 off 42e4 vaddr 3c19caa0 paddr 3c19caa0 filesz 330 memsz 330 flags 6 align 0
[Oct 20 21:33:45.977] D (6446) esp_core_dump_elf: PHDR type 1 off 4614 vaddr 3fcc02b0 paddr 3fcc02b0 filesz 15c memsz 15c flags 6 align 0
[Oct 20 21:33:45.988] D (6457) esp_core_dump_elf: PHDR type 1 off 4770 vaddr 3fcbff80 paddr 3fcbff80 filesz 310 memsz 310 flags 6 align 0
[Oct 20 21:33:45.998] D (6467) esp_core_dump_elf: PHDR type 1 off 4a80 vaddr 3fce7dd4 paddr 3fce7dd4 filesz 15c memsz 15c flags 6 align 0
[Oct 20 21:33:46.009] D (6478) esp_core_dump_elf: PHDR type 1 off 4bdc vaddr 3fcb2ce0 paddr 3fcb2ce0 filesz 460 memsz 460 flags 6 align 0
[Oct 20 21:33:46.019] D (6488) esp_core_dump_elf: PHDR type 1 off 503c vaddr 3fcc6058 paddr 3fcc6058 filesz 15c memsz 15c flags 6 align 0
[Oct 20 21:33:46.030] D (6499) esp_core_dump_elf: PHDR type 1 off 5198 vaddr 3fcc5d60 paddr 3fcc5d60 filesz 2e0 memsz 2e0 flags 6 align 0
[Oct 20 21:33:46.040] D (6509) esp_core_dump_elf: PHDR type 1 off 5478 vaddr 3fcb5024 paddr 3fcb5024 filesz 15c memsz 15c flags 6 align 0
[Oct 20 21:33:46.051] D (6520) esp_core_dump_elf: PHDR type 1 off 55d4 vaddr 3fcb4d80 paddr 3fcb4d80 filesz 280 memsz 280 flags 6 align 0
[Oct 20 21:33:46.061] D (6530) esp_core_dump_elf: PHDR type 1 off 5854 vaddr 3fcc7674 paddr 3fcc7674 filesz 15c memsz 15c flags 6 align 0
[Oct 20 21:33:46.072] D (6541) esp_core_dump_elf: PHDR type 1 off 59b0 vaddr 3fcc9cc0 paddr 3fcc9cc0 filesz 2b0 memsz 2b0 flags 6 align 0
[Oct 20 21:33:46.082] D (6551) esp_core_dump_elf: PHDR type 1 off 5c60 vaddr 3fccb388 paddr 3fccb388 filesz 15c memsz 15c flags 6 align 0
[Oct 20 21:33:46.093] D (6562) esp_core_dump_elf: PHDR type 1 off 5dbc vaddr 3fccb0a0 paddr 3fccb0a0 filesz 2d0 memsz 2d0 flags 6 align 0
[Oct 20 21:33:46.103] D (6572) esp_core_dump_elf: PHDR type 1 off 608c vaddr 3fcd3774 paddr 3fcd3774 filesz 15c memsz 15c flags 6 align 0
[Oct 20 21:33:46.114] D (6583) esp_core_dump_elf: PHDR type 1 off 61e8 vaddr 3fcd3490 paddr 3fcd3490 filesz 2c0 memsz 2c0 flags 6 align 0
[Oct 20 21:33:46.124] D (6593) esp_core_dump_elf: PHDR type 1 off 64a8 vaddr 3fcd4dbc paddr 3fcd4dbc filesz 15c memsz 15c flags 6 align 0
[Oct 20 21:33:46.135] D (6604) esp_core_dump_elf: PHDR type 1 off 6604 vaddr 3fcd4ae0 paddr 3fcd4ae0 filesz 2c0 memsz 2c0 flags 6 align 0
[Oct 20 21:33:46.145] D (6614) esp_core_dump_elf: PHDR type 1 off 68c4 vaddr 3fce53e0 paddr 3fce53e0 filesz 15c memsz 15c flags 6 align 0
[Oct 20 21:33:46.156] D (6625) esp_core_dump_elf: PHDR type 1 off 6a20 vaddr 3fcebde0 paddr 3fcebde0 filesz 410 memsz 410 flags 6 align 0
[Oct 20 21:33:46.166] D (6635) esp_core_dump_elf: PHDR type 1 off 6e30 vaddr 3fcc43b0 paddr 3fcc43b0 filesz 15c memsz 15c flags 6 align 0
[Oct 20 21:33:46.177] D (6646) esp_core_dump_elf: PHDR type 1 off 6f8c vaddr 3fcc4090 paddr 3fcc4090 filesz 300 memsz 300 flags 6 align 0
[Oct 20 21:33:46.187] D (6656) esp_core_dump_elf: PHDR type 1 off 728c vaddr 3fcd1da8 paddr 3fcd1da8 filesz 15c memsz 15c flags 6 align 0
[Oct 20 21:33:46.198] D (6667) esp_core_dump_elf: PHDR type 1 off 73e8 vaddr 3fcd19b0 paddr 3fcd19b0 filesz 3e0 memsz 3e0 flags 6 align 0
[Oct 20 21:33:46.208] D (6677) esp_core_dump_elf: PHDR type 1 off 77c8 vaddr 3fce4dfc paddr 3fce4dfc filesz 15c memsz 15c flags 6 align 0
[Oct 20 21:33:46.219] D (6688) esp_core_dump_elf: PHDR type 1 off 7924 vaddr 3fce4b20 paddr 3fce4b20 filesz 2c0 memsz 2c0 flags 6 align 0
[Oct 20 21:33:46.229] D (6698) esp_core_dump_elf: PHDR type 1 off 7be4 vaddr 3fcae1a8 paddr 3fcae1a8 filesz 15c memsz 15c flags 6 align 0
[Oct 20 21:33:46.240] D (6227) esp_core_dump_elf: PHDR type 1 off 7d40 vaddr 3fcadf00 paddr 3fcadf00 filesz 290 memsz 290 flags 6 align 0
[Oct 20 21:33:46.250] D (6719) esp_core_dump_elf: PHDR type 1 off 7fd0 vaddr 3fcb092c paddr 3fcb092c filesz 15c memsz 15c flags 6 align 0
[Oct 20 21:33:46.261] D (6249) esp_core_dump_elf: PHDR type 1 off 812c vaddr 3fcb0680 paddr 3fcb0680 filesz 290 memsz 290 flags 6 align 0
[Oct 20 21:33:46.271] D (6259) esp_core_dump_elf: PHDR type 1 off 83bc vaddr 3fcadb24 paddr 3fcadb24 filesz 15c memsz 15c flags 6 align 0
[Oct 20 21:33:46.282] D (6269) esp_core_dump_elf: PHDR type 1 off 8518 vaddr 3fcad870 paddr 3fcad870 filesz 290 memsz 290 flags 6 align 0
[Oct 20 21:33:46.292] D (6761) esp_core_dump_elf: PHDR type 1 off 87a8 vaddr 3fcd7c5c paddr 3fcd7c5c filesz 15c memsz 15c flags 6 align 0
[Oct 20 21:33:46.303] D (6772) esp_core_dump_elf: PHDR type 1 off 8904 vaddr 3fcd7960 paddr 3fcd7960 filesz 2e0 memsz 2e0 flags 6 align 0
[Oct 20 21:33:46.313] D (6782) esp_core_dump_elf: PHDR type 4 off 8be4 vaddr 0 paddr 0 filesz 16c memsz 16c flags 6 align 0
[Oct 20 21:33:46.323] D (6792) esp_core_dump_elf: 72 bytes target note (204A) found in the note section
[Oct 20 21:33:46.330] D (6799) esp_core_dump_elf: 152 bytes target note (2A5) found in the note section
[Oct 20 21:33:46.338] D (6807) esp_core_dump_port: Crash TCB 0x3fccf738
[Oct 20 21:33:46.342] D (6811) esp_core_dump_port: excvaddr 0x0
[Oct 20 21:33:46.346] D (6815) esp_core_dump_port: exccause 0x1d
[Oct 20 21:33:46.351] D (6820) esp_core_dump_elf: Core dump version 0x90102
[Oct 20 21:33:46.356] D (6825) esp_core_dump_elf: App ELF SHA2 0b4354a4fc517d59
[Oct 20 21:33:46.361] D (6830) esp_core_dump_elf: Crashing task RPCCommandTask
[Oct 20 21:33:46.367] D (6836) esp_core_dump_port: Crashing PC 0x40375f7a
[Oct 20 21:33:46.372] D (6841) esp_core_dump_port: A[0] 0x80383f24
[Oct 20 21:33:46.376] D (6845) esp_core_dump_port: A[1] 0x3fccf4a0
[Oct 20 21:33:46.380] D (6849) esp_core_dump_port: A[2] 0x3fccf4eb
[Oct 20 21:33:46.385] D (6854) esp_core_dump_port: A[3] 0x3fccf518
[Oct 20 21:33:46.389] D (6858) esp_core_dump_port: A[4] 0xa
[Oct 20 21:33:46.393] D (6862) esp_core_dump_port: A[5] 0x3fccf530
[Oct 20 21:33:46.397] D (6866) esp_core_dump_port: A[6] 0x3fccf510
[Oct 20 21:33:46.401] D (6870) esp_core_dump_port: A[7] 0xc
[Oct 20 21:33:46.405] D (6874) esp_core_dump_port: A[8] 0x0
[Oct 20 21:33:46.409] D (6878) esp_core_dump_port: A[9] 0x1
[Oct 20 21:33:46.413] D (6882) esp_core_dump_port: A[10] 0x3fccf4e9
[Oct 20 21:33:46.417] D (6886) esp_core_dump_port: A[11] 0x3fccf4e9
[Oct 20 21:33:46.421] D (6890) esp_core_dump_port: A[12] 0xa
[Oct 20 21:33:46.425] D (6894) esp_core_dump_port: A[13] 0x0
[Oct 20 21:33:46.429] D (6898) esp_core_dump_port: A[14] 0x0
[Oct 20 21:33:46.433] D (6902) esp_core_dump_port: A[15] 0x1
[Oct 20 21:33:46.437] D (6906) esp_core_dump_port: Crash Backtrace
[Oct 20 21:33:46.441] D (6910) esp_core_dump_port: 0x40375f7a
[Oct 20 21:33:46.445] D (6914) esp_core_dump_port: 0x40383f21
[Oct 20 21:33:46.449] D (6918) esp_core_dump_port: 0x40389b09
[Oct 20 21:33:46.453] D (6922) esp_core_dump_port: 0x42021e8b
[Oct 20 21:33:46.457] D (6926) esp_core_dump_port: 0x4202359f
[Oct 20 21:33:46.461] D (6930) esp_core_dump_port: 0x403842ca
`

The above then also writes to flash, however I then don't get a coredump.bin that I'm able to download and review.

Who is online

Users browsing this forum: No registered users and 131 guests