ESP32 hangs on stack overflow

vazkewel
Posts: 1
Joined: Mon Jul 25, 2022 11:52 am

ESP32 hangs on stack overflow

Postby vazkewel » Tue Nov 22, 2022 4:50 am

Hi Guys,

We are facing a problem with the ESP32 halting after a stack overflow, what we have noticed is, if we continuously invoke a stack overflow, after some time the ESP32 remains in halted state and does not reset.
The GPIO states after the ESP32 is halted are similar to that in reset.

We have kept a gpio_set() inside the panic_hanlder() function and confirmed that the code enters the panic handler, but does not exit the same.

Code: Select all

static void panic_handler(void *frame, bool pseudo_excause)
{
    panic_info_t info = { 0 };

    /*
     * Setup environment and perform necessary architecture/chip specific
     * steps here prior to the system panic handler.
     * */
    int core_id = cpu_hal_get_core_id();

    // If multiple cores arrive at panic handler, save frames for all of them
    g_exc_frames[core_id] = frame;

#if !CONFIG_ESP_SYSTEM_SINGLE_CORE_MODE
    // These are cases where both CPUs both go into panic handler. The following code ensures
    // only one core proceeds to the system panic handler.
    if (pseudo_excause) {
#define BUSY_WAIT_IF_TRUE(b)                { if (b) while(1); }
        // For WDT expiry, pause the non-offending core - offending core handles panic
        BUSY_WAIT_IF_TRUE(panic_get_cause(frame) == PANIC_RSN_INTWDT_CPU0 && core_id == 1);
        BUSY_WAIT_IF_TRUE(panic_get_cause(frame) == PANIC_RSN_INTWDT_CPU1 && core_id == 0);

        // For cache error, pause the non-offending core - offending core handles panic
        if (panic_get_cause(frame) == PANIC_RSN_CACHEERR && core_id != esp_cache_err_get_cpuid()) {
            // Only print the backtrace for the offending core in case of the cache error
            g_exc_frames[core_id] = NULL;
            while (1) {
                ;
            }
        }
    }

    // Need to reconfigure WDTs before we stall any other CPU
    esp_panic_handler_reconfigure_wdts();

    esp_rom_delay_us(1);
    SOC_HAL_STALL_OTHER_CORES();
    // gpio_set_level(4, 1);
#endif

    esp_ipc_isr_stall_abort();

    if (esp_cpu_in_ocd_debug_mode()) {
#if __XTENSA__
        if (!(esp_ptr_executable(cpu_ll_pc_to_ptr(panic_get_address(frame))) && (panic_get_address(frame) & 0xC0000000U))) {
            /* Xtensa ABI sets the 2 MSBs of the PC according to the windowed call size
             * Incase the PC is invalid, GDB will fail to translate addresses to function names
             * Hence replacing the PC to a placeholder address in case of invalid PC
             */
            panic_set_address(frame, (uint32_t)&_invalid_pc_placeholder);
        }
#endif
        if (panic_get_cause(frame) == PANIC_RSN_INTWDT_CPU0
#if !CONFIG_ESP_SYSTEM_SINGLE_CORE_MODE
                || panic_get_cause(frame) == PANIC_RSN_INTWDT_CPU1
#endif
           ) {
            wdt_hal_write_protect_disable(&wdt0_context);
            wdt_hal_handle_intr(&wdt0_context);
            wdt_hal_write_protect_enable(&wdt0_context);
        }
    }

    // Convert architecture exception frame into abstracted panic info
    frame_to_panic_info(frame, &info, pseudo_excause);

    // Call the system panic handler
    esp_panic_handler(&info);
}
The code gets stuck at

Code: Select all

SOC_HAL_STALL_OTHER_CORES();
i.e the gpio_set() put after this line, does not get set.

We came across this issue, as our main application encountered a stack overflow and got stuck on site.
We have recreated this issue with a simple code, with a ton of print statements and WiFi enabled.
In this example(attached below), every 9 seconds we invoke a panic due to a stack overflow and in about 10-20 mins the ESP32 gets hanged.

What we have noticed is this issue does not occur if WiFi is not initialized, also it happens faster if we print long strings of data.
We feel its similar to this issue https://github.com/espressif/esp-idf/issues/8033, however we do not call esp_restart().

Our Development Environment :-
ESP-IDF : v4.4
OS : Ubuntu
Module : ESP32-WROOM-32E 8MB

Please find the attached code. Also find this GitHub issue https://github.com/espressif/esp-idf/issues/10110 we have raised about the same, we also have some debug logs there.
Attachments
Basic_Project.zip
(36.73 KiB) Downloaded 115 times

Who is online

Users browsing this forum: Baidu [Spider], Google [Bot] and 125 guests