Unfortunately it's not easy to create a minimal working example, because it seems to only occur when there processor has lots of work to do. After some further debugging, we have found a little more information, which I have described below:
----------------------------------------------------------------------------------------------------
Issue Description
----------------------------------------------------------------------------------------------------
We have an issue where the ipc1 task locks up when running
spi_flash_op_block_func() causing the
watchdog to report IDLE1. It happens when using the NVS API and is because the
s_flash_op_complete
flag never gets cleared in
spi_flash_enable_interrupts_caches_and_other_cpu().
Typically this is an infrequent issue but we can make it occur more quickly by adding NVS writes
at higher rates in the main task.
It only seems to occur when the system is loaded but we do not know whether this is because of total
processor loading, the number of context switches or something else. We have tried to replicate the
issue in a more simple application but have not yet managed to do so.
The issue occurs when our application is simultaneously:
- Receiving and processing 25 Hz input from a GNSS engine.
- Reading from two I2C devices at 10 Hz (accelerometer and GPIO expander).
- Updating a display at 25 Hz.
- Logging 25 Hz data to an SD card.
----------------------------------------------------------------------------------------------------
Setup Description
----------------------------------------------------------------------------------------------------
We are running ESP-IDF v5.1.1 with the following minor changes:
- Ported esp_intr_dump().
- Enabled FF_USE_FIND in the fatfs component.
- Added custom code to second stage bootloader.
- Added fatfs component support for SD card with invalid BIOS parameter block file system name.
We are using an ESP32-S3 on custom PCB and running the following tasks:
Core 0 Core 1 tskNO_AFFINITY
------ ------ ------
IDLE0 IDLE1 display_task
ipc0 ipc1 keys_task
main log_task serial_rx_task
esp_timer can_task gnss_rx_task
sys_evt data_processing_task
----------------------------------------------------------------------------------------------------
Debug process so far
----------------------------------------------------------------------------------------------------
To debug the issue we have added call counters which get printed in esp_task_wdt_isr_user_handler()
when the watchdog occurs. These suggest that when the issue occurs:
The call to
esp_ipc_call() in
spi_flash_disable_interrupts_caches_and_other_cpu() when called from
core 0 does not return:
Code: Select all
ESP_ERROR_CHECK(esp_ipc_call(other_cpuid, &spi_flash_op_block_func, (void *) other_cpuid));
The call to
xTaskNotify() in
esp_ipc_call_and_wait() when called from core 1 does not return:
Code: Select all
xTaskNotify(s_ipc_task_handle[cpu_id], wait_for, eSetValueWithOverwrite);
The call to
vPortExitCritical() in
xTaskGenericNotify() does not return:
Code: Select all
taskEXIT_CRITICAL( &xKernelLock );