UART/DPORT reads with SPI flash access hangs both cores

kbaud1
Posts: 71
Joined: Wed Jan 17, 2018 11:55 pm

UART/DPORT reads with SPI flash access hangs both cores

Postby kbaud1 » Fri Jun 10, 2022 6:01 pm

This simple test program demonstrates a malfunction on ESP32-WROOM modules when there are UART reads on one core and SPI flash reads on the other core. Within seconds of running, both cores hang forever (with the WDT disabled). This can be seen when numbers stop printing out, and we also have LEDs on IO pins that stop flashing due to both cores hanging.

This malfunction has been verified to happen on both Rev 1 and 3 silicon as well as either IDF version 3.3.6 or 4.4.1. The project was built from the hello_world sample with the task watchdog setting disabled.

You can change how frequently the malfunction happens by changing task priorities, the CPU frequency, and the vTaskDelay counters or commenting them out (do this if you have trouble seeing the malfunction). The malfunction is very sensitive to timing and so a subtle timing change can make you think it started working okay. This problem was very hard to trace when it first happened infrequently in a much more complex program. So if you see "task starvation," understand that the test program was set up this way to make the malfunction happen more frequently for troubleshooting purposes.

Why does this malfunction happen, and what should be done to prevent it? Through trial-and-error we found that disabling interrupts during READ_PERI_REG seems to stop it, or using DPORT_READ_PERI_REG [esp_dport_access_reg_read()] instead. Just adding a "MEMW" instruction before each READ_PERI_REG does not help. We found that reading RTC_CNTL_TIME0_REG (also in the DPORT memory space) instead of the UART FIFO does not malfunction. There seems to be a mysterious, underlying problem we haven't resolved, only hidden?

Research suggests that READ_PERI_REG of the UART FIFO may be the culprit. The TRM warns that DPORT reads may cause programs to crash if the bug workarounds 3.3, 3.10, 3.16, and 3.17 are not carefully followed:
https://www.espressif.com/sites/default ... ual_en.pdf
https://www.espressif.com/sites/default ... p32_en.pdf

In addition we have studied discussion of problems with UART FIFO reads like this one:
https://github.com/espressif/esp-idf/issues/5101

A careful study of each above issue indicates they cannot explain this malfunction, that it must be caused by something not documented? Tracing through the driver code shows that spi_flash_read() calls spi_flash_guard_start() to halt both CPUs, but this gets stuck in esp_ipc_call_and_wait(). The other CPU gets stuck in ipc_task(), which calls spi_flash_op_block_func(). This roughly agrees with the debug output.

Perhaps there is some memory corruption from a collision between READ_PERI_REG and the ipc interrupt? If a fix is required for this, what is it, and is it needed for reading something like RTC_CNTL_TIME0_REG also?

The following references "shared registers not protected from sharing":
https://github.com/espressif/esp-idf/co ... a6359e546a

Is some step required to protect the UART FIFO DPORT read from sharing also? None of the UART driver code seems to have anything like this.
Attachments
main.c
(2.3 KiB) Downloaded 235 times
debug output.txt
(3.39 KiB) Downloaded 275 times

WiFive
Posts: 3529
Joined: Tue Dec 01, 2015 7:35 am

Re: UART/DPORT reads with SPI flash access hangs both cores

Postby WiFive » Fri Jun 10, 2022 8:14 pm

This seems more appropriate for a GitHub issue

Who is online

Users browsing this forum: No registered users and 254 guests