Are there any tools to help debug watchdog timeouts
-
- Posts: 9719
- Joined: Thu Nov 26, 2015 4:08 am
Re: Are there any tools to help debug watchdog timeouts
Yes, the backtrace on panic is rubbish; the panic-on-watchdog-timeout-option is mostly meant to reboot the system when an abnormal issue happens. We are aware that this doesn't help when debugging - I actually ran into this a while ago. We actually have an internal request for a feature to stop in the active process when the task watchdog is called, should be in esp-idf master eventually.
Re: Are there any tools to help debug watchdog timeouts
So there is no way to figure out what the back trace is pointing to?
-
- Posts: 9719
- Joined: Thu Nov 26, 2015 4:08 am
Re: Are there any tools to help debug watchdog timeouts
Not using the serial port gdbstub. I seem to remember that jtag may be a better option; it either gives you a good backtrace out-of-the-box or you can get there by reading the exception registers (EPC1, if memory serves).
Re: Are there any tools to help debug watchdog timeouts
I am experiencing a similar issue but in my case esp_timer is the currently running task.
The funny thing is that I am not using esp_timer in my code and it must be some of libs my code depends on.
I have spent over a month trying to debug this issue but without any success. With some changes in my code now the task_wdt gets triggered after few days of uptime(it used to trigger within few hours before)
So, after this many years is there finally a better option to debug this?
The funny thing is that I am not using esp_timer in my code and it must be some of libs my code depends on.
I have spent over a month trying to debug this issue but without any success. With some changes in my code now the task_wdt gets triggered after few days of uptime(it used to trigger within few hours before)
So, after this many years is there finally a better option to debug this?
Code: Select all
E (143150756) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
E (143150756) task_wdt: - IDLE (CPU 0)
E (143150756) task_wdt: Tasks currently running:
E (143150756) task_wdt: CPU 0: esp_timer
-
- Posts: 9719
- Joined: Thu Nov 26, 2015 4:08 am
Re: Are there any tools to help debug watchdog timeouts
That's impossible to say without at least looking at your code, sorry.
Re: Are there any tools to help debug watchdog timeouts
The corrupted backtrace on abort is fixed in the latest toolchain release, 2021r2. https://github.com/espressif/esp-idf/issues/6124
If you are using an IDF release older than 4.4, you can still try to use the gdb from the latest toolchain.
If you are using an IDF release older than 4.4, you can still try to use the gdb from the latest toolchain.
Re: Are there any tools to help debug watchdog timeouts
SDK: ESP-IDF v5.0.1
I want to log the stack trace when I get a ESP_RST_TASK_WDT task watchdog timeout. I have a custom panic handler that copies the stack trace to RAM and startup code that interrogates this. I thought this would work when I enabled "Invoke panic handler on Task Watchdog timeout" in the config editor, but it does not. I simply get the restart reason as ESP_RST_TASK_WDT and no trace. The behaviour is the same regardless of the config setting, it seems to have no effect. Any idea as to how I can log the stack / determine the stuck task on a task timeout?
I want to log the stack trace when I get a ESP_RST_TASK_WDT task watchdog timeout. I have a custom panic handler that copies the stack trace to RAM and startup code that interrogates this. I thought this would work when I enabled "Invoke panic handler on Task Watchdog timeout" in the config editor, but it does not. I simply get the restart reason as ESP_RST_TASK_WDT and no trace. The behaviour is the same regardless of the config setting, it seems to have no effect. Any idea as to how I can log the stack / determine the stuck task on a task timeout?
-
- Posts: 1699
- Joined: Mon Oct 17, 2022 7:38 pm
- Location: Europe, Germany
Re: Are there any tools to help debug watchdog timeouts
The stack trace may be a bit hard to come by as the watchdog is handled via an interrupt, i.e. the stack of the interrupted code may be in some "intermediate" state and the call stack prior to the interrupt not available.
You may want to note that you "can define the function esp_task_wdt_isr_user_handler in the user code, in order to receive the timeout event and extend the default behavior.", and maybe have a look at the implementation of task_wdt_isr(void*).
You may want to note that you "can define the function esp_task_wdt_isr_user_handler in the user code, in order to receive the timeout event and extend the default behavior.", and maybe have a look at the implementation of task_wdt_isr(void*).
-
- Posts: 1699
- Joined: Mon Oct 17, 2022 7:38 pm
- Location: Europe, Germany
Re: Are there any tools to help debug watchdog timeouts
Also worth noting: The "offending" task for a WDT timeout may not even be at fault, esp. when it seems to be the IDLE task. The reason for one task to miss its watchdog may well be another, higher priority, task starving the lower priority task of the CPU.
A more manual approach to locate the problem is to do a sanity check on your code and spot sections where there may be significant time spent (in a loop) without using any vTaskDelay or other blocking functions (taking semaphores, reading from queues,...) of FreeRTOS. Busy-waiting or "spinning" on a flag are candidates for reconsideration. Busy spinning may also happen accidentally when trying to use timeout durations less than portTICK_PERIOD_MS (=10ms by default), e.g. a timeout (xTicksToWait) of 5ms is not possible and will result in a timeout of 0 (= (int)(5 / portTICK_PERIOD_MS)).
A more manual approach to locate the problem is to do a sanity check on your code and spot sections where there may be significant time spent (in a loop) without using any vTaskDelay or other blocking functions (taking semaphores, reading from queues,...) of FreeRTOS. Busy-waiting or "spinning" on a flag are candidates for reconsideration. Busy spinning may also happen accidentally when trying to use timeout durations less than portTICK_PERIOD_MS (=10ms by default), e.g. a timeout (xTicksToWait) of 5ms is not possible and will result in a timeout of 0 (= (int)(5 / portTICK_PERIOD_MS)).
Who is online
Users browsing this forum: Google [Bot] and 52 guests