vTaskDelete sometimes hangs current task

maldus
Posts: 83
Joined: Wed Jun 06, 2018 1:41 pm

vTaskDelete sometimes hangs current task

Postby maldus » Tue Jun 04, 2019 10:30 am

Hello everyone,
I've been running around an infrequent and seemingly random problem that occurs on my ESP32 powered board.
After reproducing the problem, I think I have narrowed down what is happening.

I am using both cores and multiple tasks accessing the same memory. To manage the concurrent access I am using a shared mutex semaphore.
At some point a "main" task has to delete other running subroutines. Their task handle is saved in a shared data structure, which access controlled by a mutex semaphore.
Every once in a while it appears the vTaskDelete call ends up deleting or blocking the calling task as well. Through debug prints I am sure the task handler passed to the function is neither NULL nor the current task's handler. Is this an expected behaviour?

ESP_Angus
Posts: 2344
Joined: Sun May 08, 2016 4:11 am

Re: vTaskDelete sometimes hangs current task

Postby ESP_Angus » Wed Jun 05, 2019 1:35 am

Is it possible the task handle is corrupted and it doesn't point to a valid task at all? This could result in the behaviour you describe.

Are you certain that the main task is blocking in vTaskDelete() and not in some nearby function, for example if a task is deleted while holding the semaphore that protects the shared "task handle info" data then a subsequent attempt to access this will deadlock.

If you are able to post code which exhibits this behaviour, we may be able to give some more detailed suggestions.

maldus
Posts: 83
Joined: Wed Jun 06, 2018 1:41 pm

Re: vTaskDelete sometimes hangs current task

Postby maldus » Wed Jun 05, 2019 6:51 am

This is the specific function that deletes the task:

Code: Select all

void __delete_task(pcb_t *todel) {
    timer_args_t *ptr;
    uint8_t map;

    ESP_LOGI(TAG, "deleteting process %d (%d) with alarm %i",(uint32_t) todel->task, (uint32_t)xTaskGetCurrentTaskHandle(), todel->alarm);
    ptr = (timer_args_t*) todel->args;
    takeStateSemaphore();
    ESP_LOGI(TAG, "semaphore taken");
    vTaskDelete(todel->task); // The following print is never issued
    ESP_LOGI(TAG, "deleted");
    giveStateSemaphore();
    clear_activity_bitmap_output(ptr->bitmap);
    map = ptr->dac >= 0 ? 1 << ptr-> dac : 0;
    clear_activity_bitmap_dac(map);
    // Clear outputs
    clear_output_state(ptr->bitmap);
    //if (ptr->dac >= 0 && ptr->dac < TOT_DAC)
        //update_single_dac_state(ptr->dac, 0);

    free(todel->args);
    out_procq(&process_list, todel);
    free_pcb(todel);
    ESP_LOGI(TAG, "freed memory");
}
Before calling vTaskDelete I make sure to take possession of a shared semaphore to avoid the situation you described. Besides, I made semaphore calls non blocking with no difference in behaviour.
By printing the task handles I know they are not corrupted (at least, not immediately before the call) and that they are indeed the pointer to the correct task.
In the next days I'll see if I can make a reduced example reproducible on a common demo board.

maldus
Posts: 83
Joined: Wed Jun 06, 2018 1:41 pm

Re: vTaskDelete sometimes hangs current task

Postby maldus » Thu Jun 13, 2019 5:03 pm

I have tried and apparently the problem is reproducible on a simple ESP32 devkitC. My code uses a lot of peripherals (i2c, SPI and 232 serial), but they can all be ignored.

The error is reproducible by programming a devkit with my program and running the included stress.py Python script (python stress.py --deaf). My board is a slave that answers to serial commands, and the script simply sends a barrage of random orders.
Eventually (it might take a few minutes) an important process is blocked while trying to delete another one. This is evident in line 83, 84 and 85 of asynctasks.c: there is a print before calling vTaskDelete that is reached and one just after that is never seen (when the watchdog reset is eventually triggered).

I'd really appreciate if someone could give it a try and tell me if I'm missing something evident.
Attachments
project.tar.gz
(41.71 KiB) Downloaded 789 times

fivdiAtESP32
Posts: 47
Joined: Thu Dec 20, 2018 9:47 am

Re: vTaskDelete sometimes hangs current task

Postby fivdiAtESP32 » Thu Jun 13, 2019 6:18 pm

If the following line is added directly before the call to vTaskDelete, does it print the name of the task that should be removed?

Code: Select all

ESP_LOGI(TAG, "deleteting task with name %s", pcTaskGetTaskName(todel->task));
If the correct name is not printed, todel->task doesn't contain the correct handle.

ESP_Angus
Posts: 2344
Joined: Sun May 08, 2016 4:11 am

Re: vTaskDelete sometimes hangs current task

Postby ESP_Angus » Fri Jun 14, 2019 12:06 am

Hi maldus,

Sorry, this is too much code for us to try and use it to debug an OS-level bug. If you have a simple example (maybe by deleting code from this example until it's only a few short source files), then we can happily look at it. But maybe someone else from the forum can help.

I did notice one thing, which is that it's unclear to me that "out_procq()" and "remove_procq()" both delete the entry that they return from the list - the two functions seem to do slightly different things, although maybe the two things are equivalent when considering the linked list structures.

The reason I'm mentioning that is that if there are stale entries in the list of tasks, there could be a race where one task is calling vTaskDelete(NULL) on itself while another task is calling delete_all_tasks() leading to a vTaskDelete(that_task).

Best of luck debugging.

maldus
Posts: 83
Joined: Wed Jun 06, 2018 1:41 pm

Re: vTaskDelete sometimes hangs current task

Postby maldus » Sun Jun 16, 2019 7:06 am

Hi Angus,
I understand and I managed to reduce the project to a smaller one that still displays the problem. If you can let me know whether this is enough. As always, to reproduce the issue you need to program a devkitC (it probably works on other demo boards as well but I haven't tried) and run the script stress.py (python stress.py --deaf); it might take as much as 10 minutes but the wdt eventually triggers.

You are right, there is an issue with delete_all_tasks but it's not related to this one; I almost never used that function and it is missing from this version.
Attachments
project.tar.gz
(24.04 KiB) Downloaded 811 times

ESP_Angus
Posts: 2344
Joined: Sun May 08, 2016 4:11 am

Re: vTaskDelete sometimes hangs current task

Postby ESP_Angus » Tue Jun 18, 2019 6:51 am

Sorry, this is still not anything close to the kind of minimal example that we could use to reproduce the problem and show if it's likely to be an ESP-IDF bug. The only way I can debug with this example would be to debug your application logic, tasks framework, etc. We don't have the resources to debug that.

(Given that we have no other bug reports for vTaskDelete() hanging FreeRTOS, the chances that it's a bug somewhere in the application logic is high. It's not guaranteed, but we don't have the resources to debug your app to determine that.)

With a quick look I did see at least one more race condition:

- Various commands may cause the main task to call delete_tasks_by_output() which may cause a task to be deleted by the main task.
- The task itself may time out and decide to call vTaskDelete(NULL) to delete itself.

A race condition where both these things happen at the same time will almost certainly hang FreeRTOS. Suggest changing the structure so either only main task is responsible for stopping tasks, or tasks only ever call vTaskDelete(NULL) by themselves.

ESP_Angus
Posts: 2344
Joined: Sun May 08, 2016 4:11 am

Re: vTaskDelete sometimes hangs current task

Postby ESP_Angus » Tue Jun 18, 2019 6:52 am

BTW The usual way to structure the kind of "worker tasks" arrangement you're building is not to create/delete tasks at all, but to have a worker pool who can either be idle (blocked receiving from a command queue) or currently working on a command. There's no reason what you're doing can't work, but you have to consider a lot more checks for live/dead tasks and stale handles, bad pointers, etc. compared to having a worker pool where the same tasks are running for the life of the firmware.

maldus
Posts: 83
Joined: Wed Jun 06, 2018 1:41 pm

Re: vTaskDelete sometimes hangs current task

Postby maldus » Tue Jun 18, 2019 11:40 am

BTW The usual way to structure the kind of "worker tasks" arrangement you're building is not to create/delete tasks at all, but to have a worker pool who can either be idle (blocked receiving from a command queue) or currently working on a command. There's no reason what you're doing can't work, but you have to consider a lot more checks for live/dead tasks and stale handles, bad pointers, etc. compared to having a worker pool where the same tasks are running for the life of the firmware.
Yes, I too have since realized this is not the optimal solution. I was convinced I had everything sorted out by using mutex semaphores to regulate access to the worker data structures (the situation you described should not be possible, as only one task can be reading or writing the list of processes at any given time).
Anyway, the race condition MUST be there somewhere in my code, so I solved the problem by simply notifying each task when it is scheduled for deletion, and then leaving it to them to terminate without further operation.

Next time I'll probably follow your advice from the start.

Who is online

Users browsing this forum: No registered users and 170 guests