Memory corruption when using a queue between two cores?

chris1seto
Posts: 20
Joined: Tue Jun 12, 2018 5:05 pm

Memory corruption when using a queue between two cores?

Postby chris1seto » Wed Feb 27, 2019 11:08 pm

Hi,

I have a queue of size 1 which contains a struct of some status fields. The queue is produced on core 1 and consumed on core 0.

Here is how I am producing the queue

Code: Select all

xQueueOverwrite(thing_telemetry_queue, &tlm);
Here is how I am consuming the queue

Code: Select all

bool Thing_GetTelemetry(ThingTelemetry_t* state)
{
  return (xQueuePeek(thing_telemetry_queue, &state, 0) == pdPASS);
}
I am having lots of strange crash issues when I am attempting to read and write this queue at high speed. I'm seeing everything from a "stack overflow has been detected" (Even though uxTaskGetStackHighWaterMark() shows plenty of free space right up until the event) to a dump which literally takes almost a full minute to output (which usually leads back to some unhandled core exception, like an illegalload or illegalinstruction or something)

Removing the reading of the queue results in stable code. Any ideas as to what is going on here? FreeRTOS should guarantee that there is never a data hazard when overwriting the queue, while the other core is reading, correct?

chris1seto
Posts: 20
Joined: Tue Jun 12, 2018 5:05 pm

Re: Memory corruption when using a queue between two cores?

Postby chris1seto » Wed Feb 27, 2019 11:24 pm

Here is an example of the type of crash I am seeing...
Attachments
Capture.PNG
Capture.PNG (95.45 KiB) Viewed 5572 times

ESP_Angus
Posts: 2344
Joined: Sun May 08, 2016 4:11 am

Re: Memory corruption when using a queue between two cores?

Postby ESP_Angus » Wed Feb 27, 2019 11:29 pm

The queue itself is thread safe and SMP safe - items will be copied into the queue and out of the queue safely. However you have to make sure the code is using it in a safe way. As FreeRTOS queues don't do any type checking at compile time (a restriction of C, reallly), this can lead to some traps.

- What is the "Item Size" of the queue when you create it? (Not the number of items, but the size of each item.)

- In the posted code, it looks like you are peeking for a "ThingTelemetry_t *" in the queue, so a pointer will be copied out of the queue. This is OK if intended (in which case the queue item size should be be 4 - sizeof(ThingTelemetry_t *). However note that in this case your code is still responsible for making sure the pointer is valid when the other task gets it out of the queue.

- If you intend to pass the full ThingTelemetry_t structure through the queue, make sure the queue item size is sizeof(ThingTelemetry_t) when you create it and call xQueuePeek(thing_telemetry_queue, state, 0) instead - no &state. This will copy the ThingTelemetry_t structure at the head of the queue into the buffer pointed to by 'state'. Check the other queue functions are passing the correct type of pointer as well.

(If you are passing pointers through the queue, consider passing the full structure instead - it's a little less performant but a lot easier to debug!)

ESP_Angus
Posts: 2344
Joined: Sun May 08, 2016 4:11 am

Re: Memory corruption when using a queue between two cores?

Postby ESP_Angus » Wed Feb 27, 2019 11:30 pm

Wow, I've not seen a crash like that before!

It looks like stack corruption so I think probably you meant to pass the full structure through the queue, not the pointer, and the StackPeek() is clobbering the stack.

chris1seto
Posts: 20
Joined: Tue Jun 12, 2018 5:05 pm

Re: Memory corruption when using a queue between two cores?

Postby chris1seto » Thu Feb 28, 2019 2:42 am

Excellent spot ESP-Argus!!

Yep, I totally fat fingered the address of in the peek function. That's what copy and pasting will do... :roll: That fixed it!

Yes, this issue did indeed result in some spectacular crashes. The ESP32 does not like it when the stack is poisoned in this way. You can also get it into a state where it will simply (nearly) infinite loop in stack trace dumping, which is annoying because it clutters up the terminal log file and makes it hard to really find out what was happening right beforehand.

Who is online

Users browsing this forum: No registered users and 190 guests