Advice on application architecture - Sensors/WiFi/MQTT
Posted: Mon Sep 04, 2017 12:28 am
I'd like to ask if anyone can help me figure out the "best practice" architecture for my relatively simple ESP32 application please. I'm using the ESP-IDF with C, on a "DOIT" ESP32 board.
I have a OneWire bus with five DS18B20 temperature sensors attached, and my testing with these shows the set-up to be robust and reliable. I'm getting error-free readings and can reliably take measurements from all five sensors every second, sustained, for many hours. I detect CRC errors so I typically know when communication with a sensor has failed.
What I'd like to do is integrate this simple read/print loop with an MQTT client so that it can do two new things:
[*] subscribe to a topic on an external MQTT broker and use values published on this topic to affect the operation, for example adjust the sampling period, or change the sampling resolution.
[*] publish the temperature readings to topics on the MQTT broker that higher-level applications can subscribe to and see.
I've also managed to get the MQTT side of things working over WiFi, using this component:
https://github.com/tuanpmt/espmqtt
Where I'm now running into problems is integrating these two sides of the application. My initial thought was to run the temperature sensing routine as a task, and the MQTT side as a separate task (or at least a task to read from a queue written to by the temp sense task and call `mqtt_publish`). However my attempts to do this have failed because I've run into several problems:
[*] The MQTT task blocks on a socket read and does not yield the CPU, which seems to block all other tasks on that CPU.
[*] The one-wire protocol is timing sensitive so microsecond delays must be carefully honoured - but interruptions from the WiFi/MQTT side cause bad readings and therefore CRC errors (which I can see).
I've tried using vTaskSuspendAll()/xTaskResumeAll() around my temperature sensing code to avoid interruptions during time-sensitive GPIO, however this causes the MQTT side to assert (and reset the micro) because it's not expecting the scheduler to be disabled (I assume therefore that it is running on the other CPU, so is not suspended, but doesn't know that so the check for a non-suspended task scheduler fails and it asserts).
I also tried setting the priority of the temperature sensing task to be higher than the MQTT task, but this seems to cause random problems - sometimes the MQTT task cannot connect to the remote server, and sometimes the temperature sensing task doesn't even run! This seems very strange to me.
In some cases when there's clearly a clash between tasks, I've seen a simple "printf" loop from 0 to 4 simply not print anything for, say, values 3 and 4. But the application doesn't crash - so I have no idea where the output from that loop has gone - it vanished.
I suspect I'm going to need to add a timeout to the blocked MQTT read() so that it can check a queue for any outgoing MQTT publications, send them, then go back to reading the socket in case of incoming values.
What I'd like to know is what would be the best approach for this kind of application. Is it wise to split the workload into separate tasks? How can I ensure a task isn't interrupted for ~1 millisecond? Should I explicitly bind each task to a CPU to avoid random issues with the scheduler? I'd prefer to find a solution that would work regardless of which task runs on which CPU though, as I'd also like to eventually run this on a single CPU.
If anyone can offer me some advice on how to go about structuring this kind of application I'd really appreciate it.
Also, if anyone knows of a good FreeRTOS book that covers such topics, with good advice on how to organise such applications, especially with regards to sockets/LWIP and multi-CPU systems, please let me know.
I have a OneWire bus with five DS18B20 temperature sensors attached, and my testing with these shows the set-up to be robust and reliable. I'm getting error-free readings and can reliably take measurements from all five sensors every second, sustained, for many hours. I detect CRC errors so I typically know when communication with a sensor has failed.
What I'd like to do is integrate this simple read/print loop with an MQTT client so that it can do two new things:
[*] subscribe to a topic on an external MQTT broker and use values published on this topic to affect the operation, for example adjust the sampling period, or change the sampling resolution.
[*] publish the temperature readings to topics on the MQTT broker that higher-level applications can subscribe to and see.
I've also managed to get the MQTT side of things working over WiFi, using this component:
https://github.com/tuanpmt/espmqtt
Where I'm now running into problems is integrating these two sides of the application. My initial thought was to run the temperature sensing routine as a task, and the MQTT side as a separate task (or at least a task to read from a queue written to by the temp sense task and call `mqtt_publish`). However my attempts to do this have failed because I've run into several problems:
[*] The MQTT task blocks on a socket read and does not yield the CPU, which seems to block all other tasks on that CPU.
[*] The one-wire protocol is timing sensitive so microsecond delays must be carefully honoured - but interruptions from the WiFi/MQTT side cause bad readings and therefore CRC errors (which I can see).
I've tried using vTaskSuspendAll()/xTaskResumeAll() around my temperature sensing code to avoid interruptions during time-sensitive GPIO, however this causes the MQTT side to assert (and reset the micro) because it's not expecting the scheduler to be disabled (I assume therefore that it is running on the other CPU, so is not suspended, but doesn't know that so the check for a non-suspended task scheduler fails and it asserts).
I also tried setting the priority of the temperature sensing task to be higher than the MQTT task, but this seems to cause random problems - sometimes the MQTT task cannot connect to the remote server, and sometimes the temperature sensing task doesn't even run! This seems very strange to me.
In some cases when there's clearly a clash between tasks, I've seen a simple "printf" loop from 0 to 4 simply not print anything for, say, values 3 and 4. But the application doesn't crash - so I have no idea where the output from that loop has gone - it vanished.
I suspect I'm going to need to add a timeout to the blocked MQTT read() so that it can check a queue for any outgoing MQTT publications, send them, then go back to reading the socket in case of incoming values.
What I'd like to know is what would be the best approach for this kind of application. Is it wise to split the workload into separate tasks? How can I ensure a task isn't interrupted for ~1 millisecond? Should I explicitly bind each task to a CPU to avoid random issues with the scheduler? I'd prefer to find a solution that would work regardless of which task runs on which CPU though, as I'd also like to eventually run this on a single CPU.
If anyone can offer me some advice on how to go about structuring this kind of application I'd really appreciate it.
Also, if anyone knows of a good FreeRTOS book that covers such topics, with good advice on how to organise such applications, especially with regards to sockets/LWIP and multi-CPU systems, please let me know.