More efficient UART receive options
-
- Posts: 52
- Joined: Mon Oct 24, 2022 9:37 pm
More efficient UART receive options
All the serial-handling code I've seen uses the IDF driver which seems to have numerous niceties like ring buffer, core synchronization, etc. I want to receive each byte as quick as possible, as I use timing information for framing. Right now, I use the existing driver with a 1-byte interrupt threshold, which simply dumps to a queue. This generally works, but with so many interrupts the queue worker which frames the data can suffer starvation during periods of high data reception. The queue fills up and data is then lost.
I've been considering different solutions and would like advice.
One possibility would be throttling. There's two possible methods. The IDF driver has its own buffer, so when the queue is getting full, the queue reader could temporarily disable interrupts and let the driver use its own buffer, and then re-enable interrupt when queue again has room. Or, have the interrupt look at the queue space available and decide whether to queue or hold in buffer. These might work as long as high-data burst periods fit in a reasonable buffer.
Another option is not use the IDF driver, or strip it down to make it more efficient, with assumptions such as locked to particular core, and no internal buffering needed. Effectively convert the solution with a lower level interrupt which avoids all the features the IDF driver has which slows it down. Is there any such example? Or suggestion how to accomplish this?
Another option is polling for received data, and do the processing when there is none. I've noticed most Arduino examples do this, but still go through a high-level driver. I'd like to do this at a lower level. Any advice? Perhaps I can find the relevant part from the driver and use it.
My preference is the second option, more efficient interrupts would be ideal.
There's also a naive solution, just making the queue much bigger. I'm going to try it, but it's kind of distasteful as 99% of the time there's just a handful of bytes on the queue.
I've been considering different solutions and would like advice.
One possibility would be throttling. There's two possible methods. The IDF driver has its own buffer, so when the queue is getting full, the queue reader could temporarily disable interrupts and let the driver use its own buffer, and then re-enable interrupt when queue again has room. Or, have the interrupt look at the queue space available and decide whether to queue or hold in buffer. These might work as long as high-data burst periods fit in a reasonable buffer.
Another option is not use the IDF driver, or strip it down to make it more efficient, with assumptions such as locked to particular core, and no internal buffering needed. Effectively convert the solution with a lower level interrupt which avoids all the features the IDF driver has which slows it down. Is there any such example? Or suggestion how to accomplish this?
Another option is polling for received data, and do the processing when there is none. I've noticed most Arduino examples do this, but still go through a high-level driver. I'd like to do this at a lower level. Any advice? Perhaps I can find the relevant part from the driver and use it.
My preference is the second option, more efficient interrupts would be ideal.
There's also a naive solution, just making the queue much bigger. I'm going to try it, but it's kind of distasteful as 99% of the time there's just a handful of bytes on the queue.
-
- Posts: 1708
- Joined: Mon Oct 17, 2022 7:38 pm
- Location: Europe, Germany
Re: More efficient UART receive options
How much of inter-frame time are we talking about? *hints at RX_TOUT*I use timing information for framing
-
- Posts: 52
- Joined: Mon Oct 24, 2022 9:37 pm
Re: More efficient UART receive options
That's a good thought. I tried that at the beginning, but didn't understand the protocol as well then. I'm going to revisit that along with a double or higher multi-buffering frames pattern.MicroController wrote: ↑Thu Dec 07, 2023 10:13 pmHow much of inter-frame time are we talking about? *hints at RX_TOUT*
I'm still interested in the other paths if anyone has experience in those.
-
- Posts: 1708
- Joined: Mon Oct 17, 2022 7:38 pm
- Location: Europe, Germany
Re: More efficient UART receive options
From what you described, you likely won't need any more buffering than the UART (and driver) already have.
The UART has (default) 128 bytes of hardware FIFO, and the driver can have as much as you have RAM available. One could call that a double buffer.
If there is a significant inter-frame gap (a couple of byte-lengths), compared to any intra-frame delays, you should be able to set a high RX threshold and an RX_TOUT of some value less than the inter-frame gap. If a full frame fits into the HW FIFO you/the driver will get only a single interrupt after each frame.
Note that, at least as of IDF v5.1now, uart_read_bytes(...) is broken when it comes to returning data in a timely manner. You can work around that issue by never trying to read more than uart_get_buffered_data_len(...)+1 bytes at a time. This may be relevant in your case because I assume you want to process a frame a.s.a.p. after it is received.
The UART has (default) 128 bytes of hardware FIFO, and the driver can have as much as you have RAM available. One could call that a double buffer.
If there is a significant inter-frame gap (a couple of byte-lengths), compared to any intra-frame delays, you should be able to set a high RX threshold and an RX_TOUT of some value less than the inter-frame gap. If a full frame fits into the HW FIFO you/the driver will get only a single interrupt after each frame.
Note that, at least as of IDF v5.1now, uart_read_bytes(...) is broken when it comes to returning data in a timely manner. You can work around that issue by never trying to read more than uart_get_buffered_data_len(...)+1 bytes at a time. This may be relevant in your case because I assume you want to process a frame a.s.a.p. after it is received.
-
- Posts: 52
- Joined: Mon Oct 24, 2022 9:37 pm
Re: More efficient UART receive options
I was initially thinking I would have to parse a partial frame to determine the variable size, but that may not be necessary if the timeout works effectively. But I will try just depending on the timeouts and let the driver FIFO do the buffering, that makes sense.
Good to know those gotchas about the API, thanks. But my current interrupt uses the UART registers directly to read the FIFO and data. I will just continue to do it that character-by-character way but loop a full frame of data. I thought the double-buffering might still be necessary because while processing the current frame, another frame might arrive, and the interrupt could potentially overwrite the frame before it's finished processing. I'm not totally certain that could happen, but frames can vary in size from 4 bytes to 255 bytes, so it seems prudent. I will use a simple notify signal out of the interrupt rather than queueing bytes, which I don't believe makes sense with frame-level interrupts.
What would be nice, if you could just take over the buffer from the driver, and then tell the driver to use a different buffer, to avoid copying data.
Good to know those gotchas about the API, thanks. But my current interrupt uses the UART registers directly to read the FIFO and data. I will just continue to do it that character-by-character way but loop a full frame of data. I thought the double-buffering might still be necessary because while processing the current frame, another frame might arrive, and the interrupt could potentially overwrite the frame before it's finished processing. I'm not totally certain that could happen, but frames can vary in size from 4 bytes to 255 bytes, so it seems prudent. I will use a simple notify signal out of the interrupt rather than queueing bytes, which I don't believe makes sense with frame-level interrupts.
What would be nice, if you could just take over the buffer from the driver, and then tell the driver to use a different buffer, to avoid copying data.
-
- Posts: 52
- Joined: Mon Oct 24, 2022 9:37 pm
Re: More efficient UART receive options
Ah, now I remember why I gave up on the timeout before. I've reduced the timeout thresh to 1 and I still get frames concatenated together. That doesn't really make sense, that should be 1 UART symbol period. But I know from my byte-by-byte solution the frame gaps are way larger than this - more like 20 to 40 times the time of consecutive symbols. The API says the symbol period is based on the current baud rate, but there's clearly something off.
-
- Posts: 1708
- Joined: Mon Oct 17, 2022 7:38 pm
- Location: Europe, Germany
Re: More efficient UART receive options
Did you remember to enable and handle both the RX_FULL and the RX_TOUT interrupts?
-
- Posts: 52
- Joined: Mon Oct 24, 2022 9:37 pm
Re: More efficient UART receive options
I did some more testing, and remembered in my byte-by-byte solution, I partially parse the frame to determine the length and terminate the frame when the length is complete, regardless of timeout.
In fact, it seems some frames come in rapid succession, so seems they even get concatenated with the smallest threshould. Do you know if that threshould gets converted to ms resolution or micro? Adding micros timing measurements to my byte-by-byte solution shows there's still a significant (should be detectable) delay at the start of a new frame, on the order of 50 to 100% longer than normal successive bytes.
In fact, it seems some frames come in rapid succession, so seems they even get concatenated with the smallest threshould. Do you know if that threshould gets converted to ms resolution or micro? Adding micros timing measurements to my byte-by-byte solution shows there's still a significant (should be detectable) delay at the start of a new frame, on the order of 50 to 100% longer than normal successive bytes.
-
- Posts: 1708
- Joined: Mon Oct 17, 2022 7:38 pm
- Location: Europe, Germany
Re: More efficient UART receive options
The RX_TOUT threshold is actually measured in symbols, i.e. bytes.Do you know if that threshould gets converted to ms resolution or micro?
By now I lost track of whether there is or isn't enough inter-frame time in your protocol.
However, as the frame size can be determined from the data, you can also look into dynamically adjusting the RX threshold. First, set a low RX threshold so that you're notified when enough data has been received to calculate the length of the frame. Once you know the expected size of the remainder of the frame, you can update the RX threshold to put it right at the expected end of the frame. Then, when the frame is complete, you prepare for the next frame by setting the threshold low again.
-
- Posts: 1708
- Joined: Mon Oct 17, 2022 7:38 pm
- Location: Europe, Germany
Re: More efficient UART receive options
That's actually how the IDF's ring buffers work. They give you a pointer into the buffer from which you can consume the data as you like, while the producer can concurrently keep pushing new data into another part of the buffer.What would be nice, if you could just take over the buffer from the driver, and then tell the driver to use a different buffer, to avoid copying data.
The ESP32 will happily copy a couple of hundred MB per second from and to RAM, so performance is not an issue with the measly UART producing the data.
OTOH, I recently too got annoyed by the http server API wanting me to provide 'enough' buffers for it to needlessly copy each request's immutable data to, while it also keeps holding a copy in its internal structure. Had to go ahead and 'fix' that issue
Who is online
Users browsing this forum: Google [Bot] and 104 guests