More efficient UART receive options

HighVoltage
Posts: 52
Joined: Mon Oct 24, 2022 9:37 pm

More efficient UART receive options

Postby HighVoltage » Thu Dec 07, 2023 1:16 am

All the serial-handling code I've seen uses the IDF driver which seems to have numerous niceties like ring buffer, core synchronization, etc. I want to receive each byte as quick as possible, as I use timing information for framing. Right now, I use the existing driver with a 1-byte interrupt threshold, which simply dumps to a queue. This generally works, but with so many interrupts the queue worker which frames the data can suffer starvation during periods of high data reception. The queue fills up and data is then lost.

I've been considering different solutions and would like advice.

One possibility would be throttling. There's two possible methods. The IDF driver has its own buffer, so when the queue is getting full, the queue reader could temporarily disable interrupts and let the driver use its own buffer, and then re-enable interrupt when queue again has room. Or, have the interrupt look at the queue space available and decide whether to queue or hold in buffer. These might work as long as high-data burst periods fit in a reasonable buffer.

Another option is not use the IDF driver, or strip it down to make it more efficient, with assumptions such as locked to particular core, and no internal buffering needed. Effectively convert the solution with a lower level interrupt which avoids all the features the IDF driver has which slows it down. Is there any such example? Or suggestion how to accomplish this?

Another option is polling for received data, and do the processing when there is none. I've noticed most Arduino examples do this, but still go through a high-level driver. I'd like to do this at a lower level. Any advice? Perhaps I can find the relevant part from the driver and use it.

My preference is the second option, more efficient interrupts would be ideal.

There's also a naive solution, just making the queue much bigger. I'm going to try it, but it's kind of distasteful as 99% of the time there's just a handful of bytes on the queue.

MicroController
Posts: 1708
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: More efficient UART receive options

Postby MicroController » Thu Dec 07, 2023 10:13 pm

I use timing information for framing
How much of inter-frame time are we talking about? *hints at RX_TOUT*

HighVoltage
Posts: 52
Joined: Mon Oct 24, 2022 9:37 pm

Re: More efficient UART receive options

Postby HighVoltage » Fri Dec 08, 2023 6:37 pm

MicroController wrote:
Thu Dec 07, 2023 10:13 pm
How much of inter-frame time are we talking about? *hints at RX_TOUT*
That's a good thought. I tried that at the beginning, but didn't understand the protocol as well then. I'm going to revisit that along with a double or higher multi-buffering frames pattern.

I'm still interested in the other paths if anyone has experience in those.

MicroController
Posts: 1708
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: More efficient UART receive options

Postby MicroController » Fri Dec 08, 2023 7:58 pm

From what you described, you likely won't need any more buffering than the UART (and driver) already have.
The UART has (default) 128 bytes of hardware FIFO, and the driver can have as much as you have RAM available. One could call that a double buffer.
If there is a significant inter-frame gap (a couple of byte-lengths), compared to any intra-frame delays, you should be able to set a high RX threshold and an RX_TOUT of some value less than the inter-frame gap. If a full frame fits into the HW FIFO you/the driver will get only a single interrupt after each frame.
Note that, at least as of IDF v5.1now, uart_read_bytes(...) is broken when it comes to returning data in a timely manner. You can work around that issue by never trying to read more than uart_get_buffered_data_len(...)+1 bytes at a time. This may be relevant in your case because I assume you want to process a frame a.s.a.p. after it is received.

HighVoltage
Posts: 52
Joined: Mon Oct 24, 2022 9:37 pm

Re: More efficient UART receive options

Postby HighVoltage » Sat Dec 09, 2023 5:12 am

I was initially thinking I would have to parse a partial frame to determine the variable size, but that may not be necessary if the timeout works effectively. But I will try just depending on the timeouts and let the driver FIFO do the buffering, that makes sense.

Good to know those gotchas about the API, thanks. But my current interrupt uses the UART registers directly to read the FIFO and data. I will just continue to do it that character-by-character way but loop a full frame of data. I thought the double-buffering might still be necessary because while processing the current frame, another frame might arrive, and the interrupt could potentially overwrite the frame before it's finished processing. I'm not totally certain that could happen, but frames can vary in size from 4 bytes to 255 bytes, so it seems prudent. I will use a simple notify signal out of the interrupt rather than queueing bytes, which I don't believe makes sense with frame-level interrupts.

What would be nice, if you could just take over the buffer from the driver, and then tell the driver to use a different buffer, to avoid copying data.

HighVoltage
Posts: 52
Joined: Mon Oct 24, 2022 9:37 pm

Re: More efficient UART receive options

Postby HighVoltage » Sat Dec 09, 2023 7:10 am

Ah, now I remember why I gave up on the timeout before. I've reduced the timeout thresh to 1 and I still get frames concatenated together. That doesn't really make sense, that should be 1 UART symbol period. But I know from my byte-by-byte solution the frame gaps are way larger than this - more like 20 to 40 times the time of consecutive symbols. The API says the symbol period is based on the current baud rate, but there's clearly something off.

MicroController
Posts: 1708
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: More efficient UART receive options

Postby MicroController » Sat Dec 09, 2023 8:59 pm

Did you remember to enable and handle both the RX_FULL and the RX_TOUT interrupts?

HighVoltage
Posts: 52
Joined: Mon Oct 24, 2022 9:37 pm

Re: More efficient UART receive options

Postby HighVoltage » Sat Dec 09, 2023 11:30 pm

I did some more testing, and remembered in my byte-by-byte solution, I partially parse the frame to determine the length and terminate the frame when the length is complete, regardless of timeout.

In fact, it seems some frames come in rapid succession, so seems they even get concatenated with the smallest threshould. Do you know if that threshould gets converted to ms resolution or micro? Adding micros timing measurements to my byte-by-byte solution shows there's still a significant (should be detectable) delay at the start of a new frame, on the order of 50 to 100% longer than normal successive bytes.

MicroController
Posts: 1708
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: More efficient UART receive options

Postby MicroController » Sun Dec 10, 2023 1:18 am

Do you know if that threshould gets converted to ms resolution or micro?
The RX_TOUT threshold is actually measured in symbols, i.e. bytes.

By now I lost track of whether there is or isn't enough inter-frame time in your protocol.
However, as the frame size can be determined from the data, you can also look into dynamically adjusting the RX threshold. First, set a low RX threshold so that you're notified when enough data has been received to calculate the length of the frame. Once you know the expected size of the remainder of the frame, you can update the RX threshold to put it right at the expected end of the frame. Then, when the frame is complete, you prepare for the next frame by setting the threshold low again.

MicroController
Posts: 1708
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: More efficient UART receive options

Postby MicroController » Sun Dec 10, 2023 1:35 am

What would be nice, if you could just take over the buffer from the driver, and then tell the driver to use a different buffer, to avoid copying data.
That's actually how the IDF's ring buffers work. They give you a pointer into the buffer from which you can consume the data as you like, while the producer can concurrently keep pushing new data into another part of the buffer.
The ESP32 will happily copy a couple of hundred MB per second from and to RAM, so performance is not an issue with the measly UART producing the data.
OTOH, I recently too got annoyed by the http server API wanting me to provide 'enough' buffers for it to needlessly copy each request's immutable data to, while it also keeps holding a copy in its internal structure. Had to go ahead and 'fix' that issue ;-)

Who is online

Users browsing this forum: No registered users and 117 guests