Why are WebSockets so painfully slow?
Why are WebSockets so painfully slow?
I've recently implemented OTA over HTTP in a project running on an ESP32-S3-WROOM-2-N32R8V. Having 8MB of PSRAM my first approach was to simply send the whole firmware as a single HTTP PUT request. That would still be my preferred method if it weren't for the problem of easily running into an HTTP receive timeout once the firmware has multiple MBs. By default this timeout is set to 5s. Of course one could increase it, but this negatively effects all other HTTP requests running on the server.
So I rewrote my app to use WebSockets instead. First I used packet sizes of 32kB. Much to my surprise transmitting (and asynchronously responding) to a single packet already triggers the previously mentioned 5s delay! How is that possible? Transmitting a 1.3MB firmware with a single HTTP PUT request took less than 60s in my network. How can 32kB of data suddenly take 5s?
Does anyone have any deeper insight on the WebSocket implementation? Where could this bottleneck be?
So I rewrote my app to use WebSockets instead. First I used packet sizes of 32kB. Much to my surprise transmitting (and asynchronously responding) to a single packet already triggers the previously mentioned 5s delay! How is that possible? Transmitting a 1.3MB firmware with a single HTTP PUT request took less than 60s in my network. How can 32kB of data suddenly take 5s?
Does anyone have any deeper insight on the WebSocket implementation? Where could this bottleneck be?
-
- Posts: 1696
- Joined: Mon Oct 17, 2022 7:38 pm
- Location: Europe, Germany
Re: Why are WebSockets so painfully slow?
What's the size of the buffer you're trying to read (httpd_req_recv(...)?) data into?
You may get a timeout result if the buffer cannot be completely filled within the given timeframe. Trying to read 1MB at once may trigger a timeout where 100 reads @ 10kB may not.
You may get a timeout result if the buffer cannot be completely filled within the given timeframe. Trying to read 1MB at once may trigger a timeout where 100 reads @ 10kB may not.
Re: Why are WebSockets so painfully slow?
Oh... that's news to me. I've simply allocated the whole thing in one go. Thanks for the hint I'll try that.
/edit
Thank you, doing multiple calls to httpd_req_recv works like a charm.
I now still wonder though... why the f. is the WebSocket implementation SO slow?
Transmitting a 1.3MB firmware to my device with an HTTP request takes 5.05s.
Transmitting the same firmware with WebSockets takes 8min! The packet size does hardly matter in that case...
/edit
Thank you, doing multiple calls to httpd_req_recv works like a charm.
I now still wonder though... why the f. is the WebSocket implementation SO slow?
Transmitting a 1.3MB firmware to my device with an HTTP request takes 5.05s.
Transmitting the same firmware with WebSockets takes 8min! The packet size does hardly matter in that case...
-
- Posts: 1696
- Joined: Mon Oct 17, 2022 7:38 pm
- Location: Europe, Germany
Re: Why are WebSockets so painfully slow?
Does the WS client send all data in one go, or does it wait for some server response every X kB?
Re: Why are WebSockets so painfully slow?
Ok, I've totally misjudged that sorry. Turns out the bottleneck is writing to flash memory, not anything HTTP related.
Transmitting the 1.3MB firmware takes just about 5s with HTTP, but writing it to an OTA partition then takes ~7.6min which is almost what I got before when using WebSockets (plus some minor delays because every packet got acknowledged).
I'm currently using the "OTA_WITH_SEQUENTIAL_WRITES" macro as second parameter when invoking esp_ota_begin as everything else gives me a TG1WDT_SYS_RST panic. I assume the flash write times could be improved upon if one could erase the entire partition at once instead of sequentially... the question is how though.
Transmitting the 1.3MB firmware takes just about 5s with HTTP, but writing it to an OTA partition then takes ~7.6min which is almost what I got before when using WebSockets (plus some minor delays because every packet got acknowledged).
I'm currently using the "OTA_WITH_SEQUENTIAL_WRITES" macro as second parameter when invoking esp_ota_begin as everything else gives me a TG1WDT_SYS_RST panic. I assume the flash write times could be improved upon if one could erase the entire partition at once instead of sequentially... the question is how though.
Re: Why are WebSockets so painfully slow?
That's an abnormally long time. Is there anything else going on that might be locking up the flash for extended periods while OTA is busy? No higher priority task taking a huge amount of CPU time? Can you lower the log level?vinci1989 wrote: ↑Tue Jun 20, 2023 3:38 amwriting it to an OTA partition then takes ~7.6min
I'm currently using the "OTA_WITH_SEQUENTIAL_WRITES" macro as second parameter when invoking esp_ota_begin as everything else gives me a TG1WDT_SYS_RST panic. I assume the flash write times could be improved upon if one could erase the entire partition at once instead of sequentially...
If you know the binary size in advance (usually you can get this from the Content-Length header) then you can provide that to esp_ota_begin. Alternately, use 0 to erase the entire partition in advance.
Ensure you have enabled CONFIG_SPI_FLASH_YIELD_DURING_ERASE.
Re: Why are WebSockets so painfully slow?
You were right again. My feeling was way off on how much CPU time I already spend somewhere else.
I've increased the OTA task priority and voila, the time the 1.3MB update now takes dropped down to 3min.
For some reason neither does work for me. Both, true size and 0 trigger a TG1WDT_SYS_RST panic.
I've also tried the config option "CONFIG_SPI_FLASH_CHECK_ERASE_TIMEOUT_DISABLED" to no success.
Can't say where this panic comes from. I've also tried to disable every watchdog (core0&1 and interrupts)... weird.
This option is enabled.
Re: Why are WebSockets so painfully slow?
It sounds like the task watchdog. I suspect that, even with SPI_FLASH_YIELD_DURING_ERASE, these other very busy tasks still don't have enough time to get through their work. The obvious and easy thing to do is increase SPI_FLASH_ERASE_YIELD_TICKS so that the OTA task will give other tasks more time to jump in and do their thing while erasing flash.
Other things to do are to see if you can reduce CPU load in these other tasks, at least while OTA is active, and consider if pinning/unpinning to CPU(s) can shuffle workloads around in a way that reduces contention.
Other things to do are to see if you can reduce CPU load in these other tasks, at least while OTA is active, and consider if pinning/unpinning to CPU(s) can shuffle workloads around in a way that reduces contention.
Re: Why are WebSockets so painfully slow?
I've had another task one the same core performing ADC conversions with 4ksps. Suspending this task before performing the erase operation solved the issue of the triggered watchdog.
Thank you very much for all the useful replies!
Thank you very much for all the useful replies!
Re: Why are WebSockets so painfully slow?
How is your latency?
The packet latency can be a few hundred ms when in low power mode. Without low power its ~15ms. This may account for the poor throughput.
The packet latency can be a few hundred ms when in low power mode. Without low power its ~15ms. This may account for the poor throughput.
Code: Select all
esp_wifi_set_ps(WIFI_PS_NONE);
Who is online
Users browsing this forum: No registered users and 59 guests