Disappointing WiFi performance ESP32-S3
Disappointing WiFi performance ESP32-S3
Hi,
This may be a complex issue, with a lot of variables, so I appreciate all possible input. I must say I already tried a lot, but who knows!
So this is the issue. I have been developing and maintaining a firmware image for ESP8266 for years now. It's based on the ESP8266_NONOS_SDK, so almost bare hardware, little influence from SDK/IDK/RTOS whatever, just the usual wifi binary blobs and a self compiled LWIP. Processor running on 80 or 160 Mhz, it doesn't make that much difference for WiFi performance. Because of tight RAM space, most of the memory used for networking is zero copy and moved along the firmware.
On this firmware, I get throughput (including a little bit of processing) of max 800 kbyte/second (depending a little bit on the location related to the access point, and also it varies per module somehow). This is when the module is associated at the full 802.1N 65 Mbps MCS.
I have made something similar for the ESP32-S3. Some things are required to do a little differently, because the ESP-IDF is completely different approach. One of the things is using the POSIX networking API (recv/send) instead of raw LWIP calls. Another thing is using threads and queues instead of lwip callbacks.
With this approach I can get a throughput of max 200 kbyte/second. This is also with an association at 65 Mbps to the same access point. I cannot get my head around why the ESP32, with dual cores and a higher CPU speed (240 Mhz) would be slower at this?
I will post some details and things I tried next.
This may be a complex issue, with a lot of variables, so I appreciate all possible input. I must say I already tried a lot, but who knows!
So this is the issue. I have been developing and maintaining a firmware image for ESP8266 for years now. It's based on the ESP8266_NONOS_SDK, so almost bare hardware, little influence from SDK/IDK/RTOS whatever, just the usual wifi binary blobs and a self compiled LWIP. Processor running on 80 or 160 Mhz, it doesn't make that much difference for WiFi performance. Because of tight RAM space, most of the memory used for networking is zero copy and moved along the firmware.
On this firmware, I get throughput (including a little bit of processing) of max 800 kbyte/second (depending a little bit on the location related to the access point, and also it varies per module somehow). This is when the module is associated at the full 802.1N 65 Mbps MCS.
I have made something similar for the ESP32-S3. Some things are required to do a little differently, because the ESP-IDF is completely different approach. One of the things is using the POSIX networking API (recv/send) instead of raw LWIP calls. Another thing is using threads and queues instead of lwip callbacks.
With this approach I can get a throughput of max 200 kbyte/second. This is also with an association at 65 Mbps to the same access point. I cannot get my head around why the ESP32, with dual cores and a higher CPU speed (240 Mhz) would be slower at this?
I will post some details and things I tried next.
Re: Disappointing WiFi performance ESP32-S3
More details here.
I am testing with 4096 bytes packets, using udp and tcp. Over UDP I have LWIP fragment them into wlan sized frames and then reassemble, over TCP LWIP does the segmentation and the firmware unsegments them into the original 4096 bytes packets. There is little difference in performance over UDP and TCP (both platforms).
This ESP32-S3 has 2 Mbytes of PSRAM which I am using. I have tested with LWIP and WIFI buffers in PSRAM and in internal RAM. Only very slight difference. I also tried with the PSRAM at 40 Mhz instead of 80 Mhz, also very little change, so I do not think the PSRAM is the issue here.
Then I tried with various CPU speeds: 80/160/240 MHz. Also little change in performance. What the hell? This suggests any of the actual running code is the issue.
I have been benchmarking the memcpy's (4k each). Min time is 1 microseconds (probably this is from a smaller chunk), max time is 7 microseconds, which isn't that bad I think?
I have been benchmarking heap_caps_malloc too. That may be interesting. Shortest time is 7 microseconds, largest time is 274 microseconds, which, I think, is substantial. I guess it also tries to do some garbage collection at the same time. To rule out this influence, I changed the source to not do the malloc but pass the original buffer (and guard it). If malloc would be culprit, I'd expect to see a firm difference. But no, no difference at all. Influence of malloc appears to be neglectable.
I also tried:
- enable/disable various "put wifi functions in IRAM"
- enable/disable AMPDU/AMSDU
- running wifi on the other core
- yielding the processor right after submitting the packet to the queue
None of these make a difference.
So I am very very interested in suggestions. Could it be the POSIX interface layer (socket/bind/listen/recv/send/close) that slows things down tremendously? Could I avoid this layer? I am also using Bluetooth on the ESP32 (normally just idle listening), is that an issue? How about USB?
Thanks!
I am testing with 4096 bytes packets, using udp and tcp. Over UDP I have LWIP fragment them into wlan sized frames and then reassemble, over TCP LWIP does the segmentation and the firmware unsegments them into the original 4096 bytes packets. There is little difference in performance over UDP and TCP (both platforms).
This ESP32-S3 has 2 Mbytes of PSRAM which I am using. I have tested with LWIP and WIFI buffers in PSRAM and in internal RAM. Only very slight difference. I also tried with the PSRAM at 40 Mhz instead of 80 Mhz, also very little change, so I do not think the PSRAM is the issue here.
Then I tried with various CPU speeds: 80/160/240 MHz. Also little change in performance. What the hell? This suggests any of the actual running code is the issue.
I have been benchmarking the memcpy's (4k each). Min time is 1 microseconds (probably this is from a smaller chunk), max time is 7 microseconds, which isn't that bad I think?
I have been benchmarking heap_caps_malloc too. That may be interesting. Shortest time is 7 microseconds, largest time is 274 microseconds, which, I think, is substantial. I guess it also tries to do some garbage collection at the same time. To rule out this influence, I changed the source to not do the malloc but pass the original buffer (and guard it). If malloc would be culprit, I'd expect to see a firm difference. But no, no difference at all. Influence of malloc appears to be neglectable.
I also tried:
- enable/disable various "put wifi functions in IRAM"
- enable/disable AMPDU/AMSDU
- running wifi on the other core
- yielding the processor right after submitting the packet to the queue
None of these make a difference.
So I am very very interested in suggestions. Could it be the POSIX interface layer (socket/bind/listen/recv/send/close) that slows things down tremendously? Could I avoid this layer? I am also using Bluetooth on the ESP32 (normally just idle listening), is that an issue? How about USB?
Thanks!
-
- Posts: 9
- Joined: Sun Jun 16, 2024 2:25 pm
Re: Disappointing WiFi performance ESP32-S3
I think you may totally disable PSRAM. If that helps, then you may use octal PSRAM with 80MHz or 120MHz.
-
- Posts: 9723
- Joined: Thu Nov 26, 2015 4:08 am
Re: Disappointing WiFi performance ESP32-S3
You may want to look at the iperf example in esp-idf; I believe it has a bunch of configuration options that make it prefer raw speed over balanced resource usage.
Re: Disappointing WiFi performance ESP32-S3
I can't really do that. The ESP32-S3 I am using has QIO PSRAM at 80 Mhz. Nothing to gain there. But it means memory access should be possible at somewhere around 10 Mbytes/second, if not even hit by the cache. Actually I don't think the bottleneck is there. As said, I tried running the PSRAM at half the speed and it didn't really matter...Linetkux Wang wrote: ↑Tue Jun 18, 2024 2:20 pmI think you may totally disable PSRAM. If that helps, then you may use octal PSRAM with 80MHz or 120MHz.
Re: Disappointing WiFi performance ESP32-S3
Has anyone (especially at Espressif) done that, what kind of numbers should I be able to expect? If much better, I think I should have a look at the code to see if there are any smart(er) approaches.ESP_Sprite wrote: ↑Fri Jun 21, 2024 6:14 amYou may want to look at the iperf example in esp-idf; I believe it has a bunch of configuration options that make it prefer raw speed over balanced resource usage.
-
- Posts: 1700
- Joined: Mon Oct 17, 2022 7:38 pm
- Location: Europe, Germany
Re: Disappointing WiFi performance ESP32-S3
Interesting, so I am getting only 1/200th of what I should be getting (~200 kb/s, 4096 byte packets)...
And that is compared to the minimum rank there.
And that is compared to the minimum rank there.
-
- Posts: 1700
- Joined: Mon Oct 17, 2022 7:38 pm
- Location: Europe, Germany
-
- Posts: 1700
- Joined: Mon Oct 17, 2022 7:38 pm
- Location: Europe, Germany
Re: Disappointing WiFi performance ESP32-S3
Btw, I would first look into my application's code, probably measure some times (esp_cpu_get_cycle_count()), and check if there may be some unnoticed lags anywhere which would prevent me pushing data to the IP stack as fast as possible. Synchronization/inter-task communication (queues, ringbuffers,...), and most of all logging can consume much more time than one might expect.
Who is online
Users browsing this forum: Bing [Bot] and 356 guests