Ethernet packet loss with ESP-NOW (idf v4.4.6)

biterror
Posts: 31
Joined: Thu Apr 30, 2020 11:00 am

Ethernet packet loss with ESP-NOW (idf v4.4.6)

Postby biterror » Thu Jan 11, 2024 1:41 pm

I have been writing a communication system which uses ESP-NOW and the protocol is working quite nicely now. However, on the "router" system which passes messages between ESP-NOW nodes and Ethernet, I see UDP packet loss whenever there are packets flowing over ESP-NOW. Same with ping (ICMP) if ESP-NOW is moving packets.

My code uses several tasks (some for ESP-NOW, some for other things) and the tasks are running at priority level 0. No polling or busy looping is used, so there should be no CPU shortage.

I can flood ping the ESP32 board from two hosts with zero packet loss when there's no traffic over ESP-NOW, but if I send, say, 10 packets per second over ESP-NOW, I see some ICMP packet loss almost every second. I assume that the ICMP packets are handled at a very low level, and my own code should have nothing to do with that, so I find this problem strange.

I have tried increasing the number of some lwip buffers in menuconfig, but that didn't seem to help.

EDIT: It seems it's the Ethernet receive direction which is losing packets.

Any ideas?

biterror
Posts: 31
Joined: Thu Apr 30, 2020 11:00 am

Re: Ethernet packet loss with ESP-NOW (idf v4.4.6)

Postby biterror » Thu Jan 11, 2024 6:07 pm

So far, I have tried:
- forcing lwip to CPU1, no help
- setting mac_config.rx_task_prio = 5, no help
- setting lwip task priority to max-1, no help
- enabling ETH_CMD_S_FLOW_CTRL, no help
- measuring time spent in my ESP-NOW receive callback: max 140 µs in worst case
- setting ethernet and lwip buffers and mailboxes to max values, no help

Running out of ideas.

biterror
Posts: 31
Joined: Thu Apr 30, 2020 11:00 am

Re: Ethernet packet loss with ESP-NOW (idf v4.4.6)

Postby biterror » Fri Jan 12, 2024 4:39 pm

Some additional information:

ESP-NOW reception is not a problem. If one ESP32 is only receiving packets over ESP-NOW, no ICMP packets are lost on Ethernet.

But if I use a simple software to transmit ESP-NOW packets at 100 ms or 50 ms intervals, some ICMP packets are lost almost every second.

I can see the same when using my ESP-NOW protocol. If the router board is transmitting over ESP-NOW and receives an UDP packet at the same time, the packet may be lost.

biterror
Posts: 31
Joined: Thu Apr 30, 2020 11:00 am

Re: Ethernet packet loss with ESP-NOW (idf v4.4.6)

Postby biterror » Sat Jan 13, 2024 5:01 pm

I have been monitoring DMAMISSEDFR_REG, but no lost frames are reported (as far as I can tell, the driver doesn't access this register - if it does, it clears the register by reading it).

One strange thing: When I keep calling esp_now_send(), the Ethernet LED blinks, although the software is not transmitting anything over Ethernet. As soon as I stop calling esp_now_send(), the Ethernet LED stops blinking. I don't know what is happening... :?

ESP_Sprite
Posts: 9577
Joined: Thu Nov 26, 2015 4:08 am

Re: Ethernet packet loss with ESP-NOW (idf v4.4.6)

Postby ESP_Sprite » Sun Jan 14, 2024 2:11 am

That is an odd problem, lemme check if anything is known about this. WiFi in general has a somewhat spikey power use profile; is there any way the lines to the Ethernet hardware can be affected by EMC? (Although tbh I'd expect the issues to appear on ESP-NOW Tx in that case...)

biterror
Posts: 31
Joined: Thu Apr 30, 2020 11:00 am

Re: Ethernet packet loss with ESP-NOW (idf v4.4.6)

Postby biterror » Sun Jan 14, 2024 8:35 am

Thanks for the reply! The PCB is a 6-layer one and the Ethernet signals are routed in the inner layers. I'm using an external antenna with ESP32, so (most of) the RF radiation is outside the board. Power supply has a 2 A regulator and all the chips and ESP32 have multiple bypass capacitors next to their power supply pins. The PHY chip (LAN8710A) is located near the ESP32 to minimize the trace length.

I found this yesterday: viewtopic.php?f=13&t=30141 I'm using GPIO17 as an output for PHY clock. Is it possible that using WIFI/ESP-NOW to transmit data causes disturbances in the clock output which in turn causes problems in the PHY chip?

I would need a new version of the PCB to test a separate oscillator for the PHY.. and the systems already installed in several countries would still be a problem :roll:

ESP_Sprite
Posts: 9577
Joined: Thu Nov 26, 2015 4:08 am

Re: Ethernet packet loss with ESP-NOW (idf v4.4.6)

Postby ESP_Sprite » Mon Jan 15, 2024 3:47 am

That indeed sounds like it's your issue... sorry, I don't know of a way to work around the problem on existing boards.

biterror
Posts: 31
Joined: Thu Apr 30, 2020 11:00 am

Re: Ethernet packet loss with ESP-NOW (idf v4.4.6)

Postby biterror » Tue Jan 16, 2024 2:16 pm

Shouldn't there be a warning in the ESP32 documentation about this? "If you plan on using WIFI and Ethernet, you CAN NOT use the ESP32 as the PHY master clock source." or something..

biterror
Posts: 31
Joined: Thu Apr 30, 2020 11:00 am

Re: Ethernet packet loss with ESP-NOW (idf v4.4.6)

Postby biterror » Thu Jun 20, 2024 4:37 pm

Umm, isn't it possible to use GPIO17 as RMII clock input? I was going to test the existing hardware with an external clock oscillator to see if Ethernet would work reliably, but I can't configure the GPIO17 pin as clock input. :-(

EDIT: I couldn't find any information about this in Tech Ref Manual (by searching for RMII). GPIO17 has a function EMAC_CLK_180, but no mention about the pin being output only. (Or that the APLL generated clock is not suitable for Ethernet if WIFI is used.)

How are we supposed to know these things? I have designed two products, both suffering from the Ethernet/clock problem if I enable WIFI. Hmph.

ESP_ondrej
Posts: 198
Joined: Fri May 07, 2021 10:35 am

Re: Ethernet packet loss with ESP-NOW (idf v4.4.6)

Postby ESP_ondrej » Fri Aug 09, 2024 8:37 am

The observed issue with generated CLK stability is unfortunately a real issue... ESP32 cannot be used as RMII CLK source when Wi-Fi is used. The only workaround is to not use Wi-Fi or use external source of the CLK :(

We are deeply sorry for trouble caused. We at least updated errata https://www.espressif.com/sites/default ... ata_en.pdf, Section 3.22. The TRM is also planning to be updated to make the limitation clearly visible.

Who is online

Users browsing this forum: Basalt, Bing [Bot] and 320 guests