So I just ported over a small arduino program to ESP32. This program creates a single WiFiServer and WiFiClient object.
A remote program on a PC connects to this server and then the ESP32 sends a 100byte message 100 times per second. The message is sent as a single client.write(buffer, length) call each time through the main loop. The same message is also sent out serial.
This worked great on an Adafruit Feather M0 WiFi board. When I ported it to their ESP32 board the TCP started having big problems.
The main loop has a spin lock, while(micros()<next_run_time); to lock it to 100hz. Sending messages out the serial is fine but when I have it start sending messages out TCP, performance is horrible. The main loop experiences unpredictable stalls of as long as 1-8 SECONDS frequently. If I stop sending TCP messages, performance goes back to normal.
Seems like the ESP32 shouldn't have problems sending just 10k/second (100 packets/second). What could I be doing wrong? The same code was running fine on the other board.
8 second stalls when TCP running, less than 10k/sec throughput
-
- Posts: 35
- Joined: Sun May 20, 2018 9:16 pm
-
- Posts: 35
- Joined: Sun May 20, 2018 9:16 pm
Re: 8 second stalls when TCP running, less than 10k/sec throughput
Forgot to mention, what I am doing is reading an I2C sensor at 100hz and then writing the values it returns to a TCP stream at 100hz.
This works fine all day long if I leave WiFi enabled but write the data to serial instead of TCP.
Just wanted to add that because I saw a problem mentioned elsewhere where it was a bad interaction between I2C and wifi.
Still digging into this and will try to put a simple reproduce case together.
This works fine all day long if I leave WiFi enabled but write the data to serial instead of TCP.
Just wanted to add that because I saw a problem mentioned elsewhere where it was a bad interaction between I2C and wifi.
Still digging into this and will try to put a simple reproduce case together.
-
- Posts: 35
- Joined: Sun May 20, 2018 9:16 pm
Re: 8 second stalls when TCP running, less than 10k/sec throughput
Here's a bit of test code, I chopped it down as much as possible and still have it outputting the same text in the same way. There is no I2C anymore. I just updated to the latest ESP32 arduino stuff this morning. In my case the AP/router is only 6 feet away from the computers and ESP32 and no other devices are using it.
To run the test just add the name of your AP and it's password below, compile and run. You should almost immediately get strings of numbers out the serial port. Then connect to port 90 using a TCP terminal program like Putty, you should get the same stream of numbers there. Let it run and every once in a while the stream of numbers will just stop for 8 seconds or more.
The first number in the string of numbers being printed is the loop time in us, it should be around 10000 but when there is a stall it will be a LOT larger.
The other thing that is odd, but you can't see with Putty, is that even when it's running smoothly the TCP traffic is very bursty. I have a PC telemetry program that ingests and graphs the stream in real time, when I use it with ESP32 the graph stutters and updates in irregular spurts. With a different MCU the graph moves smoothly.
Have I missed some setup thing that needs to be done to get this to work? Maybe buffer sizes, turning nagle mode off or something?
To run the test just add the name of your AP and it's password below, compile and run. You should almost immediately get strings of numbers out the serial port. Then connect to port 90 using a TCP terminal program like Putty, you should get the same stream of numbers there. Let it run and every once in a while the stream of numbers will just stop for 8 seconds or more.
The first number in the string of numbers being printed is the loop time in us, it should be around 10000 but when there is a stall it will be a LOT larger.
The other thing that is odd, but you can't see with Putty, is that even when it's running smoothly the TCP traffic is very bursty. I have a PC telemetry program that ingests and graphs the stream in real time, when I use it with ESP32 the graph stutters and updates in irregular spurts. With a different MCU the graph moves smoothly.
Have I missed some setup thing that needs to be done to get this to work? Maybe buffer sizes, turning nagle mode off or something?
Code: Select all
#define FOR_FEATHER_ESP32
#include <WiFi.h>
//-----------------------------------------------------------------------------------------------------------------
// WiFi Access point name and password.
//-----------------------------------------------------------------------------------------------------------------
const char ssid[] = "yourAP";
const char pass[] = "yourPASS";
#define TIME_BETWEEN_SENSOR_READS (10000) // Loop time in microseconds
#define TIME_BETWEEN_PID_COMPUTES (10) // Same as above, but in miliseconds, for the PID library
uint8_t
system_state = 1; // current state of the robots state machine
// These are tunable parameters, what works for one bot will probably not work for another, even if the design is
// exactly the same. For a new bot tt is best to set all these to zero and start tuning from there.
// When you have a tune that works well, put the values below so your bot will start up in-tune.
float
tp,ti,td; // calculated components of the pid for telemetry
double
balanceSetpoint = 2.7, // Natural balance point of the robot (usually 0 unless robot is lopsided)
totalSetpoint = 0.0,
sensorPitch = 0.0, // The current pitch value from the sensor
pitchError = 0.0, // Calculated pitch error (balanceSetpoint-sensorPitch)
pidOutput = 0.0;
int
bias_test_speed = 1;
// Timers and stats
unsigned long
next_blynk,
next_rssi,
loop_start, // Time at the top of the main loop
last_loop_start, // Time at the top of the previous main loop
sensor_frequency, // Time between sensor reads
last_sensor, // Time of last sensor read
pid_frequency, // Time between PID updates
last_pid, // Time of last PID update
blynk_time, // Time to execute the blynk processing in the main loop
sensor_time, // Time to execute the sensor read portion of the loop
next_sensor, // Next time the sensors should be read
last_loop_time; // Time spent inside the previous loop (report 1 frame late so we can capture the time required to do telemetry)
// Sensor related globals
uint8_t
sys = 0, // Calibration state of the IMU
gyro = 0, // gyro calibration state
accel = 0, // accelerometer calibration state
mag = 0; // magnitometer calibration state
//-----------------------------------------------------------------------------------------------------------------
// For the TCP based telemetry and control on port 90
// NOTE: if more than one client connects or this will get confused.
//-----------------------------------------------------------------------------------------------------------------
//
WiFiServer
server(90); // Creates Server on port 90
WiFiClient
client; // First client connected
bool
clientConnected = false, // TRUE when client is connected
clientCommand = false; // TRUE when there is a complete command in the buffer
char
clientBuffer[80]; // Buffer for incomming commands
int
clientBufsize = 0; // Size of the current line of text in the buffer
//-----------------------------------------------------------------------------------------------------------------
// For the telemetry system, used for both serial and wifi
//-----------------------------------------------------------------------------------------------------------------
// printBuffer is only initialized for test purposes. You can delete the initializer to save space.
// 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
uint8_t
printBuffer[200] = "15004,359.94,85.31,-3.94,1,0,3,0,0,15004,2818,15,0,0.00,-83.31,17.00,48.00,0.60,FAKE\n";
int
printBufferLen = 0; // Length of string in print buffer 85
bool
serialTelemetryOn = true, // true to turn on serial telemetry at startup
tcpTelemetryOn = true; // true to turn on TCP telemetry as soon as a connection is made to port 90
//-----------------------------------------------------------------------------------------------------------------
// void setup(void) --- automatically called by arduino at startup to set up the entire system
//-----------------------------------------------------------------------------------------------------------------
void setup(void)
{
// Setup Serial, to be used with the USB-serial telemetry app bit rate in the telemetry app needs to match.
Serial.begin(2000000);
Serial.println(F("BalanceBotMultiplatform7"));
// Initialize WiFi
Serial.println();
Serial.print("Connecting to ");
Serial.println(ssid);
WiFi.begin(ssid, pass);
while (WiFi.status() != WL_CONNECTED) {
delay(500);
Serial.print(".");
}
Serial.println("");
Serial.println("WiFi connected.");
Serial.println("IP address: ");
Serial.println(WiFi.localIP());
server.begin();
Serial.print("connected to wifi at ");
IPAddress myAddress=WiFi.localIP();
Serial.println(myAddress);
// Initialize all the time keeping variables.
// These are setup so the times reported first time through the loops will be close
// to zero rather than unusually large (if they were set before blink/wifi was initialized
// they would be very big because it takes blynk a lot of time to connect to wifi)
last_loop_start = micros();
last_sensor = micros();
last_pid = micros();
pid_frequency = 0; // Time between PID updates
blynk_time = 0, // Time to execute the blynk processing in the main loop
sensor_time = 0, // Time to execute the sensor portion of the loop
next_sensor = 0; // Next time the sensors should be read
}
//-----------------------------------------------------------------------------------------------------------------
// void loop(void) --- It is very important that this loop executes exactly on time. The PID library and other
// code assumes this. Confirm with telemetry that you are not going over time and adjust the TIME_BETWEEN_SENSOR_READS
// to make sure you will always run on time.
//
void loop(void)
{
// Start with a Spinlock to make sure this runs every TIME_BETWEEN_SENSOR_READS microseconds
// This could also be scheduled with an interrupt.
// Microsecond resolution is probably not required, only consistency, so this code could use
// the milisecond timer to avoid problems with wrap-around.
// I continue to use microseconds because I need the high resolution to time some of the
// execution time stats accurately.
while(micros() < next_sensor);
// Remember loop start time
loop_start = micros();
// If it's time to get a new sensor event, get it
sensor_time = micros(); // Start timer on sensor read
last_sensor = sensor_time;
next_sensor = sensor_time + TIME_BETWEEN_SENSOR_READS; // Compute time of next sensor read
sensor_time = micros()-sensor_time; // Compute how long this read took
// State handling
if(!clientConnected)
{
// Check to see if a client has connected
if(client = server.available())
{
clientConnected = true;
clientCommand = false;
clientBufsize = 0;
Serial.println("Connected to TCP telemetry/command." );
}
}
// Only generate the string if serial or TCP telemetry is on, this speeds up the loop when you aren't using telemetry.
if(tcpTelemetryOn || serialTelemetryOn)
{
// Calculate the MCU battery voltage (for feather M0)
float measuredvbat = 0;
printBufferLen = sprintf((char*)printBuffer,"%d,%4.2f,%4.2f,%4.2f,%d,%d,%d,%d,%d,%d,%d,%d,%d,%4.2f,%4.2f,%4.2f,%4.2f,%4.2f,%d,%4.2f,%4.2f,%4.2f,%4.2f,%4.2f\n",
(loop_start - last_loop_start), // Total loop time, measured from start of previous loop to start of this one
(1), // Sensor fusion x
(2), // Sensor fusion y
(3), // Sensor fusion Z
// euler.x(),
// euler.y(),
// euler.z(),
(system_state), // System state machine
(sys), // IMU calibration states
(gyro),
(accel),
(mag),
(sensor_frequency), // Time between sensor reads
(sensor_time), // Time to do a sensor read
(blynk_time), // Time to do a blynk call
(pid_frequency), // Time between PID calculations
(pidOutput), // Output of a PID calculation
(totalSetpoint - sensorPitch), // Pitch error
(1), // PID tuning parameters
(2),
(3),
(last_loop_time), // time since loop start not counting write of stats
measuredvbat, // battery voltage
totalSetpoint, // setpoint including joystick contribution
tp,
ti,
td);
}
// Send telemetry buffer out serial using Serial.write
if(serialTelemetryOn && printBufferLen > 0)
{
Serial.write(printBuffer,printBufferLen);
}
// send telemetry buffer out TCP
if(clientConnected && tcpTelemetryOn)
{
pid_frequency = micros();
client.write(printBuffer,printBufferLen);
pid_frequency = micros()-pid_frequency;
}
// Update timing stats, note that "last loop time" is the time of the PREVIOUS loop execution
printBufferLen = 0;
last_loop_time = micros()-loop_start;
last_loop_start = loop_start;
}
-
- Posts: 35
- Joined: Sun May 20, 2018 9:16 pm
Re: 8 second stalls when TCP running, less than 10k/sec throughput
I did some wireshark captures and am seeing what look like bad retransmits due to missing ACKs or timeouts.
It usually starts with a "previous segment not captured" error, followed by a number of duplicate ACKs, a TCP retransmission and then a "spurious retransmission" error.
I need to check more carefully but it appears that either the ESP dropped an ACK or timed out and decided to retransmit, but according to wireshark the PC seemed to have received and acked the data the first time.
Is there any way I can change the ESP buffer size or other TCP timeout values that might resolve this?
It usually starts with a "previous segment not captured" error, followed by a number of duplicate ACKs, a TCP retransmission and then a "spurious retransmission" error.
I need to check more carefully but it appears that either the ESP dropped an ACK or timed out and decided to retransmit, but according to wireshark the PC seemed to have received and acked the data the first time.
Is there any way I can change the ESP buffer size or other TCP timeout values that might resolve this?
Re: 8 second stalls when TCP running, less than 10k/sec throughput
Sorry for digging up this old thread again. I just wasted a week on a smiliar problem with the solution being rather simple, so maybe I can save someone a bit of hassle.
The short answer is: disable nagle.
The problem is caused by the nagle algorithm as it will only allow to push out data if the previous ACK was received.
When a single ACK is dropped, lwip will wait until the ack timeout expires before resending the packet as "spurious retransmission". As lwip's default ACK timeout is not very strict this will cause a "stall" lasting multiple seconds while lwip waits for the lost ack to arrive before resending. For sending "live" data, this is obviously very bad.
The timeout can be reduced to 500ms at which point lwip's tcp polling rate will prevent setting it any lower. By rebuilding lwip, this rate can be increased.
For sending a steady stream of data, I found that disabling nagle completely works best. When an ack is dropped it'll just send the next packet. If a packet is dropped the client will detect that it's out of sequence and resend the lost packet ("TCP fast retransmission").
The connection usually recovers within a few milliseconds without causing any notable stalls.
Disabling nagle obviously comes at the price of reduced payload throughput.
I'm not entirely sure how you would disable it using arduino. Using lwip's raw tcp api you can just call
The short answer is: disable nagle.
The problem is caused by the nagle algorithm as it will only allow to push out data if the previous ACK was received.
When a single ACK is dropped, lwip will wait until the ack timeout expires before resending the packet as "spurious retransmission". As lwip's default ACK timeout is not very strict this will cause a "stall" lasting multiple seconds while lwip waits for the lost ack to arrive before resending. For sending "live" data, this is obviously very bad.
The timeout can be reduced to 500ms at which point lwip's tcp polling rate will prevent setting it any lower. By rebuilding lwip, this rate can be increased.
For sending a steady stream of data, I found that disabling nagle completely works best. When an ack is dropped it'll just send the next packet. If a packet is dropped the client will detect that it's out of sequence and resend the lost packet ("TCP fast retransmission").
The connection usually recovers within a few milliseconds without causing any notable stalls.
Disabling nagle obviously comes at the price of reduced payload throughput.
I'm not entirely sure how you would disable it using arduino. Using lwip's raw tcp api you can just call
Code: Select all
tcp_nagle_disable(pcb);
Who is online
Users browsing this forum: No registered users and 60 guests