mqtt_client keepalive

diolupo
Posts: 11
Joined: Sat Nov 19, 2022 12:08 pm

mqtt_client keepalive

Postby diolupo » Fri Jun 09, 2023 11:27 am

Hi,
I'm using the ESP mqtt_client library with ESP-IDF v4.4.4 and v5.0.1.
In both SDKs, the real mqtt_client keepalive value used seems to be the half of the one set in the client configuration.
I have checked this looking at the mosquitto logs and looking for the PING_RES / PING_REQ from my client.

For example when I set a keepalive of 30s, I get PING_REQ/RES of around 15s (below the mosquitto logs):

Code: Select all

1686308238: Received PINGREQ from CLI-0000f412fad7a3e0
1686308238: Sending PINGRESP to CLI-0000f412fad7a3e0
1686308254: Received PINGREQ from CLI-0000f412fad7a3e0
1686308254: Sending PINGRESP to CLI-0000f412fad7a3e0
1686308270: Received PINGREQ from CLI-0000f412fad7a3e0
1686308270: Sending PINGRESP to CLI-0000f412fad7a3e0
Looking at the mqtt_client.c source code, I've encountered this code:

Code: Select all

static inline bool has_timed_out(uint64_t last_tick, uint64_t timeout) {
  uint64_t next = last_tick + timeout;
  return (int64_t)(next - platform_tick_get_ms()) <= 0;
}

static esp_err_t process_keepalive(esp_mqtt_client_handle_t client)
{
    if (client->connect_info.keepalive > 0) {
        const uint64_t keepalive_ms = client->connect_info.keepalive * 1000;

        if (client->wait_for_ping_resp == true ) {
            if (has_timed_out(client->keepalive_tick, keepalive_ms)) {
                ESP_LOGE(TAG, "No PING_RESP, disconnected");
                esp_mqtt_abort_connection(client);
                client->wait_for_ping_resp = false;
                return ESP_FAIL;
            }
            return ESP_OK;
        }

        if (has_timed_out(client->keepalive_tick, keepalive_ms/2)) {
            if (esp_mqtt_client_ping(client) == ESP_FAIL) {
                ESP_LOGE(TAG, "Can't send ping, disconnected");
                esp_mqtt_abort_connection(client);
                return ESP_FAIL;
            }
            client->wait_for_ping_resp = true;
            return ESP_OK;
        }
    }
    return ESP_OK;
}
Looking at the code, it seems to me that the mqtt client performs a PING_REQ if the time interval between now and latest keepalive_tick is greater than keepalive_ms/2.

Why such implementation? Is there a real motivation to have the keepalive configuration value halved in the internal implementation of the mqtt library?

Reading about the mqtt keepalive, what I understand is that the mqtt client should send the PING_REQ once every keepalive interval, and should wait for the PING_RESP from the broker for half of the keepalive.
This behaviour is completely overturned in the current implementation...

MicroController
Posts: 1734
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: mqtt_client keepalive

Postby MicroController » Fri Jun 09, 2023 5:19 pm

Random quote from the internet:
Keep Alive is [...] the maximum time in seconds allowed to elapse between MQTT protocol packets sent by the client.
Notice it says maximum. If the client was to send a ping only after the full keep alive time expired it would already be in violation of the timeout.
Given that the response to a ping may take up to keep_alive/2 it makes sense to send a ping keep_alive/2 after the last pong was received to ensure the client always sends a ping between keep_alive/2 and keep_alive after the previous one.

diolupo
Posts: 11
Joined: Sat Nov 19, 2022 12:08 pm

Re: mqtt_client keepalive

Postby diolupo » Sun Jun 11, 2023 6:32 am

In my opinion this is completely misleading.
The keepalive interval is expressed in seconds, so if I set to 30, I expect that the mqtt client will ping the Broker every 30s, not every 15s.
Moreover, mqtt Brokers do not timeout the connection if the ping is not received exactly at the elapse of the keepalive interval, they wait at most (keepalive + keepalive/2) before assuming that the client is not connected anymore.
The ESP mqtt client code sends ping to the Broker with an interval = keepalive/2 sec. and waits for ping responses (those that you call pong) for at most keepalive sec.
I think that in the code of the process_keepalive() function, the keepalive and keepalive/2 are inverted.

a2800276
Posts: 78
Joined: Sat Jan 23, 2016 1:59 pm

Re: mqtt_client keepalive

Postby a2800276 » Sun Jun 11, 2023 7:20 am

This seems to be a clear case of "parameter does not mean what I want it to mean" :D Read section "3.1.2.10 Keep Alive" of the MQTT spec: the client has to send a message every keep alive at the latest, like microcontroller said. And the server must close the connection at one-and-a-half times keep alive.

MicroController
Posts: 1734
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: mqtt_client keepalive

Postby MicroController » Sun Jun 11, 2023 5:22 pm

diolupo wrote:
Sun Jun 11, 2023 6:32 am
The ESP mqtt client code sends ping to the Broker with an interval = keepalive/2 sec. and waits for ping responses (those that you call pong) for at most keepalive sec.
Not quite. A timeout is detected if no pong is received keepalive seconds after the last pong; and a ping is sent keepalive/2 seconds after the last pong, so all is good :)

diolupo
Posts: 11
Joined: Sat Nov 19, 2022 12:08 pm

Re: mqtt_client keepalive

Postby diolupo » Mon Jun 12, 2023 7:03 am

Everything is good... ok, the mqtt client is working, but I still do not understand the choice of halving the keepalive interval... why not choosing keepalive - 1 or keepalive / 3...
My customers make me notice this behaviour, they setup the device with a keepalive value and then they discover that the device pings the broker with half of the keepalive interval they set up.
So, if this behavior is the correct one (and I still do not understand why), I will change the configuration interface doubling the keepalive interval to have a consistent behaviour against the configuration parameter.

lodogg
Posts: 7
Joined: Wed Sep 11, 2019 11:06 am

Re: mqtt_client keepalive

Postby lodogg » Tue Jul 02, 2024 3:09 pm

diolupo wrote:
Mon Jun 12, 2023 7:03 am
Everything is good... ok, the mqtt client is working, but I still do not understand the choice of halving the keepalive interval... why not choosing keepalive - 1 or keepalive / 3...
My customers make me notice this behaviour, they setup the device with a keepalive value and then they discover that the device pings the broker with half of the keepalive interval they set up.
So, if this behavior is the correct one (and I still do not understand why), I will change the configuration interface doubling the keepalive interval to have a consistent behaviour against the configuration parameter.
Doubling the keep alive in the configuration to get the desired keep alive timing could not be possible. For example if you want to set keep alive at the maximum allowed by the server (let's say 20 minutes for AWS), you are going to communicate 40 minutes in the connect message and that is not allowed by the server.

So I'm not sure halving the keepalive interval is the right choice by esp...

Who is online

Users browsing this forum: No registered users and 96 guests