OTA Problems

Bartosz
Posts: 8
Joined: Tue Jul 09, 2019 8:06 pm

OTA Problems

Postby Bartosz » Tue Jul 09, 2019 8:40 pm

mesh config:
routerless, one device configured as MESH_ROOT another as MESH_NODE, root communicate with PC via UART

problem conditions:
OTA data is transfered to ROOT, then mlink_ota_send_task is started to program another NODE

1) NODE random crashes during OAT transfer from ROOT to NODE (approximately 1 of 3 OTA transfer fails), after NODE reebots it enter upgrade mode again and crash again in infinite loop, it stops after ROOT mupgrade_firmware_stop():
I (17371) mesh: 1888<arm>parent monitor, my layer:2(cap:16), interval:12560ms, retries:2<>
E (23801) [mwifi, 815]: <ESP_ERR_NO_MEM> Realloc size: 1456, new_ptr: 0x0, heap free: 10816
assertion "new_ptr" failed: file "src\mwifi.c", line 815, function: __mwifi_read
abort() was called at PC 0x401682ab on core 1

Backtrace: 0x4008adeb:0x3ffe2280 0x4008b0e9:0x3ffe22a0 0x401682ab:0x3ffe22c0 0x400e1743:0x3ffe22f0 0x400d2ccb:0x3ffe2380 0x40087db5:0x3ffe23d0

Rebooting...

2) after NODE enters upgrade mode always watchdog warnings appears (OTA ends with success):
I (5096421) [MDF_APP, 1202]: event_loop_cb, event: 0x3001
W (5096421) [APP, 1354]: Enter upgrade mode
E (5101411) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
E (5101411) task_wdt: - IDLE0 (CPU 0)
E (5101411) task_wdt: Tasks currently running:
E (5101411) task_wdt: CPU 0: wifi
E (5101411) task_wdt: CPU 1: IDLE1
E (5106401) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
E (5106401) task_wdt: - IDLE0 (CPU 0)
E (5106401) task_wdt: Tasks currently running:
E (5106401) task_wdt: CPU 0: wifi
E (5106401) task_wdt: CPU 1: IDLE1
I (5108121) [MDF_APP, 1202]: event_loop_cb, event: 0x3002
W (5108121) [APP, 1362]: The upgrade progress is: 10%
I (5108401) mesh: 4379<assoc>parent layer:1, channel:4, rssi:-49, assoc:1, rssi threshold<-78,-82,-85>
E (5118551) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
E (5118551) task_wdt: - IDLE0 (CPU 0)
E (5118551) task_wdt: Tasks currently running:
E (5118551) task_wdt: CPU 0: node_handle
E (5118551) task_wdt: CPU 1: IDLE1
I (5122691) [MDF_APP, 1202]: event_loop_cb, event: 0x3002
W (5122741) [APP, 1362]: The upgrade progress is: 20%
I (5125111) [MDF_APP, 1202]: event_loop_cb, event: 0x3002
W (5125111) [APP, 1362]: The upgrade progress is: 30%
E (5130081) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
E (5130081) task_wdt: - IDLE0 (CPU 0)
E (5130081) task_wdt: Tasks currently running:
E (5130081) task_wdt: CPU 0: node_handle
E (5130081) task_wdt: CPU 1: IDLE1
I (5132191) [MDF_APP, 1202]: event_loop_cb, event: 0x3002
W (5132191) [APP, 1362]: The upgrade progress is: 40%
E (5139361) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
E (5139361) task_wdt: - IDLE0 (CPU 0)
E (5139361) task_wdt: Tasks currently running:
E (5139361) task_wdt: CPU 0: node_handle
E (5139361) task_wdt: CPU 1: IDLE1
I (5140211) [MDF_APP, 1202]: event_loop_cb, event: 0x3002
W (5140221) [APP, 1362]: The upgrade progress is: 50%
I (5149871) [MDF_APP, 1202]: event_loop_cb, event: 0x3002
W (5149871) [APP, 1362]: The upgrade progress is: 60%
I (5158801) [MDF_APP, 1202]: event_loop_cb, event: 0x3002
W (5158801) [APP, 1362]: The upgrade progress is: 70%
I (5163891) [MDF_APP, 1202]: event_loop_cb, event: 0x3002
W (5163891) [APP, 1362]: The upgrade progress is: 80%
I (5171041) [MDF_APP, 1202]: event_loop_cb, event: 0x3002
W (5171041) [APP, 1362]: The upgrade progress is: 90%
I (5177281) [mupgrade_node, 212]: Write total_size: 969616, written_size: 969616, spend time: 81s
I (5177291) [MDF_APP, 1202]: event_loop_cb, event: 0x3002
W (5177301) [APP, 1362]: The upgrade progress is: 100%
I (5177301) esp_image: segment 0: paddr=0x00010020 vaddr=0x3f400020 size=0x29020 (167968) map
I (5177401) esp_image: segment 1: paddr=0x00039048 vaddr=0x3ffbdb60 size=0x03488 ( 13448)
I (5177411) esp_image: segment 2: paddr=0x0003c4d8 vaddr=0x40080000 size=0x00400 ( 1024)
I (5177411) esp_image: segment 3: paddr=0x0003c8e0 vaddr=0x40080400 size=0x03730 ( 14128)
I (5177431) esp_image: segment 4: paddr=0x00040018 vaddr=0x400d0018 size=0xaf614 (718356) map
I (5177831) esp_image: segment 5: paddr=0x000ef634 vaddr=0x40083b30 size=0x0d538 ( 54584)
I (5177861) [MDF_APP, 1202]: event_loop_cb, event: 0x3003
W (5177871) [APP, 1371]: Upgrade completed waiting for restart

ESP_Bond
Posts: 51
Joined: Mon Apr 15, 2019 1:37 pm

Re: OTA Problems

Postby ESP_Bond » Wed Jul 10, 2019 9:41 am

This problem is caused by insufficient memory.

```
E (23801) [mwifi, 815]: <ESP_ERR_NO_MEM> Realloc size: 1456, new_ptr: 0x0, heap free: 10816
```

1. Check if your configuration is as follows:
Image

2. If you can't provide a complete log of your child nodes, check the memory usage.

Bartosz
Posts: 8
Joined: Tue Jul 09, 2019 8:06 pm

Re: OTA Problems

Postby Bartosz » Thu Jul 11, 2019 11:27 am

Thanks for advice, surprisingly my number of mesh buffer queues was set to 64, after change to 32 there's no more crashes.
Watchdog warnings still happens.

ESP_Bond
Posts: 51
Joined: Mon Apr 15, 2019 1:37 pm

Re: OTA Problems

Postby ESP_Bond » Fri Jul 12, 2019 2:17 am

Please provide the code of `node_handle_task`

Bartosz
Posts: 8
Joined: Tue Jul 09, 2019 8:06 pm

Re: OTA Problems

Postby Bartosz » Mon Jul 15, 2019 2:50 pm

It's looks like during OTA root transfers data with maximum bandwidth - node generates watchdog warning and there is also no communication with node during OTA. Adding delay in mupgrade_firmware_send() loop resolves all problems.

mupgrade_root.c function mupgrade_firmware_send:
....
/**
* @brief Send firmware data to unfinished devide.
*/
if ((MWIFI_ADDR_IS_ANY(addrs_list) || MWIFI_ADDR_IS_BROADCAST(addrs_list))
&& result->successed_num < 2 && addrs_num == 1) {
MDF_LOGD("seq: %d, size: %d, addrs_num: %d", packet->seq, packet->size, addrs_num);

if (MWIFI_ADDR_IS_ANY(addrs_list) && result->successed_num == 1) {
uint8_t broadcast_addr[] = MWIFI_ADDR_BROADCAST;
ret = mwifi_root_write(broadcast_addr, addrs_num, &type,
packet, sizeof(mupgrade_packet_t), true);
} else {
ret = mwifi_root_write(addrs_list, addrs_num, &type,
packet, sizeof(mupgrade_packet_t), true);
}
} else {
MDF_LOGD("seq: %d, size: %d, addrs_num: %d", packet->seq, packet->size, result->requested_num);
ret = mwifi_root_write(result->requested_addr, result->requested_num, &type,
packet, sizeof(mupgrade_packet_t), true);
}

//slow down
vTaskDelay(100 / portTICK_PERIOD_MS); //!!!!!!!!!!!!!!!

MDF_ERROR_CONTINUE(ret != ESP_OK, "<%s> Mwifi root write", mdf_err_to_name(ret));
...

ESP_Bond
Posts: 51
Joined: Mon Apr 15, 2019 1:37 pm

Re: OTA Problems

Postby ESP_Bond » Thu Jul 18, 2019 9:33 am

1. I hope to provide a complete log of the problem.https://docs.espressif.com/projects/esp ... l#feedback
2. Set the mupgrade log level to reduce log output

Code: Select all

     
     Esp_log_level_set("*", ESP_LOG_INFO);
     Esp_log_level_set("mupgrade_node", ESP_LOG_INFO);
     Esp_log_level_set("mupgrade_root", ESP_LOG_INFO);

Bartosz
Posts: 8
Joined: Tue Jul 09, 2019 8:06 pm

Re: OTA Problems

Postby Bartosz » Tue Jul 23, 2019 8:33 am

I've upgrade MDF to v1.0-beta1-2-gc1cc50f and recreate my project - now everything is ok.
I was basing on examples\development_kit\light but I've mess with sdkconfig.defaults, now I notice that there are many overrides of standard settings - I suppose that it was source of all my problems.

Who is online

Users browsing this forum: No registered users and 62 guests