Crash running https_request example

mgleason_3
Posts: 44
Joined: Mon Nov 07, 2016 5:04 pm

Crash running https_request example

Postby mgleason_3 » Sat Jun 10, 2017 2:30 am

Running the "esp-idf/examples/protocols/https_request example" (on a ESP-WROOM-32) plugged into a USB port on my computer. The https_request example basically makes a request via SSL to a web site, waits 10 seconds and repeats.

It seems to run fine for a while, then BOOM! it panics and reboots. It may successfully complete the connect/wait-10-secs cycle 3 times before panic'ing and rebooting - or it might go 20 or 30 cycles. Any idea what's going on?

Code: Select all

...
I (53406) example: Connected.
I (53406) example: Performing the SSL/TLS handshake...
Guru Meditation Error: Core  0 panic'ed (Interrupt wdt timeout on CPU1)
Register dump:
PC      : 0x400d14f4  PS      : 0x00060d34  A0      : 0x80085174  A1      : 0x3ffafda0  
0x400d14f4: esp_vApplicationIdleHook at /Users/Fred/Dev/esp/esp-idf/components/esp32/./freertos_hooks.c:52

A2      : 0x00000008  A3      : 0x00000001  A4      : 0x00060023  A5      : 0x3ffafe4c  
A6      : 0x00000001  A7      : 0x00000001  A8      : 0x3ffb2c5c  A9      : 0x3ffafd80  
A10     : 0x00000000  A11     : 0x00060d20  A12     : 0x80083c99  A13     : 0x3ffafcb0  
A14     : 0x00000003  A15     : 0x00060023  SAR     : 0x00000000  EXCCAUSE: 0x00000006  
EXCVADDR: 0x00000000  LBEG    : 0x00000000  LEND    : 0x00000000  LCOUNT  : 0x00000000  

Backtrace: 0x400d14f4:0x3ffafda0 0x40085174:0x3ffafdc0
0x400d14f4: esp_vApplicationIdleHook at /Users/Fred/Dev/esp/esp-idf/components/esp32/./freertos_hooks.c:52

0x40085174: prvIdleTask at /Users/Fred/Dev/esp/esp-idf/components/freertos/./tasks.c:3439


Rebooting...

ESP_igrr
Posts: 2071
Joined: Tue Dec 01, 2015 8:37 am

Re: Crash running https_request example

Postby ESP_igrr » Sat Jun 10, 2017 3:23 am

Could you give a bit more info about the hardware? ESP-WROOM32 is a module which isn't normally connected to USB port directly. Usually there is some board involved which has a voltage regulator and USB/UART converter.

mgleason_3
Posts: 44
Joined: Mon Nov 07, 2016 5:04 pm

Re: Crash running https_request example

Postby mgleason_3 » Sat Jun 10, 2017 6:15 am

It's an Espressif ESP32-DEVKITC. Black PCB, 4Mbyte Flash. It says its based on the ESP-WROOM-32.
Here's the link: https://www.olimex.com/Products/IoT/ESP32-CoreBoard/
Assuming it matches the BOM,
It should have:
- a silicon labs CP2102-GM UART.
- one of the following voltage regulators:
  • NCP1117ST33T3G - 1A @3.3V
  • AMS1117 - 1A @3.3V

mgleason_3
Posts: 44
Joined: Mon Nov 07, 2016 5:04 pm

Re: Crash running https_request example

Postby mgleason_3 » Sat Jun 10, 2017 10:47 pm

Tried setting the CPU Speed to 160Mhz instead of 240Mhz. That did not seem to affect the crashes.

Interestingly, there was an additional crash I hadn't noticed before that was a bit different from the others:

Code: Select all

...
I (111952) example: Connected.
I (111952) example: Performing the SSL/TLS handshake...
ets Jun  8 2016 00:22:57

rst:0x8 (TG1WDT_SYS_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0x00
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0008,len:8
load:0x3fff0010,len:4400
load:0x40078000,len:11092
ho 0 tail 12 room 4
load:0x40080000,len:252
0x40080000: _iram_start at ??:?

entry 0x40080034
0x40080034: _iram_start at ??:?

W (95) boot: PRO CPU has been reset by WDT.
W (95) boot: WDT reset info: PRO CPU PC=0x4008204a
0x4008204a: .check_dport_access_end at xtensa_vectors.o:?

W (96) boot: WDT reset info: APP CPU PC=0x7a7fd376
I (114) boot: ESP-IDF v2.0-rc1-902-ge45a2885 2nd stage bootloader
I (135) boot: compile time 13:03:32
I (148) boot: Enabling RNG early entropy source...
I (165) boot: SPI Speed      : 40MHz
I (178) boot: SPI Mode       : DIO
I (191) boot: SPI Flash Size : 4MB
I (203) boot: Partition Table:
I (215) boot: ## Label            Usage          Type ST Offset   Length
I (237) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (261) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (284) boot:  2 factory          factory app      00 00 00010000 00100000
I (307) boot: End of partition table
I (320) boot: Disabling RNG early entropy source...
I (337) boot: Loading app partition at offset 00010000
So, there seems to be two types of crashes - both having to do with the WatchDogTimer:
  1. "rst:0x8 (TG1WDT_SYS_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)" / "PRO CPU has been reset by WDT"
  2. "Guru Meditation Error: Core 0 panic'ed (Interrupt wdt timeout on CPU1)"
Last edited by mgleason_3 on Sun Jun 11, 2017 6:58 am, edited 1 time in total.

mgleason_3
Posts: 44
Joined: Mon Nov 07, 2016 5:04 pm

Re: Crash running https_request example

Postby mgleason_3 » Sat Jun 10, 2017 11:31 pm

Tried adding vTaskDelay() in a couple spots. Seemed a little better but not much. Unfortunately, I suspect something needs to be done in the esp_idf and I don't want to go monkeying there yet...

Also, increased the "Interrupt watchdog timeout (ms)" from 300 to 5000. It seems to go a bit longer before crashing, but I can't say it fixes the problem.

And got a crash in a different location (outside of the SSL Handshake):

Code: Select all

I (59402) example: Verifying peer X.509 certificate...
I (59402) example: Certificate verified.
I (59412) example: Writing HTTP request...
I (59412) example: 111 bytes written
I (59412) example: Reading HTTP response...
I (59432) example: 145 bytes read
HTTP/1.0 200 OK
Content-Type: application/json; charset=UTF-8
Date: Sat, 10 Jun 2017 23:21:55 GMT
Guru Meditation Error: Core  0 panic'ed (Interrupt wdt timeout on CPU0)
Register dump:
PC      : 0x40083c4c  PS      : 0x00060034  A0      : 0x80084178  A1      : 0x3ffb0530  
0x40083c4c: vPortCPUAcquireMutex at /Users/fred/Dev/esp/esp-idf/components/freertos/./port.c:427

A2      : 0x3ffbca1c  A3      : 0x00060023  A4      : 0x00060021  A5      : 0x3ffb89b0  
A6      : 0x0000000f  A7      : 0x00000000  A8      : 0x55416a46  A9      : 0x00000046  
A10     : 0x00060023  A11     : 0x00000000  A12     : 0x801077a9  A13     : 0x3ffb8980  
A14     : 0x3ffaf800  A15     : 0x3ffae3c0  SAR     : 0x00000015  EXCCAUSE: 0x00000005  
EXCVADDR: 0x00000000  LBEG    : 0x400014fd  LEND    : 0x4000150d  LCOUNT  : 0xffffffff  

Backtrace: 0x40083c4c:0x3ffb0530 0x40084178:0x3ffb0550 0x40081778:0x3ffb0570 0x400816d8:0x3ffb05c0 0x400819b9:0x3ffb05e0 0x40081e75:0x3ffb0600
0x40083c4c: vPortCPUAcquireMutex at /Users/fred/Dev/esp/esp-idf/components/freertos/./port.c:427

0x40084178: xQueueGenericSendFromISR at /Users/fred/Dev/esp/esp-idf/components/freertos/./queue.c:2034

0x40081778: pp_post at ??:?

0x400816d8: lmacProcessRxSucData at ??:?

0x400819b9: wDev_ProcessFiq at ??:?

0x40081e75: _xt_lowint1 at xtensa_vectors.o:?


Rebooting...
So, looks like there's 3 types of crashes:
  1. rst:0x8 (TG1WDT_SYS_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)" / "PRO CPU has been reset by WDT
  2. Guru Meditation Error: Core 0 panic'ed (Interrupt wdt timeout on CPU1)
  3. Guru Meditation Error: Core 0 panic'ed (Interrupt wdt timeout on CPU0)
#1 - Occurs during "Performing the SSL/TLS handshake..."
#2 - Occurs during "Performing the SSL/TLS handshake..."
#3 - Occurs during "Reading HTTP response..." when printing the response.

f.h-f.s.
Posts: 214
Joined: Thu Dec 08, 2016 2:53 pm

Re: Crash running https_request example

Postby f.h-f.s. » Sun Jun 11, 2017 1:10 pm

Have you made any changes to the example?
Try running FreeRTOS only on core one and pin your tasks to CPU0.
Are there any shared resources across tasks on different cores? Because to me, it looks like the tasks are sharing a Queue?
pp_post is some sort of rom component?

ESP_igrr
Posts: 2071
Joined: Tue Dec 01, 2015 8:37 am

Re: Crash running https_request example

Postby ESP_igrr » Sun Jun 11, 2017 3:54 pm

Based on the varying crash cause, this looks very much like a brownout... unfortunately brownout detector support hasn't been merged yet, so hard to tell for sure.
Do you have a different USB cable to try? It doesn't happen very often, but I remember there were a few reports of brownout like issues with some USB cables.

mgleason_3
Posts: 44
Joined: Mon Nov 07, 2016 5:04 pm

Re: Crash running https_request example

Postby mgleason_3 » Mon Jun 12, 2017 1:18 am

@f.h-f.s.
>> Have you made any changes to the example?
Minor - recently in https_request_example_main.c:
  1. Commented out line 302 to avoid buffer overflow and so much output // putchar(buf);
  2. Changed the delay on Line 320 to get faster iterations
    from: vTaskDelay(1000 / portTICK_PERIOD_MS);
    to: vTaskDelay(100 / portTICK_PERIOD_MS);

>> Try running FreeRTOS only on core one and pin your tasks to CPU0.
Sorry, how would I do this? I see the menuconfig option "Run FreeRTOS only on first core" which I understand is CPU0 (AKA the "PRO CPU") - but that would put it on core-zero rather than core-one as you suggest. Also, based on viewtopic.php?t=802 I'm not clear whether there's value in running FreeRTOS on just one CPU - outside of the memory savings. But let me know if I'm wrong.

When I get a few minutes, I'll try pinning the "https_get_task" to CPU1 (AKA the "APP CPU") to see what happens.

>> Are there any shared resources across tasks on different cores? Because to me, it looks like the tasks are sharing a Queue?
No, just running the (largely) unmodified example - so there shouldn't be any shared resources.

>> pp_post is some sort of rom component?
Don't know - "https-request.map" seems to indicate its a literal in libpp.a
.literal.ic_pp_post
0x0000000000000000 0x0 /Users/fred/Dev/esp/esp-idf/components/esp32/lib/libpp.a(if_hwctrl.o)

mgleason_3
Posts: 44
Joined: Mon Nov 07, 2016 5:04 pm

Re: Crash running https_request example

Postby mgleason_3 » Mon Jun 12, 2017 2:56 am

@ESP_igrr
>> Based on the varying crash cause, this looks very much like a brownout... unfortunately brownout detector support hasn't been merged yet, so hard to tell for sure.
>>Do you have a different USB cable to try? It doesn't happen very often, but I remember there were a few reports of brownout like issues with some USB cables.

Tried 3 different cables on 2 different laptops (a MacBook Pro and a Dell Windows Laptop). Unfortunately, I don't see that changing cables or laptops made any significant difference. I've got another development module (also based on the WROOM-32) on order which should arrive tomorrow in case that's the issue. But I'm skeptical that's the problem - see "Characterization" below.

Since the vast majority are due to a wdt timeout, I wonder if it's not an issue with WatchDog Timers timing out?

Characterization
Did a bit more characterization of the crashes. Added a while loop in app_main to print the time every 100ms. Also, I believe I'd set the "Interrupt watchdog timeout" to 5000 instead of 300. Here are the results from running that for a while:
  1. Completed cycles: 674
    This is the number of times the "Starting again!" message was printed.
  2. Guru Meditation Error: Core 0 panic'ed (Interrupt wdt timeout on CPU1): 24
    Always occurred 5-8 seconds after the following was printed: "Performing the SSL/TLS handshake..."
  3. Guru Meditation Error: Core 0 panic'ed (Interrupt wdt timeout on CPU0): 24
    Always occurred essentially immediately during the printing of the HTTP response. i.e. There were no times printed by the loop in app-main, the following the following had already printed: "Reading HTTP response...", "... bytes read" and the HTTPresponse was being printed - e.g. "HTTP/1.0 200 OK..."
  4. Guru Meditation Error of type LoadProhibited occurred on core 0: 1
    Occured at the same time as 3. above (during the printing of the HTTP response)
    [*} Guru Meditation Error of type LoadProhibited occurred on core 1: 2
    Occured at the same time as 3. above (during the printing of the HTTP response)
  5. rst:0x8 ( TG1WDT_SYS_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT): 2
    0ccurred 0.7 sec. and 1.6 sec. after the following was printed: "Performing the SSL/TLS handshake..."
Also, it's suspicious to that each type of crash happens in same spot each time - although I suppose it's possible that something's always happening during "Performing the SSL/TLS handshake..." that causes a brownout. But, it only happens in 3.6% of the cycles. Also, the brownout's would have to be able to cause the processor think it's received a "Interrupt wdt timeout" (is that even possible?) - unless I'm misreading what that message means.
Last edited by mgleason_3 on Mon Jun 12, 2017 6:16 pm, edited 1 time in total.

f.h-f.s.
Posts: 214
Joined: Thu Dec 08, 2016 2:53 pm

Re: Crash running https_request example

Postby f.h-f.s. » Mon Jun 12, 2017 6:10 am

Sorry i meant CPU0 =P

Who is online

Users browsing this forum: yimeng and 66 guests