ESP32 Bootloop

lvarasquim
Posts: 3
Joined: Thu May 23, 2019 7:50 pm

ESP32 Bootloop

Postby lvarasquim » Thu May 23, 2019 8:44 pm

Hi
I've got a rather strange issue.
My coworker and I are working in the same project. It is a custom board with a u-blox NINA-W10 module, which is basically an ESP32 module with 2MB flash.

Code: Select all

Chip is ESP32D0WDQ6 (revision 1)
Features: WiFi, BT, Dual Core, 240MHz, VRef calibration in efuse, Coding Scheme None
He is using Windows 10, I'm using Ubuntu:

Code: Select all

$ uname -a
Linux pop-os-varaska 4.15.0-50-generic #54-Ubuntu SMP Mon May 6 18:46:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
When loading the image compiled on his machine, it works.
When loading the image compiled on my machine, I get a bootloop.

I even recompiled the toolchain on my machine:

Code: Select all

Toolchain path: /home/lvarasquim/esp/crosstool-NG/builds/xtensa-esp32-elf/bin/xtensa-esp32-elf-gcc
Toolchain version: crosstool-ng-1.22.0-80-g6c4433a5
Compiler version: 5.2.0
Python requirements from /home/lvarasquim/esp/esp-idf/requirements.txt are satisfied.
The following binary files are the same:
build/ota_data_initial.bin
build/partitions.bin

The following binary files are different:
build/MY_PROJECT.bin
build/bootloader/bootloader.bin

Images are signed using the exactly same file "secure_boot_signing_key.pem".

For the esp-idf, we're working on this commit:

Code: Select all

commit 6b3da6b1882f3b72e904cc90be67e9c4e3f369a9
Merge: cb7aaae35 93b4c7159
Author: Jiang Jiang Jian <jack@espressif.com>
Date:   Wed Apr 17 22:09:23 2019 +0800

    Merge branch 'bugfix/fix_signal_test_bug_v3.3' into 'release/v3.3'
For the submodules:

Code: Select all

 55efc179b76139c8f9b44bf22a4aba4803f7a7bd components/asio/asio (asio-1-12-0-57-g55efc179)
 299183238ffe7a3e6a5ca0af9db19c10eaca62cf components/aws_iot/aws-iot-device-sdk-embedded-C (v2.1.1-59-g2991832)
 70d6a277d7adb468fc4dbeed92a64144569bf080 components/bt/lib (heads/master)
 6468887a12666f88b8704d797fc176cd4f40ee4c components/coap/libcoap (v4.1.1-401-g6468887)
 fbe87f4076739cd3aa4cc7f9cd668c152279709e components/esp32/lib (fbe87f4)
 9ad444a6e06e58833d5e6044c1d5f3eb3dd56023 components/esptool_py/esptool (v2.6)
 968b8cc46dbee47b83318d5f31a8e7907199614b components/expat/expat (R_2_2_5)
 7cc52f60356909b3dd260304c7c50c0693699353 components/json/cJSON (v1.7.1)
 70170c28c844a4786e75efc626e1aeebc93caebc components/libsodium/libsodium (1.0.12)
 fe9a58cb7abdc74e7a2196b7c4e0a29ff9042f88 components/lwip/lwip (STABLE-2_0_3_RELEASE-59-gfe9a58cb)
 19eb57f7f7b2f4312d497ddbcb5f104fc6877c70 components/mbedtls/mbedtls (mbedtls-2.13.1-1-g19eb57f7)
 d037ec89546fad14b5c4d5456c2e23a71e554966 components/micro-ecc/micro-ecc (v1.0)
 f08f3b678717865234637164a29ed3a63e756ca7 components/mqtt/esp-mqtt (remotes/origin/feature/kconfig-55-gf08f3b6)
 3bcc416e13cc790e2fb45fcfe9111d38609c5032 components/nghttp/nghttp2 (v1.24.0)
 22464fe5a0a10f2b077eaba109ce1e912e4a77de components/nghttp/nghttp2/third-party/mruby (1.0.0-2902-g22464fe5)
 da5c2ab419a3bb8a4cc6c37a6c7f3e4bd4b41134 components/nghttp/nghttp2/third-party/neverbleed (remotes/origin/kazuho/change-file-ownership-on-setuid-26-gda5c2ab)
 dac1a65feac4ad72f612aab99f487056fbcf5c1a components/protobuf-c/protobuf-c (v1.3.0)
 f5e26c4e933189593a71c6b82cda381a7b21e41c components/spiffs/spiffs (0.2-221-gf5e26c4)
 7d2bf62b7e6afaf38153041a9d53c21aeeca9a25 components/unity/unity (v2.4.3-51-g7d2bf62)
 7e8e249990ec491ec15990cf95b6d871a66cf64a examples/build_system/cmake/import_lib/main/lib/tinyxml2 (2.0.2-659-g7e8e249)
With no trace enabled, there is the following output:

Code: Select all

rst:0x1 (POWERON_RESET),boot:0x33 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0018,len:4
load:0x3fff001c,len:160
ho 0 tail 12 room 4
load:0x40078000,len:9204
load:0x40080400,len:5728
entry 0x4008066c
ets Jun  8 2016 00:22:57

rst:0xc (SW_CPU_RESET),boot:0x33 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0018,len:4
load:0x3fff001c,len:160
ho 0 tail 12 room 4
load:0x40078000,len:9204
load:0x40080400,len:5728
entry 0x4008066c
ets Jun  8 2016 00:22:57

If I enable the "Default log verbosity (Verbose)", I get the following (just the end of the log):

Code: Select all

D (1073) memory_layout: Available memory region 0x4009c000 - 0x4009e000
V (1080) memory_layout: Examining memory region 0x4009e000 - 0x400a0000
D (1087) memory_layout: Available memory region 0x4009e000 - 0x400a0000
I (1094) heap_init: Initializing. RAM available for dynamic allocation:
D (1101) heap_init: New heap initialised at 0x3ffaff10
I (1106) heap_init: At 3FFAFF10 len 000000F0 (0 KiB): DRAM
D (1112) heap_init: New heap initialised at 0x3ffb6388
I (1117) heap_init: At 3FFB6388 len 00001C78 (7 KiB): DRAM
D (1124) heap_init: New heap initialised at 0x3ffb9a20
I (1129) heap_init: At 3FFB9A20 len 00004108 (16 KiB): DRAM
I (1135) heap_init: At 3FFBDB5C len 00000004 (0 KiB): DRAM
D (1141) heap_init: New heap initialised at 0x3ffcdcf8
I (1146) heap_init: At 3FFCDCF8 len 00012308 (72 KiB): DRAM
I (1153) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (1159) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
D (1166) heap_init: New heap initialised at 0x40091bc8
I (1171) heap_init: At 40091BC8 len 0000E438 (57 KiB): IRAM
I (1177) cpu_start: Pro cpu start user code
D (1189) clk: RTC_SLOW_CLK calibration value: 3516787
V (192) intr_alloc: esp_intr_alloc_intrstatus (cpu 0): checking args
V (192) intr_alloc: esp_intr_alloc_intrstatus (cpu 0): Args okay. Resulting flags 0xE
D (197) intr_alloc: Connected src 46 to int 2 (cpu 0)
V (203) intr_alloc: esp_intr_alloc_intrstatus (cpu 0): checking args
V (208) intr_alloc: esp_intr_alloc_intrstatus (cpu 0): Args okay. Resulting flags 0xC0E
D (217) intr_alloc: Connected src 57 to int 3 (cpu 0)
V (222) esp_dbg_stubs: esp_dbg_stubs_init stubs 3ffcbfb4
V (228) intr_alloc: esp_intr_alloc_intrstatus (cpu 0): checking args
V (233) intr_alloc: esp_intr_alloc_intrstatus (cpu 0): Args okay. Resulting flags 0x40E
D (242) intr_alloc: Connected src 24 to int 9 (cpu 0)
I (247) cpu_start: Starting scheduler on PRO CPU.
V (0) intr_alloc: esp_intr_alloc_intrstatus (cpu 1): checking args
V (0) intr_alloc: esp_intr_alloc_intrstatus (cpu 1): Args okay. Resulting flags 0x40E
D (10) intr_alloc: Connected src 25 to int 2 (cpu 1)
I (10) cpu_start: Starting scheduler on APP CPU.
D (297) heap_init: New heap initialised at 0x3ffe0440
D (297) heap_init: New heap initialised at 0x3ffe4350
V (307) intr_alloc: esp_intr_alloc_intrstatus (cpu 0): checking args
V (307) intr_alloc: esp_intr_�ets Jun  8 2016 00:22:57

rst:0xc (SW_CPU_RESET),boot:0x33 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0018,len:4
load:0x3fff001c,len:160
ho 0 tail 12 room 4
load:0x40078000,len:9204
load:0x40080400,len:5728
entry 0x4008066c
If I enable the "Panic handler behaviour: (Invoke GDB Stub)" and print the backtrace:

Code: Select all

rst:0x1 (POWERON_RESET),boot:0x33 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0018,len:4
load:0x3fff001c,len:160
ho 0 tail 12 room 4
load:0x40078000,len:9204
load:0x40080400,len:5728
entry 0x4008066c
���2����ʂ��т�*���
����2�������т�"��������2����"���т�"��’҂��2����ɂ��т��2��҂��2����тj��H���W��-���,�.]]��now.
$T04#b8GNU gdb (crosstool-NG crosstool-ng-1.22.0-80-g6c4433a5) 7.10
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=x86_64-build_pc-linux-gnu --target=xtensa-esp32-elf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /media/lvarasquim/FILES/projetos/pbl/build/PBL_TEST.elf...done.
Remote debugging using /dev/ttyUSB0
0x40095274 in ?? ()
(gdb) bt
#0  0x40095274 in ?? ()
#1  0x4008fe03 in vTaskSwitchContext () at /home/lvarasquim/esp/esp-idf/components/freertos/tasks.c:2827
#2  0x4008f009 in xQueueGenericCreate (uxQueueLength=1, uxItemSize=0, ucQueueType=1 '\001')
    at /home/lvarasquim/esp/esp-idf/components/freertos/queue.c:392
#3  0x4008f23b in xQueueCreateMutex (ucQueueType=1 '\001')
    at /home/lvarasquim/esp/esp-idf/components/freertos/queue.c:499
#4  0x400ea43c in nvs::Lock::init () at /home/lvarasquim/esp/esp-idf/components/nvs_flash/src/nvs_platform.hpp:47
#5  nvs_flash_init_partition (part_name=0x3f403ab8 "nvs")
    at /home/lvarasquim/esp/esp-idf/components/nvs_flash/src/nvs_api.cpp:144
#6  0x400ea49d in nvs_flash_init () at /home/lvarasquim/esp/esp-idf/components/nvs_flash/src/nvs_api.cpp:165
#7  0x400dbcae in BLU_SPP_INIT () at /media/lvarasquim/FILES/projetos/pbl/main/pbl_main_task.c:830
#8  0x400dc785 in PBL_MAIN_TASK () at /media/lvarasquim/FILES/projetos/pbl/main/pbl_main_task.c:904
#9  0x4008f6bc in vPortTaskWrapper (pxCode=0x400dc74c <PBL_MAIN_TASK>, pvParameters=0x0)
    at /home/lvarasquim/esp/esp-idf/components/freertos/port.c:143
(gdb)
YES, we've tried different cables, power sources and boards, this doesn't seem to be a hardware issue.
I also deleted the esp folder, checked it out again, recompiled the toolchain. Out project is on Git, the source code is the same. All but the BUILD folder is versioned.

Any help is appreciated.

ESP_Sprite
Posts: 9766
Joined: Thu Nov 26, 2015 4:08 am

Re: ESP32 Bootloop

Postby ESP_Sprite » Fri May 24, 2019 3:55 am

Just for shits and giggles, can you try flashing your colleagues binaries on your board/computer and see if that works? Hint: if you run 'make' without arguments, the build system will tell you the correct command line to manually flash the binaries.

WiFive
Posts: 3529
Joined: Tue Dec 01, 2015 7:35 am

Re: ESP32 Bootloop

Postby WiFive » Fri May 24, 2019 7:32 am

Did you check a bdiff of the bin files and a diff of the objdumps?

lvarasquim
Posts: 3
Joined: Thu May 23, 2019 7:50 pm

Re: ESP32 Bootloop

Postby lvarasquim » Fri May 24, 2019 11:50 am

Thanks for the quick response @ESP_Sprite and @WiFive.

Yes, I tried flashing my colleague binaries manually, it works. That makes me think there's something wrong with my compilation process, I just can't spot the difference.

Regarding the diff, I noticed an option called "Application Manager --> [*] Use date/time stamp for app" is set. We'll try compiling with that option disabled and check the diff.

Meanwhile, the most noticeable difference on the objdump is the start address, shouldn't those be the same?

Code: Select all

lvarasquim@pop-os-varaska:build$ xtensa-esp32-elf-objdump -f MINE.elf HIS.elf 

MINE.elf:     file format elf32-xtensa-le
architecture: xtensa, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x4008100c

HIS.elf:     file format elf32-xtensa-le
architecture: xtensa, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x40080e80
Let me know if you need the objdump with other options.

lvarasquim
Posts: 3
Joined: Thu May 23, 2019 7:50 pm

Re: ESP32 Bootloop

Postby lvarasquim » Wed May 29, 2019 7:41 pm

Digging a little deeper, the results of $ xtensa-esp32-elf-objdump -d on both ELF files

mine:

Code: Select all

40080000 <_WindowOverflow4>:
40080040 <_WindowUnderflow4>:
40080050 <_xt_alloca_exc>:
40080080 <_WindowOverflow8>:
400800c0 <_WindowUnderflow8>:
40080100 <_WindowOverflow12>:
40080140 <_WindowUnderflow12>:
40080180 <_Level2Vector>:
400801c0 <_Level3Vector>:
40080200 <_Level4Vector>:
40080240 <_Level5Vector>:
40080280 <_DebugExceptionVector>:
400802c0 <_NMIExceptionVector>:
40080300 <_KernelExceptionVector>:
40080340 <_UserExceptionVector>:
400803c0 <_DoubleExceptionVector>:
40080400 <esp_ota_get_app_elf_sha256-0x8b4>:
40080cb4 <esp_ota_get_app_elf_sha256>:
40080d08 <esp_restart_noos>: ----------------------------------- THIS IS WHERE IT STARTS DIFFERING
40080e74 <esp_restart>:
40080e94 <start_cpu0_default>:
40080f9c <start_cpu1_default>:
40080fc8 <call_start_cpu1>:
40081028 <call_start_cpu0>:
40081114 <esp_cache_err_get_cpuid>:
4008113c <tick_hook>:
40081188 <timer_list_lock>:
40081194 <timer_list_unlock>:
400811a0 <timer_insert>:
40081210 <timer_alarm_handler>:

his:

Code: Select all

40080000 <_WindowOverflow4>:
40080040 <_WindowUnderflow4>:
40080050 <_xt_alloca_exc>:
40080080 <_WindowOverflow8>:
400800c0 <_WindowUnderflow8>:
40080100 <_WindowOverflow12>:
40080140 <_WindowUnderflow12>:
40080180 <_Level2Vector>:
400801c0 <_Level3Vector>:
40080200 <_Level4Vector>:
40080240 <_Level5Vector>:
40080280 <_DebugExceptionVector>:
400802c0 <_NMIExceptionVector>:
40080300 <_KernelExceptionVector>:
40080340 <_UserExceptionVector>:
400803c0 <_DoubleExceptionVector>:
40080400 <esp_ota_get_app_elf_sha256-0x8b4>:
40080cb4 <esp_ota_get_app_elf_sha256>:
40080d08 <start_cpu0_default>: ***-----------------------------------------------------------
40080e10 <start_cpu1_default>:
40080e3c <call_start_cpu1>:
40080e9c <call_start_cpu0>:
40080f88 <malloc_internal_wrapper>:
40080f9c <timer_arm_us_wrapper>:
40080fb0 <timer_disarm_wrapper>:
40080fbc <task_yield_from_isr_wrapper>:
40080fc4 <wifi_int_restore_wrapper>:
40080fd4 <wifi_int_disable_wrapper>:
40080fe8 <semphr_give_from_isr_wrapper>:
40080ffc <semphr_take_from_isr_wrapper>:
...
4008111a <ld_include_panic_highint_hdl>:
4008111c <tick_hook>:
40081168 <esp_restart_noos>: ***------------------------------------------
400812d4 <esp_restart>:
400812f4 <esp_dport_access_stall_other_cpu_start>:
4008138c <esp_dport_access_stall_other_cpu_end>:
Any ideas on why these are placed on such different locations?

Who is online

Users browsing this forum: No registered users and 138 guests