Bootlooping once the program exceeds a specific size

akasaka_spk
Posts: 18
Joined: Tue Oct 08, 2024 9:11 am
Location: Sapporo, Japan

Bootlooping once the program exceeds a specific size

Postby akasaka_spk » Mon Dec 02, 2024 10:18 am

So, my program has grown very close to the 2MB per partition size of a 4MB flash with OTA.

However, even though it technically fits, for some reason it won't boot after surpassing some specific size.

E.g. currently I have:

Code: Select all

Checking size .pio\build\AKI_K875\firmware.elf
Advanced Memory Usage is available via "PlatformIO Home > Project Inspect"
RAM:   [==        ]  19.7% (used 64588 bytes from 327680 bytes)
Flash: [==========]  99.7% (used 1568765 bytes from 1572864 bytes)
CURRENT: upload_protocol = esptool
Looking for upload port...
Auto-detected: COM4
Uploading .pio\build\AKI_K875\firmware.bin
esptool.py v4.5.1
Serial port COM4
Connecting.......
Chip is ESP32-D0WD-V3 (revision v3.1)
Features: WiFi, BT, Dual Core, 240MHz, VRef calibration in efuse, Coding Scheme None
Crystal is 40MHz
MAC: a8:42:e3:ae:9b:bc
Uploading stub...
Running stub...
Stub running...
Changing baud rate to 921600
Changed.
Configuring flash size...
Flash will be erased from 0x00001000 to 0x00005fff...
Flash will be erased from 0x00008000 to 0x00008fff...
Flash will be erased from 0x0000e000 to 0x0000ffff...
Flash will be erased from 0x00010000 to 0x00190fff...
Compressed 17536 bytes to 12202...
Writing at 0x00001000... (100 %)
Wrote 17536 bytes (12202 compressed) at 0x00001000 in 0.4 seconds (effective 334.0 kbit/s)...
Hash of data verified.
Compressed 3072 bytes to 145...
Writing at 0x00008000... (100 %)
Wrote 3072 bytes (145 compressed) at 0x00008000 in 0.1 seconds (effective 446.5 kbit/s)...
Hash of data verified.
Compressed 8192 bytes to 47...
Writing at 0x0000e000... (100 %)
Wrote 8192 bytes (47 compressed) at 0x0000e000 in 0.1 seconds (effective 613.8 kbit/s)...
Hash of data verified.
Compressed 1575056 bytes to 952743...
Writing at 0x00010000... (1 %)
...
Writing at 0x0018fa51... (100 %)
Wrote 1575056 bytes (952743 compressed) at 0x00010000 in 16.4 seconds (effective 768.8 kbit/s)...
Hash of data verified.

Leaving...
Hard resetting via RTS pin...
--- Terminal on COM4 | 115200 8-N-1
ets Jul 29 2019 12:21:46

rst:0x3 (SW_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:1184
load:0x40078000,len:13232
load:0x40080400,len:3028
entry 0x400805e4
ets Jul 29 2019 12:21:46

rst:0x3 (SW_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:1184
load:0x40078000,len:13232
load:0x40080400,len:3028
entry 0x400805e4
ets Jul 29 2019 12:21:46

... many more of the same thing ...
So then I go to my device config header and turn off a feature flag. And surprisingly it starts to work!

Code: Select all

Checking size .pio\build\AKI_K875\firmware.elf
Advanced Memory Usage is available via "PlatformIO Home > Project Inspect"
RAM:   [==        ]  19.5% (used 63940 bytes from 327680 bytes)
Flash: [==========]  98.5% (used 1549509 bytes from 1572864 bytes)
Building .pio\build\AKI_K875\firmware.bin
esptool.py v4.5.1
Creating esp32 image...
Merged 26 ELF sections
Successfully created esp32 image.
CURRENT: upload_protocol = esptool
Looking for upload port...
Auto-detected: COM4
Uploading .pio\build\AKI_K875\firmware.bin
esptool.py v4.5.1
Serial port COM4
Connecting....
Chip is ESP32-D0WD-V3 (revision v3.1)
Features: WiFi, BT, Dual Core, 240MHz, VRef calibration in efuse, Coding Scheme None
Crystal is 40MHz
MAC: a8:42:e3:ae:9b:bc
Uploading stub...
Running stub...
Stub running...
Changing baud rate to 921600
Changed.
Configuring flash size...
Flash will be erased from 0x00001000 to 0x00005fff...
Flash will be erased from 0x00008000 to 0x00008fff...
Flash will be erased from 0x0000e000 to 0x0000ffff...
Flash will be erased from 0x00010000 to 0x0018bfff...
Compressed 17536 bytes to 12202...
Writing at 0x00001000... (100 %)
Wrote 17536 bytes (12202 compressed) at 0x00001000 in 0.4 seconds (effective 327.9 kbit/s)...
Hash of data verified.
Compressed 3072 bytes to 145...
Writing at 0x00008000... (100 %)
Wrote 3072 bytes (145 compressed) at 0x00008000 in 0.1 seconds (effective 427.2 kbit/s)...
Hash of data verified.
Compressed 8192 bytes to 47...
Writing at 0x0000e000... (100 %)
Wrote 8192 bytes (47 compressed) at 0x0000e000 in 0.1 seconds (effective 615.7 kbit/s)...
Hash of data verified.
Compressed 1555808 bytes to 941391...
Writing at 0x00010000... (1 %)
...
Writing at 0x001894ca... (100 %)
Wrote 1555808 bytes (941391 compressed) at 0x00010000 in 16.1 seconds (effective 772.3 kbit/s)...
Hash of data verified.
Leaving...
Hard resetting via RTS pin...
--- Terminal on COM4 | 115200 8-N-1
[   473][I][esp32-hal-psram.c:96] psramInit(): PSRAM enabled
[   503][I][main.cpp:289] setup(): [APL_MAIN] Hello!
[   505][I][akizuki_k875.cpp:88] initialize(): [SWEEP] Expected pixel clock = 250000
[   643][I][prefs.cpp:14] init_store_if_needed(): [PREF] Initialize
[   649][E][Preferences.cpp:483] getString(): nvs_get_str len fail: v_lic NOT_FOUND
[   656][I][yukkuri.cpp:12] Yukkuri(): [AQTK] MAC = A8:42:E3:AE:9B:BC
[   662][I][yukkuri.cpp:13] Yukkuri(): [AQTK] License: XXX-XXX-XXX
[   669][I][yukkuri.cpp:23] Yukkuri(): [AQTK] Init OK
[   673][I][main.cpp:118] bringup_sound(): [APL_MAIN] Sound subsystem ready
[   694][I][main.cpp:309] setup(): [APL_MAIN] File system mounted
[   768][I][font.cpp:161] load_font_from_file(): [FONT] Load /disk/font/misaki_mincho.mofo
[   802][I][font.cpp:98] load_font_from_file_handle(): [FONT] Decompressing RngZ
[   836][I][font.cpp:145] load_font_from_file_handle(): [FONT] Decompressing BMPZ
[   863][I][font.cpp:156] load_font_from_file_handle(): [FONT] Got font: encoding=1, glyphfmt=0, cursor=5f, invalid=25c6, w=8, h=8, range_cnt=4161, data=0x3f81d1d8, ranges=0x3f80bbd4
[   868][I][font.cpp:161] load_font_from_file(): [FONT] Load /disk/font/jiskan16.mofo
[   911][I][font.cpp:98] load_font_from_file_handle(): [FONT] Decompressing RngZ
[   994][I][font.cpp:145] load_font_from_file_handle(): [FONT] Decompressing BMPZ
[  1074][I][font.cpp:156] load_font_from_file_handle(): [FONT] Got font: encoding=1, glyphfmt=0, cursor=5f, invalid=25c6, w=16, h=16, range_cnt=6342, data=0x3f857e64, ranges=0x3f82b208
[  1122][I][localize.cpp:97] _load_lang_map_if_needed(): [LOCA] Loaded language ID=0, Entries: 9
[  1124][I][main.cpp:339] setup(): [APL_MAIN] setup end.
[  1124][I][main.cpp:213] boot_task(): [APL_MAIN] LePIS-OS v5.3 is in da house now!!

... remainder of completely normal boot process ...
What I normally would do is go and disable non-essential/experimental features when I'm not using a devkit with a big (8MB/16MB) flash chip, or deprecate and delete old code. However sadly this is now production firmware and it has to fit the new code too :/

Interestingly enough, it happens on all of my 4MB ESP32 WROVER modules, no matter the origin. Experimentally I've deduced that the things go bonkers at exactly the 99.5% mark, but the actual byte count varies within that region by a margin of a few bytes.

My platformio.ini file is here: https://github.com/vladkorotnev/plasma- ... formio.ini
And the partition map is here: https://github.com/vladkorotnev/plasma- ... itions.csv

What am I doing wrong here?

boarchuz
Posts: 619
Joined: Tue Aug 21, 2018 5:28 am

Re: Bootlooping once the program exceeds a specific size

Postby boarchuz » Mon Dec 02, 2024 12:24 pm

Flash will be erased from 0x00010000 to 0x00190fff...
That's weird. Looks like the erase is off by one, maybe the write is too?

Can you try:
1. Increase bootloader log verbosity to hopefully capture some output telling you why it's resetting. After the monitor has started, press the reset button to get output from a clean reset and share it here. Do the same with the working binary for comparison.
2. Confirm that the bin file in your build output matches the reported size in your log and that it's <=0x180000 bytes.
3. Use esptool to manually erase the region from 0x10000 to 0x18ffff, then use esptool to manually write your binary to 0x10000

akasaka_spk
Posts: 18
Joined: Tue Oct 08, 2024 9:11 am
Location: Sapporo, Japan

Re: Bootlooping once the program exceeds a specific size

Postby akasaka_spk » Mon Dec 02, 2024 1:52 pm

Sadly I cannot adjust anything about the bootloader, as it's precompiled in the esp32-arduino framework.

All of the addresses however seem to line up in the working version with no difference to the broken one:

Code: Select all

rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:1184
load:0x40078000,len:13232
load:0x40080400,len:3028
entry 0x400805e4
[   473][I][esp32-hal-psram.c:96] psramInit(): PSRAM enabled
[   501][I][main.cpp:289] setup(): [APL_MAIN] Hello!
The size of firmware.bin according to explorer is 1,575,056 bytes, which is 0x180890. Huh, that is way over! What gives? (And no, I'm not looking in the "Size on disk" row)

lbernstone
Posts: 857
Joined: Mon Jul 22, 2019 3:20 pm

Re: Bootlooping once the program exceeds a specific size

Postby lbernstone » Mon Dec 02, 2024 8:00 pm

TLDR, assume that you need about 64k more space than the size shows you.

I think you are seeing two tools underestimating the amount of space used.
1. The arduino-esp32 version uses the compiler "size" tool, which looks at the elf file and determines the size based on the output target system. This is the compiled app size. There's a header with a bit of magic info that looks like it occupies the first 0x10F bytes, which is likely not accounted for. This is the difference between what the xtensa...-size tool gives you and the file size.
2. On the flash, firmware must be aligned to 0x10000 boundaries. This has to do with how the memory gets mapped. So, if you are a single byte over a chunk boundary, that will be truncated on the flash once it is full. If you have extra at the end b/c your partition table does not align properly to a 0x10000 boundary, that space is not available (rearrange to get it into the nvs)!

If you would like to get this fixed, open an issue on the platformio repo and paste this conversation. It shouldn't take someone familiar with their tool very long to figure out how to adjust the size to give you a more accurate usage estimate.

akasaka_spk
Posts: 18
Joined: Tue Oct 08, 2024 9:11 am
Location: Sapporo, Japan

Re: Bootlooping once the program exceeds a specific size

Postby akasaka_spk » Mon Dec 02, 2024 11:42 pm

ibernstone
Thanks for the insight. That probably is what is happening, and that is what I did already: https://github.com/platformio/platform- ... ssues/1500 (albeit I forgot to link the thread there, will do in a bit)

IIRC pio allows to set a custom size limit in the project settings. But I might go around and workaround the thing by using a python script to check the partition size, then checking the filesize.

This is kinda sad from a project standpoint though — I assumed I've just got some configuration wrong, but turns out I really did run out of memory! Changing the partition layout isn't really an option, because the rest is also pretty cramped, and there are already deployed devices which update from CI/CD via OTA, so unless I split the tables between hardware types, there is no way other than trimming down something.

Then again, this specific device which gets the IR sensor (the new feature) is just two new models in the lineup which are also one of a kind for now — so if going the different tables per device route, maybe I will just find a bigger flash in my parts bin and drop it on :P

Who is online

Users browsing this forum: No registered users and 40 guests