Issues loading code from RAM on ESP32-P4, but not on ESP32-C3

SyntaxVortex
Posts: 2
Joined: Fri Jan 24, 2025 6:47 pm

Issues loading code from RAM on ESP32-P4, but not on ESP32-C3

Postby SyntaxVortex » Fri Jan 24, 2025 6:53 pm

Hi!

for my project I have to be able to load code from an SD card into RAM and execute it.

Here is the code that I use to test this:

Code: Select all

void app_main(void) {
  heap_caps_print_heap_info(MALLOC_CAP_EXEC);

  uint32_t* code = heap_caps_aligned_alloc(0x40, 0x40, MALLOC_CAP_32BIT|MALLOC_CAP_EXEC);
  printf("%p\n", code);

  code[0] = 0x01000513; // li a0, 16
  code[1] = 0x00008067; // ret

  // not sure about this
  //esp_cache_msync(code, 0x40, ESP_CACHE_MSYNC_FLAG_DIR_M2C); // 0x40 = cache line size

  // also not sure about this
  __asm__ volatile ("fence");
  __asm__ volatile ("fence.i");

  int (*func_ptr)(void) = (int (*)(void))code;
  int result = func_ptr();
  printf("result = %d\n", result);
}
I also set the following options using idf.py menuconfig:

ESP_SYSTEM_PMP_IDRAM_SPLIT=n (Enable IRAM/DRAM split protection)
ESP_SYSTEM_MEMPROT_FEATURE=n (Enable memory protection)

On my ESP32-C3 it works fine:
Heap summary for capabilities 0x00000001:
At 0x3fc8d990 len 206448 free 196988 allocated 8376 min_free 196988
largest_free_block 196608 alloc_blocks 36 free_blocks 1 total_blocks 37
At 0x3fcc0000 len 116496 free 115624 allocated 0 min_free 115624
largest_free_block 114688 alloc_blocks 0 free_blocks 1 total_blocks 1
At 0x3fcdc710 len 10576 free 10160 allocated 0 min_free 10160
largest_free_block 9216 alloc_blocks 0 free_blocks 1 total_blocks 1
At 0x5000021c len 7628 free 7248 allocated 0 min_free 7248
largest_free_block 7168 alloc_blocks 0 free_blocks 1 total_blocks 1
Totals:
free 330020 allocated 8376 min_free 330020 largest_free_block 196608
0x4038fec4
result = 16
On the ESP32-P4, the chip that I planned to use, however:
ESP-ROM:esp32p4-eco2-20240710
Build:Jul 10 2024
rst:0x17 (CHIP_USB_UART_RESET),boot:0x20f (SPI_FAST_FLASH_BOOT)
Core0 Saved PC:0x40007d6e
--- 0x40007d6e: uart_ll_get_txfifo_len at /Users/vortex/esp-idf/components/hal/esp32p4/include/hal/uart_ll.h:725
(inlined by) uart_tx_char at /Users/vortex/esp-idf/components/esp_driver_uart/src/uart_vfs.c:170

Core1 Saved PC:0x4ff0360a
--- 0x4ff0360a: esp_cpu_wait_for_intr at /Users/vortex/esp-idf/components/esp_hw_support/cpu.c:57 (discriminator 1)

SPI mode:DIO, clock div:1
load:0x4ff33ce0,len:0x162c
load:0x4ff2abd0,len:0xd70
load:0x4ff2cbd0,len:0x32fc
entry 0x4ff2abda
I (27) boot: ESP-IDF v5.4 2nd stage bootloader
I (28) boot: compile time Jan 24 2025 19:33:31
I (28) boot: Multicore bootloader
I (29) boot: chip revision: v1.0
I (30) boot: efuse block revision: v0.1
I (34) boot.esp32p4: SPI Speed : 80MHz
I (38) boot.esp32p4: SPI Mode : DIO
I (42) boot.esp32p4: SPI Flash Size : 2MB
I (45) boot: Enabling RNG early entropy source...
I (50) boot: Partition Table:
I (52) boot: ## Label Usage Type ST Offset Length
I (59) boot: 0 nvs WiFi data 01 02 00009000 00006000
I (65) boot: 1 phy_init RF data 01 01 0000f000 00001000
I (72) boot: 2 factory factory app 00 00 00010000 00100000
I (79) boot: End of partition table
I (82) esp_image: segment 0: paddr=00010020 vaddr=40020020 size=09f80h ( 40832) map
I (95) esp_image: segment 1: paddr=00019fa8 vaddr=30100000 size=0000ch ( 12) load
I (97) esp_image: segment 2: paddr=00019fbc vaddr=3010000c size=00038h ( 56) load
I (105) esp_image: segment 3: paddr=00019ffc vaddr=4ff00000 size=0601ch ( 24604) load
I (116) esp_image: segment 4: paddr=00020020 vaddr=40000020 size=18c60h (101472) map
I (134) esp_image: segment 5: paddr=00038c88 vaddr=4ff0601c size=07f64h ( 32612) load
I (141) esp_image: segment 6: paddr=00040bf4 vaddr=4ff0df80 size=01d2ch ( 7468) load
I (146) boot: Loaded app from partition at offset 0x10000
I (146) boot: Disabling RNG early entropy source...
I (159) cpu_start: Multicore app
I (168) cpu_start: Pro cpu start user code
I (168) cpu_start: cpu freq: 360000000 Hz
I (169) app_init: Application information:
I (169) app_init: Project name: hello_world
I (173) app_init: App version: 1
I (176) app_init: Compile time: Jan 24 2025 19:33:28
I (181) app_init: ELF file SHA256: 3d3b3380e...
I (185) app_init: ESP-IDF: v5.4
I (189) efuse_init: Min chip rev: v0.1
I (193) efuse_init: Max chip rev: v1.99
I (197) efuse_init: Chip rev: v1.0
I (201) heap_init: Initializing. RAM available for dynamic allocation:
I (207) heap_init: At 4FF113F0 len 00029BD0 (166 KiB): RAM
I (212) heap_init: At 4FF3AFC0 len 00004BF0 (18 KiB): RAM
I (217) heap_init: At 4FF40000 len 00060000 (384 KiB): RAM
I (223) heap_init: At 50108080 len 00007F80 (31 KiB): RTCRAM
I (228) heap_init: At 30100044 len 00001FBC (7 KiB): TCM
I (234) spi_flash: detected chip: generic
I (237) spi_flash: flash io: dio
W (240) spi_flash: Detected size(16384k) larger than the size in the binary image header(2048k). Using the size in the binary image header.
I (252) main_task: Started on CPU0
I (262) main_task: Calling app_main()
Heap summary for capabilities 0x00000001:
At 0x4ff113f0 len 170960 free 156916 allocated 12944 min_free 156916
largest_free_block 155648 alloc_blocks 40 free_blocks 1 total_blocks 41
At 0x4ff3afc0 len 19440 free 18704 allocated 0 min_free 18704
largest_free_block 18432 alloc_blocks 0 free_blocks 1 total_blocks 1
At 0x4ff40000 len 393216 free 391444 allocated 0 min_free 391444
largest_free_block 385024 alloc_blocks 0 free_blocks 1 total_blocks 1
At 0x50108080 len 32640 free 31904 allocated 0 min_free 31904
largest_free_block 31744 alloc_blocks 0 free_blocks 1 total_blocks 1
Totals:
free 598968 allocated 12944 min_free 598968 largest_free_block 385024
0x4ff14b00
Guru Meditation Error: Core 0 panic'ed (Illegal instruction). Exception was unhandled.

--- Stack dump detected
Core 0 register dump:
MEPC : 0x4ff15168 RA : 0x4000bbd0 SP : 0x4ff13800 GP : 0x4ff0e780
--- 0x4000bbd0: app_main at /Users/vortex/esp/hello_world/main/hello_world_main.c:142

TP : 0x4ff13860 T0 : 0x4fc0a9f8 T1 : 0x4ff1345c T2 : 0x00000000
S0/FP : 0x4ff14b00 S1 : 0x00000000 A0 : 0x0000000b A1 : 0x00000000
A2 : 0x7f000000 A3 : 0x00000000 A4 : 0x00000000 A5 : 0x00008067
A6 : 0x4000a4e8 A7 : 0x0000000a S2 : 0x00000000 S3 : 0x00000000
--- 0x4000a4e8: usb_serial_jtag_write at /Users/vortex/esp-idf/components/esp_driver_usb_serial_jtag/src/usb_serial_jtag_vfs.c:184

S4 : 0x00000000 S5 : 0x00000000 S6 : 0x00000000 S7 : 0x00000000
S8 : 0x00000000 S9 : 0x00000000 S10 : 0x00000000 S11 : 0x00000000
T3 : 0x00000000 T4 : 0x00000000 T5 : 0x00000000 T6 : 0x00000000
MSTATUS : 0x00011880 MTVEC : 0x4ff00003 MCAUSE : 0x00000002 MTVAL : 0x000034f4
--- 0x4ff00003: _vector_table at ??:?

MHARTID : 0x00000000


--- Backtrace:


0x4ff15168 in ?? ()
#0 0x4ff15168 in ?? ()
#1 0x4000bbd0 in app_main () at /Users/vortex/esp/hello_world/main/hello_world_main.c:142
#2 0x00000000 in ?? ()
Backtrace stopped: frame did not save the PC



ELF file SHA256: 3d3b3380e

Rebooting...
ESP-ROM:esp32p4-eco2-20240710
Build:Jul 10 2024
rst:0xc (SW_CPU_RESET),boot:0x20f (SPI_FAST_FLASH_BOOT)
Core0 Saved PC:0x4ff034ae
--- 0x4ff034ae: cpu_utility_ll_reset_cpu at /Users/vortex/esp-idf/components/hal/esp32p4/include/hal/cpu_utility_ll.h:23
(inlined by) esp_cpu_reset at /Users/vortex/esp-idf/components/esp_hw_support/cpu.c:49

Core1 Saved PC:0x4ff0360a
--- 0x4ff0360a: esp_cpu_wait_for_intr at /Users/vortex/esp-idf/components/esp_hw_support/cpu.c:57 (discriminator 1)

SPI mode:DIO, clock div:1
load:0x4ff33ce0,len:0x162c
load:0x4ff2abd0,len:0xd70
load:0x4ff2cbd0,len:0x32fc
entry 0x4ff2abda
I (28) boot: ESP-IDF v5.4 2nd stage bootloader
Any ideas why this is happening? Why does it work on the C3, but not on the P4? Also is the cache sync necessary? And what about the fences?

Thanks a lot!

ESP_Sprite
Posts: 9883
Joined: Thu Nov 26, 2015 4:08 am

Re: Issues loading code from RAM on ESP32-P4, but not on ESP32-C3

Postby ESP_Sprite » Sat Jan 25, 2025 3:05 am

It's likely a cache thing. The fences tend to be there to make sure the bit above the fence is executed before the bit after the fence (which sounds stupid, but with memory write reordering, speculative reads, and the compiler taking permission to re-order statements as long as it's in line with the C language rules, that's not always the case without the fence). Unfortunately, that's not the issue here.

What I think is the issue is caches. The P4 has both a level 1 dcache and icache, which read/write to the internal memory (or the external memory L2 cache, if you're using external memory, but I don't think you do here). What I think happens here is that your data gets stuck in the L1 dcache, while when you execute the code written, the icache will read the instructions from the (underlying, and untouched) main memory.

In other words: you need to flush your dcache to main memory first, and then for good order clear your icache as well in case the written data just happens to be there.

SyntaxVortex
Posts: 2
Joined: Fri Jan 24, 2025 6:47 pm

Re: Issues loading code from RAM on ESP32-P4, but not on ESP32-C3

Postby SyntaxVortex » Sat Jan 25, 2025 11:18 am

You're absolutely right. I had already tried flushing the cache with esp_cache_msync, but I was using the wrong direction.

This did the trick:

Code: Select all

esp_cache_msync(code, 0x40, ESP_CACHE_MSYNC_FLAG_DIR_C2M | ESP_CACHE_MSYNC_FLAG_TYPE_DATA);
esp_cache_msync(code, 0x40, ESP_CACHE_MSYNC_FLAG_DIR_M2C | ESP_CACHE_MSYNC_FLAG_TYPE_INST);
I also added an additional fence before the sync just to be sure.

Thank you so much!

Who is online

Users browsing this forum: Corand and 74 guests