ESP32-S3 - esp_async_memcpy not working with PSRAM using GDMA

NZ Gangsta
Posts: 16
Joined: Wed Jul 20, 2022 8:32 am

ESP32-S3 - esp_async_memcpy not working with PSRAM using GDMA

Postby NZ Gangsta » Wed Nov 15, 2023 7:59 am

I'm struggling to get the esp_async_memcpy API to copy memory between external PSRAM using GDMA. The ESP32-S3 Technical Reference Manual says it is possible with GDMA and the esp-camera library does it using the GDMA library directly. I cannot find any actual complete example code using esp_async_memcpy, so I put this code together from the API documentation to try and make it work.

I can successfully transfer data within IRAM using the MALLOC_CAP_DMA. If I try to word align (32bit align) the memory it doesn't work saying the memory is not aligned. Even though the memory addresses and lengths look aligned to me.

Finally, I noticed GDMA functionality for esp_async_memcpy is only available on the Master branch of the ESP-IDF and that it doesn't even exist in IDF 5.1.1.

Can anyone tell me what I'm doing wrong? Or is there a bug in the implementation of the esp_async_memcpy API?
  1. /*
  2. * Attempts to use the async_memcpy driver to copy data from one buffer to another
  3. *
  4. * Works for MALLOC_CAP_DMA but not if we try to align the buffers to 32 bits
  5. * Does not work for MALLOC_CAP_SPIRAM no matter what I try.
  6. * Reporting "invalid argument" on the call to esp_async_memcpy
  7. *
  8. * Using an ESP32-S3-WROOM-1U-N8R8 module
  9. * Using the master branch of ESP-IDF cloned on 15/11/2023
  10. *
  11. */
  12.  
  13.  
  14. #include <inttypes.h>
  15. #include <string.h>
  16.  
  17. #include "malloc.h"
  18. #include "esp_log.h"
  19. #include "esp_heap_caps.h"
  20. #include "esp_async_memcpy.h"
  21.  
  22. #include "freertos/FreeRTOS.h"
  23. #include "freertos/task.h"
  24. #include "freertos/semphr.h"
  25.  
  26. static const char *TAG = "async_memcpy";
  27.  
  28. /// @brief The source buffer
  29. void* _source;
  30.  
  31. /// @brief The destination buffer
  32. void* _dest;
  33.  
  34.  
  35. /// @brief Async Memory copy callback implementation, running in ISR context
  36. /// @param mcp_hdl Handle of async memcpy
  37. /// @param event Event object, which contains related data, reserved for future
  38. /// @param cb_args User defined arguments, passed from esp_async_memcpy function
  39. /// @return Whether a high priority task is woken up by the callback function
  40. static IRAM_ATTR bool my_async_memcpy_cb(async_memcpy_t mcp_hdl, async_memcpy_event_t *event, void *cb_args)
  41. {
  42.  
  43.     // Get the semaphore handle from the callback arguments
  44.     SemaphoreHandle_t sem = (SemaphoreHandle_t)cb_args;
  45.  
  46.     // Give the semaphore so the CPU can continue execution when its ready
  47.     // high_task_wakeup set to pdTRUE if we want to yield to a high priority task
  48.     BaseType_t high_task_wakeup = pdTRUE;                                  
  49.     xSemaphoreGiveFromISR(sem, &high_task_wakeup);
  50.  
  51.     // Return whether a high priority task was woken up so the ISR knows if it needs to yield
  52.     return high_task_wakeup == pdTRUE;
  53.  
  54. }
  55.  
  56.  
  57. /// @brief Main application entry point
  58. void app_main(void)
  59. {
  60.  
  61.     // Hello world
  62.     ESP_LOGI(TAG, "\n\nasync_memcpy!\n");
  63.  
  64.     // Allocate some RAM for the source and destination buffers
  65.     ESP_LOGI(TAG, "Allocating memory for buffers...");
  66.     uint32_t size = 100000;
  67.     // ** This works **
  68.     _source = heap_caps_malloc(size, MALLOC_CAP_DMA);                                              
  69.     _dest = heap_caps_malloc(size, MALLOC_CAP_DMA);  
  70.     // ** This DOES NOT work **
  71.     // _source = heap_caps_malloc(size, MALLOC_CAP_DMA | MALLOC_CAP_32BIT);                        
  72.     // _dest = heap_caps_malloc(size, MALLOC_CAP_DMA | MALLOC_CAP_32BIT);  
  73.     // ** This DOES NOT work **
  74.     // _source = heap_caps_malloc(size, MALLOC_CAP_SPIRAM | MALLOC_CAP_DMA | MALLOC_CAP_32BIT);
  75.     // _dest = heap_caps_malloc(size, MALLOC_CAP_SPIRAM | MALLOC_CAP_DMA | MALLOC_CAP_32BIT);
  76.     // ** This DOES NOT work **
  77.     //_source = heap_caps_malloc(size, MALLOC_CAP_SPIRAM);
  78.     //_dest = heap_caps_malloc(size, MALLOC_CAP_SPIRAM);
  79.  
  80.     // Initialize the source buffer
  81.     ESP_LOGI(TAG, "Initializing source buffer...");
  82.     //memset(_source, 0x77, size);
  83.  
  84.     // Install the Async memcpy driver. Using the defaults + a larger backlog.
  85.     ESP_LOGI(TAG, "Installing async_memcpy driver...");
  86.     async_memcpy_config_t config = ASYNC_MEMCPY_DEFAULT_CONFIG();
  87.     //uint32_t backlog = 8;
  88.     uint32_t backlog = (size > 2500) ? size/2500 : 4;   // I think 2500 is approx the size of DMA transfer?
  89.     config.backlog = backlog;    
  90.     // config.sram_trans_align = 32;                    // These don't work
  91.     // config.psram_trans_align = 32;
  92.     async_memcpy_handle_t driver = NULL;
  93.     //ESP_ERROR_CHECK(esp_async_memcpy_install(&config, &driver));
  94.     ESP_ERROR_CHECK(esp_async_memcpy_install_gdma_ahb(&config, &driver));
  95.  
  96.     // Create a semaphore so we can wait for the async memcpy to complete
  97.     ESP_LOGI(TAG, "Creating semaphore...");
  98.     SemaphoreHandle_t semaphore = xSemaphoreCreateBinary();
  99.  
  100.     // Start the async memcpy
  101.     ESP_LOGI(TAG, "Starting the async_memcpy to transfer %" PRIu32 " bytes...", size);
  102.     ESP_ERROR_CHECK(esp_async_memcpy(driver, _dest, _source, size, my_async_memcpy_cb, semaphore));
  103.  
  104.     // Wait for the async memcpy to complete
  105.     ESP_LOGI(TAG, "Waiting for the DMA operation to complete...");
  106.     xSemaphoreTake(semaphore, portMAX_DELAY);
  107.  
  108.     // Success. All done
  109.     ESP_LOGI(TAG, "async_memcpy complete! Transferred %" PRIu32 " bytes", size);
  110.  
  111. }
This is the log when I use IRAM and everything works
  1. SPIWP:0xee
  2. mode:DIO, clock div:1
  3. load:0x3fce3810,len:0x178c
  4. load:0x403c9700,len:0x4
  5. load:0x403c9704,len:0xcb8
  6. load:0x403cc700,len:0x2d84
  7. entry 0x403c9914
  8. I (26) boot: ESP-IDF v5.3-dev-277-gc8243465e4 2nd stage bootloader
  9. I (27) boot: compile time Nov 15 2023 20:47:33
  10. I (27) boot: Multicore bootloader
  11. I (31) boot: chip revision: v0.1
  12. I (35) boot.esp32s3: Boot SPI Speed : 80MHz
  13. I (40) boot.esp32s3: SPI Mode       : DIO
  14. I (45) boot.esp32s3: SPI Flash Size : 8MB
  15. I (49) boot: Enabling RNG early entropy source...
  16. I (55) boot: Partition Table:
  17. I (58) boot: ## Label            Usage          Type ST Offset   Length
  18. I (66) boot:  0 nvs              WiFi data        01 02 00009000 00006000
  19. I (73) boot:  1 phy_init         RF data          01 01 0000f000 00001000
  20. I (81) boot:  2 factory          factory app      00 00 00010000 00100000
  21. I (88) boot: End of partition table
  22. I (92) esp_image: segment 0: paddr=00010020 vaddr=3c020020 size=0dc44h ( 56388) map
  23. I (111) esp_image: segment 1: paddr=0001dc6c vaddr=3fc92500 size=023ach (  9132) load
  24. I (113) esp_image: segment 2: paddr=00020020 vaddr=42000020 size=1a66ch (108140) map
  25. I (137) esp_image: segment 3: paddr=0003a694 vaddr=3fc948ac size=00608h (  1544) load
  26. I (138) esp_image: segment 4: paddr=0003aca4 vaddr=40374000 size=0e4b4h ( 58548) load
  27. I (161) boot: Loaded app from partition at offset 0x10000
  28. I (162) boot: Disabling RNG early entropy source...
  29. I (173) cpu_start: Multicore app
  30. I (182) cpu_start: Pro cpu start user code
  31. I (182) cpu_start: cpu freq: 160000000 Hz
  32. I (183) cpu_start: Application information:
  33. I (185) cpu_start: Project name:     async_memcpy
  34. I (191) cpu_start: App version:      d6b7f7a-dirty
  35. I (196) cpu_start: Compile time:     Nov 15 2023 20:47:26
  36. I (202) cpu_start: ELF file SHA256:  97614a826...
  37. I (208) cpu_start: ESP-IDF:          v5.3-dev-277-gc8243465e4
  38. I (214) cpu_start: Min chip rev:     v0.0
  39. I (219) cpu_start: Max chip rev:     v0.99
  40. I (224) cpu_start: Chip rev:         v0.1
  41. I (228) heap_init: Initializing. RAM available for dynamic allocation:
  42. I (236) heap_init: At 3FC95790 len 00053F80 (335 KiB): RAM
  43. I (242) heap_init: At 3FCE9710 len 00005724 (21 KiB): RAM
  44. I (248) heap_init: At 3FCF0000 len 00008000 (32 KiB): DRAM
  45. I (254) heap_init: At 600FE010 len 00001FD8 (7 KiB): RTCRAM
  46. I (261) spi_flash: detected chip: generic
  47. I (265) spi_flash: flash io: dio
  48. I (269) sleep: Configure to isolate all GPIO pins in sleep state
  49. I (276) sleep: Enable automatic switching of GPIO sleep configuration
  50. I (283) main_task: Started on CPU0
  51. I (293) main_task: Calling app_main()
  52. I (293) async_memcpy:
  53.  
  54. async_memcpy!
  55.  
  56. I (293) async_memcpy: Allocating memory for buffers...
  57. I (303) async_memcpy: Initializing source buffer...
  58. I (303) async_memcpy: Installing async_memcpy driver...
  59. D (313) gdma: new group (0) at 0x3fccbd8c
  60. D (313) gdma: new pair (0,0) at 0x3fccbe14
  61. D (323) gdma: new tx channel (0,0) at 0x3fccbd54
  62. D (323) gdma: new rx channel (0,0) at 0x3fccbe34
  63. D (333) gdma: tx channel (0,0), (0:32) bytes aligned, burst enabled
  64. D (333) gdma: rx channel (0,0), (0:32) bytes aligned, burst disabled
  65. D (343) gdma: install interrupt service for rx channel (0,0)
  66. I (353) async_memcpy: Creating semaphore...
  67. I (353) async_memcpy: Starting the async_memcpy to transfer 100000 bytes...
  68. I (363) async_memcpy: Waiting for the DMA operation to complete...
  69. I (373) async_memcpy: async_memcpy complete! Transferred 100000 bytes
  70. I (373) main_task: Returned from app_main()
This is the log when I attempt to use external PSRAM
  1. SPIWP:0xee
  2. mode:DIO, clock div:1
  3. load:0x3fce3810,len:0x178c
  4. load:0x403c9700,len:0x4
  5. load:0x403c9704,len:0xcb8
  6. load:0x403cc700,len:0x2d84
  7. entry 0x403c9914
  8. I (26) boot: ESP-IDF v5.3-dev-277-gc8243465e4 2nd stage bootloader
  9. I (26) boot: compile time Nov 15 2023 20:47:33
  10. I (27) boot: Multicore bootloader
  11. I (31) boot: chip revision: v0.1
  12. I (35) boot.esp32s3: Boot SPI Speed : 80MHz
  13. I (39) boot.esp32s3: SPI Mode       : DIO
  14. I (44) boot.esp32s3: SPI Flash Size : 8MB
  15. I (49) boot: Enabling RNG early entropy source...
  16. I (54) boot: Partition Table:
  17. I (58) boot: ## Label            Usage          Type ST Offset   Length
  18. I (65) boot:  0 nvs              WiFi data        01 02 00009000 00006000
  19. I (73) boot:  1 phy_init         RF data          01 01 0000f000 00001000
  20. I (80) boot:  2 factory          factory app      00 00 00010000 00100000
  21. I (88) boot: End of partition table
  22. I (92) esp_image: segment 0: paddr=00010020 vaddr=3c020020 size=0dc44h ( 56388) map
  23. I (110) esp_image: segment 1: paddr=0001dc6c vaddr=3fc92500 size=023ach (  9132) load
  24. I (113) esp_image: segment 2: paddr=00020020 vaddr=42000020 size=1a66ch (108140) map
  25. I (136) esp_image: segment 3: paddr=0003a694 vaddr=3fc948ac size=00608h (  1544) load
  26. I (137) esp_image: segment 4: paddr=0003aca4 vaddr=40374000 size=0e4b4h ( 58548) load
  27. I (161) boot: Loaded app from partition at offset 0x10000
  28. I (161) boot: Disabling RNG early entropy source...
  29. I (172) cpu_start: Multicore app
  30. I (182) cpu_start: Pro cpu start user code
  31. I (182) cpu_start: cpu freq: 160000000 Hz
  32. I (182) cpu_start: Application information:
  33. I (185) cpu_start: Project name:     async_memcpy
  34. I (190) cpu_start: App version:      d6b7f7a-dirty
  35. I (196) cpu_start: Compile time:     Nov 15 2023 20:47:26
  36. I (202) cpu_start: ELF file SHA256:  7c7112055...
  37. I (207) cpu_start: ESP-IDF:          v5.3-dev-277-gc8243465e4
  38. I (214) cpu_start: Min chip rev:     v0.0
  39. I (218) cpu_start: Max chip rev:     v0.99
  40. I (223) cpu_start: Chip rev:         v0.1
  41. I (228) heap_init: Initializing. RAM available for dynamic allocation:
  42. I (235) heap_init: At 3FC95790 len 00053F80 (335 KiB): RAM
  43. I (241) heap_init: At 3FCE9710 len 00005724 (21 KiB): RAM
  44. I (247) heap_init: At 3FCF0000 len 00008000 (32 KiB): DRAM
  45. I (253) heap_init: At 600FE010 len 00001FD8 (7 KiB): RTCRAM
  46. I (261) spi_flash: detected chip: generic
  47. I (264) spi_flash: flash io: dio
  48. I (269) sleep: Configure to isolate all GPIO pins in sleep state
  49. I (275) sleep: Enable automatic switching of GPIO sleep configuration
  50. I (283) main_task: Started on CPU0
  51. I (293) main_task: Calling app_main()
  52. I (293) async_memcpy:
  53.  
  54. async_memcpy!
  55.  
  56. I (293) async_memcpy: Allocating memory for buffers...
  57. I (303) async_memcpy: Initializing source buffer...
  58. I (303) async_memcpy: Installing async_memcpy driver...
  59. D (313) gdma: new group (0) at 0x3fc9b05c
  60. D (313) gdma: new pair (0,0) at 0x3fc9b0e4
  61. D (323) gdma: new tx channel (0,0) at 0x3fc9b024
  62. D (323) gdma: new rx channel (0,0) at 0x3fc9b104
  63. D (333) gdma: tx channel (0,0), (0:32) bytes aligned, burst enabled
  64. D (333) gdma: rx channel (0,0), (0:32) bytes aligned, burst disabled
  65. D (343) gdma: install interrupt service for rx channel (0,0)
  66. I (353) async_memcpy: Creating semaphore...
  67. I (353) async_memcpy: Starting the async_memcpy to transfer 100000 bytes...
  68. E (363) async_mcp: esp_async_memcpy(21): invalid argument
  69. ESP_ERROR_CHECK failed: esp_err_t 0x102 (ESP_ERR_INVALID_ARG) at 0x420084c1
  70. 0x420084c1: app_main at D:/Ferret/Dev/esp32/Tests/Async_memcpy/main/main.c:102 (discriminator 1)
  71.  
  72. file: "./main/main.c" line 102
  73. func: app_main
  74. expression: esp_async_memcpy(driver, _dest, _source, size, my_async_memcpy_cb, semaphore)
  75.  
  76. abort() was called at PC 0x40379ca7 on core 0
  77. ex40379ca7: _esp_error_check_failed at C:/esp32/esp-idf_master/esp-idf/components/esp_system/esp_err.cN:o50n
  78.  
  79.  
  80.  
  81.  
  82. Backtrace: 0x403758a6:0x3fc99340 0x40379cb1:0x3fc99360 0x40380441:0x3fc99380 0x40379ca7:0x3fc993f0 0x420084c1:0x3fc99420 0x42019813:0x3fc99460 0x4037a515:0x3fc99490
  83. Waiting for the device to reconnect0x403758a6: panic_abort at C:/esp32/esp-idf_master/esp-idf/components/esp_system/panic.c:472
  84.  
  85. 0x40379cb1: esp_system_abort at C:/esp32/esp-idf_master/esp-idf/components/esp_system/port/esp_system_chip.c:93
  86.  
  87. 0x40380441: abort at C:/esp32/esp-idf_master/esp-idf/components/newlib/abort.c:38
  88.  
  89. 0x40379ca7: _esp_error_check_failed at C:/esp32/esp-idf_master/esp-idf/components/esp_system/esp_err.c:50
  90.  
  91. 0x420084c1: app_main at D:/Ferret/Dev/esp32/Tests/Async_memcpy/main/main.c:102 (discriminator 1)
  92.  
  93. 0x42019813: main_task at C:/esp32/esp-idf_master/esp-idf/components/freertos/app_startup.c:208
  94.  
  95. 0x4037a515: vPortTaskWrapper at C:/esp32/esp-idf_master/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:137
  96.  
  97.  
  98.  
  99.  
  100.  
  101. ELF file SHA256: 7c7112055
  102.  
  103. Rebooting...
  104. ESP-ROM:esp32s3-20210327
  105. Build:Mar 27 2021
  106. rst:0xc (RTC_SW_CPU_RST),boot:0xf (SPI_FAST_FLASH_BOOT)
  107. Saved PC:0x4037581c
  108. 0x4037581c: esp_restart_noos at C:/esp32/esp-idf_master/esp-idf/components/esp_system/port/soc/esp32s3/system_internal.c:158

User avatar
ok-home
Posts: 78
Joined: Sun May 02, 2021 7:23 pm
Location: Russia Novosibirsk
Contact:

Re: ESP32-S3 - esp_async_memcpy not working with PSRAM using GDMA

Postby ok-home » Wed Nov 15, 2023 8:23 am

Hi
try it

Code: Select all

buf = heap_caps_aligned_alloc(GDMA_PSRAM_BURST,len, MALLOC_CAP_SPIRAM);

NZ Gangsta
Posts: 16
Joined: Wed Jul 20, 2022 8:32 am

Re: ESP32-S3 - esp_async_memcpy not working with PSRAM using GDMA

Postby NZ Gangsta » Wed Nov 15, 2023 8:54 am

The compiler doesn't recognise the GDMA_PSRAM_BURST define. I changed it to a hardcoded value of 64 but it didn't work for MALLOC_CAP_SPIRAM. It does however work for the MALLOC_CAP_DMA. So do alignments of 16 & 32.

So it looks like the alignment problem is sorted, but I still can't use the GMDA controller to move data between buffers in SPIRAM (AKA PSRAM)
  1. _source = heap_caps_aligned_alloc(64, size, MALLOC_CAP_DMA);
  2. _dest = heap_caps_aligned_alloc(64, size, MALLOC_CAP_DMA);
.
.
.
  1.     config.sram_trans_align = 64;                       // Only works for IRAM at the moment
  2.     config.psram_trans_align = 64;

User avatar
ok-home
Posts: 78
Joined: Sun May 02, 2021 7:23 pm
Location: Russia Novosibirsk
Contact:

Re: ESP32-S3 - esp_async_memcpy not working with PSRAM using GDMA

Postby ok-home » Wed Nov 15, 2023 11:50 am

hi
Are you sure that you have enabled psram in the menuconfig ?
I don't see the definition of psram in the startup log

example on my log

I (788) heap_init: At 3FC9FEF8 len 00049818 (294 KiB): DRAM
I (795) heap_init: At 3FCE9710 len 00005724 (21 KiB): STACK/DRAM
I (801) heap_init: At 3FCF0000 len 00008000 (32 KiB): DRAM
I (807) heap_init: At 600FE010 len 00001FD8 (7 KiB): RTCRAM
I (814) esp_psram: Adding pool of 8192K of PSRAM memory to heap allocator
I (822) spi_flash: detected chip: gd

MicroController
Posts: 1696
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: ESP32-S3 - esp_async_memcpy not working with PSRAM using GDMA

Postby MicroController » Wed Nov 15, 2023 11:59 am

I don't think you can DMA from PSRAM to PSRAM. You can only DMA from internal RAM to PSRAM or vice versa.

Also note that MALLOC_CAP_32BIT does not align the allocated memory. Use heap_caps_aligned_alloc(...) for that.

User avatar
ok-home
Posts: 78
Joined: Sun May 02, 2021 7:23 pm
Location: Russia Novosibirsk
Contact:

Re: ESP32-S3 - esp_async_memcpy not working with PSRAM using GDMA

Postby ok-home » Wed Nov 15, 2023 12:54 pm

hi
I don't think you can DMA from PSRAM to PSRAM. You can only DMA from internal RAM to PSRAM or vice versa.
3.4.3 Memory­to­Memory Data Transfer
The GDMA controller also allows memory-to-memory data transfer. Such data transfer can be enabled by setting
GDMA_MEM_TRANS_EN_CHn, which connects the output of transmit channel n to the input of receive channel
n. Note that a transmit channel is only connected to the receive channel with the same number (n).
As every transmit and receive channel can be used to access internal and external RAM, there are four data
transfer modes:
• from internal RAM to internal RAM
• from internal RAM to external RAM
• from external RAM to internal RAM
• from external RAM to external RAM

MicroController
Posts: 1696
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: ESP32-S3 - esp_async_memcpy not working with PSRAM using GDMA

Postby MicroController » Wed Nov 15, 2023 7:16 pm

I just tried it, and indeed PSRAM->PSRAM DMA does work (ESP-IDF v5.1).

I used

Code: Select all

    const uint32_t PSRAM_ALIGN = cache_hal_get_cache_line_size(CACHE_TYPE_DATA);
    const uint32_t INTRAM_ALIGN = PSRAM_ALIGN;
    const async_memcpy_config_t cfg {
        .backlog = 8,
        .sram_trans_align = INTRAM_ALIGN,
        .psram_trans_align = PSRAM_ALIGN,
        .flags = 0
    };
and

Code: Select all

heap_caps_aligned_alloc(PSRAM_ALIGN,...,MALLOC_CAP_SPIRAM);
Btw, re: backlog:

Code: Select all

#define DMA_DESCRIPTOR_BUFFER_MAX_SIZE (4095) /*!< Maximum size of the buffer that can be attached to descriptor */
4092 (multiple of word size) or 4095 would be the maximum size of one "backlog" transfer, so backlog = (size+4091)/4092 should be the minimum required.

NZ Gangsta
Posts: 16
Joined: Wed Jul 20, 2022 8:32 am

Re: ESP32-S3 - esp_async_memcpy not working with PSRAM using GDMA

Postby NZ Gangsta » Wed Nov 15, 2023 10:08 pm

That's encouraging to hear it works for you. However I still cannot get it to work with PSRAM :(

I replaced the backlog calculation with your suggested one and this works well :)

I could not get the cache_hal_get_cache_line_size function to compile as it reports too few arguments. On ESP-IDF 5.1, 5.1.1 and Master. This is really strange as it appears to only need the one argument in the definition within cache_hal.h. Go figure??? At the moment I have hard coded the alignment to 32 bytes as this is what I think the cache line size is for the ESP32-S3. I have tried 16, 32 & 64. All of them work for IRAM but none of them worked for PSRAM.

Image

If possible could you provide a complete working example please.

MicroController
Posts: 1696
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: ESP32-S3 - esp_async_memcpy not working with PSRAM using GDMA

Postby MicroController » Wed Nov 15, 2023 11:49 pm

This quickly cobbled-together code works for me:

Code: Select all

#include <stdio.h>
#include <inttypes.h>
#include "esp_cpu.h"
#include "esp_heap_caps.h"
#include "esp_log.h"
#include "freertos/FreeRTOS.h"
#include "freertos/task.h"
#include "hal/cache_hal.h"

#include "esp_async_memcpy.h"


static const char TAG[] = "DMADEMO";

static bool dmacpy_cb(async_memcpy_t hdl, async_memcpy_event_t*, void* args) {
    BaseType_t r = pdFALSE;
    xTaskNotifyFromISR((TaskHandle_t)args,0,eNoAction,&r);
    return (r!=pdFALSE);
}


void app_main(void)
{

    const uint32_t LEN = 16384;
    const uint32_t PSRAM_ALIGN = cache_hal_get_cache_line_size(CACHE_TYPE_DATA);
    const uint32_t INTRAM_ALIGN = PSRAM_ALIGN;
    const async_memcpy_config_t cfg = {
        .backlog = (LEN+4091)/4092,
        .sram_trans_align = INTRAM_ALIGN,
        .psram_trans_align = PSRAM_ALIGN,
        .flags = 0
    };

    ESP_LOGI(TAG, "Allocating 2x %" PRIu32 "kb in PSRAM, alignment: %" PRIu32 " bytes", LEN/1024, PSRAM_ALIGN);

    void* const mem1 = heap_caps_aligned_alloc(PSRAM_ALIGN, LEN, MALLOC_CAP_SPIRAM);

    if(!mem1) {
        ESP_LOGE(TAG, "Alloc 1 failed");
        return;
    }

    void* const mem2 = heap_caps_aligned_alloc(PSRAM_ALIGN, LEN, MALLOC_CAP_SPIRAM);

    if(!mem2) {
        free(mem1);
        ESP_LOGE(TAG, "Alloc 2 failed");
        return;
    }    

    async_memcpy_t handle;
    esp_err_t r = esp_async_memcpy_install(&cfg,&handle);

    if (r == ESP_OK) {

        TaskHandle_t task = xTaskGetCurrentTaskHandle();

        ESP_LOGI(TAG, "Starting DMA copy.");

        const uint32_t tstart = esp_cpu_get_cycle_count();

        r = esp_async_memcpy(handle,mem1,mem2,LEN,&dmacpy_cb,(void*)task);

        if(r == ESP_OK) {
            if(xTaskNotifyWait(0,0,0,1000/portTICK_PERIOD_MS)) {
                ESP_LOGI(TAG, "DMA CPY EXT->EXT took %" PRIu32 " CPU cycles", esp_cpu_get_cycle_count() - tstart);
            } else {
                ESP_LOGE(TAG, "Timed out waiting for DMA CPY.");
            }
        } else {
            ESP_LOGE(TAG, "Failed to start DMA CPY: %i",r);
        }    

        esp_async_memcpy_uninstall(handle);
    } else {
        ESP_LOGE(TAG, "Failed to install DMA driver: %i",r);
    }

    free(mem2);
    free(mem1);

}
this is the PSRAM part of my sdkconfig:

Code: Select all

#
# ESP PSRAM
#
CONFIG_SPIRAM=y

#
# SPI RAM config
#
# CONFIG_SPIRAM_MODE_QUAD is not set
CONFIG_SPIRAM_MODE_OCT=y
# CONFIG_SPIRAM_TYPE_AUTO is not set
CONFIG_SPIRAM_TYPE_ESPPSRAM64=y
CONFIG_SPIRAM_ALLOW_STACK_EXTERNAL_MEMORY=y
CONFIG_SPIRAM_CLK_IO=30
CONFIG_SPIRAM_CS_IO=26
# CONFIG_SPIRAM_FETCH_INSTRUCTIONS is not set
# CONFIG_SPIRAM_RODATA is not set
CONFIG_SPIRAM_SPEED_80M=y
# CONFIG_SPIRAM_SPEED_40M is not set
CONFIG_SPIRAM_SPEED=80
CONFIG_SPIRAM_BOOT_INIT=y
# CONFIG_SPIRAM_IGNORE_NOTFOUND is not set
# CONFIG_SPIRAM_USE_MEMMAP is not set
CONFIG_SPIRAM_USE_CAPS_ALLOC=y
# CONFIG_SPIRAM_USE_MALLOC is not set
# CONFIG_SPIRAM_MEMTEST is not set
# CONFIG_SPIRAM_TRY_ALLOCATE_WIFI_LWIP is not set
# CONFIG_SPIRAM_ALLOW_BSS_SEG_EXTERNAL_MEMORY is not set
# CONFIG_SPIRAM_ECC_ENABLE is not set
# end of SPI RAM config
# end of ESP PSRAM
And this is the log output:

Code: Select all

I (37) boot.esp32s3: Boot SPI Speed : 80MHz
I (42) boot.esp32s3: SPI Mode       : QIO
I (47) boot.esp32s3: SPI Flash Size : 16MB
I (51) boot: Enabling RNG early entropy source...
I (57) boot: Partition Table:
I (60) boot: ## Label            Usage          Type ST Offset   Length
I (68) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (75) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (83) boot:  2 factory          factory app      00 00 00010000 00100000
I (90) boot: End of partition table
I (94) esp_image: segment 0: paddr=00010020 vaddr=3c020020 size=0b2a4h ( 45732) map
I (110) esp_image: segment 1: paddr=0001b2cc vaddr=3fc92600 size=02ae4h ( 10980) load
I (113) esp_image: segment 2: paddr=0001ddb8 vaddr=40374000 size=02260h (  8800) load
I (121) esp_image: segment 3: paddr=00020020 vaddr=42000020 size=1a058h (106584) map
I (144) esp_image: segment 4: paddr=0003a080 vaddr=40376260 size=0c29ch ( 49820) load
I (160) boot: Loaded app from partition at offset 0x10000
I (160) boot: Disabling RNG early entropy source...
I (172) octal_psram: vendor id    : 0x0d (AP)
I (172) octal_psram: dev id       : 0x02 (generation 3)
I (172) octal_psram: density      : 0x03 (64 Mbit)
I (177) octal_psram: good-die     : 0x01 (Pass)
I (182) octal_psram: Latency      : 0x01 (Fixed)
I (188) octal_psram: VCC          : 0x01 (3V)
I (193) octal_psram: SRF          : 0x01 (Fast Refresh)
I (199) octal_psram: BurstType    : 0x01 (Hybrid Wrap)
I (204) octal_psram: BurstLen     : 0x01 (32 Byte)
I (210) octal_psram: Readlatency  : 0x02 (10 cycles@Fixed)
I (216) octal_psram: DriveStrength: 0x00 (1/1)
I (222) esp_psram: Found 8MB PSRAM device
I (226) esp_psram: Speed: 80MHz
I (230) cpu_start: Pro cpu up.
I (233) cpu_start: Starting app cpu, entry point is 0x403754d8

I (0) cpu_start: App cpu up.
I (255) cpu_start: Pro cpu start user code
I (255) cpu_start: cpu freq: 240000000 Hz
I (255) cpu_start: Application information:
I (258) cpu_start: Project name:     test2
I (263) cpu_start: App version:      9c643af-dirty
I (275) cpu_start: ELF file SHA256:  c262dca9b5d25f56...
I (281) cpu_start: ESP-IDF:          v5.1-dev-4124-gbb9200acec-dirty
I (288) cpu_start: Min chip rev:     v0.0
I (292) cpu_start: Max chip rev:     v0.99 
I (297) cpu_start: Chip rev:         v0.2
I (302) heap_init: Initializing. RAM available for dynamic allocation:
I (309) heap_init: At 3FC959C8 len 00053D48 (335 KiB): DRAM
I (315) heap_init: At 3FCE9710 len 00005724 (21 KiB): STACK/DRAM
I (322) heap_init: At 3FCF0000 len 00008000 (32 KiB): DRAM
I (328) heap_init: At 600FE010 len 00001FF0 (7 KiB): RTCRAM
I (334) esp_psram: Adding pool of 8192K of PSRAM memory to heap allocator
I (342) spi_flash: detected chip: generic
I (346) spi_flash: flash io: qio
I (351) sleep: Configure to isolate all GPIO pins in sleep state
I (357) sleep: Enable automatic switching of GPIO sleep configuration
I (364) app_start: Starting scheduler on CPU0
I (369) app_start: Starting scheduler on CPU1
I (369) main_task: Started on CPU0
I (379) main_task: Calling app_main()
I (379) DMADEMO: Allocating 2x 16kb in PSRAM, alignment: 32 bytes
I (389) DMADEMO: Starting DMA copy.
I (399) DMADEMO: DMA CPY EXT->EXT took 153634 CPU cycles
I (399) main_task: Returned from app_main()

NZ Gangsta
Posts: 16
Joined: Wed Jul 20, 2022 8:32 am

Re: ESP32-S3 - esp_async_memcpy not working with PSRAM using GDMA

Postby NZ Gangsta » Thu Nov 16, 2023 4:27 am

Awesome, that works. Nice! :D

I think my original code had a missing config setting. I also noticed that adding the MALLOC_CAP_DMA flag to the memory allocation actually breaks things when using PSRAM. My code didn't pick up on this because it never checked the memory pointers to see if the allocation actually succeeded.

Once I fixed the config and the memory allocation my code works, although the code from MicroController is much better at handling errors.

Use this when allocating memory for DMA in PSRAM...
  1. _source = heap_caps_aligned_alloc(PSRAM_ALIGN, size, MALLOC_CAP_SPIRAM);
Not this...
  1. dest = heap_caps_aligned_alloc(PSRAM_ALIGN, size, MALLOC_CAP_SPIRAM | MALLOC_CAP_DMA);
Thanks for the help everyone. Especially MicroController for the solution!

Who is online

Users browsing this forum: Baidu [Spider] and 100 guests