esp32s3 Cache and DMA

esp_programmer
Posts: 16
Joined: Wed Jul 12, 2023 4:26 am

esp32s3 Cache and DMA

Postby esp_programmer » Wed Jul 12, 2023 4:39 am

Hi,

I am implementing a performance-critical application using the esp32s3. I need to do signal processing on a large amount of data and write the result to psram so the SPI controller can transfer it to another board.

The cache flushing procedure is pretty poorly documented in the TRM (as far as I can tell). I assume I have to manually flush the cache before enabling the dma transfer, but the esp-idf only seems to have a mysterious Cache_Writeback_Addr function embedded into the ROM which also seems to be "unsafe" to even use in an sdk application (according to the comments). There is also a header that implies the flushing is done by writing to a peripheral register(s).

I need to do this flushing as fast as possible, so I would like to know what the bare minimum instructions are to do this (ideally I would write a few lines in assembler, but it appears the Xtensa flushing instructions aren't implemented). Specifically, my questions are:

1. Is there clear documentation on how to use the registers defined in soc/extmem_reg.h to flush the data cache?
2. If not, are there proper functions that we SHOULD be using via a normal sdk application?
3. If not, can I see the source code for the Cache_Writeback_* functions?

Thanks
esp_programmer

ESP_igrr
Posts: 2072
Joined: Tue Dec 01, 2015 8:37 am

Re: esp32s3 Cache and DMA

Postby ESP_igrr » Wed Jul 12, 2023 6:54 am

Hi esp_programmer,

Incidentally, we are about to make a change which moves the Cache_Writeback_Addr code into IDF, since the version in ROM apparently had a bug. It should be merged some time this week.

esp_programmer
Posts: 16
Joined: Wed Jul 12, 2023 4:26 am

Re: esp32s3 Cache and DMA

Postby esp_programmer » Wed Jul 12, 2023 7:04 am

Hi ESP_igrr,

Thanks for the quick reply.

That sounds promising, although if there is a bug in the ROM, does that mean any hardware already shipped might have an issue (i.e. does the bootloader or something else used this function)?

Also, could I get some extra info about how the cache flushing actually works? Specifically, is it possible/safe to write one block of the cache while a different block is being flushed? Should the TRM have clearer instructions on what is needed when using external memory and DMA?

Thanks

ESP_igrr
Posts: 2072
Joined: Tue Dec 01, 2015 8:37 am

Re: esp32s3 Cache and DMA

Postby ESP_igrr » Wed Jul 12, 2023 8:04 am

esp_programmer wrote: does that mean any hardware already shipped might have an issue (i.e. does the bootloader or something else used this function)?
The bug affects the situation when there is an unaligned buffer being flushed, and the stack is in the same cache line, and an interrupt happens while the writeback is in process, and a cache miss happens. Then the CPU won't be able to access correct values on the stack. No, it's not something that affects the bootloader.
esp_programmer wrote: Also, could I get some extra info about how the cache flushing actually works? Specifically, is it possible/safe to write one block of the cache while a different block is being flushed?
Yes, as long as they are not in the same cache line. In this case flushing (or preloading) to one cache line can happen concurrently with the CPU accessing to another cache line.
esp_programmer wrote: Should the TRM have clearer instructions on what is needed when using external memory and DMA?
Agree this would be nice. I'll pass the request along to our documentation team.

esp_programmer
Posts: 16
Joined: Wed Jul 12, 2023 4:26 am

Re: esp32s3 Cache and DMA

Postby esp_programmer » Wed Jul 12, 2023 9:03 pm

Ok, thanks for your time.

jprpower104
Posts: 16
Joined: Tue Jul 18, 2023 3:59 am

Re: esp32s3 Cache and DMA

Postby jprpower104 » Fri Sep 15, 2023 9:30 pm

Can you send the code GDMA example for esp32s3?? I need do GDMA for al communications ports like UART, SPI, I2C, TWAI , etc... in my job, because we have many apps in multiples chips but we like migrate to esp32XXXX family...

ESP_Sprite
Posts: 9764
Joined: Thu Nov 26, 2015 4:08 am

Re: esp32s3 Cache and DMA

Postby ESP_Sprite » Sat Sep 16, 2023 12:21 am

jprpower104 wrote:
Fri Sep 15, 2023 9:30 pm
Can you send the code GDMA example for esp32s3?? I need do GDMA for al communications ports like UART, SPI, I2C, TWAI , etc... in my job, because we have many apps in multiples chips but we like migrate to esp32XXXX family...
Use the ESP-IDF drivers. They will take care of any DMA capability if the peripheral has it.

jprpower104
Posts: 16
Joined: Tue Jul 18, 2023 3:59 am

Re: esp32s3 Cache and DMA

Postby jprpower104 » Mon Sep 25, 2023 8:23 pm

Now I have this code

Code: Select all

 #include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "freertos/FreeRTOS.h"
#include "freertos/task.h" 
#include "freertos/semphr.h"
//#include "soc/periph_defs.h"
//#include "soc/soc_memory_layout.h"
//#include "soc/soc_caps.h"
//#include "hal/gdma_ll.h"
//#include "hal/gdma_hal.h"
//#include "esp_private/periph_ctrl.h"
#include "esp_log.h"
//#include "esp_attr.h"
//#include "esp_err.h"
//#include <inttypes.h>
//#include <sys/param.h>

#include "C:/Users/sopor/esp/esp-idf/components/esp_hw_support/port/include/esp_async_memcpy_impl.h"
#include "C:\Users\sopor\esp\esp-idf\components\esp_hw_support\dma\gdma_priv.h"

#include "hal/gdma_ll.h"

#if SOC_APM_SUPPORTED
    #include "hal/apm_ll.h"
#endif

#define         GDMA_LL_RX_STATUS_EOF_FLAG             (28<<1);
int token = 0;

// //******************************************************************************************************************************************************************************************************************************************************

// void create_task(async_memcpy_impl_t *);
// SemaphoreHandle_t GlobalKey = xSemaphoreCreateMutex();

// //******************************************************************************************************************************************************************************************************************************************************


// void vSemaphoreTask(void *pvParameters)
// {
//     async_memcpy_impl_t *mcp_impl = (async_memcpy_impl_t *)pvParameters;
//     while(1)
//     {
//         if(xSemaphoreTakeFromISR(GlobalKey,(BaseType_t*)pdTRUE)==pdTRUE)
//         {
//             token=1;
//             async_memcpy_isr_on_rx_done_event(mcp_impl);
//             ESP_LOGI("ISR GDMA : ","Interrup allowcated!");
//         }
//     }
// }

//******************************************************************************************************************************************************************************************************************************************************

IRAM_ATTR static bool async_memcpy_impl_rx_eof_callback(gdma_channel_handle_t dma_chan, gdma_event_data_t *event_data, void *user_data)
{
    async_memcpy_impl_t *mcp_impl = (async_memcpy_impl_t *)user_data;
    mcp_impl->rx_eof_addr = event_data->rx_eof_desc_addr;
    // clear EOF_FLAG **********************************************************
    //gdma_ll_rx_clear_interrupt_status(dma_chan, 0, GDMA_LL_RX_STATUS_EOF_FLAG);
    //****************************************************************************
    token=1;
    async_memcpy_isr_on_rx_done_event(mcp_impl);
    return true;//mcp_impl->isr_need_yield;
}

//******************************************************************************************************************************************************************************************************************************************************

esp_err_t async_memcpy_impl_init(async_memcpy_impl_t *impl)
{
    esp_err_t ret = ESP_OK;
    // create TX channel and reserve sibling channel for future use
    gdma_channel_alloc_config_t tx_alloc_config = 
    {
        .sibling_chan   = NULL, 
        .direction      = GDMA_CHANNEL_DIRECTION_TX,
        .flags = 
        {
            .reserve_sibling = 0
        }
    };

    ret = gdma_new_channel(&tx_alloc_config, &impl->tx_channel);
    if (ret != ESP_OK) 
    {
        return ret;
    }

    // create RX channel and specify it should be reside in the same pair as TX
    gdma_channel_alloc_config_t rx_alloc_config = 
    {
        .sibling_chan   = impl->tx_channel,
        .direction      = GDMA_CHANNEL_DIRECTION_RX,
        .flags = 
        {
            .reserve_sibling = 0
        }
    };
    ret = gdma_new_channel(&rx_alloc_config, &impl->rx_channel);
    if (ret != ESP_OK) 
    {
        return ret;
    }

    gdma_trigger_t m2m_trigger = GDMA_MAKE_TRIGGER(GDMA_TRIG_PERIPH_M2M, 0);
    // get a free DMA trigger ID for memory copy
    uint32_t free_m2m_id_mask = 0;
    gdma_get_free_m2m_trig_id_mask(impl->tx_channel, &free_m2m_id_mask);
    m2m_trigger.instance_id = __builtin_ctz(free_m2m_id_mask);
    gdma_connect(impl->rx_channel, m2m_trigger);
    gdma_connect(impl->tx_channel, m2m_trigger);

    gdma_strategy_config_t strategy_config = 
    {
        .owner_check = true,
        .auto_update_desc = true,
    };
    
    gdma_transfer_ability_t transfer_ability = 
    {
        .sram_trans_align   = 4,//impl->sram_trans_align,
        .psram_trans_align  = 32,//impl->psram_trans_align,
    };

    ret = gdma_set_transfer_ability(impl->tx_channel, &transfer_ability);
    if (ret != ESP_OK) 
    {
        return ret;
    }

    ret = gdma_set_transfer_ability(impl->rx_channel, &transfer_ability);
    if (ret != ESP_OK) 
    {
        return ret;
    }
    
    gdma_apply_strategy(impl->tx_channel, &strategy_config);
    gdma_apply_strategy(impl->rx_channel, &strategy_config);

#if SOC_APM_SUPPORTED
    // APM strategy: trusted mode
    // TODO: IDF-5354 GDMA for M2M usage only need read and write permissions, we should disable the execute permission by the APM controller
    apm_tee_ll_set_master_secure_mode(APM_LL_MASTER_GDMA + m2m_trigger.instance_id, APM_LL_SECURE_MODE_TEE);
#endif // SOC_APM_SUPPORTED

    gdma_rx_event_callbacks_t cbs = 
    {
        .on_recv_eof = async_memcpy_impl_rx_eof_callback
    };

    gdma_register_rx_event_callbacks(impl->rx_channel, &cbs, impl);
    return ret;
}

//******************************************************************************************************************************************************************************************************************************************************

esp_err_t async_memcpy_impl_deinit(async_memcpy_impl_t *impl)
{
    gdma_disconnect(impl->rx_channel);
    gdma_disconnect(impl->tx_channel);
    gdma_del_channel(impl->rx_channel);
    gdma_del_channel(impl->tx_channel);
    return ESP_OK;
}

//******************************************************************************************************************************************************************************************************************************************************

esp_err_t async_memcpy_impl_start(async_memcpy_impl_t *impl, intptr_t outlink_base, intptr_t inlink_base)
{
    gdma_start(impl->rx_channel, inlink_base);
    gdma_start(impl->tx_channel, outlink_base);
    return ESP_OK;
}

//******************************************************************************************************************************************************************************************************************************************************

esp_err_t async_memcpy_impl_stop(async_memcpy_impl_t *impl)
{
    gdma_stop(impl->rx_channel);
    gdma_stop(impl->tx_channel);
    return ESP_OK;
}

//******************************************************************************************************************************************************************************************************************************************************

esp_err_t async_memcpy_impl_restart(async_memcpy_impl_t *impl)
{
    gdma_append(impl->rx_channel);
    gdma_append(impl->tx_channel);
    return ESP_OK;
}

//******************************************************************************************************************************************************************************************************************************************************

esp_err_t async_memcpy_impl_new_etm_event(async_memcpy_impl_t *impl, async_memcpy_etm_event_t event_type, esp_etm_event_handle_t *out_event)
{
    if (event_type == ASYNC_MEMCPY_ETM_EVENT_COPY_DONE) {
        // use the RX EOF to indicate the async memcpy done event
        gdma_etm_event_config_t etm_event_conf = {
            .event_type = GDMA_ETM_EVENT_EOF,
        };
        return gdma_new_etm_event(impl->rx_channel, &etm_event_conf, out_event);
    } else {
        return ESP_ERR_NOT_SUPPORTED;
    }
}

//******************************************************************************************************************************************************************************************************************************************************

bool async_memcpy_impl_is_buffer_address_valid(async_memcpy_impl_t *impl, void *src, void *dst)
{
    bool valid = true;
    if (esp_ptr_external_ram(dst)) {
        if (impl->psram_trans_align) {
            valid = valid && (((intptr_t)dst & (impl->psram_trans_align - 1)) == 0);
        }
    } else {
        if (impl->sram_trans_align) {
            valid = valid && (((intptr_t)dst & (impl->sram_trans_align - 1)) == 0);
        }
    }
    return valid;
}

//******************************************************************************************************************************************************************************************************************************************************

extern "C" void app_main(void)
{
    esp_err_t ret;
    async_memcpy_impl_t M2M_gmda_test;
    // gdma_tx_channel_t tx_ch;
    // gdma_rx_channel_t rx_ch;
    char source_buf[] = 
    "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis faucibus, ex nec finibus cursus, magna sem cursus est, ac interdum elit mi \n"
    "et tellus. Sed augue odio, interdum vitae pharetra id, tristique sed urna. Ut at accumsan dolor, vel pharetra metus. Cras ac neque eleifend, cursus purus nec,\n"
    "ullamcorper leo. Sed arcu lacus, placerat id blandit non, scelerisque in elit. Fusce a blandit diam. Mauris gravida consequat purus, finibus dictum neque\n" 
    "condimentum eu. Nulla consectetur arcu vel vehicula imperdiet. Phasellus placerat, libero vel vulputate dapibus, lectus magna molestie sapien, eget viverra\n" 
    "mauris orci sed ipsum. Ut ac nisi urna. Donec fermentum tellus semper, malesuada arcu nec, finibus velit. Praesent blandit elit vel consectetur rutrum.\n" 
    "Vivamus lorem justo, volutpat at nulla ut, gravida pretium purus. Morbi pharetra leo tincidunt convallis ultricies. In metus velit, volutpat in suscipit eu,\n" 
    "lobortis vitae lectus.\n";
    // "Duis elementum dignissim feugiat. Nunc imperdiet dolor metus. Ut fermentum metus sed turpis rhoncus venenatis. Cras mi quam, scelerisque id enim eget, aliquet\n" 
    // "mattis erat. Sed euismod varius leo ac porttitor. Integer gravida sapien pharetra ornare ullamcorper. Quisque non eros bibendum, vulputate justo in, mattis\n"
    // "lorem. Nullam molestie nunc tellus, ac viverra massa tempor at. Donec nec justo lacus. Sed consectetur facilisis justo, ac suscipit urna pharetra quis.\n" 
    // "Pellentesque maximus imperdiet mauris id vulputate. Interdum et malesuada fames ac ante ipsum primis in faucibus.\n"
    // "Interdum et malesuada fames ac ante ipsum primis in faucibus. Morbi id tempor ipsum. Donec erat metus, facilisis eu orci ut, condimentum tempus sapien.\n" 
    // "Donec congue, lectus nec gravida facilisis, arcu sem vehicula dolor, non finibus nisi ante eu lectus. Fusce cursus iaculis pellentesque. Nullam euismod,\n" 
    // "enim non varius egestas, lectus urna tempus dolor, non placerat velit leo quis augue. Etiam arcu massa, ullamcorper quis ligula in, efficitur efficitur neque.\n" 
    // "Cras porta sem eu ante gravida, in hendrerit risus maximus integer.\n";
    

    char receiv_buf[sizeof(source_buf)] = {"0"};//(char *)malloc(sizeof(source_buf)*sizeof(char));

    ret = async_memcpy_impl_init(&M2M_gmda_test);
    if(ret==ESP_OK)
    {
        printf("M2M GDMA init sucesful! \n");
    }
    else
    {
        printf("Error in M2M GDMA init! \n");
    }

    dma_descriptor_t desc_tx;
    dma_descriptor_t desc_rx;

    // GDMA descriptor tx
    desc_tx.dw0.owner = DMA_DESCRIPTOR_BUFFER_OWNER_DMA;
    desc_tx.dw0.suc_eof = 1; // Last descriptor
    desc_tx.dw0.size = sizeof(source_buf);
    desc_tx.dw0.length = sizeof(source_buf);
    desc_tx.buffer = source_buf;
    desc_tx.next = &desc_tx;


    // GDMA descriptor rx
    desc_rx.dw0.owner = DMA_DESCRIPTOR_BUFFER_OWNER_DMA;
    desc_rx.dw0.suc_eof = 1; // Last descriptor
    desc_rx.dw0.size = sizeof(receiv_buf);
    desc_rx.dw0.length = sizeof(receiv_buf);
    desc_rx.buffer = receiv_buf;
    desc_rx.next = &desc_rx;

    async_memcpy_impl_start(&M2M_gmda_test, (intptr_t)&desc_tx, (intptr_t)&desc_rx);
    //create_task(&M2M_gmda_test);
    while(true)
    {
        // printf("source  buffer = %s \n",source_buf);
        // printf("receive buffer = %s \n",receiv_buf);
        //printf("size of source  buffer = %d \n",sizeof(source_buf));
        if(token==1)
        {
            printf("size of receive buffer = %d \n\n",sizeof(receiv_buf));
            token=0;
        }
    }
    // Free the buffers
    free(source_buf);
    free(receiv_buf);
    async_memcpy_impl_deinit(&M2M_gmda_test);
    async_memcpy_impl_stop(&M2M_gmda_test);
}

// //******************************************************************************************************************************************************************************************************************************************************

// void create_task(async_memcpy_impl_t *impl)
// {
//     TaskHandle_t xHandle = NULL;

//     xTaskCreatePinnedToCore(vSemaphoreTask,                 // TaskFunction_t pvTaskCode            -> aqui va la funcion de la tarea void __xTask_function(void *arg)__
//                             "vSemaphoreTask",               // const char *constpcName              -> este en un nombre de texto arbitrario para identificar la tarea __"Tarea de Demostracion"__
//                             ( uint32_t)1024,                   // const uint32_t usStackDepth,         -> tamaño del buffer de la pila de tarea                
//                             (void*)impl,                    // void *constpvParameters,             -> Parametros __(*arg)__   que pueden venir de la funcion __xTask_function(void *arg)__
//                             1,                              // UBaseType_t uxPriority,              -> Prioridad de la tarea 
//                             &xHandle,                       // TaskHandle_t *constpvCreatedTask,    -> Handle de la tarea o gestionador
//                             0);                             // const BaseType_t xCoreID             -> Si el procesador es multinucleo se puede asignar la tarea al __(0)__ core0  o al __(1)__ core1 
// }

//****************************************************************************************************************************************************************************************************************************************************** 
but now this ISR rutine

Code: Select all

 IRAM_ATTR static bool async_memcpy_impl_rx_eof_callback(gdma_channel_handle_t dma_chan, gdma_event_data_t *event_data, void *user_data)
{
    async_memcpy_impl_t *mcp_impl = (async_memcpy_impl_t *)user_data;
    mcp_impl->rx_eof_addr = event_data->rx_eof_desc_addr;
    // clear EOF_FLAG **********************************************************
    //gdma_ll_rx_clear_interrupt_status(dma_chan, 0, GDMA_LL_RX_STATUS_EOF_FLAG);
    //****************************************************************************
    token=1;
    async_memcpy_isr_on_rx_done_event(mcp_impl);
    return true;//mcp_impl->isr_need_yield;
}
in specific in the line -->"async_memcpy_isr_on_rx_done_event(mcp_impl);" is making problem.

jprpower104
Posts: 16
Joined: Tue Jul 18, 2023 3:59 am

Re: esp32s3 Cache and DMA

Postby jprpower104 » Mon Sep 25, 2023 9:49 pm

inside of "async_memcpy_isr_on_rx_done_event(mcp_impl);" i have this captureImage and next line of code in portENTER_CRITICAL_ISR(&asmcp->spinlock); asmcp have all data, but in the static inline void __attribute__((always_inline)) vPortEnterCritical(portMUX_TYPE *mux)
{
xPortEnterCriticalTimeout(mux, portMUX_NO_TIMEOUT);
}

mux is totally null values
Attachments
Captura de pantalla 2023-07-24 160849.png
Captura de pantalla 2023-07-24 160849.png (317.9 KiB) Viewed 32896 times

jprpower104
Posts: 16
Joined: Tue Jul 18, 2023 3:59 am

Re: esp32s3 Cache and DMA

Postby jprpower104 » Tue Oct 03, 2023 4:07 pm

Any people can give me support???

Who is online

Users browsing this forum: No registered users and 8 guests