Integer division performance ????

Baldhead
Posts: 468
Joined: Sun Mar 31, 2019 5:16 am

Re: Integer division performance ????

Postby Baldhead » Tue Dec 10, 2019 11:24 pm

Hi ESP_Angus,


At first i think it's good the way it is.

From 4.1us to 3us ( -Og compiler optimization ).
From 4.1us to 1,9 us ( -Os compiler optimization ).


Thank's for the help.

Baldhead
Posts: 468
Joined: Sun Mar 31, 2019 5:16 am

Re: Integer division performance ????

Postby Baldhead » Tue Dec 10, 2019 11:46 pm

Hi ESP_Angus,

Below follows what i'm doing.

Called only in startup.

Code: Select all

static DMA_ATTR uint32_t buf_a[ 15360 ];
static DMA_ATTR lldesc_t dma_desc_buf_a[ 16 ];  

static void init_dma_descriptors_a( )
{
     
    for ( uint32_t i = 0 ; i < descriptor_size ; i++ )    // descriptor_size  = 16
    {
        dma_desc_buf_a[i].size = 4092; 
	dma_desc_buf_a[i].length = 0;	
	dma_desc_buf_a[i].offset = 0;	
	dma_desc_buf_a[i].sosf = 0;	
	dma_desc_buf_a[i].eof = 1;	    // indicate that are the last node of linked list. 
	dma_desc_buf_a[i].owner = 1;      // the allowed operator is the DMA controller.		
	dma_desc_buf_a[i].buf = (uint8_t*) ( ( &buf_a[0] ) + ( 1023 * i ) );   		
	    			    	    
	if ( i == descriptor_size - 1 )      	    
	{
	    dma_desc_buf_a[i].qe.stqe_next = ( lldesc_t* ) NULL;
        }
	else
	{
	    dma_desc_buf_a[i].qe.stqe_next = ( lldesc_t* ) &dma_desc_buf_a[i+1];
	}
}

Code: Select all

Here i eliminated this instruction in all last transfer node:  
dma_desc_buf_a[ i ].qe.stqe_next = ( lldesc_t* ) NULL;

I am only using:
dma_desc_buf_a[ i ].eof = 1;

Called every time you want to send data through dma.

Code: Select all

static inline int fill_dma_descriptor_a ( uint32_t len )  //  uint32_t len in bytes. When len = 15360 bytes the function takes 3 us( -Og compiler optimization ). 1,9 us( -Os compiler optimization ).    
{
    uint32_t length;
    
    if ( len > 15360 )  return -1;     
    if ( len == 0 )       return -2;    


    length = 4 * len;    // ( 4 * len ) = converte de byte(8 bits) para word(32bits).  


    if ( length <= 4092 )    // Only need one single descriptor.   
    {          
        dma_desc_buf_a[0].length = length;	
	dma_desc_buf_a[0].eof = 1;	    // indicate that are the last node of linked list.	    		
	  
        return 1;   
    }

    // if ( length > 4092 )  // Need more that a single descriptor. 

    uint32_t fullBufferNum;                  
    fullBufferNum = (uint32_t) length / 4092;   

    uint32_t remainderBufferNum;          
    remainderBufferNum = ( length % 4092 );  
                                                    
    
    for ( uint32_t i = 0 ; i < fullBufferNum ; i++ )     
    {	
	    dma_desc_buf_a[i].length = 4092;  	
	    dma_desc_buf_a[i].eof = 0;	      // indicate that are not the last node of linked list. 		    
    }

    if ( remainderBufferNum == 0 )    // Remainder of division are 0. Don't need to allocate more one descriptor.      
    {
        dma_desc_buf_a[ fullBufferNum - 1 ].eof = 1;    // indicate that are the last node of linked list.
        
        return 2;  
    }
    else  // Need to allocate (statically) 1 more descriptor.
    {        
	    dma_desc_buf_a[fullBufferNum].length = remainderBufferNum;	
	    dma_desc_buf_a[fullBufferNum].eof = 1;	    // indicate that are the last node of linked list. 	 	    

        return 3;       
    }   
}
This configuration through my tests is working.

Do you think this can cause any type of problems ?

My next step will be to implement interrupt.
Can it generate any problem ?


Thank's for your help.

ESP_Angus
Posts: 2344
Joined: Sun May 08, 2016 4:11 am

Re: Integer division performance ????

Postby ESP_Angus » Wed Dec 11, 2019 4:03 am

Baldhead wrote:
Tue Dec 10, 2019 11:46 pm
This configuration through my tests is working.

Do you think this can cause any type of problems ?
Looks fine to me. Of course I can't debug your driver for you, maybe something will need changing here.
Baldhead wrote:
Tue Dec 10, 2019 11:46 pm
My next step will be to implement interrupt.
Can it generate any problem ?
You're asking me if code you haven't written yet might have a problem?

Baldhead
Posts: 468
Joined: Sun Mar 31, 2019 5:16 am

Re: Integer division performance ????

Postby Baldhead » Wed Dec 11, 2019 5:45 pm

Hi ESP_Angus,


"You're asking me if code you haven't written yet might have a problem?"

I would like to know if i need or i dont need both "instructions" at the same time on last node of linked list:

dma_desc_buf_a[ last node ].eof = 1;
dma_desc_buf_a[ last node ].qe.stqe_next = ( lldesc_t* ) NULL;

I am only using:

dma_desc_buf_a[ last node ].eof = 1;


In my first stage of driver development i was initializing all fields from my linked list, ie: all lldesc_t fields from each node,
which took a long time to fill.
So i optimized my code and i wonder if this way is ok.


The last stage of my driver development are to implement interrupt and 2 buffer sync plus tearing sync, which i think will give me a lot of headache.


Thank's for your help.

ESP_Angus
Posts: 2344
Joined: Sun May 08, 2016 4:11 am

Re: Integer division performance ????

Postby ESP_Angus » Thu Dec 12, 2019 3:46 am

Hi Baldhead,
Baldhead wrote:
Wed Dec 11, 2019 5:45 pm
I would like to know if i need or i dont need both "instructions" at the same time on last node of linked list:

dma_desc_buf_a[ last node ].eof = 1;
dma_desc_buf_a[ last node ].qe.stqe_next = ( lldesc_t* ) NULL;
Right, sorry I missed that. I checked with the peripheral teams, both fields need to be set: EOF=1 causes an EOF interrupt to be triggered when the descriptor is reached, but the DMA operation will continue until it reaches a field where the next descriptor pointer is NULL.

Baldhead
Posts: 468
Joined: Sun Mar 31, 2019 5:16 am

Re: Integer division performance ????

Postby Baldhead » Thu Dec 12, 2019 6:27 pm

Hi ESP_Angus,

"Right, sorry I missed that. I checked with the peripheral teams, both fields need to be set: EOF=1 causes an EOF interrupt to be triggered when the descriptor is reached, but the DMA operation will continue until it reaches a field where the next descriptor pointer is NULL."

Strange.
For me it's working with only this "instruction": dma_desc_buf_a[ last node ].eof = 1.

However i am not currently using interrupt, i am using polling.
I am using polling for testing purposes only.

Thank's.

Who is online

Users browsing this forum: No registered users and 116 guests