ESP32 Forum

Posted: **Fri Oct 26, 2018 2:18 pm**

Edit: changed topic title

I made an assembly based RGB color mixing algorythm, that blends 2 color values (one of which has Alpha component) into a single RGB value based on Alpha color mixing rules.
The goal was to make it as lighweight as possible.
I tried compiling it by includimg xtensa/corebits.h and specreg.h, but the compiler seem to be unable to resolve the register names. It doesn't even go further than the clobbered register list, and stops at ACCLO, even if i remove the %.

Which includes should I put in there to be able to go a bit further?

(this does not contain the file headers)

Code: Select all

/**
  	@brief 
	The equation below is made from the original color mixing equation, to avoid using devision and fraction
	!!!This equation assumes that background is always not transparent!!!
	Accuracy of this is very good. 95% of time the 8bits of the color components are the same.
	Difference can maybe occur in the last LSB bit, but that does nearly nothing noticeable.
	
	This function should make the following equation (with order of execution)
	________III.________
				_____II._____
				___I.___     __IV.__
	((fg*fg.A)+(256-fg.A)*bg)/256
	The final devide by 256 is supplemented by a shifting to right by 8.
	Numerical testing shows this method is accurate enough.
	NO need to use slow floating point here.
  
	Where fg is the layers color component (R, G or B), bg is the background color
	Alpha is the alpha component of layer, bg has no alpha (non transparent)

	@param fg - Layer color with alpha channel in ARGB8888 format
	@param bg - Background color you want to mix onto RGB888 format
	
	@return result - RGB888 color
 **/
uint32_t cAlphaBlendARGB8888(uint32_t *fg, uint32_t *bg)
{
  //%0: result (output)

  //Register usage
  //A0: used bitmask to mask out color values
  //A1: fg.A
  //A2: 256-fg.A
  //A3: fg color
  //A4: bg color
  //A5: intermediate masked result for FG
  //A6: intermediate masked result for BG
  //A7: BLUE
  //A8: GREEN
  //A9: RED
  
 	uint32_t result = 0; 
	 __asm__ __volatile__ (
	 "MOV %%a3, fg"			//loads fgcolor to A3 - this has Alpha value in bits 31-24.
	 "MOV %%a4, bg"			//loads bgcolor to A4 - Bits 31-24 is don't care in this function
	 //ALPHA MIXING
	 "MOVI %%a0, $0xFF000000"		//Loads Alpha mask
	 "AND %%a1, %%a3, %a0"	//ANDs mask(A0) and fg(A3) for ALPHA components of layer, stores it in A1
	 "SRLI %%a1, %%a1, 24"
	 "MOVI %%a0, $256"	
	 "SUB %%a2, %%a0, %%a1"	//256-fg.A - I.
	 //BLUE (Here mask is already set to 0x000000FF when we loaded A0 with 256)
	 "AND %%a5, %%a3, %a0"	//ANDs mask and fg for Blue components of layer, stores it in A5
	 "AND %%a6, %%a4, %a0"	//ANDs mask and bg for Blue components of background, stores it in A6
	 "MUL.AA.HH %%a1, %%a5"	//1st multiplication -  II.
	 "MULA.AA.HH %%a2, %%a6"	//2nd multiplication, result in ACCLO - III.
	 "SRLI %%a7, %%ACCLO, 8"	//store ACCLO result in A7 shifted right by 8 - IV.
	 //GREEN
	 "SLLI %%a0, 8"			//mask shifted for next component
	 "AND %%a5, %%a3, %a0"	//ANDs mask and fg for Blue components of layer, stores it in A5
	 "AND %%a6, %%a4, %a0"	//ANDs mask and bg for Blue components of background, stores it in A6
	 "MUL.AA.HH %%a1, %%a5"	//1st multiplication -  II.
	 "MULA.AA.HH %%a2, %%a6"	//2nd multiplication, result in ACCLO - III.
	 "SRLI %%a8, %%ACCLO, 8"	//store ACCLO result in A8 shifted right by 8 - IV.
	 //RED
	 "SLLI %%a0, 8"			//mask shifted for next component
	 "AND %%a5, %%a3, %a0"	//ANDs mask and fg for Blue components of layer, stores it in A5
	 "AND %%a6, %%a4, %a0"	//ANDs mask and bg for Blue components of background, stores it in A6
	 "MUL.AA.HH %%a1, %%a5"	//1st multiplication -  II.
	 "MULA.AA.HH %%a2, %%a6"	//2nd multiplication, result in ACCLO - III.
	 "SRLI %%a9, %%ACCLO, 8"	//store ACCLO result in A9 shifted right by 8 - IV.
	 //At this point all RGB components are stored in registers A7-A9, with color value stored in bit 7-0.
	 //The next lines format and generate the result .
	 "SLLI %%a9, %%a9, 16"
	 "SLLI %%a8, %%a8, 8"
	 "ADD %%a7, %%a7, %%a8"
	 "ADD %0, %%a7, %%a9" 
	 
	 : "=r" (result)
	 : 
	 : "%ACCLO", "%a0", "%a1", "%a2", "%a3", "%a4", "%a5", "%a6", "%a7", "%a8", "%a9");
	 
	return result;
}

Posted: **Sun Oct 28, 2018 2:20 am**

ACCLO is a special function register - I'm not sure if gcc supports clobbering it, and you should read from and write to it using RSR/WSR.

Posted: **Mon Oct 29, 2018 7:42 am**

Thanks for the feedback. I'm gonna modify it and hope for the best.

Though I'm still not entirely sure what includes should I use when using assembly. I assume specreg.h is enough, because GCC should know all the other registers based on the xtensa toolchain?

Posted: **Tue Oct 30, 2018 4:18 am**

No idea, sorry, I haven't done more than a few short snippets of inline assembly (I usually use separate .S-files.)

ESP32 Forum

ASM code not compiling

ASM code not compiling

Re: ASM code not compiling

Re: ASM code not compiling

Re: ASM code not compiling