Hello
if you use any integrated module, for example, a PWM generator, its outputs are not related to the state of the GPIO_OUT_REG register and it does not matter what you write to this register (you disabled the GPIO_OUT_REG outputs and connected the corresponding outputs of the PWM generator)
As options for solving your problem, you can additionally use SPI modules in quad/octal mode, or an LCD_CAM module.
Both modules allow parallel data output without data desynchronization
Changing the state of group of output pins within the same clock cycle
-
- Posts: 1735
- Joined: Mon Oct 17, 2022 7:38 pm
- Location: Europe, Germany
Re: Changing the state of group of output pins within the same clock cycle
Well, you can have some others pins (not inside the bundle) also configured for outputs (to control some other external peripherals) and by writing whole 32 bits range into GPIO_OUT_REG you would affect them if you do not previously read the GPIO_OUT_REG, mask it (with inversed mask of your bundle mask) to preserve those other states, and finally OR the new value of your bundle and write back into GPIO_OUT_REG.
Sure, ESP32S3 (uses Tensillica Xtensa LX7 core) has FAST GPIO module which can "PUMP" up to 8 bits into output port without need to read the state of the port first. It is because such a hardware module allows that and that indeed guaranty atomicity of whole operation without needs to disable interrupts or bring other core into stale state. A new assembler instructions from extended set like EE.WRT_MASK_GPIO_OUT or EE.SET_BIT_GPIO_OUT are responsible for that.
However, on older ESP32 (not S3 version), which uses Tensillica Xtensa LX6 core, a dedicated GPIO module is not implemented. So the only way to "pump" some group of bits into GPIO_OUT_REG (within the same cycle) is to read the state of GPIO_OUT_REG register and mask it with inverted mask of your bundle (to preserve other bits not part of the bundle, and clear values of the bundle), then, OR a new values into, and finally, write computed value back to GPIO_OUT_REG. But that sequence of instructions is not atomic. Compiled code has several assembler instructions from the moment of reading the register to the moment when final result is stored back in register. Now suppose asm instruction which fetch the actual value of GPIO_OUT_REG into some CPU register is interrupted by some peripheral and ISR is called. In that ISR some function performs some OUTput operation to some external hardware, for example, TURNs ON some LED. It will change the value of GPIO_OUT_REG. Once ISR is completed, execution is continued from the instruction which was interrupted. But now, your variable (register) with sampled value of GPIO_OUT_REG contains the OLD value of GPIO_OUT_REG and when masking is done and register is stored back into GPIO_OUT_REG, new value for the bundle will be ok, but pin which control the LED will be set to OLD state (which means it will turn off the LED).
Problem can be partially solved by disabling ALL the interrupts (and also bringing into stall state other core on the systems with two cores) while performing READ-MODIFYING-WRITE operation over GPIO_OUT_REG. That is the only way you can guaranty the integrity of whole operation. But there is a penalty, because you involved a delay on other peripherals which requires precise response from their ISR routines while interrupts are disabled.
Theoretically on those old ESP32 versions (LX6 core), it would be possible to affect only 8 bits inside GPIO_OUT_REG:
1. If sequence of bits in a bundle is exact 8 bit width
2. and if those 8 bits are exact BYTE aligned inside GPIO_OUT_REG (only 4 possible cases)
3. and if there is such assembler instruction capable to do BYTE writing into memory into non-4-byte aligned address forcing a cache to writeback
In that case byte value could be "pumped" into GPIO_OUT_REG without need to previously read its state.
In Xtensa ISA for LX6 documentation, there is S8I instruction capable to write a BYTE into a memory with an imm8 (byte) offset which is third parameter in instruction. That technically provides solution for all 4 possible cases (bundles as 0x000000FF, 0x0000FF00, 0x00FF0000 and 0xFF000000).
So for little-endian memory model, writing operations for each of those 4 cases would be something like:
MOV A0, 8_bit_value // zero extended 8 bit value into 32 bit register
MOV A1, GPIO_OUT_REG
S8I A0, A1, 0 // for case 0x000000FF
/*
S8I A0, A1, 1 // for case 0x0000FF00
S8I A0, A1, 2 // for case 0x00FF0000
S8I A0, A1, 3 // for case 0xFF000000
*/
// followed with instruction to force write back cache to memory
DHWBI
That would be in case if cache is implemented to "see" virtual address space too (where GPIO registers resides). However, if internal cache sees only real memory map then there would be necessary to remap address of GPIO_OUT_REG to be "visible" to main core address space and perform similar operations.
cheers
Sure, ESP32S3 (uses Tensillica Xtensa LX7 core) has FAST GPIO module which can "PUMP" up to 8 bits into output port without need to read the state of the port first. It is because such a hardware module allows that and that indeed guaranty atomicity of whole operation without needs to disable interrupts or bring other core into stale state. A new assembler instructions from extended set like EE.WRT_MASK_GPIO_OUT or EE.SET_BIT_GPIO_OUT are responsible for that.
However, on older ESP32 (not S3 version), which uses Tensillica Xtensa LX6 core, a dedicated GPIO module is not implemented. So the only way to "pump" some group of bits into GPIO_OUT_REG (within the same cycle) is to read the state of GPIO_OUT_REG register and mask it with inverted mask of your bundle (to preserve other bits not part of the bundle, and clear values of the bundle), then, OR a new values into, and finally, write computed value back to GPIO_OUT_REG. But that sequence of instructions is not atomic. Compiled code has several assembler instructions from the moment of reading the register to the moment when final result is stored back in register. Now suppose asm instruction which fetch the actual value of GPIO_OUT_REG into some CPU register is interrupted by some peripheral and ISR is called. In that ISR some function performs some OUTput operation to some external hardware, for example, TURNs ON some LED. It will change the value of GPIO_OUT_REG. Once ISR is completed, execution is continued from the instruction which was interrupted. But now, your variable (register) with sampled value of GPIO_OUT_REG contains the OLD value of GPIO_OUT_REG and when masking is done and register is stored back into GPIO_OUT_REG, new value for the bundle will be ok, but pin which control the LED will be set to OLD state (which means it will turn off the LED).
Problem can be partially solved by disabling ALL the interrupts (and also bringing into stall state other core on the systems with two cores) while performing READ-MODIFYING-WRITE operation over GPIO_OUT_REG. That is the only way you can guaranty the integrity of whole operation. But there is a penalty, because you involved a delay on other peripherals which requires precise response from their ISR routines while interrupts are disabled.
Theoretically on those old ESP32 versions (LX6 core), it would be possible to affect only 8 bits inside GPIO_OUT_REG:
1. If sequence of bits in a bundle is exact 8 bit width
2. and if those 8 bits are exact BYTE aligned inside GPIO_OUT_REG (only 4 possible cases)
3. and if there is such assembler instruction capable to do BYTE writing into memory into non-4-byte aligned address forcing a cache to writeback
In that case byte value could be "pumped" into GPIO_OUT_REG without need to previously read its state.
In Xtensa ISA for LX6 documentation, there is S8I instruction capable to write a BYTE into a memory with an imm8 (byte) offset which is third parameter in instruction. That technically provides solution for all 4 possible cases (bundles as 0x000000FF, 0x0000FF00, 0x00FF0000 and 0xFF000000).
So for little-endian memory model, writing operations for each of those 4 cases would be something like:
MOV A0, 8_bit_value // zero extended 8 bit value into 32 bit register
MOV A1, GPIO_OUT_REG
S8I A0, A1, 0 // for case 0x000000FF
/*
S8I A0, A1, 1 // for case 0x0000FF00
S8I A0, A1, 2 // for case 0x00FF0000
S8I A0, A1, 3 // for case 0xFF000000
*/
// followed with instruction to force write back cache to memory
DHWBI
That would be in case if cache is implemented to "see" virtual address space too (where GPIO registers resides). However, if internal cache sees only real memory map then there would be necessary to remap address of GPIO_OUT_REG to be "visible" to main core address space and perform similar operations.
cheers
Who is online
Users browsing this forum: No registered users and 101 guests