Changing the state of group of output pins within the same clock cycle
Changing the state of group of output pins within the same clock cycle
In hardware design I have an external 4 to 16 address decoder which require 4 output pins from ESP32 S3 for proper address selection which on outputs of address decoder targets one of 16 peripherals. If those 4 pins are not set up within the same clock pulse I will get floating behavior on the outputs of address decoder which is not acceptable.
What is the best approach to achieve that group of 4 pins are changed within the same clock cycle?
What is the best approach to achieve that group of 4 pins are changed within the same clock cycle?
-
- Posts: 9766
- Joined: Thu Nov 26, 2015 4:08 am
Re: Changing the state of group of output pins within the same clock cycle
Thank you for your answer.
I've already checked all registers of ESP32 involved for direct "port" manipulation bypassing GPIO MATRIX and using only IO MUX.
But the possible issue related to more than 1 assembly instruction required to manage "pin bundle" will not generate the safe code. Here is the reason of possible issues using that method.
Problem related to to method 1:
By using READ-MODIFY-WRITE of live register GPIO_OUT_REG you do not have ability to perform those operations in single assembly instruction. The problem is as following: Since GPIO_OUT_REG is "live", what that actually means. Well, suppose you have using let's say some of integrated modules like PWM generator, which has its own timer and perform a sequence of pulses directed to some output pads. Beside that you need lets say bundle of 4 pins to control an external address decoder. Since the change of GPIO_OUT_REG can not be performed by single assembler instruction but you have to write something like:
MOV reg1, mask of bundle // take the mask of a bundle
LD reg2, GPIO_OUT_REG // take actual values of GPIO_OUT_REG
AND reg2, reg2, reg1 // masking
OR reg2, new_value // masking
ST reg2, GPIO_OUT_REG // updating GPIO_OUT_REG
So as you can see, even if you disable the interrupts during execution of such a code, the state of GPIO_OUT_REG is took in second instruction, and new state is written in 5th instruction, so what happens if PWM (or any other module, which performs output operations) changed the state of GPIO_OUT_REG after 2nd instruction is executed? Next two instructions will perform a masking for your bundle but 5th instruction for storing result back into GPIO_OUT_REG will overwrite the pins of PWM (or any other outputs which are changed after 2nd instruction is executed) by old sampled value making external peripheral which uses those PWM signals to fail.
Method 2:
By using 0x100 (256) GPIO MATRIX function which will trigger proper "pins" inside GPIO_OUT_REG to SET/CLEAR state determined by state of bits in two other registers: GPIO_OUT_W1TS (set) and GPIO_OUT_W1TC (clear). So that 0x100 function will force SET (making value 1) of bits in GPIO_OUT_REG on positions marked by 1 in GPIO_OUT_W1TS. Also it will force CLEAR (making value 0) of bits in GPIO_OUT_REG on positions marked by 1 in GPIO_OUT_W1TC register. That looks like better solution but still, if you have mixed value you have to store to GPIO_OUT_REG, composed of both ones and zeroes, you still have to perform TWO assembler instructions so values at output will not be changed within the same clock.
So my question is related to that second method. Is it possible to perform operations on those two registers while GPIO MATRIX function 0x100 is disabled? Question is more related to: Will the values of those two register be updated and stay permanent while function is disabled?
If that is true, then I guess, once proper value is set up in both registers while function is disabled, then by just enabling the GPIO MATRIX function 0x100, the state of those two registers will update the state of GPIO_OUT_REG. Can someone confirm proper behavior of such a concept?
I've already checked all registers of ESP32 involved for direct "port" manipulation bypassing GPIO MATRIX and using only IO MUX.
But the possible issue related to more than 1 assembly instruction required to manage "pin bundle" will not generate the safe code. Here is the reason of possible issues using that method.
Problem related to to method 1:
By using READ-MODIFY-WRITE of live register GPIO_OUT_REG you do not have ability to perform those operations in single assembly instruction. The problem is as following: Since GPIO_OUT_REG is "live", what that actually means. Well, suppose you have using let's say some of integrated modules like PWM generator, which has its own timer and perform a sequence of pulses directed to some output pads. Beside that you need lets say bundle of 4 pins to control an external address decoder. Since the change of GPIO_OUT_REG can not be performed by single assembler instruction but you have to write something like:
MOV reg1, mask of bundle // take the mask of a bundle
LD reg2, GPIO_OUT_REG // take actual values of GPIO_OUT_REG
AND reg2, reg2, reg1 // masking
OR reg2, new_value // masking
ST reg2, GPIO_OUT_REG // updating GPIO_OUT_REG
So as you can see, even if you disable the interrupts during execution of such a code, the state of GPIO_OUT_REG is took in second instruction, and new state is written in 5th instruction, so what happens if PWM (or any other module, which performs output operations) changed the state of GPIO_OUT_REG after 2nd instruction is executed? Next two instructions will perform a masking for your bundle but 5th instruction for storing result back into GPIO_OUT_REG will overwrite the pins of PWM (or any other outputs which are changed after 2nd instruction is executed) by old sampled value making external peripheral which uses those PWM signals to fail.
Method 2:
By using 0x100 (256) GPIO MATRIX function which will trigger proper "pins" inside GPIO_OUT_REG to SET/CLEAR state determined by state of bits in two other registers: GPIO_OUT_W1TS (set) and GPIO_OUT_W1TC (clear). So that 0x100 function will force SET (making value 1) of bits in GPIO_OUT_REG on positions marked by 1 in GPIO_OUT_W1TS. Also it will force CLEAR (making value 0) of bits in GPIO_OUT_REG on positions marked by 1 in GPIO_OUT_W1TC register. That looks like better solution but still, if you have mixed value you have to store to GPIO_OUT_REG, composed of both ones and zeroes, you still have to perform TWO assembler instructions so values at output will not be changed within the same clock.
So my question is related to that second method. Is it possible to perform operations on those two registers while GPIO MATRIX function 0x100 is disabled? Question is more related to: Will the values of those two register be updated and stay permanent while function is disabled?
If that is true, then I guess, once proper value is set up in both registers while function is disabled, then by just enabling the GPIO MATRIX function 0x100, the state of those two registers will update the state of GPIO_OUT_REG. Can someone confirm proper behavior of such a concept?
-
- Posts: 9766
- Joined: Thu Nov 26, 2015 4:08 am
Re: Changing the state of group of output pins within the same clock cycle
Check that dedicated GPIO link please. It alllows you to take out a few GPIOs and route them to a fast GPIO bundle, then write to only those GPIO pins without worrying about disturbing other pins. Also check the examples if this is still unclear.
Re: Changing the state of group of output pins within the same clock cycle
I've already checked both of links you provided, however both methods use the 0x100 function inside GPIO MATRIX which uses GPIO_OUT_W1TS or GPIO_OUT_W1TC registers to force SET/CLEAR state of bits of interest only (bundle of bits). However, if the composed value that has to be written OUT is mix of zeroes and ones, it means at core level two assembler instructions are required (each targeting proper register) to force bit values in GPIO_OUT_REG to SET/CLEAR state, but that also means those ones/zeroes wont change in the same clock cycle. Because you need one asm instruction to force zeroes and another asm instruction to force ones.
However, possible solution could involve HOLD_ON signal on those pins, but ONLY if a reverse function, a HOLD_OFF, can be triggered on the group of pins without iterating through them one-by-one. But it is not well documented. Anyway, all the functions (or macros) provided as C api are not well documented.HOLD-ON is not problematic, because once a proper state is achieved you can activate HOLD-ON signals even by using bit-by-bit method. And it will hold those values. Once another change is required, you can modify bits output also bit-by-bit and it will not reflect changes on output pads because values are HOLDED in latches. However, if HOLD-OFF function can not be performed at once on whole bundle but you need to iterate through all involved bits to make HOLD_OFF then you can not get change within same clock cycle. So HOLD can not help in solving that issue.
There is huge difference in meanings: Changing the state of a group of output pins IN ALMOST the same clock cycle in comparison to meaning: Changing the group of output pins where operation is GUARANTEED TO BE PERFORMED within the same cycle.
Those two different meanings impact on hardware design of the rest of peripherals supposed to be controlled by an ESP32 and involving things like data bus or address bus.
If operation can not be guaranteed to be performed within the same cycle, proper "strobe" or "CE" signals have to be used on that rest hardware to avoid "flying" states until whole group of out pins of ESP32 establish their final states. And that can be a huge problem for upgrading an old already working design where only MCU is supposed to be replaced by advanced ESP32 chip.
Unfortunately, in the all Technical Reference Manuals related to ESP32, I collected, I couldn't find a proper solution to that issue. DEDICATED GPIO is a collection of C functions and macros which is supposed to do that task "as close as possible" but that is not the same as guaranteed operation. I need to see how that API uses hardware capabilities of ESP32 to be sure how that task is performed at hardware level. Because, "as close as possible" is not enough for ultra stable behavior of an end commercial product.
So I hope that EspressIf will update documentation with an examples of proper use of that hardware capability because any of described methods so far, can not guarantee that operation within single cycle. But on the controllers of other manufacturers, it is such a trivial task even not worth a mention, however internal design of ESP32 require guru skills for proper implementation of such a trivial tasks.
However, possible solution could involve HOLD_ON signal on those pins, but ONLY if a reverse function, a HOLD_OFF, can be triggered on the group of pins without iterating through them one-by-one. But it is not well documented. Anyway, all the functions (or macros) provided as C api are not well documented.HOLD-ON is not problematic, because once a proper state is achieved you can activate HOLD-ON signals even by using bit-by-bit method. And it will hold those values. Once another change is required, you can modify bits output also bit-by-bit and it will not reflect changes on output pads because values are HOLDED in latches. However, if HOLD-OFF function can not be performed at once on whole bundle but you need to iterate through all involved bits to make HOLD_OFF then you can not get change within same clock cycle. So HOLD can not help in solving that issue.
There is huge difference in meanings: Changing the state of a group of output pins IN ALMOST the same clock cycle in comparison to meaning: Changing the group of output pins where operation is GUARANTEED TO BE PERFORMED within the same cycle.
Those two different meanings impact on hardware design of the rest of peripherals supposed to be controlled by an ESP32 and involving things like data bus or address bus.
If operation can not be guaranteed to be performed within the same cycle, proper "strobe" or "CE" signals have to be used on that rest hardware to avoid "flying" states until whole group of out pins of ESP32 establish their final states. And that can be a huge problem for upgrading an old already working design where only MCU is supposed to be replaced by advanced ESP32 chip.
Unfortunately, in the all Technical Reference Manuals related to ESP32, I collected, I couldn't find a proper solution to that issue. DEDICATED GPIO is a collection of C functions and macros which is supposed to do that task "as close as possible" but that is not the same as guaranteed operation. I need to see how that API uses hardware capabilities of ESP32 to be sure how that task is performed at hardware level. Because, "as close as possible" is not enough for ultra stable behavior of an end commercial product.
So I hope that EspressIf will update documentation with an examples of proper use of that hardware capability because any of described methods so far, can not guarantee that operation within single cycle. But on the controllers of other manufacturers, it is such a trivial task even not worth a mention, however internal design of ESP32 require guru skills for proper implementation of such a trivial tasks.
-
- Posts: 1735
- Joined: Mon Oct 17, 2022 7:38 pm
- Location: Europe, Germany
Re: Changing the state of group of output pins within the same clock cycle
static inline void dedic_gpio_cpu_ll_write_all(uint32_t value)
{
asm volatile("wur.gpio_out %0"::"r"(value):);
}
As I can see that instruction will force unsigned 32 bit value (through register) to GPIO_OUT_REGISTER. Technically it will write value to all pins. However, how do you call that function? You have to do some reading of the value already presented on the port to preserve them from changing if you want to target only specific bundle of bits. So, before calling to this asm instruction you have to READ the status of the port, do proper masking, and then execute this ASM command with a new value. BUT, from the moment of reading, to the moment of writing, the states of "live" GPIO_OUT_REG could be changed (due to other thread, or even if interrupts are disabled, due to some integrated module in ESP32 like PWM or any other, capable to perform its own writing to output pads). So if such a module changed the state of output pads AFTER your reading of port, BUT before you put a new value to it, then executing THIS asm instruction will OVERWRITE those changed values by OLDER sampled values, which can make an external periferal depending on PWM (or any other function) to fail.
Can you provide a full example of putting lets say GPIO0, GPIO1, GPIO2 and GPIO3 to lets say 0x05 state..... If I understand properly your approach, it would be something like this: (ok, this is written in assembler but I am guessing approach is the same also using C)
LD reg1, GPIO_OUT_REG
AND reg1,reg1,0xFFFFFFF0
OR reg1,reg1,0x05
ST reg1,GPIO_OUT_REG
The first instruction will take actual state of gpio_out into some register
The second instruction will clear lower 4 bits in register with assumption those bits correspond to GPIO 0-3
The third instruction will set GPIO3 and GPIO0 while GPIO1 and GPIO2 stay cleared
The forth instruction will store new composed value back to GPIO while all other bits are preserved
// technically speaking, that last assembler instruction is the instruction of that encapsulated assembler code you provided
But, problem with this approach is sampling of GPIO_OUT_REG performed in the first asm instruction. Now, suppose during execution of second or third instruction some interrupt happens, and inside ISR routine there is other function which impacts on GPIO_OUT_REG too, after return from ISR modified GPIO state is not stored. Or even if interrupts are disabled, suppose some PWM function is running by its own and it outputs its pulses on, let's say, GPIO20. If such pulse occur during second or third asm instruction it will change the state of GPIO20. Which also means the state of GPIO_OUT_REG has changed. But then your last instruction, which write calculated value to whole port, will OVERWRITE the state of GPIO20 by sampled old value, because it is sampled before such change occurred.
So if you do not use this approach I would like to see HOW do you isolate specific bits and preserve all other bits.
Can you provide full logic? Thank you in advance
{
asm volatile("wur.gpio_out %0"::"r"(value):);
}
As I can see that instruction will force unsigned 32 bit value (through register) to GPIO_OUT_REGISTER. Technically it will write value to all pins. However, how do you call that function? You have to do some reading of the value already presented on the port to preserve them from changing if you want to target only specific bundle of bits. So, before calling to this asm instruction you have to READ the status of the port, do proper masking, and then execute this ASM command with a new value. BUT, from the moment of reading, to the moment of writing, the states of "live" GPIO_OUT_REG could be changed (due to other thread, or even if interrupts are disabled, due to some integrated module in ESP32 like PWM or any other, capable to perform its own writing to output pads). So if such a module changed the state of output pads AFTER your reading of port, BUT before you put a new value to it, then executing THIS asm instruction will OVERWRITE those changed values by OLDER sampled values, which can make an external periferal depending on PWM (or any other function) to fail.
Can you provide a full example of putting lets say GPIO0, GPIO1, GPIO2 and GPIO3 to lets say 0x05 state..... If I understand properly your approach, it would be something like this: (ok, this is written in assembler but I am guessing approach is the same also using C)
LD reg1, GPIO_OUT_REG
AND reg1,reg1,0xFFFFFFF0
OR reg1,reg1,0x05
ST reg1,GPIO_OUT_REG
The first instruction will take actual state of gpio_out into some register
The second instruction will clear lower 4 bits in register with assumption those bits correspond to GPIO 0-3
The third instruction will set GPIO3 and GPIO0 while GPIO1 and GPIO2 stay cleared
The forth instruction will store new composed value back to GPIO while all other bits are preserved
// technically speaking, that last assembler instruction is the instruction of that encapsulated assembler code you provided
But, problem with this approach is sampling of GPIO_OUT_REG performed in the first asm instruction. Now, suppose during execution of second or third instruction some interrupt happens, and inside ISR routine there is other function which impacts on GPIO_OUT_REG too, after return from ISR modified GPIO state is not stored. Or even if interrupts are disabled, suppose some PWM function is running by its own and it outputs its pulses on, let's say, GPIO20. If such pulse occur during second or third asm instruction it will change the state of GPIO20. Which also means the state of GPIO_OUT_REG has changed. But then your last instruction, which write calculated value to whole port, will OVERWRITE the state of GPIO20 by sampled old value, because it is sampled before such change occurred.
So if you do not use this approach I would like to see HOW do you isolate specific bits and preserve all other bits.
Can you provide full logic? Thank you in advance
-
- Posts: 1735
- Joined: Mon Oct 17, 2022 7:38 pm
- Location: Europe, Germany
Re: Changing the state of group of output pins within the same clock cycle
Looking at the S3's TRM, the EE.WR_MASK_GPIO_OUT instruction in dedic_gpio_cpu_ll_write_mask() does what you're asking about:
GPIO_OUT[7:0] = (GPIO_OUT[7:0] & ~ax[7:0]) | (as[7:0] & ax[7:0])
GPIO_OUT[7:0] = (GPIO_OUT[7:0] & ~ax[7:0]) | (as[7:0] & ax[7:0])
-
- Posts: 9766
- Joined: Thu Nov 26, 2015 4:08 am
Re: Changing the state of group of output pins within the same clock cycle
Mate, stop the 'it won't work', please. We architected this feature, I'd know if it wouldn't work.
The 'dedicated IO' feature is different from yer normal GPIO registers wrt IO values it sets. What happens is that a configurable set of IO pins is re-routed from the general-purpose GPIO registers you're referring to a CPU core. The CPU core has a dedicated register with dedicated instructions to write/read it, that will control these re-routed (and only those re-routed) GPIOs. (The wur.gpio_out instruction is one of these instructions; the gpio_out named there is the dedicated GPIO register internal to the CPU, not the general-purpose GPIO register that lives memory space.) So if you want to control GPIO 1 and 12, you'd create a bundle with those registers, and using the wur.gpio would only ever affect those two registers, and nothing else. Even if you route a bunch of independent GPIOs to here, as microcontroller mentioned, the ee.wr_mask_gpio_out can handle masked writes (and iirc that is how the higher-level concept of bundles is implemented.)
The 'dedicated IO' feature is different from yer normal GPIO registers wrt IO values it sets. What happens is that a configurable set of IO pins is re-routed from the general-purpose GPIO registers you're referring to a CPU core. The CPU core has a dedicated register with dedicated instructions to write/read it, that will control these re-routed (and only those re-routed) GPIOs. (The wur.gpio_out instruction is one of these instructions; the gpio_out named there is the dedicated GPIO register internal to the CPU, not the general-purpose GPIO register that lives memory space.) So if you want to control GPIO 1 and 12, you'd create a bundle with those registers, and using the wur.gpio would only ever affect those two registers, and nothing else. Even if you route a bunch of independent GPIOs to here, as microcontroller mentioned, the ee.wr_mask_gpio_out can handle masked writes (and iirc that is how the higher-level concept of bundles is implemented.)
Re: Changing the state of group of output pins within the same clock cycle
I am sorry if my question touched you feelings. It wasn't my intention. I'm using ESP32 S3 and managed only documents related to S3.
I found more details in TRM for ESP32 S2 (but not in TRM for S3).
I do not know why Dedicated IO is omitted from documentation related to S3.
From that doc, now I see, you are using register addressing mode (usually called indirect addressing mode) to write a BYTE size value where target is GPIO output. That is all I want to know. And that explains how CPU can write only fraction of port.
I really do not know why that part is omitted from documentation for S3. Maybe I have a huge expectations if I assume that TRM related to S3 should contain all related to S3.
Anyway, thanks for the answers and sorry again if I hurt someone's feelings.
I found more details in TRM for ESP32 S2 (but not in TRM for S3).
I do not know why Dedicated IO is omitted from documentation related to S3.
From that doc, now I see, you are using register addressing mode (usually called indirect addressing mode) to write a BYTE size value where target is GPIO output. That is all I want to know. And that explains how CPU can write only fraction of port.
I really do not know why that part is omitted from documentation for S3. Maybe I have a huge expectations if I assume that TRM related to S3 should contain all related to S3.
Anyway, thanks for the answers and sorry again if I hurt someone's feelings.
Who is online
Users browsing this forum: No registered users and 76 guests