Protect counter variable by portENTER_CRITICAL vs <atomic>
Posted: Thu Apr 11, 2024 11:41 pm
Hey everyone,
I have a counter variable (int type). This counter is incremented from within an ISR. Outside of an ISR that variable is read (for an <=int32 that’s atomic), but also decremented an amount of n (so that’s a read modify write).
From what I understand, the proper ESP-IDF FreeRTOS way is to protect the variable access by portENTER_CRITICAL(&spinLock), whenever I’m rmw-ing.
Now, I’ve stumbled upon the c++ <atomic> header. Is this feature properly implemented within the esp32? From what I understand, I don’t need a spinlock, or the critical section in this case, but does it also really protect the variable in the multiprocessing case?
I’ve created this simplified example, implementing both methods. The first one with protection by critical section and spinlock, the second one with protection by <atomic>.
In the example high going pulses on pin 4 and 5 are counted, I’ve coded this out in an object oriented way that I would use, for IC-drivers connected to the ESP…
The program seems to work as intended, but that doesn’t mean anything certain.
Are both methods (multiprocessing) safe and correctly done like that? Is there a (performance?) downside to using <atomic>?
Also, do I need the volatile keyword for the atomic variable? Probably not... And on a side note, should I use ARDUINO_ISR_ATTR or IRAM_ATTR, or is it just the same?
Background:
I’m working with an ESP32-S3 and the Arduino core. I’ve just gotten started with FreeRTOS (I’m somewhat aware of ESP IDF’s FreeRTOS change due to SMP) and trying to figure out how to properly manage concurrency etc.
Thank you in advance!
I have a counter variable (int type). This counter is incremented from within an ISR. Outside of an ISR that variable is read (for an <=int32 that’s atomic), but also decremented an amount of n (so that’s a read modify write).
From what I understand, the proper ESP-IDF FreeRTOS way is to protect the variable access by portENTER_CRITICAL(&spinLock), whenever I’m rmw-ing.
Now, I’ve stumbled upon the c++ <atomic> header. Is this feature properly implemented within the esp32? From what I understand, I don’t need a spinlock, or the critical section in this case, but does it also really protect the variable in the multiprocessing case?
I’ve created this simplified example, implementing both methods. The first one with protection by critical section and spinlock, the second one with protection by <atomic>.
Code: Select all
#include <Arduino.h>
#include "FunctionalInterrupt.h"
#include "freertos/FreeRTOS.h"
#include "freertos/task.h"
#include "freertos/semphr.h"
#include "spinlock.h"
#include <atomic>
#define SER USBSerial // ESP32 S3 integrated usb serial
// TODO: use ARDUINO_ISR_ATTR or IRAM_ATTR to place the function in IRAM???
class MyHardware {
public:
// worker binary semaphore
SemaphoreHandle_t _workerSemaphore;
void link();
// Version 1: Spinlock and critical section protected
uint8_t counter1Get();
void counter1Reduce(uint8_t amount);
// Version 2: Using atomic header
uint8_t counter2Get();
void counter2Reduce(uint8_t amount);
private:
unsigned long _debounce_delay;
// Version 1: Spinlock and critical section protected
volatile uint8_t _counter1;
unsigned long _last_debounce1;
portMUX_TYPE _counter1_spinLock;
void IRAM_ATTR _myInterrupt1();
// Version 2: Using atomic header
std::atomic_uint8_t _counter2; //TODO: volatile needed for atomic?
unsigned long _last_debounce2;
void IRAM_ATTR _myInterrupt2();
};
void MyHardware::link() {
// Must be called within or after setup()
_workerSemaphore = xSemaphoreCreateBinary();
_debounce_delay = 30; // in ms, compared to millis()
// Version 1: Spinlock and critical section protected
_counter1 = 0;
_last_debounce1 = 0;
_counter1_spinLock = portMUX_INITIALIZER_UNLOCKED;
pinMode(4, INPUT_PULLDOWN);
attachInterrupt(4,
std::bind(&MyHardware::_myInterrupt1, this),
RISING);
// Version 2: Using atomic header
_counter2 = 0;
_last_debounce2 = 0;
pinMode(5, INPUT_PULLDOWN);
attachInterrupt(5,
std::bind(&MyHardware::_myInterrupt2, this),
RISING);
}
/******** Version 1 member functions: spinlock and critical section ***********/
void IRAM_ATTR MyHardware::_myInterrupt1() {
unsigned long current_time = millis();
if (current_time - _last_debounce1 < _debounce_delay) {
return;
}
_last_debounce1 = current_time;
portENTER_CRITICAL_ISR(&_counter1_spinLock);
_counter1++;
portEXIT_CRITICAL_ISR(&_counter1_spinLock);
if (_counter1 >= 10) {
BaseType_t xHigherPriorityTaskWoken = pdFALSE;
xSemaphoreGiveFromISR(_workerSemaphore, &xHigherPriorityTaskWoken);
if (xHigherPriorityTaskWoken == pdTRUE) {
portYIELD_FROM_ISR();
}
}
}
uint8_t MyHardware::counter1Get() {
return _counter1;
}
void MyHardware::counter1Reduce(uint8_t amount) {
portENTER_CRITICAL(&_counter1_spinLock);
_counter1 -= amount;
portEXIT_CRITICAL(&_counter1_spinLock);
}
/************* Version 2 member functions: Using atomic header ****************/
void IRAM_ATTR MyHardware::_myInterrupt2() {
unsigned long current_time = millis();
if (current_time - _last_debounce2 < _debounce_delay) {
return;
}
_last_debounce2 = current_time;
// _counter2++ would use memory_order_seq_cst by default (slower)
_counter2.fetch_add(1, std::memory_order_relaxed);
if (_counter2.load(std::memory_order_relaxed) >= 10) {
BaseType_t xHigherPriorityTaskWoken = pdFALSE;
xSemaphoreGiveFromISR(_workerSemaphore, &xHigherPriorityTaskWoken);
if (xHigherPriorityTaskWoken == pdTRUE) {
portYIELD_FROM_ISR();
}
}
}
uint8_t MyHardware::counter2Get() {
// return _counter2 would use memory_order_seq_cst by default (slower)
return _counter2.load(std::memory_order_relaxed);
}
void MyHardware::counter2Reduce(uint8_t amount) {
_counter2.fetch_sub(amount, std::memory_order_relaxed);
}
/******************************************************************************/
void genericTask(void *pvParameters) {
SER.println("Task started");
MyHardware* my_hw = static_cast<MyHardware*>(pvParameters);
uint8_t counter1, counter2;
while (1) {
if (xSemaphoreTake(my_hw->_workerSemaphore, portMAX_DELAY) == pdTRUE) {
counter1 = my_hw->counter1Get();
counter2 = my_hw->counter2Get();
SER.print("\nCounter 1: "); SER.print(counter1);
SER.print(", Counter 2: "); SER.println(counter2);
if (counter1 >= 10) {
// some work here
my_hw->counter1Reduce(10);
SER.println("Counter 1 reduced by 10");
}
if (counter2 >= 10) {
// some work here
my_hw->counter2Reduce(10);
SER.println("Counter 2 reduced by 10");
}
SER.print("Counter 1: "); SER.print(my_hw->counter1Get());
SER.print(", Counter 2: "); SER.println(my_hw->counter2Get());
}
}
}
MyHardware g_MY_HW;
void setup() {
SER.begin(115200);
g_MY_HW.link();
// Create task, pin to core 1
xTaskCreatePinnedToCore(
genericTask, // Task function
"genericTask", // Name of the task (for debugging)
2048, // Stack size (bytes)
&g_MY_HW, // Task input parameter
1, // Priority of the task
NULL, // Task handle
1 // Core
);
}
void loop() {}
The program seems to work as intended, but that doesn’t mean anything certain.
Are both methods (multiprocessing) safe and correctly done like that? Is there a (performance?) downside to using <atomic>?
Also, do I need the volatile keyword for the atomic variable? Probably not... And on a side note, should I use ARDUINO_ISR_ATTR or IRAM_ATTR, or is it just the same?
Background:
I’m working with an ESP32-S3 and the Arduino core. I’ve just gotten started with FreeRTOS (I’m somewhat aware of ESP IDF’s FreeRTOS change due to SMP) and trying to figure out how to properly manage concurrency etc.
Thank you in advance!