ESP-IDF and Xtensa instructions S32C1I, L32AI and S32RI

aydosc
Posts: 4
Joined: Tue Feb 07, 2023 10:31 pm

ESP-IDF and Xtensa instructions S32C1I, L32AI and S32RI

Postby aydosc » Tue Feb 07, 2023 11:07 pm

Hi, I am new to programming ESP-IDF. Been reading the documentation and I have some questions. Note that I am not concerned about high level synchronisation that stops interrupts, rather I am interested in how the framework uses XTensa native instructions to support synchronisation.

The ESP32S3 API guide reads:
SMP on an ESP Target
ESP targets (such as the ESP32, ESP32-S3) are dual core SMP SoCs. These targets have the following hardware features that make them SMP capable:
  • Two identical cores known as CPU0 (i.e., Protocol CPU or PRO_CPU) and CPU1 (i.e., Application CPU or APP_CPU). This means that the execution of a piece of code is identical regardless of which core it runs on.
  • Symmetric memory (with some small exceptions).
    (a) If multiple cores access the same memory address, their access will be serialized at the memory bus level.
    (b) True atomic access to the same memory address is achieved via an atomic compare-and-swap instruction provided by the ISA.
  • Cross-core interrupts that allow one CPU to trigger and interrupt on another CPU. This allows cores to signal each other.
My question are
  • Are (a) and (b) above automatically achieved at the framework level or as a programmer do I need to take care of that?
  • Are the above answers the same for 2 processes in different CPU's or 2 event in the same CPU?
  • I can see the API offers access to S32C1I/SCOMPARE1 via the esp_cpu_compare_and_set(). Is anything similar to L32AI/S32RI?

ESP_Sprite
Posts: 9746
Joined: Thu Nov 26, 2015 4:08 am

Re: ESP-IDF and Xtensa instructions S32C1I, L32AI and S32RI

Postby ESP_Sprite » Wed Feb 08, 2023 1:03 am

A is achieved on the hardware level, nothing software or the framework needs to do about it. B has support for it in the framework (via e.g. the esp_cpu_compare_and_set function you mention) but as the framework cannot infer that a compare/swap needs to be atomic, you need to call that function (or better: the C language equivalent, see below) if you need an atomic compare/swap.

Code taking A and B into account will work both for tasks on separate cores as well as tasks running on the same core.

I'm not sure we have functions for L32AI and friends, but in general: if you want to use atomics like this, we support the C compilers idea of atomics. If you use e.g. the C11 stdatomic stuff, we'll automatically do the right thing regardless of what platform you're on.

aydosc
Posts: 4
Joined: Tue Feb 07, 2023 10:31 pm

Re: ESP-IDF and Xtensa instructions S32C1I, L32AI and S32RI

Postby aydosc » Wed Feb 08, 2023 1:15 am

Thanks ESP_Sprite.

Ok so just to be clear, by using std::atomic, the compiler will implement by using compare-and-swap rather then stop interrupts?

aydosc
Posts: 4
Joined: Tue Feb 07, 2023 10:31 pm

Re: ESP-IDF and Xtensa instructions S32C1I, L32AI and S32RI

Postby aydosc » Wed Feb 08, 2023 2:27 am

Thanks ESP_Sprite. Just to further clarify with a couple of examples.

Will variable "y" ever receive a scrambled value or are 32bit reads and writes are always atomic?
- Assuming process1 and process2 are running on separate cores
- Assuming process1 and process2 are running on same core, process2 has higher priority.

Code: Select all

uint32_t x = 0;
uint32_t y;
void process1()
{
   while(true)
      x++;
}
void process2()
{
   while(true)
      y = x;
}

ESP_Sprite
Posts: 9746
Joined: Thu Nov 26, 2015 4:08 am

Re: ESP-IDF and Xtensa instructions S32C1I, L32AI and S32RI

Postby ESP_Sprite » Wed Feb 08, 2023 7:19 am

Std::atomic, on multicore CPUs, is indeed implemented using designated atomic instructions (like compare-and-swap).

Your example actually won't work, but not due to the reasons described: as there is no function call in the while() loop in process2, the compiler will assume that x can never change and it might optimize it to 'y=x; while() ;'. (Sorry, I forgot the exact terminology for this behaviour.) But making the extra assumption that you would put a function call or something in that while loop that breaks that compilers assumption, your code would do what you expect. Note that if you were to change x to an uint64_t, or wrote back to x in process2 (or any other process other than process1), your code would subtly break. All that holds both for both tasks on the same CPU as well as different CPUs.

As I mentioned before: unless you have real good reasons not to, I'd strongly suggest not surprising anyone who tries to maintain the code and simply making the variable an atomic (or placing a spinlock around the bunch, if applicable). If not, at least leave lots of comments wrt what you did and the fact that it's brittle.

Who is online

Users browsing this forum: Baidu [Spider] and 89 guests