ESP32 Forum

Posted: **Fri Apr 14, 2023 9:43 pm**

Good afternoon community...

I have set myself an ambitious project in relation to VGA and Esp32 (using a library of course, maybe you know it), but using both Esp32 Cores in their true and maximum computing capacity.

Unfortunately FreeRTOS is a big obstacle to get to such a point, I can tolerate the fact that it consumes a considerable amount of RAM, but what I don't find funny is that it is running SOMETHING behind the scenes (at this point I don't know what it does or so, I don't know if it's the Watchdog, interruptions, some light sub-task, I don't know).

Between my investigations on the internet, I managed to use both Cores without the Watchdog running (or so I think) and without having to use some kind of delay for... I don't know, I don't know why all the examples of the use of dual core must use delays so the Esp32 wouldn't crash, but I managed to avoid that

.

But there is a detail... For some unknown reason, Core 0 is a little less efficient than Core 1, I don't know why, would I have missed deactivating something from FreeRTOS? Or perhaps at the hardware level is it less efficient?

Here I show you some Benchmarks that I did, which use the I/O (to see if there are no performance losses for the use of VGA) and two different high consumption tasks (to see if the computing level is affected).
Code: https://gist.github.com/HiperDoo/c87624 ... dd003a73cd

1. Use Core 1 to calculate PI or Prime Numbers and use Core 0 for OUTPUT/INPUT GPIOs. (Values vary ~1ms)
Write timing is slower and Read is slower.
PI timing is perfect and Prime timing is perfect.

Code: Select all

// For the Core 0
>>> BENCHMARK WRITE<<< Core: 0
 * digitalWrite():   6853 ms
 * gpio_set_level(): 5927 ms
 * GPIO.out_w1ts/t:  1009 ms
 * REG_WRITE():      1009 ms
// OR
>>> BENCHMARK READ <<< Core: 0
 * digitalWrite():   2019 ms
 * gpio_get_level(): 1639 ms
 * GPIO.in:          505 ms
 * REG_READ():       504 ms
 
// For the Core 1
 >>> BENCHMARK PI <<< Core: 1
 * PI: 3.141593
 * Time: 38244 ms
// OR 
 >>> BENCHMARK PRIME <<< Core: 1
 * Prime: 25997
 * Time: 94846 ms

2. Use Core 0 to calculate PI or Prime Numbers and use Core 1 for OUTPUT/INPUT GPIOs. (Values vary ~1ms)
Write timing is perfect and Read is perfect.
PI timing is slower and Prime timing is slower.

Code: Select all

// For the Core 1
>>> BENCHMARK WRITE <<< Core: 1
 * digitalWrite():   6837 ms
 * gpio_set_level(): 5914 ms
 * GPIO.out_w1ts/t:  1007 ms
 * REG_WRITE():      1008 ms
// OR
>>> BENCHMARK READ <<< Core: 1
 * digitalWrite():   2013 ms
 * gpio_get_level(): 1636 ms
 * GPIO.in:          504 ms
 * REG_READ():       504 ms
 
// For the Core 0
 >>> BENCHMARK PI <<< Core: 0
 * PI: 3.141593
 * Time: 40368 ms
// OR
>>> BENCHMARK PRIME <<< Core: 0
 * Prime: 25997
 * Time: 98676 ms

3. Use Core 1 to calculate PI and use Core 0 to calculate Prime Numbers. (Values vary ~1ms)
PI timing is perfect and Prime timing is slower.

Code: Select all

>>> BENCHMARK PI <<< Core: 1
 * PI: 3.141593
 * Time: 38244 ms
 
 >>> BENCHMARK PRIME <<< Core: 0
 * Prime: 25997
 * Time: 95070 ms

4. Use Core 0 to calculate PI and use Core 1 to calculate Prime Numbers. (Values vary ~1ms)
PI timing is slower and Prime timing is perfect.

Code: Select all

>>> BENCHMARK PI <<< Core: 0
 * PI: 3.141593
 * Time: 38334 ms
 
 >>> BENCHMARK PRIME <<< Core: 1
 * Prime: 25997
 * Time: 94846 ms

5. Using both Cores for READ and WRITE (even though they are different pins), drops performance for both tasks randomly (ie times are ~2000ms apart!!!).

Code: Select all

>>> BENCHMARK READ <<< Core: 1
 * digitalWrite():   2052 ms
 * gpio_get_level(): 1701 ms
 * GPIO.in:          503 ms
 * REG_READ():       504 ms

>>> BENCHMARK READ <<< Core: 1
 * digitalWrite():   2086 ms
 * gpio_get_level(): 1690 ms
 * GPIO.in:          503 ms
 * REG_READ():       504 ms

>>> BENCHMARK WRITE <<< Core: 0
 * digitalWrite():   7860 ms
 * gpio_set_level(): 7942 ms
 * GPIO.out_w1ts/t:  1009 ms
 * REG_WRITE():      1009 ms

>>> BENCHMARK READ <<< Core: 1
 * digitalWrite():   4076 ms
 * gpio_get_level(): 1708 ms
 * GPIO.in:          504 ms
 * REG_READ():       504 ms

>>> BENCHMARK READ <<< Core: 1
 * digitalWrite():   2051 ms
 * gpio_get_level(): 1708 ms
 * GPIO.in:          504 ms
 * REG_READ():       504 ms

>>> BENCHMARK READ <<< Core: 1
 * digitalWrite():   2074 ms
 * gpio_get_level(): 1689 ms
 * GPIO.in:          504 ms
 * REG_READ():       504 ms

>>> BENCHMARK WRITE <<< Core: 0
 * digitalWrite():   8868 ms
 * gpio_set_level(): 6934 ms
 * GPIO.out_w1ts/t:  1009 ms
 * REG_WRITE():      1009 ms

>>> BENCHMARK READ <<< Core: 1
 * digitalWrite():   2087 ms
 * gpio_get_level(): 3698 ms
 * GPIO.in:          504 ms
 * REG_READ():       504 ms

>>> BENCHMARK READ <<< Core: 1
 * digitalWrite():   2051 ms
 * gpio_get_level(): 1708 ms
 * GPIO.in:          504 ms
 * REG_READ():       504 ms

Obviously I will be attentive to your answers, but also to recommendations.

I know that they usually recommend not deactivating the Watchdog (although I don't know why, I haven't seen the reasons), but the least of my problems is that a task doesn't finish executing (that would be my logic problem).

Posted: **Sat Apr 15, 2023 3:21 am**

Bluetooth and WiFi are pinned to core0 in Arduino. You probably have some cycles servicing those tasks.

Posted: **Sat Apr 15, 2023 4:02 am**

Interesting, I didn't think of that possibility.

If so, how can I disable these services?

At no time will I use WiFi or Bluetooth in this project in case you were wondering.

Posted: **Sat Apr 15, 2023 9:42 am**

The main task (the one that calls setup and loop) is pinned to core 0 in Arduino. Your loop is

Code: Select all

while(1);

so this will occupy quite a lot of core 0's time. Delete or suspend the task instead. There are FreeRTOS utilities to help debug issues like this: https://www.freertos.org/a00021.html#vT ... nTimeStats

Posted: **Sat Apr 15, 2023 9:01 pm**

In my case, setup() and loop() use Core 1.

And yes, that is my intention, to use all the power and time of Core 0 to verify that there is nothing running behind the scenes.

Using delays or deleting the task has nothing to do with what I'm trying to do.

Posted: **Sun Apr 16, 2023 7:23 pm**

I don't know if setup() and loop() can run on another Core, but in my case it always runs on Core 0.

And I don't think adding delays to it will make Core 0 run any faster. What I want is to find out why Core 0 is not as fast as Core 1.

Posted: **Mon Apr 24, 2023 10:55 pm**

Any idea what could be going on or is it a really complex issue?

Posted: **Tue Apr 25, 2023 1:56 am**

Change loop to

Code: Select all

void loop() {
    vTaskDelete(NULL);
}

ESP32 Forum

Core 0 is Slower

Core 0 is Slower

Re: Core 0 is Slower

Re: Core 0 is Slower

Re: Core 0 is Slower

Re: Core 0 is Slower

Re: Core 0 is Slower

Re: Core 0 is Slower

Re: Core 0 is Slower