RTOS running on one core only

HelWeb · Postby **HelWeb** » Fri Apr 05, 2019 12:12 pm

(Please excuse my english, I am german)

I want to use one core with RTOS and the other core with my own cooperative OS (CoopOS), which I use since the days of Arduino. Yes, there are some reasons to do so - it is much faster in special circumstances..

It works and I am nearly happy. For the second core I start a task pinned to core 1 and then disable interrupts (for that core).

Interrupts are managed in core 0 with RTOS and the interrupt-service-routine can send a signal to my CoopOS - and it works fine.

But: polling an input-pin with CoopOS is much faster - it does not have the RTOS overhead and can by done less than 2-4 µs (depending on the number of CoopOS-Tasks) to start the right CoopOS-Task for this Interrupt.
I measured an RTOS task switch time (240 Mhz) of 2 µs.
And an RTOS-Task doing nothing else than test a pin (polling) could be very fast.
But: it has to have the lowest priority - otherwise no other task would run.
And that means, timing is not really deterministic.
With my solution I can garantee the response time.

My problem:
I have to disable interrupts at core 1 to become rid of the RTOS ticks - which should not disturb the CoopOS, but i want to be able to use interrupts on this core 1 with own interrupt service routines.

Possibilities I tried:
1) make menuconfig: Use RTOS with one core - no success.
2) I tried to change start.c, where the cores are started (should be possible) - no success.

Has anybody tried to use RTOS limited to core 0 and start a Non-RTOS program using core 1 WITH working interrupts on this core?

I think, a solution would be useful for a lot of ESP32 developers.

michprev · Postby **michprev** » Fri Apr 05, 2019 5:00 pm

Running the second core without FreeRTOS is not very easy. Basically you need to follow call_start_cpu0() function and do almost everything that is done in #if !CONFIG_FREERTOS_UNICORE directives.

You should know that:

you will need to care about 3.10 sillicon bug
you will not be able to use newlib
you will not be able to use any of peripheral drivers

To get interrupts working on core 1 you need to:

set base address of interrupt vectors using
Code: Select all
```
__asm__ __volatile__ ("wsr.vecbase %0" :: "r"(vecbase_address));
```
You need to be sure that vecbase_address is in IRAM memory. Please note that you cannot use vecbase that uses core 0.

place vectors on that address with following offsets:

Code: Select all

0x0   _WindowOverflow4
0x40  _WindowUnderflow4
0x80  _WindowOverflow8
0xC0  _WindowUnderflow8
0x100 _WindowOverflow12
0x140 _WindowUnderflow12
0x180 _Level2InterruptVector
0x1C0 _Level3InterruptVector
0x200 _Level4InterruptVector
0x240 _Level5InterruptVector
0x280 _DebugExceptionVector
0x2C0 _NMIExceptionVector
0x300 _KernelExceptionVector
0x340 _UserExceptionVector
0x3C0 _DoubleExceptionVector

You will need to write these interrupt handlers in assembly. You can look into xtensa_vectors.S and xtensa_intr_asm.S for inspiration. Overflow and underflow handlers can be simply copied. You will also probably need to modify the linker script.

username · Postby **username** » Fri Apr 05, 2019 6:09 pm

But: polling an input-pin with CoopOS is much faster

Just curious why you would pole an input-pin rather than set a IRQ for it.

michprev · Postby **michprev** » Fri Apr 05, 2019 6:38 pm

See viewtopic.php?f=12&t=422. Both interrupt handler and FreeRTOS adds some delay. I would suggest using one of high priority interrupts (which can not be serviced in C) but you need to understand Xtensa architecture very well.

Deouss · Postby **Deouss** » Fri Apr 05, 2019 8:55 pm

I have never heard about such idea. It is impossible. You are mistaking multiple cores with multiple OSes.
There always will be conflict unless one OS is hosted in VM within the other. Why even people talk about such difficult ideas and they want it to run on tiny humble MCU ) It is a bit ridiculous. Better get two ESP8266 and communicate between them using two different OSes.

HelWeb · Postby **HelWeb** » Fri Apr 05, 2019 10:34 pm

Thank you very much.
I am not able to write all the stuff to install fast interrupt handlers in assembler.
But from the tips and links you posted I found a way to detect pin changes very fast.
While core 0 is doing the RTOS tasks I do a brute force loop on core 1 (interrupts disabled) like this:

int irq=0, lastIrq=0;
while(1) {
irq=REG_READ(GPIO_IN_REG) & (1<<12); // Test pin 12
if (irq != lastIrq) { // 166 ns
if (irq) REG_WRITE(GPIO_OUT_W1TS_REG,(1<<5)); // Set pin 5
else REG_WRITE(GPIO_OUT_W1TC_REG,(1<<5));
lastIrq=irq;
}
}

Doing this loop it needs 166 ns to react on an edge at pin 12 (measured with scope) which is much faster than any fast interrupt can do - I think.
And I can test 31 pins at once.
So interrupts on core 1 are not needed any longer.
But now i have the problem, how to measure such times in a program - who can help ?

HelWeb · Postby **HelWeb** » Fri Apr 05, 2019 11:06 pm

@ deouss
That you have never heard from that idea does not mean it is impossible

I managed to run 8 RTOS Tasks (core 0) an 20 CoopOS tasks (core 1) and let them communicate.
CoopOS has a lot of features of an RTOS: Signals, Priorities, non blocking delays etc.
The big difference: task switches are done cooperative - not preemptive. That means every tasks should do a taskSwitch or delay within short times. The structure of the tasks are like RTOS tasks:
while(1) {
do something
taskSwitch()
do something
waitForSignal() // non blocking
do someting
taskDelay(x)
}
Taskswitches are very fast because all tasks use the same stack. I get from 100.000 to 500.000 taskSwitches per second. It is NOT a replacement of an RTOS, but a good addition because the CoopOS works without any interrupts.
Its tested with Arduino (without an RTOS), bare metal Raspberry Pi 3 with circle library, Raspberry Pi with Linux preempt, Lenovo Laptop with Linux preempt and isolated cores - and ESP32.
Never say impossible again

If you are interested in you may look at HelmutWeber.de
It is in German, but google may help to translate.

HelWeb · Postby **HelWeb** » Fri Apr 05, 2019 11:57 pm

How is it done?

void app_main(void)
{
start RTOS tasks and
vOtherFunction();
....
----------------------------
void vOtherFunction( void )
{
TaskHandle_t xHandle = NULL;

xHandle = xTaskCreateStaticPinnedToCore(
CoopOS, // Function that implements the task.
"CoopOS", // Text name for the task.
STACK_SIZE, // Stack size in bytes, not words.
( void * ) 1, // Parameter passed into the task.
//tskIDLE_PRIORITY,// Priority at which the task is created.
tskIDLE_PRIORITY+2,
xStack, // Array to use as the task's stack.
&xTaskBuffer, // Variable to hold the task's data structure.
1); // Core 1 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
}

---------------------------------------------------
// Start of CoopOS

// Function that implements the task being created.
void CoopOS( void * pvParameters )
////void start_cpu1()
{

// 240 Mhz = about 4 ticks per microsecond
start=portGET_RUN_TIME_COUNTER_VALUE();
start2=start;

// I do not want an RTOS-Tick here
portDISABLE_INTERRUPTS(); // YEAH <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

// Sorted by Priority
Test3ID = InitTask((char *)"Test3", Test3, 99, READY, 0,0, &Param1);
Test5ID = InitTask((char *)"Test5", Test5, 24, READY, 0,0, 0);
Test7ID = InitTask((char *)"Test7", Test7, 23, READY, 0,0, 0);
Test1ID = InitTask((char *)"Test1", Test1, 20, READY, 0,0, 0);
Test9ID = InitTask((char *)"Test9", Test9, 20, READY, 0,0, 0);
Test2ID = InitTask((char *)"Test2", Test2, 19, READY, 0,0, 0);
Test4ID = InitTask((char *)"Test4", Test4, 18, READY, 0,0, 0);
Test6ID = InitTask((char *)"Test6", Test6, 17, READY, 0,0, 0);
Test10ID =InitTask((char *)"Test10",Test10, 17, READY, 0,0, 0);
Test11ID =InitTask((char *)"Test11",Test11, 16, READY, 0,0, 0);
Test12ID =InitTask((char *)"Test12",Test12, 16, READY, 0,0, 0);
Test13ID =InitTask((char *)"Test13",Test13, 16, READY, 0,0, 0);
Test14ID =InitTask((char *)"Test14",Test14, 16, READY, 0,0, 0);
Test15ID =InitTask((char *)"Test15",Test15, 16, READY, 0,0, 0);
Test8ID = InitTask((char *)"Test8", Test8, 10, READY, 0,0, 0);

Try it yourself with any cooperative multitasking library ......
There are a lot of such libraries around !

Deouss · Postby **Deouss** » Sat Apr 06, 2019 1:44 am

I am not sure how you define OS but it is operating system that should manage both cores efficiently and seamlessly.
That CoopOS what you call is a thread that tries to be isolated from the tasks and memory and must watch not to overlap with any register or pointer values and that's quite shaky environment. So why call it an OS ?

HelWeb · Postby **HelWeb** » Mon Apr 08, 2019 4:27 am

Dear Deouss,
in Germany we say "The tone makes the music".
So I grab all my politeness and try to explain, what seems so difficult to see for you.
First lets look why core without RTOS can make sense.
Here is the challenge:
1. Part
You get a rising edge at an external pin.
You have to set high an output line as an answer. The deadline for this answer is 200ns.
Pull up the line for 250 ns, then drop it down.
_____________________
_____| Input
__________
___________| |_____ Answer Output

+max + precise
200 250
ns ns

max: dealine
precise: +/- 10 ns

The Timings have to be exact !!! Do not tell: Oh, there was a taskswitch, an other external interrupt
with higher priority, ....

I am really interested in your solution!

Here is mine:

A RTOS-task created and pinned to core 1. Disable Interrupts for core 1. (As above)

And this task is able fulfill the demands:

uint32_t irq=0, lastIrq=0;
uint32_t st3, noww;

while(1) {
irq=REG_READ(GPIO_IN_REG) & (1<<12); // 12 is the input line
if (irq != lastIrq) { // 166 ns +/- 10 ns can be garanteed!

REG_WRITE(GPIO_OUT_W1TS_REG,(1<<5)); // pin 5 high reaction:
st3=myGetCycleCount(); // get actual cpu-cycles
lastIrq=irq;

noww=st3;
// do the pulse
while ((noww-st3)<(60-22)) { // 60=244 ns, 120 = 496 ns, 240 = 1.000 µs --- 240MHz
__asm__ __volatile__("esync; rsr %0,ccount":"=a" (noww));
}

REG_WRITE(GPIO_OUT_W1TC_REG,(1<<5)); // end pulse width: --> low
}
}

And it works - even if the RTOS on core 0 is running to hell - because it needs no tick nor taskswitch.

Please test it with a scope AND tell me, how YOU solved the challenge.
Not theoreticaly, but with a source so I can test it !!!

By the way. It is an RTOS task and can be treated as such - using (other than interrupts) the full API.
The concept of isolated CPUs in Linux preempt is very similar.
Well - your solution?

------
2. Part
You can use RTOS also cooperative !
CoopOS is much lighter, but a lot of programs / tasks from RTOS can be easily transformed to CoopOS tasks. So if RTOS is an OS then CoopOS is it as well. The major difference: it is perfect for fast reactions -
not so good in number crunching and yes: the high frequency of task switches use a lot of the CPU capacity. Its good for tasks, which should be called very often.
For example: NeoPixels do have a strict timing: 400/800ns. I built an RTOS task to manage:
Lightning and dimming a color from zero to 0xff and back in 1 second. Than the next color without a delay and so on. I have to stop interrupts for some µs to get it stable.
And with a lot of timer- and external interrupts the timing of RTOS gets more and more undeterministic.
If I transfer it to CoopOS it does'nt matter - interrupts are disabled there.
RTOS is freed from this task - and CoopOS can do much more.
With 250000 taskswitches per second running 10 Tasks and more you can do a lot - just in time.
And you can do a lot of things for wich you would need a lot of timers and interrupts using RTOS alone.
Running CoopOS as a task (without ticks) on one core means:
The most of RTOS API is useable! It is possible to transfer data in both directions.
Yes, there are sometimes real advantages using such a cooperative system.
For instance: An RTOS-Interrupt routine sends a signal to the CoopOS and to another task running on core 0 with:

taskSetSignalFIQ(12); // --> CoopOS
xQueueSendFromISR(gpio_evt_queue, &gpio_num, NULL); // --> RTOS

CoopOS running 12 tasks reacts faster and with less jitter.

Conclusion:
With ONE task on core 1 doing polling you can react much faster than with RTOS alone
or
You have a coopertive system on core 1 where the tasks can react faster - because of the high taskswitch frequency. And there is a simple rule:
Tasks, wich are called very often (but the work is done very fast) gets the highest priority.
No task should work longer than some microseconds.
An RTOS task with high priority will effectively stop all other tasks.

An other example: Rotary dial switches are simple to read: They have a clock pin and a data pin.
If clock changes level: read data pin. If data is low it is turned left, if data is high it is turned right.
That is the theory. Its easy: Interrupt on clock and then read data.
But: if you turn the dial very fast you get a lot of bounces. Hundreds. And with distances down to 500ns.
Ok, you can build some electronics to suppress bouncing. But we want to do it with software.
With CoopOS it is simple. Read the switch 5 times (or more) with the time distance set to 1ms with low priority. With taskswitches every 5 µs it's no matter. Even if there is a jitter of 1 ms ( hard to reach -it means 400 times a taskswitch occures with another task with higher priority is READY ) there is no problem and you get a reliable result.
With RTOS without interrupt: you will not get all events.
With interrupt: how to mange the bursts of interrupts from the bouncing rotary dial contacts ?

The philosopy behind CoopOS as an cooperative multitasking operatingsystem is other than that of an RTOS. But there is nothing - really nothing - that you can do with an RTOS, which you can not do with an cooperative OS.
Some things are no tso efficient, not so deterministic and the percentage lost for taskswitching is much higher- but some things are done much faster. With RTOS you can promise to start a task every 10 ms exactly.
With CoopOS you may have a jitter - from 5 to 50 µs. But you can promise, that this task is called 200 times within 10ms. And that is sometimes much better.
If you have a lot of (isolated) tasks which are not cooperative you need preemtion.
But in some embedded sytems the tasks DO a taskswitch, because they are ready for now. Or they delay for some time, or they wait for somthing. So you may see systems running an RTOS where a timed preempt will never happen - because all tasks behave cooperative in this sense.

What I mean is: Women can do things men cannot do - and vice versa.
And that is the same with RTOS and a cooperative Multitasker. Together they are better than one alone.
And the statement: Cooperative Multitasking could not be an OS is just nonsense. Period.

You will find a lot to this theme in the internet - and the reason is simple:
(Example for you, because you dont like cooperation

- a joke)
If you have 8 cores and 6 are ok to run the Linux - is it possible to run something different on 2 cores ?
It will happen - take it for sure.

If you get more cores it sometimes makes sense to isolate a core from the OS to do something else.
And I am sure: you will see a solution for the ESP32 to run the RTOS on core 0 alone to do all the things like wifi and internet and all the fine things you got drivers for- and be free to run somthing else on "App core" - within ONE year.

Whats about a bet?
And last not least an advice: if you do not understand or have not enough experience - the best way is to ask - and learn and test!

Scientists where sure until some years ago: Bumblebees are not able to fly, it is impossible - but look at them. Mankind needed thousands of years to understand how they do it

PS: Don't forget the challenge !

RTOS running on one core only

RTOS running on one core only

Re: RTOS running on one core only

Re: RTOS running on one core only

Re: RTOS running on one core only

Re: RTOS running on one core only

Re: RTOS running on one core only

Re: RTOS running on one core only

Re: RTOS running on one core only

Re: RTOS running on one core only

Re: RTOS running on one core only

Who is online

About Us

Extra

Information