Page 1 of 1

ESP32 Bare Metal Implementation

Posted: Tue Jun 25, 2019 9:03 pm
by berlinetta
Hello all,

I am attempting to create a project in which the APP core is running "bare metal" code (no RTOS). I have been informed that this should be possible, while the PRO core executes the WiFi / BLE stack code running under FreeRTOS.

To this point, I have not found any examples of a "bare metal" implementation. Is anyone aware of some sample code supporting peripherals or the CPU interrupt handling? When developing for most other devices, I have typically seen the tool suite provide assembly-level modules which initialized the vector table (start-up code). Is there any such code available for the ESP32?

Thanks in advance!
Mark

Re: ESP32 Bare Metal Implementation

Posted: Tue Jun 25, 2019 11:11 pm
by PeterR
Why?
I am told that app_main() runs from the startup task (what ever that is) but if you don't create any more tasks is the one task the same as no tasks? Or if you only have one task is that a task?

Re: ESP32 Bare Metal Implementation

Posted: Wed Jun 26, 2019 12:44 am
by WiFive
There have been some threads about it but I have not seen an example that I would use

Re: ESP32 Bare Metal Implementation

Posted: Wed Jun 26, 2019 12:46 pm
by berlinetta
PeterR wrote:
Tue Jun 25, 2019 11:11 pm
Why?
I am told that app_main() runs from the startup task (what ever that is) but if you don't create any more tasks is the one task the same as no tasks? Or if you only have one task is that a task?
There are several reasons for operating under bare metal... the two crucial details for me on this project are as follows:

1) The SPI communications I would like to implement with the host controller have extremely high latency (relatively speaking) when executed with the existing driver implementation due to its use of queues and the reliance on the RTOS. In the interest of efficiency, I would like to create a bare metal driver for the peripheral which is capable of handling communications with direct access to the interrupt controller and peripheral control registers. This would alleviate the need for additional delays in our host controller communications stream and speed up data transfers.

2) The sample code relies heavily on the pre-emptive operating system and the use of memory allocation for its operations. We have a proven development strategy in place for embedded code which is hierarchical and modular in nature, statically declaring required memory resources such that we know if we have any memory issues at compile time. Reliance upon memory allocation can create run-time problems with memory leaks and fragmented memory. As a senior developer, I am astonished at the quantity of horsepower and memory resources available on these latest devices, yet coding practices too often nullify the performance gains and waste power.

As far as tasks go, you are correct... If you do not create any tasks for the RTOS, the main thread will be the only one executing. I am not 100% certain how the FreeRTOS would handle that main thread during execution, should it enter an eternal while loop while executing. I am fairly certain if you had not declared any tasks, it would have no reason to pre-empt your main loop to execute anything else. The caveat here, however, lies in the sample/driver code you reference within your project. Most of those driver calls are establishing tasks and RTOS resources of their own, so your main code would get pre-empted at asynchronous intervals.

Best Regards,
Mark

Re: ESP32 Bare Metal Implementation

Posted: Wed Jun 26, 2019 12:53 pm
by berlinetta
WiFive wrote:
Wed Jun 26, 2019 12:44 am
There have been some threads about it but I have not seen an example that I would use
Thanks for the feedback... unfortunately, I am experiencing the same issue finding any worth while details through the forum and the documentation.

This is not something unique to Espressif. Too many of the manufacturers today are providing source code repositories offering solutions which keep much of the technical stuff "under the hood" and do not lend themselves well to guys like me who would prefer to crank up the efficiency. What is lacking in sample code is not covered in the technical documentation either, so it makes my job a bit more difficult. :roll:

Best Regards,
Mark

Re: ESP32 Bare Metal Implementation

Posted: Fri Jun 28, 2019 7:08 pm
by PeterR
The SPI communications I would like to implement with the host controller have extremely high latency (relatively speaking) when executed with the existing driver implementation due to its use of queues and the reliance on the RTOS.
Understood. I am using an MPC2515 (cost) which has a maximum bus of 10MHz. The MPC2515 has a high converstation rate; (1) Why are you raising an interrupt? (2) Ok then give me the data, etc. It would be difficult to achieve 500Kbps with the ESP SPI latency. Think I got to 400Kbps ish dedicated core.
The sample code relies heavily on the pre-emptive operating system and the use of memory allocation for its operations. We have a proven development strategy in place for embedded code which is hierarchical and modular in nature, statically declaring required memory resources such that we know if we have any memory issues at compile time.
Sure, also safety cultures would frown. Resisting poking what I have seen in safety culture, lets say that I have rarely seen CI.
We have dual core 160MHz+ to achieve what we did 10 years ago with 20Mhz. But we create the application a lot quicker I think (I would stand by the toolset being a lot cheaper).
I think you are ok with the ESP library and/or can preallocate.
For your own stuff, & obviously it does depends on your other constraints, then overloading new with a pool allocator works for me.

I don't know the size of your team but I doubt that (if you stick with ESP) you can compete with the flight hours of the mainstream ESP32 library.
So does that not have you reevaluating approach or silicon?
For myself I wish that ESP would provide a bare metal libary (like CMSIS) and then an OS aware level. Should be easy enough. Call a callback rather than semaphore......

Would be very interested in a low latency SPI driver if you create one.....

Re: ESP32 Bare Metal Implementation

Posted: Mon Jul 01, 2019 7:04 pm
by berlinetta
Thanks for the reply, Peter...
We have dual core 160MHz+ to achieve what we did 10 years ago with 20Mhz. But we create the application a lot quicker I think (I would stand by the toolset being a lot cheaper).
I would agree in part with your statement... provided the user has a good understanding of the development system and how to properly use it. However, there is a learning curve involved.

I have traditionally dealt with bare-metal systems - and yes, some of those were used within safety-related products - so I am not well versed in the use of RTOS. I am also lacking in experience with multi-core devices such as ESP32, so I would have to rely heavily on the established IDF code to handle most of the important stuff "under the hood". We have paid a considerable sum to an outside consultant to put together a proof of concept using the ESP32 to replace an existing solution. Unfortunately, after the many hours spent to derive the code, it is not functional for many reasons. So I am left to ride up the learning curve on my own - I am a firmware design team of '1' in the R&D area of my company.

I began with a review of the ESP32 datasheet and the task of installing the tools in an attempt to build the code that was provided to us by the consultant. My experience with the installation procedure for use with Eclipse was not stellar, and I am even less impressed with the stability of the debugger operation in that environment. After weeks of pouring over documentation, scavenging through the forums and attempting to experiment with the sample code and tool set, my gut was telling me I may be better off operating our application-specific code on the APP core as a bare-metal implementation. I am told that this is possible to do, but my first foray into this experiment is proving to be very tough.
So does that not have you reevaluating approach or silicon?
Yes, it does!

In your reply, you made several references to "preallocate" or use of a "pool allocator" to deal with the SPI communications. I don't understand these references... Could you elaborate on them for me?

Out of curiosity, are you an Espressif employee with intimate knowledge of the architecture, or are you an independent developer? (I'm just wondering if you could answer other questions I have regarding design implementation issues.)

Best Regards,
Mark