FPU Documentation for s3

srjasz
Posts: 46
Joined: Wed Apr 03, 2024 4:29 pm

FPU Documentation for s3

Postby srjasz » Thu Jun 27, 2024 8:50 pm

I'm looking for documentation on using the FPU on the s3. I searched the Programming Guide and Technical Reference Manual but found nothing.

Thanks

MicroController
Posts: 1702
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: FPU Documentation for s3

Postby MicroController » Fri Jun 28, 2024 8:09 am

What documentation do you want?
Normally you don't explicitly 'use' the FPU, you just let the compiler deal with it.
On the assembly level, the FPU is used via the Xtensa ISA's floating point instructions.

srjasz
Posts: 46
Joined: Wed Apr 03, 2024 4:29 pm

Re: FPU Documentation for s3

Postby srjasz » Mon Jul 08, 2024 2:19 pm

Pretty much the "you just let the compiler deal with it." part. How can I verify that is being used by the compiler. I have a multiplication that is taking very long to process, longer than a 16 bit SPI transfer, which is unimaginably long. I need to verify if it is fact working. I have also heard rumors that it doesn't do division, so I would like to see what other short comings it may have.

Thanks

MicroController
Posts: 1702
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: FPU Documentation for s3

Postby MicroController » Mon Jul 08, 2024 3:29 pm

You can play around with the compiler explorer to see what instruction sequences gcc puts out.

According to the Xtensa ISA, 4.3.11.5 "Divide and Square Root Sequences", a single FP division indeed takes around 25 FP instructions.

Multiplication should be pretty fast, but note that a) ESP-IDF does 'lazy-saving/restoring' of the FPU registers, which means that the first FP instruction executed in a task after a context switch can appear to effectively take a hundred or so clock cycles, and b) no FPU use in an ISR context.

srjasz
Posts: 46
Joined: Wed Apr 03, 2024 4:29 pm

Re: FPU Documentation for s3

Postby srjasz » Tue Jul 09, 2024 3:42 pm

Thanks for the Xtensa document.

The last sentence in you response seems to be only partially correct. I have a single task running on core 0 which is nothing but a while loop. I have a single task running on core 1 which is doing multiplication. The FPU is never activated on core 1. There are no context changes on either core. I would appreciate any ideas you may have.

Thanks again for all you help.

MicroController
Posts: 1702
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: FPU Documentation for s3

Postby MicroController » Tue Jul 09, 2024 4:34 pm

I have a multiplication that is taking very long to process, longer than a 16 bit SPI transfer
Works for me. On my S3, one single-precision floating point multiplication takes 4 CPU clock cycles. (Or 7 when including the transfer of 2 operands and 1 result between address and FPU registers in the count.)

srjasz
Posts: 46
Joined: Wed Apr 03, 2024 4:29 pm

Re: FPU Documentation for s3

Postby srjasz » Wed Jul 10, 2024 8:26 pm

Thanks for the feedback and sticking with me on this issue. I set up to do more incremental measurements and found that the FPU was in fact working and saw the same number of cycles as you. I was leaning in the direction of an FPU problem because the loop that was doing the multiplication was running slower when a task was running on core 1. I found out that actually everything on core 0 ran slower when a task was running on core 1, even though the core 1 task was nothing more than a while() loop. Some things take as much as %50 more cycles, the multiplication loop takes %32 more cycles.

While I have suggested that the FPU was not being used due to a task running on another core, the latest measurements don't support that.

Thanks again for you help.

Who is online

Users browsing this forum: No registered users and 94 guests