Implementing A Custom ESP32 Runtime Linker

p-rimes · Postby **p-rimes** » Mon Jan 22, 2018 3:00 am

I would like to make a runtime linker which runs on the ESP32, and can receive ELF binaries from the network, perform the necessary relocations to addresses (via a symbol table), and then jump to the relocated code and begin execution.

(I have done this before for ARM Thumb architecture and it worked great!) When I did this before, for development I had a script which exported all functions/globals (I get this from the output of `nm` applied to a previous ELF build). Later, I can strip the table to a list of allowed symbols.

I recall the hard parts being:

Determining the arithmetic for the possible relocations (R_XTENSA_32, R_XTENSA_NONE, etc)
Creating a minimal linker script for producing sane binaries (long-calls, single section, etc)
Figuring out the best compilation options (-pie, -mlongcalls)

If there is anyone with experience with this or can recommend any ideas, I'd be very happy to hear it! Especially the specifics of the relocations, if there is an architecture document besides this one?
https://0x04.net/~mwk/doc/xtensa.pdf

Postby **ESP_igrr** » Mon Jan 22, 2018 6:48 am

Having gone through this exercise some time ago, I recall there wasn't a good reference on Xtensa relocations.
This comment and ones which follow have some info and pointers: https://github.com/jcmvbkbc/gcc-xtensa/ ... -102174847 But mostly, use binutils source code as reference.

p-rimes · Postby **p-rimes** » Tue Jan 23, 2018 5:41 pm

Wow @ESP_igrr, that code (in binutils-gdb) is handy! That definitely seems to document all the arithmetic needed for the possible XTENSA relocs. I do have one question there about the binutils code: when I parse an xtensa ELF binary, I see tiny sections just for the relocs for each section e.g. `.rela.text`. So, I'm thinking that the `rela` refers to the type of relocation that stores all info within the relocation info, and doesn't require reading the address from the `.text` section as part of the arithmetic (which would be `rel`, not `rela`). But the calculation for R_XTENSA_32 in binutils appears like it does do a `rel`? See line 1876 here:
https://sourceware.org/git/gitweb.cgi?p ... HEAD#l1876
Am I missing something? Either the ELF section name is misleading, or binutils is wrong, or I haven't grasped it yet.

Now I have some questions about the resulting ELF binary, specifically some extra sections I wasn't expecting (especially after running `strip --strip-unneeded -g -x` to produce a stripped binary, the sections still remain):

.comment
.xtensa.info
.xt.lit
.xt.prop

I'm sure I can get rid of the first two, just seems odd that `strip --strip-unneeded` didn't do that.

I'm especially curious as to the .xt.lit and .xt.prop sections as I haven't seen those before. I can just reloc them as well, but would be nice to know if I can just strip them and/or what the tradeoff would be. Or if there is a doc about what those sections do I'd be happy to read it! The best I could find so far was this thread (which seems to indicate they are useful, but optional?):
http://lists.linux-xtensa.com/pipermail ... 01150.html

p-rimes · Postby **p-rimes** » Wed Jan 24, 2018 5:45 pm

I was able to address my second question ("producing sane binaries (long-calls, single section, etc)") with the following flags:

Code: Select all

-fno-function-sections \
-fno-data-sections \
-fsingle-precision-constant \
-mtext-section-literals \
-mlongcalls \
-O2

some of these are Xtensa specific (although similar to ARM), which I found from: https://gcc.gnu.org/onlinedocs/gcc/Xtensa-Options.html

I hope that using `-mtext-section-literals` won't cause any issues, I'm not sure what the tradeoff is there.

Finally, I have a couple more questions hopefully someone can answer:

How shall I enforce read-only on the RAM before jumping to it? Do I need to mess with the linker scripts, copy my bits into a specific memory segment, re-use some of the flash loading functions, etc? This seems like a security nightmare, anyway, but I'm interested in trying!
How can I access the GOT/PLT from my loader code? Can I put my own lazy loading code in the stubs? (basically implementing something like `dlsym`, but the libraries come as ELFs from the network.) Actually, maybe `dlsym` is already supported for ESP32 then maybe I am wasting my time re-implementing a loader?

p-rimes · Postby **p-rimes** » Sat Jan 27, 2018 8:22 pm

Alright, I think I have managed to answer (but not yet implement!) my first question above ("How shall I enforce read-only on the RAM before jumping to it?"):

The ESP32 provides an I-bus MMU for "pages" held within SRAM0. There is support for PIDs 2~7 which are unprivileged/read-only for those pages (so I should execute the dynamic code while using one of these PIDs), as well as PIDs 0~1 which are privileged (so I should execute the ELF loader code using these PIDs, to be able to write into IRAM before jumping with a new PID).

Sounds about right?

So, my one remaining question is about whether I can generate a GOT/PLT for certain other symbols within my existing IDF application (e.g. everything in libmain.a), and provide custom PLT resolver stubs to do the initial symbol resolution? (i.e. to create a lazy loader)

p-rimes · Postby **p-rimes** » Sun Jan 28, 2018 2:54 am

After giving it a bit more thought, I believe I specifically want the GOT, and not the PLT. I do not actually want lazy loading for my purposes, instead all the relocations should be performed immediately when loaded, and should abort on any symbol resolution failure (rather than finding out later when the symbol is first used).

I believe that can be achieved (for systems with a dynamic linker) with the linker flag `-z now`, but I seem to be having success with generating a PIE binary (via compile+link flag `-pie`). This generates PIC code, without the PLT indirection. Apparently GCC 6+ has the linker flag `-fno-plt` which would produce exactly what I want (an `so` object without a PLT), but alas it is not supported in GCC 5.

Combined with this approach (GOT-only ELF binary), I believe I should use the MMU to implement the RELRO technique, so that a fully linked GOT is setup by my loader, copied into the MMU pages, and then made read-only, before executing with an unprivileged PID.

I'm so curious @ESP_igrr, what part of this you have already done, and what your results were. It seems very interesting to me, and I actually think with the ESP32's MMUs that the security might be strong enough to run (limited) untrusted code on an ESP32, or even multi-user.

CyCl0ne · Postby **CyCl0ne** » Fri Jan 11, 2019 8:48 am

Hi,

i was thinking about something similiar. How is your progress going?

Cheers
C.

Implementing A Custom ESP32 Runtime Linker

Implementing A Custom ESP32 Runtime Linker

Re: Implementing A Custom ESP32 Runtime Linker

Re: Implementing A Custom ESP32 Runtime Linker

Re: Implementing A Custom ESP32 Runtime Linker

Re: Implementing A Custom ESP32 Runtime Linker

Re: Implementing A Custom ESP32 Runtime Linker

Re: Implementing A Custom ESP32 Runtime Linker

Who is online

About Us

Extra

Information