otp icon indicating copy to clipboard operation
otp copied to clipboard

ERL-1475: Allow erlang:load_nif in archives

Open OTP-Maintainer opened this issue 3 years ago • 7 comments

Original reporter: lukas Affected version: Not Specified Component: erts Migrated from: https://bugs.erlang.org/browse/ERL-1475


We want to be able to load nif code from archives.

The fallback approach to this would be to create a temp file that we dlopen and then unlink directly afterward.

For most platforms, we should be able to load the memory directly by using some platform specific features/hacks:

Linux: memfd_create + dlopen of /proc/self/fd/%d
Windows: https://github.com/py2exe/py2exe/blob/master/source/MyLoadLibrary.c
OS X:  https://stackoverflow.com/questions/11821955/load-dynamic-library-from-memory
FreeBSD/DragonFly: https://www.freebsd.org/cgi/man.cgi?query=fdlopen&sektion=3,
Android: https://github.com/google/iree/issues/3845

There is also a bit of discussion here: https://github.com/erlang/otp/pull/3002

OTP-Maintainer avatar Feb 01 '21 08:02 OTP-Maintainer

josevalim said:

To expand what was mentioned on the GitHub issue, the problem with escripts is not only .so files, but also reading any file that might be in priv. There is also a concern about performance about enabling such lookups by default.

Wojtek Mach pointed out to me that the Go community has the same issue in regards to reading static files from inside binaries and they are addressing this [via a FileSystem interface|https://go.googlesource.com/proposal/+/master/design/draft-iofs.md].

Then I realized that Erlang already has something similar via the erl_prim_loader with both efile and inet loaders. I wonder if we could have a third loader, called the "escript" loader. In this case:
 # The "escript" loader would be set by default when running escripts, making sure the cost of escripts only applies to escripts
 # Using archives in regular execution mode (i.e. start the VM as usual and then calling code:prepend_path("foo.ez/ebin")) can either be deprecated or they have to be explicitly enabled by setting loader to "escript" too

This means the features are still there but the price is only paid by whoever needs them.

OTP-Maintainer avatar Feb 03 '21 14:02 OTP-Maintainer

lukas said:

{quote} Then I realized that Erlang already has something similar via the erl_prim_loader with both efile and inet loaders. I wonder if we could have a third loader, called the "escript" loader.{quote}

The loader already works as it should with archives. The problem is that the file API does not work there, so there would need to be an "escript" file server of some sort, but that does not work either as a lot of libraries pass {{raw}} to the {{file:open}} call in order to sidestep the file server.

I have a couple of ideas on how to solve this problem, but they all have their compromises. One idea could be that code:priv_dir/1 returns a relative path that starts with some special token, for instance "@". The path would then looks like this: "@lib/stdlib-1.2.3.4/priv/". This would be a cheap path to check. Not sure how such a path would be handled by filename and friends though.

OTP-Maintainer avatar Feb 05 '21 12:02 OTP-Maintainer

josevalim said:

> The loader already works as it should with archives.

Right. My thought was to streamline these changes:
 # Change the loader so reading from archives only happens under a special mode
 # And later unify the loader and the file server - this means the file:open/2 with the inet loader would be able to read on the parent node too

You have a good point about the raw flag. I also assume files coming from archives/escripts would also have other limitations such as being read-only.

OTP-Maintainer avatar Feb 05 '21 12:02 OTP-Maintainer

This one would allow escripts to embed shared libraries for NIFs, right ? This would be a nice step toward fully independant executables.

galdor avatar Jun 24 '21 07:06 galdor

Any news regarding this issue ? If I am not mistaken, this is the only thing preventing the creation of escripts with NIFs.

galdor avatar Oct 12 '21 07:10 galdor

No news yet. We are planning to do this at some point, but have not gotten around to it yet.

garazdawi avatar Oct 12 '21 07:10 garazdawi

@garazdawi, please take a look as this example, where I implemented the function dlopen_mem() that is an equivalent of dlopen() but loads shared object from memory. Maybe you'll find it helpful to implement this feature in Erlang for loading NIF shared objects from a binary? If I find some spare time I might submit a PR. Perhaps erlang:load_nif/2 should accept Filename::string() | {Filename::string(), NifSO::binary()} as the 1st argument? In case of a tuple, the Filename would be the name of the file passed to the memfd_create(1) call, and the NifSO binary would be the memory content of the shared object to be loaded. Generally speaking, in case of loading from a binary, the Filename doesn't have any significance for Linux kernel >= 3.17, but for earlier versions, shm_open(2) needs to be used, and that function does create an entry with the filename under /dev/shm. Or maybe if would be permissible not to support kernels below 3.17, in which case erlang:load_nif/2 could be simplified to accept the first argument of type: Filename::string() | NifSO::binary()?

saleyn avatar Jul 16 '22 03:07 saleyn