caxa
caxa copied to clipboard
libatomic.so.1: cannot open shared object file: No such file or directory
I don't think this a bug with caxa (it depends on one's point of view!), yet I think it is an issue that caxa users will come across. For one, I am still looking for a convenient solution (that didn't require me to compile Node.js from source).
I have created a caxa-based executable for my app, for ARM v7. Then, when trying to run it in a Docker container for the ARM platform, I got the following error regarding a missing shared library:
$ docker run -it --platform linux/arm/v7 -v ~/Downloads:/mnt/Downloads debian:10 /bin/bash
root@1f0afeb707aa:/# /mnt/Downloads/my-armv7-caxa-app
... node_modules/.bin/node: error while loading shared libraries:
libatomic.so.1: cannot open shared object file: No such file or directory
I believe that the problem is that, when creating the caxa executable, I was using a standard Node.js installation that uses shared / dynamically linked libraries. Then caxa bundled in the base Node.js executable, but not the shared libraries. For the libatomic.so.1 library in particular, the error above can be avoided if the end users of my app install the library before running the caxa-based executable:
apt-get update && apt-get install -y libatomic1
However, at least my use case for caxa is to simplify end users' life by avoiding pre-requisites like installing Node.js (#20), "just download this executable and run it", and if I had to ask end users to install shared libraries before running the caxa executable, it would spoil the experience.
I assume that the solution is to use a fully statically compiled version of Node.js (including libatomic.so.1) when creating the caxa executable. Where to find that though? For all architectures supported by caxa: x64, ARM v6, ARM v7, ARM 64. I gather that the standard Node.js builds offered for download are dynamically linked: https://nodejs.org/en/download/
Hmmm, thatās an interesting predicament.
- Where did you get the Node.js youāre using to run caxa?
- Did you try the Docker images weāre using in our tests (or some other similar images)?
- Do you think caxa could/should embed these dynamic libraries in the binary?
Thanks for looking into this issue @leafac š
I was able to reproduce the error with the arm32v7/node:16 image you've suggested. Here are the "steps to reproduce", generating a trivial caxa executable that prints "howdy":
Produce a test-caxa executable for ARM:
$ docker run -it -v /tmp:/mnt/host --platform linux/arm/v7 arm32v7/node:16 /bin/bash
root@8406f415210f:/# npm install --global --unsafe-perm caxa
added 193 packages, and audited 194 packages in 46s
root@8406f415210f:/# caxa --version
2.0.0
root@8406f415210f:/# cd /mnt/host
root@8406f415210f:/mnt/host# mkdir t
root@8406f415210f:/mnt/host# caxa --input t --output test-caxa --no-dedupe -- '{{caxa}}/node_modules/.bin/node' -e 'console.log("howdy")'
root@8406f415210f:/mnt/host# ./test-caxa
howdy
root@8406f415210f:/mnt/host# exit
exit
Then execute test-caxa on debian:10:
$ docker run -it -v /tmp:/mnt/host --platform linux/arm/v7 debian:10 /bin/bash
root@559b78703a8e:/# cd /mnt/host
root@559b78703a8e:/mnt/host# ./test-caxa
/tmp/caxa/applications/test-caxa/e9qsk4faqz/0/node_modules/.bin/node:
error while loading shared libraries: libatomic.so.1:
cannot open shared object file: No such file or directory
root@559b78703a8e:/mnt/host# apt-get update && apt-get install -y libatomic1
...
The following NEW packages will be installed:
libatomic1
0 upgraded, 1 newly installed, 0 to remove and 1 not upgraded.
root@559b78703a8e:/mnt/host# ./test-caxa
howdy
Note how executing ./test-caxa succeeds in the arm32v7/node:16 image because it already contains the shared libraries (a fully functional installation of Node.js), while it fails in the debian:10 image until/unless the libatomic1 library is installed.
Do you think caxa could/should embed these dynamic libraries in the binary?
If it worked, perhaps! We would not want to interfere with the shared libraries of the target / destination system (e.g. different versions of the shared library), which means that the shared libraries would have to be located in a different folder than the standard, and perhaps env vars used to tell the embedded Node.js executable to look for the libraries in the alternative location.
It sounds to me like it would be cleaner to embed a fully statically compiled Node.js binary in the caxa executable (assuming it would work and avoid the libatomic error), but we'd have to find that somewhere, or create it and then keep it somewhat up to date with Node.js releases upstream. This extra work would arguably be a dent on caxa's advantage over pkg -- no recompiling Node.js from source -- although a non-patched recompiled Node.js would still be better than pkg's patched Node.js. I also wonder how much the Node.js binary would increase in size when fully statically compiled.
@maxb2: Did you run into this issue with the Raspberry Pi builds of Dungeon Revealer? How did you fix it?
If we canāt find a pre-compiled statically linked Node.js for ARM, then I guess the solution would be to come up with one ourselves. I believe thatās outside the scope of caxa as caxaās job is just to package the Node.js you brought to the party. But we could take on such a project. We could use Docker to emulate ARM, use GitHub Actions to run the tasks, and GitHub Releases to distribute. Pretty much the infrastructure we have to compile the stubs. The only part weād have to figure out is the incantations necessary to statically compile Node.js. Also, the builds will take forever. But it sounds doableā¦
@maxb2: Did you run into this issue with the Raspberry Pi builds of Dungeon Revealer? How did you fix it?
I did occasionally run into this on the raspi at runtime. I just installed libatomic the same as @pdcastro. It really just depends on the distro and what the user has already installed.
It's not ideal, but we are talking about Linux users. They probably are fine with installing an extra library. I'm guessing that libatomic is left out of "slim" images.
The only part weād have to figure out is the incantations necessary to statically compile Node.js.
This is actually pretty easy https://stackoverflow.com/questions/17943595/how-to-compile-nodejs-to-a-single-fully-static-binary-file#55736487
Also, the builds will take forever.
Actions have a time limit unfortunately. Emulating arm will also make it take even longer.
Job execution time - Each job in a workflow can run for up to 6 hours of execution time. If a job reaches this limit, the job is terminated and fails to complete.
I'm going to time how long it takes to statically compile node for armv7 on my desktop. I'll check back in when it is done.
Related issue: nodejs/node#37219
Thanks for the information!
On my laptop it took around 2 hours to compile Node.js. I suppose that weād stay under the 6-hour time limit if we were to compile using ARM on GitHub Actions. Worst-case scenario I guess we could run it on one of our machinesā¦
I just hope that itās as simple as that Stack Overflow answer seems to indicate. With these things the devil is always in the detailsā¦
Are yāall interested in taking over this project? I probably wonāt have the opportunity to work on this in the near futureā¦
I tried both emulated and cross-compiling for a static arm build of node. I kept running into new issues. It turned into whack-a-mole. I also don't have time to work on this in the near future.
Yeah, thatās how I thought itād turn out.
Iāll keep the issue open for when someone steps up.
I kept running into new issues. It turned into whack-a-mole.
After whacking enough moles and waiting enough hours, :-) I can share some early, encouraging results.
With very similar Dockerfiles as the one from StackOverflow (linked in an earlier comment), I've got static builds of Node.js v12 (a version I wanted to use) and also "accidentally" the very latest Node.js v17.0.0-pre:
Dockerfile for Node.js v12
FROM alpine:3.11.3
RUN apk add git python gcc g++ linux-headers make
WORKDIR /usr/src/app
ENV NODE_VERSION=v12.22.3
RUN git clone https://github.com/nodejs/node && cd node && git checkout ${NODE_VERSION}
RUN cd node && ./configure --fully-static --enable-static
RUN cd node && make
Dockerfile for Node.js' master branch (v17.0.0-pre on 06 July 2021)
FROM alpine:3.11.3
RUN apk add git python3 gcc g++ linux-headers make
WORKDIR /usr/src/app
RUN git clone https://github.com/nodejs/node
RUN cd node && ./configure --fully-static --enable-static
RUN cd node && make
Note that the two Dockerfiles use different versions of Python and checkout different branches of Node.js. They were built with Docker v20.10.7, a command line similar to:
docker build -t node12-armv7-static-alpine --platform linux/arm/v7 .
In both cases, an ARM node binary was produced that avoided the libatomic.so.1 error:
$ docker run -it -v ~/Downloads:/mnt/Downloads --platform linux/arm/v7 debian:10 /bin/bash
root@fa428ae3d7da:~# ls -l /mnt/Downloads/node*
-rwxr-xr-x 1 root root 75159308 Jul 6 23:30 /mnt/Downloads/node17-arm-v7-built-in-alpine-qemu
-rwxr-xr-x 1 root root 44600464 Jul 6 13:03 /mnt/Downloads/node12-arm-v7-built-in-alpine-qemu
-rwxr-xr-x 1 root root 41230580 Jul 6 14:18 /mnt/Downloads/node12-arm-v7-copied-from-arm32v7-node-12
root@fa428ae3d7da:~# /mnt/Downloads/node12-arm-v7-built-in-alpine-qemu --version
v12.22.3
root@fa428ae3d7da:~# /mnt/Downloads/node17-arm-v7-built-in-alpine-qemu --version
v17.0.0-pre
root@fa428ae3d7da:~# /mnt/Downloads/node12-arm-v7-copied-from-arm32v7-node-12 --version
error while loading shared libraries: libatomic.so.1: cannot open shared object file: No such file or directory
Above, the Node.js binary that produced the libatomic.so.1 error was the dynamically linked Node.js binary copied from the "official" arm32v7/node:12 image.
Note also the file sizes in the ls -l output above. The statically compiled Node.js v12 binary is just over ~3MB larger than the dynamically linked one. I am OK with that. I have not compared the dynamic vs static size difference for other versions of Node.js yet.
I haven't yet tested using the statically compiled Node.js versions with caxa, but the results above are encouraging. :-)
Other notes:
-
make -j4, instead of justmake, specifies 4 compilation jobs in parallel, significantly reducing the overall compilation time if you have at least that many CPU cores. I've found a suggestion formake -j$(nproc), wherenprocreturns the number of CPU cores in the machine. Note that it will also cause the machine's cooling fan to run at maximum speed, and if it is anything like mine, passers-by may confuse it with a jet engine. āļø -
Using
debian(or anything other thanalpine?) as the Dockerfile base image for compiling Node.js statically appears to be a bad idea, becausedebianusesglibcand the compilation will produce warning messages such as:
b_sock.c:(.text+0x271): warning: Using 'gethostbyname' in statically linked applications
requires at runtime the shared libraries from the glibc version used for linking
That sounds like users would not only have to install shared libraries, but also match the version used during Node.js compilation. If so, it would be much worse than a dynamically linked Node.js binary! Googling it, I found this other Node.js issue and comment:
https://github.com/nodejs/help/issues/1863#issuecomment-482852205 [...] static linking to glibc doesn't work in general, google around for reasons. You need to link to a libc like musl if you want to use
--fully-static.
Well that's good to know! alpine uses musl, so alpine sounds like the way to go.
@pdcastro, I adapted what you've done so far at maxb2/static-node-binaries. I've compiled v12, v14, and v16 on my local machine and created releases with the binaries. I've also pushed the final docker images to dockerhub
I doubt that we'll be able to compile these on Github Actions due to the usage limits.
Github Hosted Runner
Hardware specification for Windows and Linux virtual machines:
- 2-core CPU
- 7 GB of RAM memory
- 14 GB of SSD disk space
Hardware specification for macOS virtual machines:
- 3-core CPU
- 14 GB of RAM memory
- 14 GB of SSD disk space
Usage Limits
Job execution time - Each job in a workflow can run for up to 6 hours of execution time. If a job reaches this limit, the job is terminated and fails to complete.
Workflow run time - Each workflow run is limited to 72 hours. If a workflow run reaches this limit, the workflow run is cancelled.
API requests - You can execute up to 1000 API requests in an hour across all actions within a repository. If exceeded, additional API calls will fail, which might cause jobs to fail.
Concurrent jobs - The number of concurrent jobs you can run in your account depends on your GitHub plan, as indicated in the following table. If exceeded, any additional jobs are queued.
It may be possible to set up a self-hosted runner to do the compilation, however it also has limitations:
Self-hosted Runner Limits
- Workflow run time - Each workflow run is limited to 72 hours. If a workflow run reaches this limit, the workflow run is cancelled.
- Job queue time - Each job for self-hosted runners can be queued for a maximum of 24 hours. If a self-hosted runner does not start executing the job within this limit, the job is terminated and fails to complete.
- API requests - You can execute up to 1000 API requests in an hour across all actions within a repository. If exceeded, additional API calls will fail, which might cause jobs to fail.
- Job matrix - A job matrix can generate a maximum of 256 jobs per workflow run. This limit also applies to self-hosted runners.
- Workflow run queue - No more than 100 workflow runs can be queued in a 10 second interval per repository. If a workflow run reaches this limit, the workflow run is terminated and fails to complete.
Someone would have to volunteer some hardware for that though.
I do have a crazy idea to get around the job time limit. We could use timeout to compile as much as possible in ~5.8 hours and then pass the directory along to the next job and so on until it finishes. It would require a bit of juggling to make it work with docker as well.
We could use
timeoutto compile as much as possible in ~5.8 hours and then pass the directory along to the next job and so on until it finishes.
That's clever! Related to this idea:
-
If passing directories along between jobs was not doable, each job could publish a partially built Docker image to DockerHub (I assume it's possible to automate this), say with suffixes
-partial-1,-partial-2and so on, then a final job could use a multistage Dockerfile that pulls the partially built images together and runs a finalmake. -
Possibly an alternative (not necessarily better!) to the
timeoutcommand, each job could compile a number of folders (say half or a quarter) from these:
$ ls -d node/deps/*
node/deps/acorn node/deps/http_parser node/deps/openssl
node/deps/acorn-plugins node/deps/icu-small node/deps/uv
node/deps/brotli node/deps/llhttp node/deps/uvwasi
node/deps/cares node/deps/nghttp2 node/deps/v8
node/deps/cjs-module-lexer node/deps/node-inspect node/deps/zlib
node/deps/histogram node/deps/npm
But the timeout idea and passing directories along sounds simpler. š
each job could publish a partially built Docker image
I think I like this better. It would go something like:
## Job 1
### Get through the configuration
docker build --target configure -t static-node:partial-build
### Start compiling
CONTAINER=$(docker run -it static-node:partial-build timeout 5.8h make -j $(nproc))
### Commit changes
docker commit $CONTAINER static-node:partial-build
### Pass the image to the next job
# TBD, probably just export the image to an artifact that is passed between jobs
## Job 2
### Repeat until finished
CONTAINER=$(docker run -it static-node:partial-build timeout 5.8h make -j $(nproc))
docker commit $CONTAINER static-node:partial-build
I also need to figure out some stopping logic.
Possibly an alternative (not necessarily better!) to the timeout command, each job could compile a number of folders (say half or a quarter) from these:
We'd have to do that manually or specify the make targets. I'd rather not do that.
Granted, this is assuming that the make process can safely resume from a SIGTERM. It should be able to, but there's no guarantee with these huge, complicated projects.
Yāall are doing some awesome work here. I love the hack to work around GitHub Actions time limits. Youāre really pushing the envelope what the toolās supposed to do.
A few questions:
-
Did you look into other CI services that may have more generous time limits for open-source?
-
May I advertise these static binaries on caxaās README?
May I advertise these static binaries on caxaās README?
Absolutely!
Did you look into other CI services that may have more generous time limits for open-source?
Briefly. It's surprisingly difficult to find the time limits for each service.
- Github Actions is 6 hours per job.
- Azure Pipelines is 6 hours per job.
- Travis CI is 50 minutes per job
- Gitlab is 3 hours per project on shared runners. ~~Gitlab offers 50k CI minutes monthly for qualifying open source projects.~~
- CircleCI is 5 hours per job. ~~CircleCI offers 400k credits monthly for qualifying open source projects.~~
~~Gitlab and CircleCI might work if there are no individual job timeouts (I can't find it anywhere), but you have to apply for those programs. I may try them in the future when I have time to apply and learn a new CI system.~~
Breaking the compilation into chunks may be the only option without costing money. Might as well keep it on Github then.
I haven't yet tested using the statically compiled Node.js versions with
caxa, but the results above are encouraging. :-)
Quoting myself, I often say, "if it's not tested, it's broken," and so it is: https://github.com/maxb2/static-node-binaries/issues/6
(The statically compiled Node.js binaries for armv7, using the Dockerfiles proposed earlier in this issue, fail to execute caxa.)
A new chapter in this saga. @maxb2 managed to fix the openssl issue on ARMv7 (maxb2/static-node-binaries#6), š Ā and I went on to test it further. Caxa's code executes all right with the statically compiled Node.js, and my app (experimentally, the balena CLI) gets extracted all right. But when I run certain balena CLI commands, I get: š„
$ balena logs 192.168.10.10
...
Error: Dynamic loading not supported
at Object.Module._extensions..node (internal/modules/cjs/loader.js:1057:18)
...
This error happens on Intel / amd64 as well, not just on ARM. What happens, I gather, is that native node modules cannot be dynamically loaded when Node.js is compiled statically. Native node modules are files with the .node extension, compiled during npm install and saved under the node_modules folder. Searching the web, I see that this issue also affects pkg when the linuxstatic "platform" is selected: https://github.com/vercel/pkg-fetch/issues/205
Talking of pkg, their asset release v3.2 (current latest) includes Node.js binaries for ARMv7, but only statically compiled and unable to load native node modules. And just a week ago they wrote this comment:
https://github.com/vercel/pkg-fetch/issues/205#issuecomment-875523641 Be aware that armv7 platform is supported on the best effort basis. There is only linuxstatic executable, and no further action would be taken.
ARMv7 is important for the balena CLI, so pkg's attitude is not encouraging.
I assume, but I don't know for sure, that it is not possible to enable the feature of dynamically loading native node modules when Node.js is compiled statically. In this case, the approach of using a statically compiled Node.js binary is fundamentally flawed for apps that make use of native node modules, like the balena CLI.
To me, it now sounds like going back to square one:
Do you think caxa could/should embed these dynamic libraries in the binary?
This approach could have complications: The libraries are definitely different between Debian and Alpine. And even if we discarded Alpine and considered only glibc-based distros like Debian and Ubuntu, I wonder if we would have to match the version of glibc (?), or some other library, installed in the system. It might possible, we'd have to investigate.
I've just had a related idea. Instead of bundling the libraries, caxa's Go stub could offer to install them, e.g. automatically executing apt-get install -y libatomic1 with the user's consent (configurable), if it detected that it was missing. If we made a list of shared libraries that are required to run Node.js (like libatomic1) in popular distros -- regardless of the payload app -- then it could be argued that the Go stub has a duty to install them if they are missing, as a bootstrapper for Node.js. And then caxa could also accept a configurable list of extra shared libraries that are needed by the payload app.
Also: If we found that libatomic1 is the only library that needs to be installed, then another solution might be to "fix" that open Node.js issue that Matthew linked, nodejs/node/issues/37219.
Thank yāall for the amazing investigative work. Iām learning so much from you!
Itās too bad that statically linked Node canāt load native modulesā¦
But Iām sure weāll come up with something that works!
I like the idea of using the dynamically linked Node and just installing the missing dependencies as a courtesy to the user. But I propose that we donāt do it in the stub. I believe the stubs should be as simple as possible. First because Iām trying to avoid writing Go š But also we have multiple packaging strategies: the all-popular Go stub, the macOS application bundle (.app) (which probably wouldnāt be affected by this change weāre discussing here), and the brand-new Shell Stub. The simpler the stubs, the easier itās to keep them all in sync.
Hereās what I propose instead: Running Node happens after the extraction, at which point we could run a shell script that prompts the user to install the missing libraries. The beauty of this solution is that itās all in user-land from caxaās perspectiveāyou can get it working today. The downside is that it relies on whatever shell there is on the userās machine, but for something as simple as what we need, probably the lowest common denominator, sh, will suffice. Think of it as an addendum to the Shell Stub.
Of course, once we get something working we can include such script in caxa as a courtesy to the packagers.
I agree that a post-extraction shell script is the right way to go. Would caxa maintain install scripts for the various OSes, or would that be on the packager? These libraries are inconsistently named sometimes.
Iām happy to host and distribute some scripts with caxa for the packagerās convenience, but ultimately itās their responsibility to make sure the scripts work for themāthe scripts become part of their application.
Nice ideas. š Ā Just adding / emphasising that there are still alternatives to be explored further:
- In
./configure --fully-static --enable-static, what exactly is "fully-static" doing? Could the configure script, or other code / script, be slightly amended to allow native node modules to be loaded? - Is
libatomic.so.1the only library that needs to be installed and, if so, could it be avoided by "fixing" Node.js? As discussed in issue nodejs/node/issues/37219
Note also that, in a fresh environment like docker run -it debian:10, one needs to run apt-get update before running apt-get install and, from past experience, apt-get update is a point of failure because of server-side / network errors, or when the base image is very old and the distro discontinues the repositories. And even in normal conditions, it can be a bit slow. For these reasons, I still see advantages in static Node.js binaries that were made to work with native node modules, if it was feasible.
Of course we can also have multiple solutions: the convenience shell script that detects missing shared libraries (*) and installs them, which could work "today", and static Node.js binaries if/when we find a solution to make them work with native node modules, or for apps that don't use native node modules.
(*) Detecting whether shared libraries are missing before attempting apt-get update && apt-get install would improve performance and reliability in cases where the libraries are already installed, which might be the most common scenario for caxa users (?).
Of course, I believe that fixing this issue closer to the source would be ideal. Either by addressing https://github.com/nodejs/node/issues/37219 or by finding the right combination of options for compilation.
Meanwhile, the workaround script weāve been talking about doesnāt necessarily have to apt-get update && apt-get install. I donāt like that idea because what if you put your caxa application into an SD card and then plug it into a Raspberry Pi that is disconnected from the internet? I believe we could just pack our own libatomic in the caxa binary. The script would then copy it to the right location if necessary.
I believe we could just pack our own libatomic in the caxa binary. The script would then copy it to the right location if necessary.
How distro-specific is libatomic, its versions and installation folder? For example, on Debian 10, apt-get install libatomic1 installs these files and soft links:
$ ls -la /usr/lib/x86_64-linux-gnu/libatomic.so.1*
lrwxrwxrwx 1 root root 18 Apr 6 2019 /usr/lib/x86_64-linux-gnu/libatomic.so.1 -> libatomic.so.1.2.0
-rw-r--r-- 1 root root 30800 Apr 6 2019 /usr/lib/x86_64-linux-gnu/libatomic.so.1.2.0
So that's a certain version of libatomic1 (1.2.0), installed on a certain folder (/usr/lib/x86_64-linux-gnu/), as determined by apt-get install for a certain version of Debian. What if other distros, or other versions of Debian, required different versions of libatomic for compatibility with other packages, say gcc? We could find out, and maybe we are lucky and we could find that "all relevant distros" currently place libatomic v1.2.0 on the same folder, but this knowledge would have to be coded in the helper script and refreshed from time to time. And if we are a bit less lucky and caxa copied some version of libatomic to some folder, and then the user subsequently installed an unrelated package that required a different version of libatomic, say apt-get install gcc, then would apt-get install additional libraries, and could there be a conflict? I suppose caxa could bundle a .deb file instead of the .so file, and then it could be installed with dpkg -i. Maybe this is what you had in mind all along! But then would caxa also package .rpm for Fedora, .pkg for Arch Linux and so on? Either the workaround script gets messy (complex), or it is simple but risks messing up with system libraries.
Not saying it couldn't / shouldn't be done, just pointing out some complications and things to consider.
what if you put your caxa application into an SD card and then plug it into a Raspberry Pi that is disconnected from the internet?
Then the user would have to manually run apt-get install libatomic1 beforehand (before running the caxa application) and at a location where the Pi has access to the internet. It would be part of the application's installation instructions to the user. It is a problem, yes, but we have to choose which of the problems are the least bad ones. :-)
apt-get updateis a point of failure because of server-side / network errors, or when the base image is very old and the distro discontinues the repositories
I had mentioned this earlier, but it is not to say that I think we shouldn't do it, just that it is a disadvantage compared to an ideal static Node.js binary that worked with native Node modules. But an even bigger disadvantage is an ideal static Node.js binary that did not exist, :-) or a static Node.js binary that did not work with my application (no support for native Node modules).
Hi yāall,
Thanks for using caxa and for the conversation here.
Iāve been thinking about the broad strategy employed by caxa and concluded that there is a better way to solve the problem. It doesnāt address the issue of statically linked Node.js binaries, but I wonder if thatās still a use case that you need to support š¤
Itās a different enough approach that I think it deserves a new name, and itās part of a bigger toolset that Iām building, which I call Radically Straightforward Ā· Package.
Iām deprecating caxa and archiving this repository. I invite you to continue the conversation in Radically Straightforwardās issues.
Best.