lambdr copied to clipboard
Amazon Lambda images now using Amazon Linux 2023, and dnf instead of yum
Just a PSA here! As of last month, the Amazon
base image for Lambda is now built on Amazon Linux 2023.
If you're like me and have been using FROM
without specifying an image version, you might notice that code like this in the Dockerfile no longer works:
RUN yum -y install wget git tar
This is because yum
is no longer included (it's been replaced with dnf
You can either change your code to use dnf
RUN dnf install -y python3
or you can specify Amazon Linux 2 (the previous major version instead):
Just wanted to post in case anyone else got caught out!
Thanks for this! I updated the documentation a few weeks ago to use the al2 tag but let’s leave this issue up for now to help others.
Hey @jimjam-slam and @mdneuzerling, I wonder if you could help me out with something? I'm new to Docker and am trying to learn so that I can be more useful at work when it comes to deploying my R processes.
I appreciate that we could pin to an al2
image, like in the docs - and thanks for updating that.
But if we wanted to be more up-to-date and use e.g. the most recent version of provided
(at time of me posting this), it seems that dnf
isn't just a drop-in replacement for yum
. And I can't figure out how to use the suggested alternatives because there appear to be some more fundamental issues at play.
For example, take this Dockerfile below
# Install R
RUN dnf -y install \
tar \
RUN dnf -y install \
&& wget${R_VERSION}-1-1.x86_64.rpm \
&& dnf -y install R-${R_VERSION}-1-1.x86_64.rpm \
&& rm R-${R_VERSION}-1-1.x86_64.rpm
If I try to build the image above, which is very similar to the one from the docs, it results in this error
=> ERROR [3/3] RUN dnf -y install && wget && dn 0.3s
> [3/3] RUN dnf -y install && wget && dnf -y install R-4.0.3-1-1.x86_64.rpm && rm R-4.0.3-1-1.x86_64.rpm:
0.333 error: No package matches ''
9 |
10 | >>> RUN dnf -y install \
11 | >>> && wget${R_VERSION}-1-1.x86_64.rpm \
12 | >>> && dnf -y install R-${R_VERSION}-1-1.x86_64.rpm \
13 | >>> && rm R-${R_VERSION}-1-1.x86_64.rpm
14 |
ERROR: failed to solve: process "/bin/sh -c dnf -y install && wget${R_VERSION}-1-1.x86_64.rpm && dnf -y install R-${R_VERSION}-1-1.x86_64.rpm && rm R-${R_VERSION}-1-1.x86_64.rpm" did not complete successfully: exit code: 1
This happens because you can't install a remote (or local) .rpm
using the dnf
included in provided.al2023
Amazon Linux 2023 uses dnf as the package manager, replacing yum, which was the default package manager in Amazon Linux 2. AL2023 base image for Lambda uses microdnf as the package manager, which is a standalone implementation of dnf based on libdnf and does not require extra dependencies such as Python. microdnf in provided.al2023 is symlinked as dnf. Note that microdnf does not support all options of dnf. For example, you cannot install a remote rpm using the rpm’s URL or install a local rpm file. Instead, you can use the rpm command directly to install such packages.
Ok, so you can use rpm
directly. Let's change the affected lines in the Dockerfile to this
RUN rpm -i \
&& rpm -i${R_VERSION}-1-1.x86_64.rpm
=> ERROR [3/3] RUN rpm -i && rpm -i 0.7s
> [3/3] RUN rpm -i && rpm -i
0.692 warning: /var/tmp/rpm-tmp.dQ9fG0: Header V4 RSA/SHA256 Signature, key ID 352c64e5: NOKEY
0.693 error: Failed dependencies:
0.693 redhat-release >= 7 is needed by epel-release-7-14.noarch
9 |
10 | >>> RUN rpm -i \
11 | >>> && rpm -i${R_VERSION}-1-1.x86_64.rpm
12 |
ERROR: failed to solve: process "/bin/sh -c rpm -i && rpm -i${R_VERSION}-1-1.x86_64.rpm" did not complete successfully: exit code: 1
Googling redhat-release >= 7 is needed by epel-release-7-14.noarch
, I find that
Extra Packages for Enterprise Linux (EPEL) is a project in the Fedora community with the objective of creating a large array of packages for enterprise-level Linux operating systems. The project has primarily produced RHEL and CentOS packages. AL2 features a high level of compatibility with CentOS 7. As a result, many EPEL7 packages work on AL2. However, AL2023 doesn't support EPEL or EPEL-like repositories.
Unfortunately this is where I find myself at a complete loss as to what to do next. I guess we need a different repository, but the R version supplied by Posit runs on CentOS 7...
Any ideas or resources you can think of? Happy to go away and look further but I need some guidance. And you both seem like two of the best placed people to help.
I'm afraid I don't have any experience yet with dnf
, @jimgar 😞
AL2023 base image for Lambda uses microdnf as the package manager, which is a standalone implementation of dnf based on libdnf and does not require extra dependencies such as Python. microdnf in provided.al2023 is symlinked as dnf. Note that microdnf does not support all options of dnf. For example, you cannot install a remote rpm using the rpm’s URL or install a local rpm file. Instead, you can use the rpm command directly to install such packages.
I wonder if you could get the full-fat version of dnf
on there instead somehow? But I guess that wouldn't fix the RedHat dependency 🤔
I see R-rpm-macros
in this list of AL2023 packages... seems weird to include it if R itself isn't available. Do you see it somewhere in the list?
Oh yeah, here it is:
Does RUN dnf -y install R
Thanks @jimjam-slam for looking into this! I didn't think to look and see if they had R as a supported package already. The version is a little bit older, but it works!
I think one unfortunate side effect is that this AL2023 image isn't based on an existing single Linux distro, it's a bit of a mutt. Where previously I was installing packages into the container with the repo set to
, we can't be guaranteed that using this will work any more. Using
is a lot slower. But it has solved my issue for now. My working Docker image is below (the lines about piputilities
refer to an internal R package that I maintain at work).
# Install R
RUN dnf -y install R
ENV PATH="${PATH}:/opt/R/${R_VERSION}/bin/"
# System requirements for R packages
RUN dnf -y install \
openssl-devel \
COPY . piputilities/
RUN Rscript -e "install.packages(c( \
'DBI', \
'dplyr', \
'glue', \
'httr', \
'jsonlite', \
'lambdr', \
'logger', \
'lubridate', \
'paws.common', \
'', \
'', \
'purrr', \
'readr', \
'RPostgreSQL', \
'stringr', \
'tidyr', \
), repos = '')" && \
Rscript -e "install.packages('piputilities', repos = NULL, type = 'source')" && \
rm -r piputilities
Oof, 4.1.3 is on the old side. But I'm glad it's working for you!
I’m still looking into a working Dockerfile for this because I think for production uses it’s important to be able to control the R version. I might need to reach out to the StackOverflow community because dnf
seems reluctant to install local .rpm files and I can’t work out why.
If I had properly read @jimgar's comment I would have understood that this isn't possible due to the implementation of dnf
in Amazon Linux 2023! Fortunately, I think I have a workaround. We can use rpm
to determine the dependencies, and then install those with dnf
, and then back to rpm
to install.
RUN dnf -y install wget git tar
RUN wget${R_VERSION}-1-1.x86_64.rpm
# system requirements for R
RUN dnf install -y `rpm -qR R-${R_VERSION}-1-1.x86_64.rpm | tr '\n' ' ' `
RUN rpm -i R-${R_VERSION}-1-1.x86_64.rpm \
&& rm R-${R_VERSION}-1-1.x86_64.rpm
ENV PATH="${PATH}:/opt/R/${R_VERSION}/bin/"
# System requirements for R packages
RUN dnf -y install openssl-devel
RUN Rscript -e "install.packages(c('httr', 'jsonlite', 'logger', 'remotes'), repos = '')"
RUN Rscript -e "remotes::install_github('mdneuzerling/lambdr')"
RUN mkdir /lambda
COPY runtime.R /lambda
RUN chmod 755 -R /lambda
RUN printf '#!/bin/sh\ncd /lambda\nRscript runtime.R' > /var/runtime/bootstrap \
&& chmod +x /var/runtime/bootstrap
CMD ["parity"]
It seems to work locally for me, but I'd appreciate a confirmation.
If you're curious, here's the SO question I asked and answered.
Hey @mdneuzerling, thanks for picking this up and figuring out the workaround!
I ran a truncated version of your Dockerfile on my personal machine - up to and including the installation of R packages. It all seems to work up to there, which is the main thing, but obviously without going through the entire process of actually deploying it I can't vouch for the end product in the context of lambdr.
One thing that caught my eye during the build were issues with the locale.
I diverted the build spew into a log. The long and short of it is each time an R package gets installed you see this:
1894 │ #10 21.47 During startup - Warning messages:
1895 │ #10 21.47 1: Setting LC_CTYPE failed, using "C"
1896 │ #10 21.47 2: Setting LC_TIME failed, using "C"
1897 │ #10 21.47 3: Setting LC_MESSAGES failed, using "C"
1898 │ #10 21.47 4: Setting LC_MONETARY failed, using "C"
1899 │ #10 21.47 5: Setting LC_PAPER failed, using "C"
1900 │ #10 21.47 6: Setting LC_MEASUREMENT failed, using "C"
1901 │ #10 21.81 * installing *binary* package 'sys' ...
1902 │ #10 21.88 * DONE (sys)
Packages do install correctly. Running the container interactively and starting R works, and at minimum jsonlite::toJSON(mtcars)
works as expected. But we also get the same warnings about the locale again:
bash-5.2# R
R version 4.3.2 (2023-10-31) -- "Eye Holes"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
During startup - Warning messages:
1: Setting LC_CTYPE failed, using "C"
2: Setting LC_COLLATE failed, using "C"
3: Setting LC_TIME failed, using "C"
4: Setting LC_MESSAGES failed, using "C"
5: Setting LC_MONETARY failed, using "C"
6: Setting LC_PAPER failed, using "C"
7: Setting LC_MEASUREMENT failed, using "C"
I think given the effect locale can have on sorting in R it is worth finding a solution.
I do NOT see the locale issue when using the Dockerfile from the lambdr readme (provided:al2
with R 4.0.3). In case this was an issue with R 4.3.2, I also tried the above Dockerfile (provided:latest
i.e. al2023) but changed the R version to 4.0.3. Still get the locale issue, so this seems to be arising from something else.
(By the way, side note: If we're using the RHEL 9 version of R we should probably use the equivalent package repo, right?
Ah! In the container if you run locale
you can see that the current locale is en_US.UTF-8
. But in the same breath, 'no such file or directory'...
bash-5.2# locale
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
Here's the available locales. All three of them. None of which are en_US.UTF-8.
bash-5.2# locale -a
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_COLLATE to default locale: No such file or directory
Compare to the al2 image
bash-4.2# locale
bash-4.2# locale -a | wc -l
bash-4.2# locale -a | grep "en_US"
Ok - so en_US.utf8
isn't exactly en_US.UTF-8
, but maybe it's close enough that the system and R know what to do? In which case perhaps we need to install locales during the image build?
I wish I knew how to do that, but I've google a shit load with no success! Feel like I'm missing something obvious, and it's getting late, so time to hand this back to you 😁
Sorry for the delay, it's been a hectic week.
I think we can get rid of those locale issues by installing glibc-langpack-en
. So in the Dockerfile I would change the first RUN
line to
RUN dnf -y install glibc-langpack-en wget git tar
Does this work for you?
No need to apologise. It's been a hectic one for me, too. Thanks for coming back to it :)
Adding glibc-langpack-en
works: no more errors about locale, en_US.UTF-8
is set as default, R can see it and is sorting character vectors as expected.
From what I can see, that was the last piece of the puzzle, and the image is now complete. Thanks so much for working on the solution, and the package more generally. It makes having lambda R processes in prod a reality for our workplace. I'm very grateful!
Thank you for the new example using dnf
. It worked perfectly.
Now I want to try to create a ARM64 version as AWS claims it is cheaper and faster compared to the x86 version. Does someone already have a working docker image?
I can install R ARM by building it from source but this takes a very long time (especially on Github Actions). Another alternative is to install r using:
# Set the R version and URL for the ARM64 binary
RUN dnf install -y R-core R-core-devel wget git tar
ENV PATH="${PATH}:/usr/local/bin"
# System requirements for R packages
RUN dnf -y install openssl-devel libcurl-devel
but now my docker image is ~2GB instead of 1GB.
I think those distribution packages (like R-core
and R-core-devel
) come with a lot of extra stuff that is no doubt weighing the Docker image down. It also means not being able to choose the specific R version, which I think is necessary for any production processes.
I can't see any ARM64 images provided by Posit. I think the only option here is to try to compile R from source. Unless you can think of any alternatives, @jimjam-slam @jimgar ?
@mdneuzerling Rocker builds them from source... I think that might be the way to go!
The good news is that R package binaries are available for ARM64 Linux from PPM... for example:
I can't say whether Amazon Linux 2 is different enough from one of the distros that PPM provides binaries for to break them, though!