datasette icon indicating copy to clipboard operation
datasette copied to clipboard

Dockerfile: use Ubuntu 20.10 as base

Open tmcl-it opened this issue 4 years ago • 4 comments

This PR changes the main Dockerfile to use ubuntu:20.10 as base image instead of python:3.9.2-slim-buster (itself based on debian:buster-slim).

The Dockerfile is essentially the one from https://github.com/simonw/datasette/issues/1249#issuecomment-803698983 with some additional cleanups to slim it down.

This fixes a couple of issues:

  1. The SQLite version in Debian Buster (2.6.0) doesn't support generated columns
  2. Installing SpatiaLite from the Debian sid repositories has the side effect of also installing updates to libc and libstdc++ from sid.

As a bonus, the Docker image becomes smaller:

$ docker image ls
REPOSITORY                   TAG           IMAGE ID       CREATED       SIZE
datasette                    0.56-ubuntu   f7aca255140a   5 hours ago   212MB
datasetteproject/datasette   0.56          efb3b282f390   13 days ago   258MB

Reproduction of the first issue

$ curl -O https://latest.datasette.io/fixtures.db
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  260k    0  260k    0     0   489k      0 --:--:-- --:--:-- --:--:--  489k

$ docker run -v `pwd`:/mnt datasetteproject/datasette:0.56 datasette /mnt/fixtures.db
Traceback (most recent call last):
  File "/usr/local/bin/datasette", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/datasette/cli.py", line 544, in serve
    asyncio.get_event_loop().run_until_complete(check_databases(ds))
  File "/usr/local/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.9/site-packages/datasette/cli.py", line 584, in check_databases
    await database.execute_fn(check_connection)
  File "/usr/local/lib/python3.9/site-packages/datasette/database.py", line 155, in execute_fn
    return await asyncio.get_event_loop().run_in_executor(
  File "/usr/local/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.9/site-packages/datasette/database.py", line 153, in in_thread
    return fn(conn)
  File "/usr/local/lib/python3.9/site-packages/datasette/utils/__init__.py", line 892, in check_connection
    for r in conn.execute(
sqlite3.DatabaseError: malformed database schema (generated_columns) - near "AS": syntax error

Here is the SQLite version:

$ docker run -v `pwd`:/mnt -it datasetteproject/datasette:0.56 /bin/bash
root@d9220d3b95dd:/# python3
Python 3.9.2 (default, Mar 27 2021, 02:50:26) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlite3
>>> sqlite3.version
'2.6.0'

Reproduction of the second issue

$ docker build . -t datasette --build-arg VERSION=0.55
[...snip...]
The following packages will be upgraded:
  libc-bin libc6 libstdc++6
[...snip...]
Unpacking libc6:amd64 (2.31-11) over (2.28-10) ...
[...snip...]
Unpacking libstdc++6:amd64 (10.2.1-6) over (8.3.0-6) ...
[...snip...]

Both libc and libstdc++ are backwards compatible, so the image still works, but it will result in a combination of libraries and Python versions that exists only in the Datasette image, so it's likely untested. In addition, since Debian sid is an always-changing rolling-release, the versions of libc, libstdc++, Spatialite, and their dependencies change frequently, so the library versions in the Datasette image will depend on the day when it was built.

tmcl-it avatar Apr 12 '21 00:04 tmcl-it

Codecov Report

Merging #1296 (527a056) into main (c73af5d) will decrease coverage by 0.11%. The diff coverage is n/a.

:exclamation: Current head 527a056 differs from pull request most recent head 8f00c31. Consider uploading reports for the commit 8f00c31 to get more accurate results Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1296      +/-   ##
==========================================
- Coverage   91.62%   91.51%   -0.12%     
==========================================
  Files          34       34              
  Lines        4371     4255     -116     
==========================================
- Hits         4005     3894     -111     
+ Misses        366      361       -5     
Impacted Files Coverage Δ
datasette/tracer.py 81.60% <0.00%> (-1.35%) :arrow_down:
datasette/views/base.py 95.01% <0.00%> (-0.42%) :arrow_down:
datasette/facets.py 89.04% <0.00%> (-0.41%) :arrow_down:
datasette/utils/__init__.py 94.13% <0.00%> (-0.21%) :arrow_down:
datasette/renderer.py 94.02% <0.00%> (-0.18%) :arrow_down:
datasette/views/database.py 97.19% <0.00%> (-0.10%) :arrow_down:
datasette/views/table.py 95.88% <0.00%> (-0.07%) :arrow_down:
datasette/views/index.py 96.36% <0.00%> (-0.07%) :arrow_down:
datasette/hookspecs.py 100.00% <0.00%> (ø)
datasette/utils/testing.py 95.38% <0.00%> (ø)
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update c73af5d...8f00c31. Read the comment docs.

codecov[bot] avatar Apr 12 '21 00:04 codecov[bot]

Removing /var/lib/apt and /var/lib/dpkg makes apt and dpkg unusable in images based on this one. Running apt-get clean and removing /var/lib/apt/lists achieves similar size savings.

this PR helps me as removing the /var/lib/apt and /var/lib/dpkg directories breaks my ability to add packages when using datasetteproject/datasette:0.56 as a base image.


Shorterm workaround for me was to use this in my Dockerfile

FROM datasetteproject/datasette:0.56

RUN mkdir -p /var/lib/apt
RUN mkdir -p /var/lib/dpkg
RUN mkdir -p /var/lib/dpkg/updates
RUN mkdir -p /var/lib/dpkg/info
RUN touch /var/lib/dpkg/status

RUN apt-get update # and install your packages etc

camallen avatar Apr 14 '21 12:04 camallen

I have also found that ubuntu has fewer vulnerabilities than the buster based images.

➜  ~ docker pull python:3-buster
➜  ~ trivy image python:3-buster | head                             
2021-04-28T17:14:29.313-0400    INFO    Detecting Debian vulnerabilities...
2021-04-28T17:14:29.393-0400    INFO    Trivy skips scanning programming language libraries because no supported file was detected
python:3-buster (debian 10.9)
=============================
Total: 1621 (UNKNOWN: 13, LOW: 1106, MEDIUM: 343, HIGH: 145, CRITICAL: 14)
+------------------------------+---------------------+----------+------------------------------+---------------+--------------------------------------------------------------+
|           LIBRARY            |  VULNERABILITY ID   | SEVERITY |      INSTALLED VERSION       | FIXED VERSION |                            TITLE                             |
+------------------------------+---------------------+----------+------------------------------+---------------+--------------------------------------------------------------+

blairdrummond avatar May 08 '21 19:05 blairdrummond

As a bonus, the Docker image becomes smaller

That's a huge surprise to me! And most welcome.

simonw avatar May 28 '21 18:05 simonw