ocrd_all [no ci] [RFC] Rework build process

I currently try to rework the build process to support more modern Linux distributions and Python versions. Ideally the full test matrix of Ubuntu LTS versions and Python versions ranging from 3.6 to 3.10 should build fine.

My changes address several issues:

Check early whether the required Tensorflow versions are available and skip those submodules which cannot be built because of missing Tensorflow.
Use different rules for Python 3.6 and 3.10 when building ocrd_kraken.
Disable ocrd_cis unconditionally because it requires an old calamari_ocr causing version conflicts.
Address potential version conflicts for Python modules which are required by different OCR-D processors by using constraints.
Reduce the number of sub_venvs by installing all modules which work with a recent Tensorflow in the main venv.
Install Python module wheel early in all virtual environments, so it is no longer needed as a dependency.
... and other changes.

This PR is not for merging, but for discussion of the different aspects with the goal to find a consensus on the right solution.

Mar 10 '22 11:03 stweil

I wonder whether we really need export PIP ?= pip3 and would prefer to replace $(PIP) by a simple pip. @bertsky, you added that macro in commit 7807ae609458b847c609c3a133467370034e3363. Are there use cases where it is helpful? It is also exported. Are there submodules which need it? I found core and cor-asv-ann which use PIP, but it looks like they also don't need it.

Mar 10 '22 11:03 stweil

I wonder whether we really need export PIP ?= pip3 and would prefer to replace $(PIP) by a simple pip

We don't need it any more, we did need it when we still had to support 2.7 under certain circumstances. In fact, using anything but pip/pip3 for $(PIP) could cause inconsistencies if people override it with a pip for a different minor version of python than the venv.

Mar 16 '22 09:03 kba

Now 3 of 5 Python versions complete the build: https://github.com/stweil/ocrd_all/actions/runs/1993530591.

The remaining failures occur when building ocrd_kraken (see pull request #288 which I included here) and might be fixed in the CI running now.

Mar 16 '22 19:03 stweil

Now all builds pass.

Mar 19 '22 09:03 stweil

The updated branch still passes for all builds: https://github.com/stweil/ocrd_all/actions/runs/2027885009.

Mar 23 '22 12:03 stweil

Latest run passes all builds: https://github.com/stweil/ocrd_all/actions/runs/2029562843.

Mar 24 '22 14:03 stweil

@bertsky, @kba, I compared the build rules here with those from git master and wrote the complete make all results in two separate virtual environments. The result shows that my build rules reduce the disk usage significantly, especially by reducing the total number of required virtual environments from four to two (one main venv, one sub venv). That would also result in faster builds and smaller docker images and so fix the time problems in CircleCI where the image uploads contribute a significant part.

stweil@ocr-02:~/src/github/OCR-D$ du -msc venv3.7-20220331-*
11605	venv3.7-20220331-master
6876	venv3.7-20220331-tensor
18480	total
stweil@ocr-02:~/src/github/OCR-D$ du -msc venv3.7-20220331-*/*
74	venv3.7-20220331-master/bin
594	venv3.7-20220331-master/build
2	venv3.7-20220331-master/build.log
11	venv3.7-20220331-master/include
2985	venv3.7-20220331-master/lib
1	venv3.7-20220331-master/lib64
3	venv3.7-20220331-master/libexec
1	venv3.7-20220331-master/locale
1	venv3.7-20220331-master/pyvenv.cfg
1	venv3.7-20220331-master/requirements.txt
103	venv3.7-20220331-master/share
7837	venv3.7-20220331-master/sub-venv
74	venv3.7-20220331-tensor/bin
594	venv3.7-20220331-tensor/build
1	venv3.7-20220331-tensor/build.log
11	venv3.7-20220331-tensor/include
4006	venv3.7-20220331-tensor/lib
1	venv3.7-20220331-tensor/lib64
3	venv3.7-20220331-tensor/libexec
1	venv3.7-20220331-tensor/locale
1	venv3.7-20220331-tensor/pyvenv.cfg
1	venv3.7-20220331-tensor/requirements.txt
103	venv3.7-20220331-tensor/share
2087	venv3.7-20220331-tensor/sub-venv
18480	total

Mar 31 '22 20:03 stweil

Now I also have timing and size results for a local make docker-maximum-cuda-git (single-threaded):

OCR-D:master
0.60user 0.55system 56:52.93elapsed 0%CPU (0avgtext+0avgdata 64456maxresident)k
3848inputs+3024outputs (39major+10597minor)pagefaults 0swaps
ocrd/all                  maximum-cuda-git   d737c47c1285   2 hours ago     32.8GB

stweil:tensorflow
0.71user 0.58system 45:54.85elapsed 0%CPU (0avgtext+0avgdata 67572maxresident)k
2280inputs+2712outputs (5major+11208minor)pagefaults 0swaps
ocrd/all                  maximum-cuda-git   a8fc92d9f811   2 minutes ago   23.7GB

Apr 02 '22 21:04 stweil

This proof of concept is no longer up to date, so it can be closed.

Jun 16 '23 20:06 stweil

ocrd_all ocrd_all copied to clipboard

[no ci] [RFC] Rework build process

ocrd_all
ocrd_all copied to clipboard