ocrd_all
ocrd_all copied to clipboard
[no ci] [RFC] Rework build process
I currently try to rework the build process to support more modern Linux distributions and Python versions. Ideally the full test matrix of Ubuntu LTS versions and Python versions ranging from 3.6 to 3.10 should build fine.
My changes address several issues:
- Check early whether the required Tensorflow versions are available and skip those submodules which cannot be built because of missing Tensorflow.
- Use different rules for Python 3.6 and 3.10 when building
ocrd_kraken. - Disable
ocrd_cisunconditionally because it requires an oldcalamari_ocrcausing version conflicts. - Address potential version conflicts for Python modules which are required by different OCR-D processors by using constraints.
- Reduce the number of sub_venvs by installing all modules which work with a recent Tensorflow in the main venv.
- Install Python module
wheelearly in all virtual environments, so it is no longer needed as a dependency. - ... and other changes.
This PR is not for merging, but for discussion of the different aspects with the goal to find a consensus on the right solution.
I wonder whether we really need export PIP ?= pip3 and would prefer to replace $(PIP) by a simple pip. @bertsky, you added that macro in commit 7807ae609458b847c609c3a133467370034e3363. Are there use cases where it is helpful? It is also exported. Are there submodules which need it? I found core and cor-asv-ann which use PIP, but it looks like they also don't need it.
I wonder whether we really need export PIP ?= pip3 and would prefer to replace $(PIP) by a simple pip
We don't need it any more, we did need it when we still had to support 2.7 under certain circumstances. In fact, using anything but pip/pip3 for $(PIP) could cause inconsistencies if people override it with a pip for a different minor version of python than the venv.
Now 3 of 5 Python versions complete the build: https://github.com/stweil/ocrd_all/actions/runs/1993530591.
The remaining failures occur when building ocrd_kraken (see pull request #288 which I included here) and might be fixed in the CI running now.
Now all builds pass.
The updated branch still passes for all builds: https://github.com/stweil/ocrd_all/actions/runs/2027885009.
Latest run passes all builds: https://github.com/stweil/ocrd_all/actions/runs/2029562843.
@bertsky, @kba, I compared the build rules here with those from git master and wrote the complete make all results in two separate virtual environments. The result shows that my build rules reduce the disk usage significantly, especially by reducing the total number of required virtual environments from four to two (one main venv, one sub venv). That would also result in faster builds and smaller docker images and so fix the time problems in CircleCI where the image uploads contribute a significant part.
stweil@ocr-02:~/src/github/OCR-D$ du -msc venv3.7-20220331-*
11605 venv3.7-20220331-master
6876 venv3.7-20220331-tensor
18480 total
stweil@ocr-02:~/src/github/OCR-D$ du -msc venv3.7-20220331-*/*
74 venv3.7-20220331-master/bin
594 venv3.7-20220331-master/build
2 venv3.7-20220331-master/build.log
11 venv3.7-20220331-master/include
2985 venv3.7-20220331-master/lib
1 venv3.7-20220331-master/lib64
3 venv3.7-20220331-master/libexec
1 venv3.7-20220331-master/locale
1 venv3.7-20220331-master/pyvenv.cfg
1 venv3.7-20220331-master/requirements.txt
103 venv3.7-20220331-master/share
7837 venv3.7-20220331-master/sub-venv
74 venv3.7-20220331-tensor/bin
594 venv3.7-20220331-tensor/build
1 venv3.7-20220331-tensor/build.log
11 venv3.7-20220331-tensor/include
4006 venv3.7-20220331-tensor/lib
1 venv3.7-20220331-tensor/lib64
3 venv3.7-20220331-tensor/libexec
1 venv3.7-20220331-tensor/locale
1 venv3.7-20220331-tensor/pyvenv.cfg
1 venv3.7-20220331-tensor/requirements.txt
103 venv3.7-20220331-tensor/share
2087 venv3.7-20220331-tensor/sub-venv
18480 total
Now I also have timing and size results for a local make docker-maximum-cuda-git (single-threaded):
OCR-D:master
0.60user 0.55system 56:52.93elapsed 0%CPU (0avgtext+0avgdata 64456maxresident)k
3848inputs+3024outputs (39major+10597minor)pagefaults 0swaps
ocrd/all maximum-cuda-git d737c47c1285 2 hours ago 32.8GB
stweil:tensorflow
0.71user 0.58system 45:54.85elapsed 0%CPU (0avgtext+0avgdata 67572maxresident)k
2280inputs+2712outputs (5major+11208minor)pagefaults 0swaps
ocrd/all maximum-cuda-git a8fc92d9f811 2 minutes ago 23.7GB
This proof of concept is no longer up to date, so it can be closed.