Optimize DockerfileBase for Improved Efficiency and Reduced Size
This pull request includes three commits that collectively optimize the DockerfileBase. The changes are focused on improving the build process, reducing the Docker image size, and maintaining consistency across the Dockerfile commands. Optimize DockerfileBase for Image Efficiency
- Add the
--no-cache-dirflag topip install --upgrade pipto align with other pip installation commands, prevent caching, and reduce Docker image layer size. - Move Tesseract-OCR files to '/usr/share/tesseract-ocr-original' to prevent redundant operations and improve the efficiency of the backup process.
- Clean the apt cache within the same RUN statement after package installations by including
rm -rf /var/lib/apt/lists/*, ensuring a reduced Docker image size by avoiding the retention of transient package data.
These changes contribute to a more efficient build process and a minimized Docker image, aligning with best practices for Dockerfile maintenance and deployment.
License Agreement for Contributions
By submitting this pull request, I acknowledge and agree that my contributions will be included in Stirling-PDF and that they can be relicensed in the future under MPL 2.0 (Mozilla Public License Version 2.0) license.
(This does not change the general open-source nature of Stirling-PDF, simply moving from one license to another license)
Docker image size before and after the change, a little bit smaller:
REPOSITORY TAG IMAGE ID CREATED SIZE
base after 91bee49112b9 19 hours ago 1.58GB
base before 767036aed57a 20 hours ago 1.64GB
Plan to go with https://github.com/Stirling-Tools/Stirling-PDF/pull/624 which will depreciate this PR
But will merge for now, appreciate the help!