Stirling-PDF icon indicating copy to clipboard operation
Stirling-PDF copied to clipboard

switch images to alpine

Open Zoey2936 opened this issue 1 year ago • 32 comments

License Agreement for Contributions

By submitting this pull request, I acknowledge and agree that my contributions will be included in Stirling-PDF and that they can be relicensed in the future under MPL 2.0 (Mozilla Public License Version 2.0) license.

(This does not change the general open-source nature of Stirling-PDF, simply moving from one license to another license)

Zoey2936 avatar Dec 31 '23 14:12 Zoey2936

Wow this is awesome work I could never get alpine images working due to inexperienced docker skills

Have you been able to test these and the functionality of python/libre office etc?

Frooodle avatar Dec 31 '23 15:12 Frooodle

Just ran a quick run of all 3 docker files

Ultra lite seems to be missing java

Frooodle avatar Dec 31 '23 16:12 Frooodle

Cool with that it seems they all load to the homepage image

I will do some tests on OCR and conversions just to see if anything else wrong in parallel with your testing etc

Frooodle avatar Dec 31 '23 16:12 Frooodle

From my tests (of the full image) everything works, except one thing; If I want to convert a pdf to word, I see in the docker logs: Error: source file could not be loaded and it downloads an empty zip file

Zoey2936 avatar Dec 31 '23 16:12 Zoey2936

doing docker compose i can expose the port but when running in docker desktop i get image Without the Expose 8080

What is the normal expectation for docker files

Frooodle avatar Dec 31 '23 16:12 Frooodle

docker run --rm -it p 8080:8080 works for me on docker desktop

Zoey2936 avatar Dec 31 '23 16:12 Zoey2936

(Meaning i am unable to run it and access 8080)

Frooodle avatar Dec 31 '23 16:12 Frooodle

yeah works via CMD for me as well, just find it odd that docker desktop UI doesnt work with it or let you define ports, so was wondering if this is something non standard

Frooodle avatar Dec 31 '23 16:12 Frooodle

I've never used EXPOSE in a Dockerfile since it does nothing (usefull): https://docs.docker.com/engine/reference/builder/#expose

Zoey2936 avatar Dec 31 '23 16:12 Zoey2936

I've tried again, it downloads now the converted document, but it is unuseable (pdf => html): grafik

Zoey2936 avatar Dec 31 '23 16:12 Zoey2936

Something is defo off with the outputs when compared to old version (You can test here) https://pdf.adminforge.de/pdf-to-word pdf to work on your version gives me image vs image

seems its just loaded the PDF file bytes into .doc format not actual converting

Frooodle avatar Dec 31 '23 17:12 Frooodle

i see libreoffice-core is not avaible, i tried just 'libreoffice' and get a slightly better conversion now.. all 3 pages are overlayed onto single page and rotation is not kept.. so something weird.. but clearly something different about the libre package

image

Frooodle avatar Dec 31 '23 17:12 Frooodle

I've found the error

Zoey2936 avatar Dec 31 '23 17:12 Zoey2936

So right now the conversions are not happening in a way which i am happy with, it seems to throw everything onto single page, not sure why... Until then we wont be able to merge this. However super appreciate your work, will try see whats up with libreoffice

Frooodle avatar Dec 31 '23 17:12 Frooodle

it seems to be a bug in libreoffice, so it will also affect debian/ubuntu when they will update in the future

Zoey2936 avatar Dec 31 '23 18:12 Zoey2936

If it's the version and not general package can we pull the older version?

Frooodle avatar Dec 31 '23 18:12 Frooodle

v7.3.7.2-r0 and earlier of libreoffice works, everything after v7.5.5.2-r0 not, so the bug seems to be added with v7.4 or v7.5

Zoey2936 avatar Dec 31 '23 18:12 Zoey2936

https://packages.ubuntu.com/jammy/libreoffice currently the dokcer image uses v7.3

Zoey2936 avatar Dec 31 '23 19:12 Zoey2936

using alpine:3.17 instead of :latest makes it work for lite and normal, ultra lite can stay at latest, but 3.17 is two versions old...

Zoey2936 avatar Dec 31 '23 19:12 Zoey2936

https://mirror1.hs-esslingen.de/pub/Mirrors/tdf/libreoffice/src/bugs-changelog-tag-libreoffice-7.6.4.1-release-7.6.4.1.log "PDF: Conversion of pdf to docx or doc collapses all content onto one page (tdf#157589) [Kevin Suo]"

Zoey2936 avatar Jan 01 '24 01:01 Zoey2936

https://gitlab.alpinelinux.org/alpine/aports/-/issues/15628

Zoey2936 avatar Jan 01 '24 01:01 Zoey2936

using alpine:3.17 instead of :latest makes it work for lite and normal, ultra lite can stay at latest, but 3.17 is two versions old...

Based on this and you using 3.19 is this PR now blocked on resolution of alpine updating its package manager?

Frooodle avatar Jan 01 '24 16:01 Frooodle

using alpine:3.17 instead of :latest makes it work for lite and normal, ultra lite can stay at latest, but 3.17 is two versions old...

Based on this and you using 3.19 is this PR now blocked on resolution of alpine updating its package manager?

yes blocked until they fix libreoffice, but from what I know, they are fast in fixing this

Zoey2936 avatar Jan 01 '24 16:01 Zoey2936

should now work, I will create a PR when a fixed libreoffice verion from outside edge can be used

Zoey2936 avatar Jan 10 '24 21:01 Zoey2936

btw wondering if i can get your advice, I am trying to get calibre installed on alpine

I get

apk add --no-cache calibre --repository=http://dl-cdn.alpinelinux.org/alpine/edge/testing
fetch http://dl-cdn.alpinelinux.org/alpine/edge/testing/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/community/x86_64/APKINDEX.tar.gz
ERROR: unable to select packages:
  py3-pyqt6-webengine (no such package):
    required by: calibre-7.3.0-r0[py3-pyqt6-webengine]
  qt6-qtwebengine (no such package):
    required by: calibre-7.3.0-r0[qt6-qtwebengine]
  so:libicui18n.so.74 (no such package):
    required by: calibre-7.3.0-r0[so:libicui18n.so.74]
  so:libicuuc.so.74 (no such package):
    required by: calibre-7.3.0-r0[so:libicuuc.so.74]
  so:libpodofo.so.2 (no such package):
    required by: calibre-7.3.0-r0[so:libpodofo.so.2]

As a solution i added

http://dl-cdn.alpinelinux.org/alpine/edge/main
http://dl-cdn.alpinelinux.org/alpine/edge/community

to /etc/apk/repositories

is this okay?

Frooodle avatar Jan 13 '24 13:01 Frooodle

I would do this: /etc/apk/repositories:

https://dl-cdn.alpinelinux.org/alpine/v3.19/main
https://dl-cdn.alpinelinux.org/alpine/v3.19/community

@testing https://dl-cdn.alpinelinux.org/alpine/edge/main
@testing https://dl-cdn.alpinelinux.org/alpine/edge/community
@testing https://dl-cdn.alpinelinux.org/alpine/edge/testing

apk add --no-cache calibre@testing

Zoey2936 avatar Jan 13 '24 13:01 Zoey2936

can I ask when this PR will be merged?-

Zoey2936 avatar Jan 13 '24 13:01 Zoey2936

can I ask when this PR will be merged?-

Once i figure out calibre and wkhtmltopdf I added them as custom installs which users can do post docker image install https://github.com/Stirling-Tools/Stirling-PDF/pull/682/files#diff-8e6e46e251822ef45509ec061b133c5788a5719a0670ac3158b49dbb58e3dab6R69

Frooodle avatar Jan 13 '24 15:01 Frooodle

should I add them to this PR? calibre and wkhtmltopdf?

Zoey2936 avatar Jan 13 '24 15:01 Zoey2936

If you can make it conditional in docker file via some arg in the init scripts or changed to match in the java file i linked then sure!

Frooodle avatar Jan 13 '24 15:01 Frooodle