docspell icon indicating copy to clipboard operation
docspell copied to clipboard

File import task loops infinitely

Open emhl opened this issue 2 years ago • 5 comments

I tried to upload a file via curl, but i made a small syntax error and now it has been trying to import something for the last five hours. cancelling the job through the web gui doesn't work. additionally the joex container spits out weird logs and seems to be restarting constantly. is ther a way to force shutdown jobs from the cli?

the incorrect curl command was of the form:

curl -XPOST -F file=file.pdf https://docspell.example.com/api/v1/open/upload/item/APIKEY
Logs from joex container
	at cats.effect.IOFiber.runLoop(IOFiber.scala:409)
	at cats.effect.IOFiber.asyncContinueSuccessfulR(IOFiber.scala:1338)
	at cats.effect.IOFiber.run(IOFiber.scala:140)
	at cats.effect.unsafe.WorkerThread.run(WorkerThread.scala:523)
Starting unoconv listener
Sep 14, 2022 5:50:12 PM org.jline.utils.Log logr
WARNING: Unable to create a system terminal, creating a dumb terminal (enable debug logging for more information)
2022.09.14 17:50:11:943 io-compute-blocker-4 INFO docspell.config.ConfigFactory.default:44
    Using config from environment variables!
2022.09.14 17:50:13:0000 [io-comp...] [INFO ] docspell.joex.Main.run:37 - 
***>  ______                          _ _
***>  |  _  \                        | | |
***>  | | | |___   ___ ___ _ __   ___| | |
***>  | | | / _ \ / __/ __| '_ \ / _ \ | |
***>  | |/ / (_) | (__\__ \ |_) |  __/ | |
***>  |___/ \___/ \___|___/ .__/ \___|_|_|
***>                      | |
***>                      |_| v0.38.0 (#516cb688)
***>  >
***>  Id:       joex1
***>  Base-Url: http://docspell-joex:7878
***>  Database: jdbc:postgresql://db:5432/dbname
***>  Fts:      Solr(http://docspell-solr:8983/solr/docspell)
***>  Config:   
***>  FileRepo: DefaultDatabase(true)
***>  
2022.09.14 17:50:14:0000 [io-comp...] [INFO ] docspell.store.migrate.FlywayMigrate.createFlyway:34 - Creating Flyway for: List(classpath:db/fixups/postgresql, classpath:db/fixups/common)
2022.09.14 17:50:14:0001 [io-comp...] [INFO ] docspell.store.migrate.FlywayMigrate.runFixups:79 - !!! Running fixup migrations (repair=false)
2022.09.14 17:50:14:0002 [io-comp...] [INFO ] org.flywaydb.core.internal.license.VersionPrinter - Flyway Community Edition 8.5.13 by Redgate
2022.09.14 17:50:14:0003 [io-comp...] [INFO ] org.flywaydb.core.internal.license.VersionPrinter - See what's new here: https://flywaydb.org/documentation/learnmore/releaseNotes#8.5.13
2022.09.14 17:50:14:0004 [io-comp...] [INFO ] org.flywaydb.core.internal.license.VersionPrinter - 
2022.09.14 17:50:14:0005 [io-comp...] [INFO ] org.flywaydb.core.internal.database.base.BaseDatabaseType - Database: jdbc:postgresql://db:5432/dbname (PostgreSQL 14.5)
2022.09.14 17:50:14:0006 [io-comp...] [INFO ] org.flywaydb.core.internal.command.DbValidate - Successfully validated 2 migrations (execution time 00:00.027s)
2022.09.14 17:50:14:0007 [io-comp...] [INFO ] org.flywaydb.core.internal.command.DbMigrate - Current version of schema "public": 1.33.0
2022.09.14 17:50:14:0008 [io-comp...] [INFO ] org.flywaydb.core.internal.command.DbMigrate - Schema "public" is up to date. No migration necessary.
2022.09.14 17:50:14:0009 [io-comp...] [INFO ] docspell.store.migrate.FlywayMigrate.createFlyway:34 - Creating Flyway for: List(classpath:db/migration/postgresql, classpath:db/migration/common)
2022.09.14 17:50:14:0010 [io-comp...] [INFO ] docspell.store.migrate.FlywayMigrate.runMain:59 - !!! Running main migrations (repair=false)
2022.09.14 17:50:14:0011 [io-comp...] [INFO ] org.flywaydb.core.internal.license.VersionPrinter - Flyway Community Edition 8.5.13 by Redgate
2022.09.14 17:50:14:0012 [io-comp...] [INFO ] org.flywaydb.core.internal.license.VersionPrinter - See what's new here: https://flywaydb.org/documentation/learnmore/releaseNotes#8.5.13
2022.09.14 17:50:14:0013 [io-comp...] [INFO ] org.flywaydb.core.internal.license.VersionPrinter - 
2022.09.14 17:50:14:0014 [io-comp...] [INFO ] org.flywaydb.core.internal.database.base.BaseDatabaseType - Database: jdbc:postgresql://db:5432/dbname (PostgreSQL 14.5)
2022.09.14 17:50:14:0015 [io-comp...] [INFO ] org.flywaydb.core.internal.command.DbValidate - Successfully validated 64 migrations (execution time 00:00.042s)
2022.09.14 17:50:14:0016 [io-comp...] [INFO ] org.flywaydb.core.internal.command.DbMigrate - Current version of schema "public": 1.38.2
2022.09.14 17:50:14:0017 [io-comp...] [INFO ] org.flywaydb.core.internal.command.DbMigrate - Schema "public" is up to date. No migration necessary.
2022.09.14 17:50:15:0000 [io-comp...] [INFO ] docspell.analysis.nlp.PipelineCache.create:128 - Clearing StanfordNLP cache after Duration(900000ms) idle time
2022.09.14 17:50:15:0001 [io-comp...] [INFO ] docspell.analysis.nlp.PipelineCache.apply:43 - Creating nlp pipeline cache
2022.09.14 17:50:15:0000 [io-comp...] [INFO ] docspell.backend.ops.ONode.register:52 - Registering node joex1
2022.09.14 17:50:16:0000 [io-comp...] [INFO ] docspell.scheduler.impl.SchedulerImpl.start:133 - Starting scheduler
2022.09.14 17:50:16:0000 [io-comp...] [INFO ] docspell.scheduler.impl.PeriodicSchedulerImpl.start:33 - Starting periodic scheduler
2022.09.14 17:50:15:0001 [io-comp...] [INFO ] docspell.notification.api.EventExchange.consume:48 - Starting up 2 notification event consumers
2022.09.14 17:50:16:0001 [io-comp...] [INFO ] docspell.pubsub.naive.NaivePubSub.subscribe:89 - Adding subscriber for topics: NonEmptyList(Topic(jobs-notify))
2022.09.14 17:50:16:0000 [io-comp...] [INFO ] docspell.pubsub.naive.NaivePubSub.subscribe:89 - Adding subscriber for topics: NonEmptyList(Topic(job-cancel-request))
2022.09.14 17:50:16:0001 [io-comp...] [INFO ] docspell.pubsub.naive.NaivePubSub.subscribe:89 - Adding subscriber for topics: NonEmptyList(Topic(periodic-task-notify))
2022.09.14 17:50:16:0000 [io-comp...] [INFO ] org.http4s.blaze.channel.nio1.NIO1SocketServerGroup - Service bound to address /0.0.0.0:7878
2022.09.14 17:50:16:0001 [io-comp...] [INFO ] org.http4s.blaze.server.BlazeServerBuilder - http4s v0.23.13 on blaze v0.23.12 started at http://0.0.0.0:7878/
2022.09.14 17:50:16:0000 [io-comp...] [INFO ] docspell.scheduler.impl.LogSink.logInternal:41 - >>> 2022-09-14T15:50:16.531Z Info Fidpvm1Hc.../docspell-system/full-text-index/Low: Running full-text-index migrations now (jobId: "Fidpvm1HcML-dFkKBFHhunW-y7spMf5y95V-mnzpyV4EQup", task: "full-text-index", group: "docspell-system", jobInfo: "Fidpvm1Hc.../docspell-system/full-text-index/Low")
2022.09.14 17:50:17:0000 [io-comp...] [INFO ] docspell.scheduler.impl.SchedulerImpl.wrapTask:270 - Job execution successful: Fidpvm1Hc.../docspell-system/full-text-index/Low
2022.09.14 17:50:17:0000 [io-comp...] [INFO ] docspell.scheduler.impl.LogSink.logInternal:41 - >>> 2022-09-14T15:50:17.044Z Info Fidpvm1Hc.../docspell-system/full-text-index/Low: No fulltext search migrations to run. (jobId: "Fidpvm1HcML-dFkKBFHhunW-y7spMf5y95V-mnzpyV4EQup", task: "full-text-index", group: "docspell-system", jobInfo: "Fidpvm1Hc.../docspell-system/full-text-index/Low")
2022.09.14 17:50:17:0001 [io-comp...] [INFO ] docspell.scheduler.impl.LogSink.logInternal:41 - >>> 2022-09-14T15:50:17.046Z Info Fidpvm1Hc.../docspell-system/full-text-index/Low: Job execution successful (jobId: "Fidpvm1HcML-dFkKBFHhunW-y7spMf5y95V-mnzpyV4EQup", task: "full-text-index", group: "docspell-system", jobInfo: "Fidpvm1Hc.../docspell-system/full-text-index/Low")
2022.09.14 17:50:17:0000 [io-comp...] [INFO ] docspell.scheduler.impl.LogSink.logInternal:41 - >>> 2022-09-14T15:50:17.138Z Info r5PT4WeeA.../steiner/process-item/Low: ============ Start processing  ============ (jobId: "r5PT4WeeALS-dK15gLymobc-nTikEt8HBHM-F1f2A6FU4g", task: "process-item", group: "steiner", jobInfo: "r5PT4WeeA.../steiner/process-item/Low")
2022.09.14 17:50:16:0001 [io-comp...] [WARN ] docspell.scheduler.impl.LogSink.logInternal:45 - >>> 2022-09-14T15:50:17.243Z Warn r5PT4WeeA.../steiner/process-item/Low: Found 1 existing item with these files. (jobId: "r5PT4WeeALS-dK15gLymobc-nTikEt8HBHM-F1f2A6FU4g", task: "process-item", group: "steiner", jobInfo: "r5PT4WeeA.../steiner/process-item/Low")
2022.09.14 17:50:17:0001 [io-comp...] [INFO ] docspell.scheduler.impl.LogSink.logInternal:41 - >>> 2022-09-14T15:50:17.689Z Info r5PT4WeeA.../steiner/process-item/Low: Converting file None (text/plain; charset="ISO-8859-1") into a PDF (jobId: "r5PT4WeeALS-dK15gLymobc-nTikEt8HBHM-F1f2A6FU4g", task: "process-item", group: "steiner", jobInfo: "r5PT4WeeA.../steiner/process-item/Low")
Exception in thread "io-compute-blocker-3" java.lang.UnsupportedClassVersionError: com/vladsch/flexmark/util/data/DataHolder has been compiled by a more recent version of the Java Runtime (class file version 55.0), this version of the Java Runtime only recognizes class file versions up to 52.0
	at java.lang.ClassLoader.defineClass1(Native Method)
	at java.lang.ClassLoader.defineClass(ClassLoader.java:757)
	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
	at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
	at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
	at docspell.convert.Conversion$$anon$1.$anonfun$toPDF$1(Conversion.scala:68)
	at cats.effect.IOFiber.next$2(IOFiber.scala:398)
	at cats.effect.IOFiber.runLoop(IOFiber.scala:409)
	at cats.effect.IOFiber.asyncContinueSuccessfulR(IOFiber.scala:1338)
	at cats.effect.IOFiber.run(IOFiber.scala:140)
	at cats.effect.unsafe.WorkerThread.run(WorkerThread.scala:523)
java.lang.UnsupportedClassVersionError: com/vladsch/flexmark/util/data/DataHolder has been compiled by a more recent version of the Java Runtime (class file version 55.0), this version of the Java Runtime only recognizes class file versions up to 52.0
	at java.lang.ClassLoader.defineClass1(Native Method)
	at java.lang.ClassLoader.defineClass(ClassLoader.java:757)
	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
	at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
	at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
	at docspell.convert.Conversion$$anon$1.$anonfun$toPDF$1(Conversion.scala:68)
	at cats.effect.IOFiber.next$2(IOFiber.scala:398)
	at cats.effect.IOFiber.runLoop(IOFiber.scala:409)
	at cats.effect.IOFiber.asyncContinueSuccessfulR(IOFiber.scala:1338)
	at cats.effect.IOFiber.run(IOFiber.scala:140)
	at cats.effect.unsafe.WorkerThread.run(WorkerThread.scala:523)
Logs from web gui
Mi.. 14. September 2022, 12:03: ============ Start processing ============
Mi.. 14. September 2022, 12:03: Not checking for duplicates
Mi.. 14. September 2022, 12:03: Creating new item with 1 attachment(s)
Mi.. 14. September 2022, 12:03: Creating item finished in 14 ms
Mi.. 14. September 2022, 12:03: Not an archive: text/plain; charset="ISO-8859-1"
Mi.. 14. September 2022, 12:03: Converting file None (text/plain; charset="ISO-8859-1") into a PDF
Mi.. 14. September 2022, 12:03: ============ Start processing ============
Mi.. 14. September 2022, 12:03: Not checking for duplicates
Mi.. 14. September 2022, 12:03: Found 1 existing item with these files.
Mi.. 14. September 2022, 12:03: Found 1 attachments. Use only those from task args: Set(steiner/attachmentsource/5cXu83cybwxznzCaPYjVb4Ev6EpaDwAPinkM5pQd2GAg)
Mi.. 14. September 2022, 12:03: Not an archive: text/plain; charset="ISO-8859-1"
Mi.. 14. September 2022, 12:03: Converting file None (text/plain; charset="ISO-8859-1") into a PDF
Mi.. 14. September 2022, 12:03: ============ Start processing ============
Mi.. 14. September 2022, 12:03: Not checking for duplicates
Mi.. 14. September 2022, 12:03: Found 1 existing item with these files.
Mi.. 14. September 2022, 12:03: Found 1 attachments. Use only those from task args: Set(steiner/attachmentsource/5cXu83cybwxznzCaPYjVb4Ev6EpaDwAPinkM5pQd2GAg)
Mi.. 14. September 2022, 12:03: Not an archive: text/plain; charset="ISO-8859-1"
Mi.. 14. September 2022, 12:03: Converting file None (text/plain; charset="ISO-8859-1") into a PDF
Mi.. 14. September 2022, 12:03: ============ Start processing ============
Mi.. 14. September 2022, 12:03: Not checking for duplicates
Mi.. 14. September 2022, 12:03: Found 1 existing item with these files.
Mi.. 14. September 2022, 12:03: Found 1 attachments. Use only those from task args: Set(steiner/attachmentsource/5cXu83cybwxznzCaPYjVb4Ev6EpaDwAPinkM5pQd2GAg)
Mi.. 14. September 2022, 12:03: Not an archive: text/plain; charset="ISO-8859-1"
Mi.. 14. September 2022, 12:03: Converting file None (text/plain; charset="ISO-8859-1") into a PDF
Mi.. 14. September 2022, 12:04: ============ Start processing ============
Mi.. 14. September 2022, 12:04: Not checking for duplicates
Mi.. 14. September 2022, 12:04: Found 1 existing item with these files.
Mi.. 14. September 2022, 12:04: Found 1 attachments. Use only those from task args: Set(steiner/attachmentsource/5cXu83cybwxznzCaPYjVb4Ev6EpaDwAPinkM5pQd2GAg)
Mi.. 14. September 2022, 12:04: Not an archive: text/plain; charset="ISO-8859-1"
Mi.. 14. September 2022, 12:04: Converting file None (text/plain; charset="ISO-8859-1") into a PDF
Mi.. 14. September 2022, 12:04: ============ Start processing ============
Mi.. 14. September 2022, 12:04: Not checking for duplicates
Mi.. 14. September 2022, 12:04: Found 1 existing item with these files.
Mi.. 14. September 2022, 12:04: Found 1 attachments. Use only those from task args: Set(steiner/attachmentsource/5cXu83cybwxznzCaPYjVb4Ev6EpaDwAPinkM5pQd2GAg)
Mi.. 14. September 2022, 12:04: Not an archive: text/plain; charset="ISO-8859-1"
Mi.. 14. September 2022, 12:04: Converting file None (text/plain; charset="ISO-8859-1") into a PDF
Mi.. 14. September 2022, 12:04: ============ Start processing ============
Mi.. 14. September 2022, 12:04: Not checking for duplicates
Mi.. 14. September 2022, 12:04: Found 1 existing item with these files.
Mi.. 14. September 2022, 12:04: Found 1 attachments. Use only those from task args: Set(steiner/attachmentsource/5cXu83cybwxznzCaPYjVb4Ev6EpaDwAPinkM5pQd2GAg)
Mi.. 14. September 2022, 12:04: Not an archive: text/plain; charset="ISO-8859-1"
Mi.. 14. September 2022, 12:04: Converting file None (text/plain; charset="ISO-8859-1") into a PDF
Mi.. 14. September 2022, 12:04: ============ Start processing ============
Mi.. 14. September 2022, 12:04: Not checking for duplicates
Mi.. 14. September 2022, 12:04: Found 1 existing item with these files.
Mi.. 14. September 2022, 12:04: Found 1 attachments. Use only those from task args: Set(steiner/attachmentsource/5cXu83cybwxznzCaPYjVb4Ev6EpaDwAPinkM5pQd2GAg)
Mi.. 14. September 2022, 12:04: Not an archive: text/plain; charset="ISO-8859-1"
Mi.. 14. September 2022, 12:04: Converting file None (text/plain; charset="ISO-8859-1") into a PDF
Mi.. 14. September 2022, 12:04: ============ Start processing ============
Mi.. 14. September 2022, 12:04: Not checking for duplicates
Mi.. 14. September 2022, 12:04: Found 1 existing item with these files.
Mi.. 14. September 2022, 12:04: Found 1 attachments. Use only those from task args: Set(steiner/attachmentsource/5cXu83cybwxznzCaPYjVb4Ev6EpaDwAPinkM5pQd2GAg)
Mi.. 14. September 2022, 12:04: Not an archive: text/plain; charset="ISO-8859-1"
Mi.. 14. September 2022, 12:04: Converting file None (text/plain; charset="ISO-8859-1") into a PDF

emhl avatar Sep 14 '22 15:09 emhl

It seems you are runnig java 8, but you need at least java 11. The infinite loop is because docspell crashes and something (docker compose or systemd?) is restarting the service - but it can never succeed. The jvm docs have a table to match the class version to a jdk version here: https://docs.oracle.com/javase/specs/jvms/se18/html/jvms-4.html#jvms-4.1

eikek avatar Sep 14 '22 17:09 eikek

i used the docker compose script from the official documentation. but this is running on arm64 hardware. are the arm64 dockerfiles outdated?

emhl avatar Sep 14 '22 18:09 emhl

RUN JDKPKG="openjdk11-jre"; \
    if [[ $TARGETPLATFORM = linux/arm* ]]; then JDKPKG="openjdk8-jre"; fi; \

a yes in joex.dockerfile java 8 is imported not java 11 (if running on arm). what's the reason for that?

emhl avatar Sep 14 '22 18:09 emhl

Ah right, the arm64 images are broken in that respect and have been fixed lately (see #1712 and #1713). Back in the day there was no working jdk11 package for alpine on arm64 - that was the reason. Now there is openjdk17 and docspell moved forward to rely on some jdk11+ APIs (only very few). But in your case it is a dependency that has been compiled with jdk11 causing the crash (it is a runtime error that only occurs if a text file is processed and the library is trying to be loaded). I have too full days lately, but I want to create the next release soon. Until then you can either create your own image (you can use the docker files on current master branch), use the SNAPSHOT version or avoid text files :).

To remove the job from the queue, use the job id from the logs (r5PT4WeeALS-dK15gLymobc-nTikEt8HBHM-F1f2A6FU4g) and run this sql:

delete from joblog where jid = '<job-id>';
delete from job where jid = '<job-id>';

eikek avatar Sep 14 '22 19:09 eikek

Oh sorry - I just saw that what I said above applies only to the restserver image 😱 You are right - the joex images must be changed in the same way - sorry for that. So you can't just use the existing dockerfiles, but teh joex one requires a little modification.

eikek avatar Sep 14 '22 19:09 eikek

Hello, I use now the SNAPSHOT version of the dockerfile v0.39.0-SNAPSHOT, but Joex has the same failure,

"com/vladsch/flexmark/util/ast/Node has been compiled by a more recent version of the Java Runtime (class file version 55.0), this version of the Java Runtime only recognizes class file versions up to 52.0"

Is there a version for ARM64 with jdk11+?

gameshacker avatar Oct 24 '22 12:10 gameshacker

@gameshacker sadly not yet, the bug is still existing in the current joex docker images for arm64.

eikek avatar Oct 24 '22 17:10 eikek