docs icon indicating copy to clipboard operation
docs copied to clipboard

Teedy-import (docker & linux) adds a file once. Subsequent files added fail to add the attachment.

Open oz-glenn opened this issue 3 years ago • 9 comments

Start condition: pdf file waiting in the import folder, teedy running. Start teedy-import. File is succesfully imported. Add another pdf file to the import folder Successive documents are repeatedly created with the name of the pdf, but the pdf is not attached, nor is it deleted. Restarting teedy-import continues creating attachment free repeats of the same document(s). Restarting teedy-server and teedy-import imports the document correctly. Once. Goto add another pdf, repeat each time a document is added to the import folder.

I've checked within the teedy-import container and it can see, move and delete the files in the import folder while teedy-server is refusing to attach the file. No errors shown in the teedy-import log, or the teedy-server container logs. Teedy server monitor shows no errors. This behaviour makes adding files to the watched folder rather pointless.

oz-glenn avatar Feb 10 '22 13:02 oz-glenn

Further information:

If left to its own devices, around 7-15 attachment free documents are created, then 5-7 minutes later the attachment is created, leaving quite a mess behind to be cleaned up.

Teedy-server version: 1.10 Teedy-import: latest. All running on Docker.

oz-glenn avatar Feb 11 '22 10:02 oz-glenn

Reproducer and help needed.

jendib avatar Apr 17 '22 11:04 jendib

Hi, I have the same behavior on a new Debian 11 installation with Docker and Portainer (latest version, can't tell which one at the moment). The pdf documents are stored in a samba share, on an ext4, and eventually processed there and then moved to a backup folder. I've already tried changing permissions or something similar, but all brought no success. I would be very grateful for a solution or a temporary fix, as it is very annoying and I process a lot of files through the importer.

ElektroCoder avatar Jul 05 '22 11:07 ElektroCoder

Hello jendib, I've managed the problem that was causing empty documents to accumulate in the database. However, due to the issue, my database now contains a number of these empty documents. I'm reaching out to kindly ask if anyone could assist me with the PostgreSQL SQL code to identify and remove these empty documents without any attached files. It's crucial for me to clean up the database while ensuring the integrity of valid data remains unaffected. Your continued support and expertise would be immensely valuable to me. Thank you all for your time and consideration.

Edit: Additional Information

I've managed to retrieve the list of documents using the following SQL query:

SELECT * FROM t_document
WHERE doc_idfile_c IS NULL;

This query gives me the documents that I believe are "empty," as they don't have associated files. However, before I proceed with any removal, I'm a bit cautious about potential inconsistencies that might arise. I would greatly appreciate your insights on whether it's safe to proceed with removing these entries directly, or if there are any recommended best practices to follow to ensure the integrity of the database.

Once again, thank you for your assistance and expertise!

ElektroCoder avatar Aug 29 '23 12:08 ElektroCoder

The query is valid, a thing to note is that the field can be updated asynchronously when a file is added to a document, so the query should be executed when no event are waiting because else you could purge documents where a file has been added but the related event is still waiting in the internal queue.

archiloque avatar Aug 29 '23 14:08 archiloque

This sounds like what I experienced and logged in #707

madduck avatar Sep 07 '23 14:09 madduck

It is similar, but in this case it does eventually process the file. If it was a permissions issue it would never have access.

oz-glenn avatar Sep 07 '23 22:09 oz-glenn

I have the same behavior with my new installation. I run

  • teedy 1.11 on docker
  • postgres:13.1-alpine on docker
  • importer on my sambaserver (no docker)

Before I had an installation with h2 database an a non docker server and the same importer on my sambaserver (no docker) This one works flawless... The importer has no changes since then.

I did some debugging but not too successfuly. The Importer usually says "Error loading tags" REQUEST onRequestResponse http://192.168.13.7/api/tag/list 403 which I interpret as an forbidden error. But login was successful before. In this case there is no upload at all

Sometimes the error ist Upload failed for <<Filename>>: null In this case in produces empty documents a described by @oz-glenn an @ElektroCoder

Is there any solution in the meantime? Or can someone help me debugging this?

dhuber1 avatar Oct 04 '23 09:10 dhuber1

Maybe interesting: I did some further testing:

  • Docker sismics/docs:latest, Docker postgres:13.1-alpine => Not working. Problem like described above
  • Docker sismics/V1.11, Docker postgres:13.1-alpine => Not working. Problem like described above
  • Docker sismics/V1.10, Docker postgres:13.1-alpine => Not working. Problem like described above
  • Bare metal. docs-web-1.10.war H2 Database => Works perfectly
  • Bare metal. docs-web-1.10.war Postgres12 (direct installed by apt on Ubuntu) => => Not working. Problem like described above

It sees to have something to do with postgres, but strangely the whole application is working properly. Only import fails. Any ideas?

dhuber1 avatar Oct 04 '23 12:10 dhuber1