uwazi
uwazi copied to clipboard
Wrong file name parsing due to missing extension
We are creating entries of this form:
"_id" : ObjectId("668c6173afac83a434cd9beb"),
"entity" : "rnu3ljgi0df",
"type" : "document",
"filename" : "1720476018696wx3osn5bm4.com discussing the case",
"originalname" : "News article from LiveLaw.com discussing the case",
"mimetype" : "application/pdf",
Where the "filename" field is wrongly inferred due to 1) a missing extension in the filename? and 2) a period in the middle of the filename, wrongly using the string after the period as the file extension.
This is creating file not found errors.
Fixes?
- Ensure this kind of entry is not created anymore
- Check out why this is giving file not found exceptions and either fix the data or adapt the retrieve approach.