paperless-ngx-postprocessor icon indicating copy to clipboard operation
paperless-ngx-postprocessor copied to clipboard

[Discussion] Custom Fields

Open hasechris opened this issue 1 year ago • 8 comments

Hi,

i wanted to ask in regards to custom fields because I could not find any mention of custom fields in the postprocessor (wether or not supported in any way).

My vision: I'm writing two dates on documents regarding when they got to me and if the document is an invoice when i payed it (if i have to pay manually).

The OCR can already recognize my handwriting today. In my mind i would have written a postprocessor rule which detects this text and copies it into custom fields. Sadly the last part is just not documented if supported and how to do it.

hasechris avatar Feb 06 '24 12:02 hasechris

Hi there! Custom fields aren't currently supported, although I think it's a brilliant idea. (This project was started long before Paperless-ngx had support for custom fields, which is why they're not supported.)

I'm a little too busy to code this up myself right now, but I'd be happy to accept pull requests if anyone wanted to try for themselves.

jgillula avatar Feb 06 '24 18:02 jgillula

I found this new issue on paperless repository: https://github.com/paperless-ngx/paperless-ngx/discussions/5482

Tomb01 avatar Mar 24 '24 15:03 Tomb01

Hi @Tomb01,

yeah, thats not what I'm searching for, sorry :laughing: The PR just brings in the possibility to define the custom fields while uploading a new document. I want to set custom fields matching content in the document in this postprocessor. Also i'm already working on it, sadly i had to pause the work regarding this FR because of life and other things. But now im back on track and will upload a PR in the next 1-2 weeks.

hasechris avatar Apr 07 '24 23:04 hasechris

Hey @jgillula (or anyone who has the time to help me),

i now have a kinda working version - see my repo https://github.com/hasechris/paperless-ngx-postprocessor on the main branch.

Sadly i cant continue because i have a problem which i just cant find my mistake - I'm still making babysteps in python. When i debug my code everything works for custom fields until the value change in the file https://github.com/hasechris/paperless-ngx-postprocessor/blob/main/paperlessngx_postprocessor/postprocessor.py#L313.

The old metadata is in the variable metadata_in_filename_format and in line 313 the new variable new_metadata_in_filename_format should get filled with the changed metadata. Sadly in my branch also the old metadata variable is changed and I cant find out why.

State after Line 309 was executed: image

State after line 313 was executed: image

See the changed date also in the upper variable. It seems there is somewhere a link/pointer for the custom_fields object, but i cant find it.

Diving further in - line 232 in the same file - my code: image This screenshot is the state after line 232 got executed. This code is just copied from your code and i sadly dont understand why this is working for the other metadata parameters but not for the custom field.

Could you take a look at this?

My example Filter is the last one in the file rulesets.d/example.yml. Also i have a document in my paperless server with the string "eingegangen 30.01.2024" in it. This matches the regex.

hasechris avatar Apr 26 '24 02:04 hasechris

Apologies for the delay in responding. I have a hunch that the issue is the copy on line 285.

copy just does a shallow copy, and since all the objects in the metadata_in_filename_format are simple objects (like strings, or dates), those don't get changed. But since the custom fields are dicts themselves, they get changed. (More info at https://docs.python.org/3/library/copy.html)

I think the solution is to change that copy to a deepcopy. (And TBH, I think this was a bug you found and even if you weren't making your changes, it should probably have been a deepcopy from the beginning.)

I don't have a test setup handy to check myself; could you try making that change and see if it fixes your problem? (Also you'll probably need to change line 143)

jgillula avatar May 10 '24 22:05 jgillula

Nice one, maybe i am doing an implementation of this feature in the next weeks. Coming back to you if it got some traction...

apachler avatar Aug 07 '24 18:08 apachler

ugh yeah sorry... Life got in again and took me much time away. @jgillula i tried the deepcopy, but for some reason it was doing exactly the same as a normal copy and some more hours did not get me anywhere.

Wanted to invest more time but slowly forgot after 1-2 weeks. Thanks for reminding me again

hasechris avatar Aug 08 '24 23:08 hasechris

Merry Christmas everyone,

I finally finished the work on the custom_field support. PR is open and now waiting for release checks (code review etc.)

Hoping we can get this soon released.

hasechris avatar Dec 24 '24 00:12 hasechris