paperless-ngx-postprocessor icon indicating copy to clipboard operation
paperless-ngx-postprocessor copied to clipboard

Match rule by content of document

Open svenwanzenried opened this issue 11 months ago • 4 comments

I'd like to create a rule which is only executed if in the documents content the string "foo bar" is found. (In my case I want to distingush letters from the same correspondent by keywords in the document)

In my understanding, I would have to define that in the match expression (probably with the match_regex() somehow. But I can not figure out how to do this with a Jinja expression. I'm sorry if this is a too trivial question, but this seems to me like a very common use case, and I wanted to ask it here, so any other beginner in Jinja will find this here.

svenwanzenried avatar Jan 09 '25 07:01 svenwanzenried

You could do this by using the new workflow feature of paperless to add a special tag when a content is found. Then you can hand it over in a rule and remove the tag later again.

MephistoJB avatar Jan 12 '25 16:01 MephistoJB

Thanks @MephistoJB for the tipp. That would certainly work and I probably going to do that. However I think it would be a really nice feature because then I have my postprocessing in one place and not split between Workflows and YAML File.

svenwanzenried avatar Jan 17 '25 14:01 svenwanzenried

I want to second this request. In my case I was trying to set up a simple rule matching by document_type, only to find out from the logs that paperless-ngx apparently assings the document_type after calling the post_consume_script.sh.

If paperless-ngx-postprocessor would support matching by document content, I could offload everything entirely into rulesets.

buschtoens avatar Jul 13 '25 12:07 buschtoens

I submitted a PR exposing content in match and validation_rule: https://github.com/jgillula/paperless-ngx-postprocessor/pull/38

buschtoens avatar Jul 13 '25 14:07 buschtoens