pdfgrep Handle output with no highlighted matches

Handle output with no highlighted matches

Open imarko opened this issue 3 years ago • 2 comments

Make pdfgrep-current-page-and-match return nil for the match when there is no highlighted match in the output line. pdfgrep-goto-locus already handles this.

Oct 07 '22 22:10 imarko

Normal pdfgrep invocations always include a highlighted match but if the user specifies --color=never or if another tool is used to produce similar output without highlights then pdfgrep.el would throw an error on the line without highlights without this fix.

I am working on creating a script that produces pdfgrep-like output from recoll queries and there are some cases when not all lines will have highlights

Oct 07 '22 22:10 imarko

Below is my pdfgrep-recoll script that can sometimes produce output lines with no highlighted matches, seems to happen for phrase searches in particular:

#! /usr/bin/python
from urllib.parse import urlparse
import sys

from recoll import recoll


class HL():
    def startMatch(self, i):
        return "\33[01;31m\33[K"

    def endMatch(self):
        return "\33[m\33[K"


hl = HL()
query = " ".join(sys.argv[1:])
q = recoll.connect().query()
count = q.execute(query)
for doc in q:
    for snip in q.getsnippets(doc, maxoccs=99999, ctxwords=8, sortbypage=True, methods=hl):
        print("%s:%d:%s" % (urlparse(doc.url).path, snip[0], snip[2]))

Oct 07 '22 22:10 imarko

pdfgrep pdfgrep copied to clipboard

Handle output with no highlighted matches

pdfgrep
pdfgrep copied to clipboard