BlackLab icon indicating copy to clipboard operation
BlackLab copied to clipboard

Option to return whole sentence as context

Open jan-niestadt opened this issue 6 years ago • 3 comments

Right now, each match is returned with a fixed number of words as context (wordsaroundhit parameter). We would like to have an option to return matches with the whole sentence as context, or possibly even the previous and next sentence too. This would require significant changes in BlackLab, though. We should think about how to go about this.

jan-niestadt avatar Apr 05 '18 08:04 jan-niestadt

dear @jan-niestadt, Has this function been implemented into BlackLab already?

puerdon avatar Mar 11 '21 16:03 puerdon

Unfortunately not. You might be able to get close to what you want by altering your query to explicitly search for sentences:

<s/> containing Q

will yield the same results as query Q by itself, but it will return the whole sentence. The downside is that you don't know where in the sentence the hit was found.

If you want the previous and next sentence as well, you could try:

<s/> (<s/> containing Q) <s/>

It would be better if you could get the hits for Q and specify the context you want to see around the hits as 1 or more sentences. Unfortunately this functionality is not a priority for us right now. We would welcome others contributing a solution though, and would be happy to brainstorm and answer any questions.

jan-niestadt avatar Mar 12 '21 08:03 jan-niestadt

A way to get full sentences and also know where in the sentence the hit occurs:

<s/> containing A:("the" "house")

The token positions of the capture group A should be returned along with the full sentence hits.

jan-niestadt avatar Apr 14 '22 13:04 jan-niestadt