vespa icon indicating copy to clipboard operation
vespa copied to clipboard

model.filter/filter is ignored with yql while recall works as expected

Open jobergum opened this issue 1 year ago • 2 comments

Using recall in combination with YQL

Using recall

This is identical to filter, except that recall terms are not exposed to the ranking framework and thus not ranked. As such, one can not use unprefixed terms; they must either be positive or negative

{'yql': 'select id,title,text from sources * where {targetHits:10}nearestNeighbor(embedding, q)', 'query': 'how does the coronavirus respond to changes in the weather', 'ranking.profile': 'dense', 'presentation.format.tensors': 'short-value', 'hits': 3, 'language': 'en', 'timeout': '15s', 'presentation.timing': 'true', 'input.query(q)': 'embed(bge, "Represent this sentence for searching relevant passages: how does the coronavirus respond to changes in the weather")', 'recall': '+text:ARIMA +text:"novel coronavirus illness"', 'tracelevel': 3}

Gives the following correct query tree

{'message': 'sc0.num0 search to dispatch: query=[AND NEAREST_NEIGHBOR {field=embedding,queryTensorName=q,hnsw.exploreAdditionalHits=0,distanceThreshold=Infinity,approximate=true,targetHits=10} |text:arima |text:"novel coronaviru illness"] timeout=14977ms offset=0 hits=3 rankprofile[dense]

Which is the expected query tree. filter (not highlight), and ranking is disabled.

Using model.filter in combination with YQL

{'yql': 'select id,title,text from sources * where {targetHits:10}nearestNeighbor(embedding, q)', 'query': 'how does the coronavirus respond to changes in the weather', 'ranking.profile': 'dense', 'presentation.format.tensors': 'short-value', 'hits': 3, 'language': 'en', 'timeout': '15s', 'presentation.timing': 'true', 'input.query(q)': 'embed(bge, "Represent this sentence for searching relevant passages: how does the coronavirus respond to changes in the weather")', 'filter': '+text:ARIMA +text:"novel coronavirus illness"', 'tracelevel': 3}

It gives the following incorrect query tree (filter is silently dropped)

sc0.num0 search to dispatch: query=[NEAREST_NEIGHBOR {field=embedding,queryTensorName=q,hnsw.exploreAdditionalHits=0,distanceThreshold=Infinity,approximate=true,targetHits=10}] timeout=14960ms offset=0 hits=3 rankprofile[dense]

jobergum avatar Oct 03 '23 07:10 jobergum

Any updates on this @bjorncs ?

jobergum avatar Dec 06 '23 08:12 jobergum

No.

bjorncs avatar Dec 06 '23 08:12 bjorncs