api icon indicating copy to clipboard operation
api copied to clipboard

Consider adding `max` parameter for Autocomplete request -

Open zimeon opened this issue 2 years ago • 6 comments

We have a min parameter (https://iiif.io/api/search/1.0/#request), there might be utility in a max parameter to omit very common terms in the index

(From editors' meeting 2022-06-10)

zimeon avatar Jun 10 '22 17:06 zimeon

Should this be included in Search 3.0? (thumbs up/down on the comment please)

azaroth42 avatar Jun 28 '22 16:06 azaroth42

I would like to look at examples of other search services that leverage a feature like this to better understand the use case

zimeon avatar Jun 29 '22 13:06 zimeon

Assuming that the terms are ordered by frequency, rather than strictly alphanumeric sort, and stopwords are not excluded, then the top words will likely be useless for autocomplete - 'th' --> the, then, there, these, this, ... and so on.

max could exclude those ... but to be less optimistic, it would be (a) better to have the server exclude these automatically, and (b) the max number would be server/index dependent and hard for the client to predict.

azaroth42 avatar Jun 29 '22 13:06 azaroth42

I think if we wanted to exclude common words for autocomplete as you describe then max would not want to be a number but instead a frequency so that it scales with size of the resource (i.e. nothing that occurs more than 5% of the time). But even then, I think know what this threshold should be would be very difficult in the general case. I agree that this is probably left up to the server though (which might be configured with knowledge of content) and this not put into the spec.

zimeon avatar Jun 29 '22 13:06 zimeon

Where a search service does not exclude stopwords, does it make sense to have a parameter that tells the client to use a specific or custom stopword list? For example, if project A requires a German stopwords list and project B requires a stopwords list of terms that are too common to be meaningful in that context, then the client could use the Search API to load the appropriate stopword list for that project context. Or something like that...

In any event, I don't think I understand how a max parameter would be able to effectively limit terms for Autocomplete and agree with @zimeon's comment.

kirschbombe avatar Jun 29 '22 19:06 kirschbombe

Changing my opinion from :eyes: to :-1: given the discussion

azaroth42 avatar Jun 30 '22 20:06 azaroth42