api icon indicating copy to clipboard operation
api copied to clipboard

autocomplete: avoid trimming trailing spaces in sanitizer

Open missinglink opened this issue 4 years ago • 0 comments

for the autocomplete endpoint specifically, the presence of a trailing whitespace character in the input text has semantic value.

it indicates a word-boundary and that the final word which has been typed was completed and does not potentially represent a prefix.

looking at the sanitiser code, we trim whitespace from both sizes of the input before passing down to the tokenizer and parser functions which could use that information to make better decisions about what sort of queries to generate.

for example, the following queries should return different results, but currently return the same thing:

/v1/autocomplete?layers=country&text=uk

... should return

Ukraine
United Kingdom
/v1/autocomplete?layers=country&text=uk%20

... should *only* return

United Kingdom

missinglink avatar Jul 20 '21 10:07 missinglink