api
api copied to clipboard
autocomplete: avoid trimming trailing spaces in sanitizer
for the autocomplete endpoint specifically, the presence of a trailing whitespace character in the input text has semantic value.
it indicates a word-boundary and that the final word which has been typed was completed and does not potentially represent a prefix.
looking at the sanitiser code, we trim whitespace from both sizes of the input before passing down to the tokenizer and parser functions which could use that information to make better decisions about what sort of queries to generate.
for example, the following queries should return different results, but currently return the same thing:
/v1/autocomplete?layers=country&text=uk
... should return
Ukraine
United Kingdom
/v1/autocomplete?layers=country&text=uk%20
... should *only* return
United Kingdom