ols4 icon indicating copy to clipboard operation
ols4 copied to clipboard

Make OLS API exactMatch=false accent insensitive

Open Angatar opened this issue 1 month ago • 1 comments

New feature description. The current exactMatch option in the V2 Search API allows for case-insensitive queries but remains sensitive to accents. For example, searching for: https://www.ebi.ac.uk/ols4/api/v2/ontologies/ictv/classes?page=0&size=20&search=sabi%C3%A1%20virus&exactMatch=false&includeObsoleteEntities=true

returns the expected result for “Sabiá virus”.

However, searching without the accent:

https://www.ebi.ac.uk/ols4/api/v2/ontologies/ictv/classes?page=0&size=20&search=sabia%20virus&exactMatch=false&includeObsoleteEntities=true

returns no match.

This limitation can affect users who do not type or paste exact diacritics in their search terms, particularly for taxonomic names with accented characters.

Use cases

  • Users searching virus taxonomic names often type queries without accents (e.g., “Sabia virus” instead of “Sabiá virus”), especially when data originate from sources that do not preserve diacritics.

  • Web tools, portals, or pipelines integrating the OLS API may rely on simplified input without accent normalization.

  • Improving accent-insensitive search would reduce false negatives in user queries and enhance usability for non-native keyboards.

User communities Broader communities using OLS for scientific ontologies in multilingual environments where accent-insensitive search improves accessibility.

Describe the solution you'd like

Extend the behavior of exactMatch=false to also normalize accents (e.g., “á” → “a”) during search processing.

Describe alternatives you've considered Alternatively, introduce a new parameter (e.g., accentInsensitive=true) to explicitly enable this normalization.

Angatar avatar Oct 20 '25 13:10 Angatar