Add Support for "track_total_hits" to Accurately Track Total Hits Above 10K
Issue:
Django Haystack currently does not support the track_total_hits parameter, leading to inaccurate hit counts when the total exceeds 10,000 (due to Elasticsearch's default limit).
Suggestion:
We can add an option in the Elasticsearch backend to enable track_total_hits, allowing users to get accurate total hit counts above 10,000.
if settings.TRACK_TOTAL_HITS:
search_kwargs["track_total_hits"] = "true"
This could be added within the search method of elasticsearch_backend.py, allowing users to enable accurate hit tracking.
Reference: Elasticsearch: Track Total Hits
That would be a good first pull-request. I might also want to think about how it could be made optional for a query as track_total_hits is not enabled by default to improve performance and some users might want to be selective about when they use it.
That would be a good first pull-request. I might also want to think about how it could be made optional for a query as
track_total_hitsis not enabled by default to improve performance and some users might want to be selective about when they use it.
If I were to add this feature, I would include a setting in settings.py:
HAYSTACK_TRACK_TOTAL_HITS = False
This would default to False for performance reasons. If set to True, track_total_hits would be enabled in the search method. I'll draft the pull request with these changes and update the documentation accordingly as soon as possible.
Thanks again for the suggestion!