django-elasticsearch-dsl-drf
django-elasticsearch-dsl-drf copied to clipboard
How to create a boolean query with both "should" and "must" clauses?
Questions
Hi @barseghyanartur. First, thanks for this great package. It has been extremely useful.
I was unable to find an answer to my question in the docs or by examining the source code so I figured I'd take a look.
Basically, what I need to do is generate a boolean query where one part of it is in a must
clause and the other is in a should
. More specifically, the query I would like to generate is as such:
"query": {
"bool": {
"must": [
{
"multi_match": {
"fields": [ SOME_FIELDS ],
"operator": "and",
"query": "SOME QUERY TERMS"
}
}
],
"should": [
{
"term": {
"SPECIFIC_FIELD": "SOME QUERY TERMS"
}
}
]
}
},
The reason for the above is to boost a phrase match.
With that being said, whenever I try mixing the following backends:
filter_backends = [
PhraseSearchFilterBackend, # Custom
MultiMatchSearchFilterBackend,
]
What ends up happening is that my term query ends up in a must clause even though I specify matching="should"
pretty much everywhere.
I even debugged this all the way to base.py
where I confirmed matching="should"
yet somehow the final query ends up all in the "must".
Any ideas what I'm doing wrong?
For reference, here is my configuration:
class PaperDocumentView(DocumentViewSet):
document = PaperDocument
permission_classes = [ReadOnly]
serializer_class = PaperDocumentSerializer
pagination_class = LimitOffsetPagination
lookup_field = 'id'
filter_backends = [
PhraseSearchFilterBackend,
MultiMatchSearchFilterBackend,
CompoundSearchFilterBackend,
FacetedSearchFilterBackend,
FilteringFilterBackend,
PostFilterFilteringFilterBackend,
DefaultOrderingFilterBackend,
OrderingFilterBackend,
HighlightBackend,
]
search_fields = {
'doi': {'boost': 3, 'fuzziness': 1},
'title': {'boost': 2, 'fuzziness': 1},
'raw_authors.full_name': {'boost': 1, 'fuzziness': 1},
'abstract': {'boost': 1, 'fuzziness': 1},
'hubs_flat': {'boost': 1, 'fuzziness': 1},
}
multi_match_search_fields = {
'doi': {'boost': 3, 'fuzziness': 1},
'title': {'boost': 2, 'fuzziness': 1},
'raw_authors.full_name': {'boost': 1, 'fuzziness': 1},
'abstract': {'boost': 1, 'fuzziness': 1},
'hubs_flat': {'boost': 1, 'fuzziness': 1},
}
multi_match_options = {
'operator': 'and'
}
post_filter_fields = {
'hubs': 'hubs.name',
}
faceted_search_fields = {
'hubs': 'hubs.name'
}
filter_fields = {
'publish_date': 'paper_publish_date'
}
ordering = ('_score', '-hot_score', '-discussion_count', '-paper_publish_date')
ordering_fields = {
'publish_date': 'paper_publish_date',
'discussion_count': 'discussion_count',
'score': 'score',
'hot_score': 'hot_score',
}
highlight_fields = {
'raw_authors.full_name': {
'field': 'raw_authors',
'enabled': True,
'options': {
'pre_tags': ["<mark>"],
'post_tags': ["</mark>"],
'fragment_size': 1000,
'number_of_fragments': 10,
},
},
'title': {
'enabled': True,
'options': {
'pre_tags': ["<mark>"],
'post_tags': ["</mark>"],
'fragment_size': 2000,
'number_of_fragments': 1,
},
},
'abstract': {
'enabled': True,
'options': {
'pre_tags': ["<mark>"],
'post_tags': ["</mark>"],
'fragment_size': 5000,
'number_of_fragments': 1,
},
}
}
Wondering if someone here can help 🙏
extend base class get_queryset() and define your own queries there, rather than using search-filter-backends
def get_queryset(self):
# getting search param from request
request = self.request
text_raw = request.GET.get("search")
query0 = multi-match query
query1 = match query
query2 = matchphrase
etc...
q1 = Bool(should=[query0, query1, tquery1, dquery1, tquery3, dquery3, item_url_query])
queryset = Search(using=self.client, index=self.index, doc_type=self.document._doc_type.name).query(q1)
return queryset
You will have finer control over your queries with this
This question comes up regularly. I'll add it to the FAQ, but TL;DR:
If you need a combination of ANDs and ORs, use SimpleQueryStringSearchFilterBackend. Check for examples here and in docs.