typesense icon indicating copy to clipboard operation
typesense copied to clipboard

Feature Request - Total hits for each group

Open felipehertzer opened this issue 2 years ago • 5 comments

Description

Would it be possible to add a count of hits when grouped by a column, currently there is a column 'found' but it only counts the total of the search. Would it be possible to get the total of hits without bringing all the hits?

Steps to reproduce

{
 "q": "test",
 "query_by": "title",
 "group_by": "author"
}

Expected Behavior

"grouped_hits": [{
  "group_key": [
    [
	    "Test"
    ]
  ],
  "hits": [...],
  "found": 10  //add this line
},
"out_of": 1727558,
"page": 1,
"found": 120
]

Actual Behavior

"grouped_hits": [{
  "group_key": [
    [
	    "Test"
    ]
  ],
  "hits": [...]
},
"out_of": 1727558,
"page": 1,
"found": 120
]

Metadata

Typsense Version: 0.23.0.rc55

OS: MacOS

felipehertzer avatar Apr 20 '22 01:04 felipehertzer

The current found column already counts the number of groups found. For e.g. if there are 6 records, from 2 brands (3 each), then grouping by brand would produce a found value of 2.

kishorenc avatar Apr 20 '22 02:04 kishorenc

I mean, the number of hits for each brand. I want to know how many hits each group has.

group_key brand 1 - 10hits group_key brand 2 - 30hits group_key brand 3 - 25hits

felipehertzer avatar Apr 20 '22 02:04 felipehertzer

Got it, will add to our roadmap.

kishorenc avatar Apr 21 '22 08:04 kishorenc

Hey, I have two questions here:

  1. Is it possible to get the count per group (as this issue asks for) yet?
  2. The found column counts the number of groups. However, the out_of field counts the total number of records, rather than the number of groups. Is there a way to get the total number of found records when using group_by?

nandorojo avatar Aug 17 '22 21:08 nandorojo

Neither of those are implemented yet. These are useful asks, just a matter of getting around to implementing them.

kishorenc avatar Aug 19 '22 11:08 kishorenc

Two questions:

  1. Has anything been implemented in this regard.
  2. Once its possible to get this requested hits per group, it would be great to be able to sort the results according to this hits count. What would be handy is to show the group with the most hits first in the results and then descending from that ...

thanks for this wonderful search engine <3

dorjeduck avatar Nov 14 '22 23:11 dorjeduck

This is ready to be tested on recent 0.25 RC build (e.g. typesense/typesense:0.25.0.rc20).

  1. The JSON response for grouped hits now have a found field that shows total groups found for each group.
  2. Sorting can be done via _group_found:desc sort by clause, e.g. "sort_by": "_group_found:desc"

kishorenc avatar Apr 03 '23 12:04 kishorenc

It appears to be returning an incorrection count for "found" when using an "||" (or) query.

For example: https://rqo7efp4k1m3xujwp-1.a1.typesense.net/collections/products/documents/search?query_by=NAME,+DIETS,+SEARCH_KEYWORDS,+DEPARTMENT,+CLASS,+SUBCLASS,+PRODUCT_TYPE,+FAMILY&facet_by=DEPARTMENT,+CLASS,+SUBCLASS,+PRODUCT_TYPE&group_by=BRAND&group_limit=3&max_facet_values=30&per_page=100&sort_by=_text_match(buckets:4):desc,+SALES:desc&q=condom&filter_by=IS_DISCONTINUED:FALSE+%26%26+STATUS:confirmed+%26%26+LOCATION_IDS:[330]%26%26+DEPARTMENT:[Smoke+Shop]%7C%7C+CLASS:[Sexual+Health]%7C%7C+SUBCLASS:[Contraceptive+Tablets]

returns 13 for the first group (Durex)

whereas https://rqo7efp4k1m3xujwp-1.a1.typesense.net/collections/products/documents/search?query_by=NAME,+DIETS,+SEARCH_KEYWORDS,+DEPARTMENT,+CLASS,+SUBCLASS,+PRODUCT_TYPE,+FAMILY&facet_by=DEPARTMENT,+CLASS,+SUBCLASS,+PRODUCT_TYPE&group_by=BRAND&group_limit=3&max_facet_values=30&per_page=100&sort_by=_text_match(buckets:4):desc,+SALES:desc&q=condom&filter_by=IS_DISCONTINUED:FALSE+%26%26+STATUS:confirmed+%26%26+LOCATION_IDS:[330]

returns the correct number (4) for the first group (Durex)

They are identical groups

sreekotay avatar Sep 08 '23 18:09 sreekotay

@sreekotay

Thank you for reporting this issue. Can you please create a small reproducible example on a smaller test dataset (maybe with a handful of records)? That would help us in identifying the underlying issue.

kishorenc avatar Sep 09 '23 09:09 kishorenc