weaviate icon indicating copy to clipboard operation
weaviate copied to clipboard

[new feature] Highlighting

Open eostis opened this issue 2 years ago • 17 comments

Would be nice to retrieve also the keyword highlighting in the field _additional. As https://solr.apache.org/guide/solr/latest/query-guide/highlighting.html.

This is a very important feature for any WooCommerce application. It can be used for all operators, including "Ask".

eostis avatar Jan 24 '23 08:01 eostis

It will be also useful for people who are considering using weaviate instead of elasticsearch

ahmetugurmetus avatar Jan 26 '23 12:01 ahmetugurmetus

Such a feature would be great. Vespa.ai does also support this in the form of dynamic snippets and bolding.

medihack avatar Oct 02 '23 11:10 medihack

Indeed @medihack, I integrated vespa.ai with WordPress in wpsolr.com with dynamic snippets and bolding.

eostis avatar Oct 02 '23 11:10 eostis

this would be great

thearchitector avatar Jan 16 '24 20:01 thearchitector

Hi community! We are putting a bounty on this issue! Join us! /bounty $200

dudanogueira avatar May 13 '24 11:05 dudanogueira

💎 $200 bounty • Weaviate

Steps to solve:

  1. Start working: Comment /attempt #2567 with your implementation plan
  2. Submit work: Create a pull request including /claim #2567 in the PR body to claim the bounty
  3. Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

Thank you for contributing to weaviate/weaviate!

Add a bountyShare on socials

Attempt Started (GMT+0) Solution
🔴 @abhishek818 May 14, 2024, 7:30:30 AM WIP
🔴 @Subh231004 Jun 20, 2024, 12:00:32 PM WIP
🔴 @JugalIwPtl Jul 5, 2024, 4:55:22 AM WIP

algora-pbc[bot] avatar May 13 '24 11:05 algora-pbc[bot]

@dudanogueira @eostis Can you brief about the way we want to implement this OR provide a sample request with response. Should highlighting support multiple fields, be case insensitive, or what other params (like min/max length etc.) are required?

Are we looking for something similar to below:


{
  Get {
    Article (
      nearText: {
        concepts: ["fashion"],
      }
    ) {
      title
      _additional {
        highlight {       <----------------------------
          title        
          product_description
      }
    }
  }
  
  Response:
  {
  "data": {
    "Get": {
      "Article": [
        {
          "_additional": {
            "distance": 0.15422738,
            "id": "e76ec9ae-1b84-3995-939a-1365b2215312"
          },
          "title": "2020's biggest fashion trends reflect a world in crisis",
          "product_description": "random text"
        },
        {
          "_additional": {
            "distance": 0.15683109,
            "id": "a2d51619-dd22-337a-8950-e1a407dab3d2"
          },
          "title": "How to Dress Up For a Fashion Holiday Season",
           "product_description": "random text fashion"
        },
      ]
    }
  }
  "highlighting": {             <-----------------------------------
    "e76ec9ae-1b84-3995-939a-1365b2215312": {
      "manu": "2020's biggest <em>fashion</em> trends reflect a world in crisis "
    },
    "a2d51619-dd22-337a-8950-e1a407dab3d2": {
      "product_description": "random text <em>fashion</em>"
    }
  }

abhishek818 avatar May 14 '24 07:05 abhishek818

/attempt #2567

Algora profile Completed bounties Tech Active attempts Options
@abhishek818 3 bounties from 3 projects
JavaScript, TypeScript
Cancel attempt

abhishek818 avatar May 14 '24 07:05 abhishek818

Looks good to me.

I usually use additional parameters:

  • number of fragments
  • fragment size (in characters).
  • prefix ("<em>")
  • postfix ("</em>")

Each fields would yield a list of (max number of fragments) fragments: "product_description": ["surrounding text <em>fashion</em> surrounding text", "surrounding text <em>fashionable</em> surrounding text"]

In my client code, I finally concatenate fragments from each field for displaying.

eostis avatar May 14 '24 08:05 eostis

/attempt #2567

Options

Subh231004 avatar Jun 20 '24 12:06 Subh231004

@Subh231004 I didn't proceeded with this as I am not sure if maintainers of repo or bounty creators are active on bounty issues, you can see other bounty issues still open and inactive PRs.

Upto you..

abhishek818 avatar Jun 20 '24 13:06 abhishek818

@abhishek818 hello :)

The bounty program is still active, we are just a bit behind on reviews. The existing PRs will be addressed shortly.

parkerduckworth avatar Jun 20 '24 21:06 parkerduckworth

@parkerduckworth thanks for the heads up. Can you or tag other repo owners to confirm with the requirement listed in comments here and here2.

abhishek818 avatar Jun 21 '24 06:06 abhishek818

@parkerduckworth Is this open Can I attempt it?

Subh231004 avatar Jun 21 '24 12:06 Subh231004

@parkerduckworth thanks for the heads up. Can you or tag other repo owners to confirm with the requirement listed in comments here and here2.

@parkerduckworth just bumping this @antas-marcin @dirkkul (tagging some active maintainers, sorry for the mentions)

abhishek818 avatar Jul 21 '24 20:07 abhishek818