weaviate
weaviate copied to clipboard
[new feature] Highlighting
Would be nice to retrieve also the keyword highlighting in the field _additional. As https://solr.apache.org/guide/solr/latest/query-guide/highlighting.html.
This is a very important feature for any WooCommerce application. It can be used for all operators, including "Ask".
It will be also useful for people who are considering using weaviate instead of elasticsearch
Such a feature would be great. Vespa.ai does also support this in the form of dynamic snippets and bolding.
Indeed @medihack, I integrated vespa.ai with WordPress in wpsolr.com with dynamic snippets and bolding.
this would be great
Hi community! We are putting a bounty on this issue! Join us! /bounty $200
💎 $200 bounty • Weaviate
Steps to solve:
-
Start working: Comment
/attempt #2567
with your implementation plan -
Submit work: Create a pull request including
/claim #2567
in the PR body to claim the bounty - Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts
Thank you for contributing to weaviate/weaviate!
Add a bounty • Share on socials
Attempt | Started (GMT+0) | Solution |
---|---|---|
🔴 @abhishek818 | May 14, 2024, 7:30:30 AM | WIP |
🔴 @Subh231004 | Jun 20, 2024, 12:00:32 PM | WIP |
🔴 @JugalIwPtl | Jul 5, 2024, 4:55:22 AM | WIP |
@dudanogueira @eostis Can you brief about the way we want to implement this OR provide a sample request with response. Should highlighting support multiple fields, be case insensitive, or what other params (like min/max length etc.) are required?
Are we looking for something similar to below:
{
Get {
Article (
nearText: {
concepts: ["fashion"],
}
) {
title
_additional {
highlight { <----------------------------
title
product_description
}
}
}
Response:
{
"data": {
"Get": {
"Article": [
{
"_additional": {
"distance": 0.15422738,
"id": "e76ec9ae-1b84-3995-939a-1365b2215312"
},
"title": "2020's biggest fashion trends reflect a world in crisis",
"product_description": "random text"
},
{
"_additional": {
"distance": 0.15683109,
"id": "a2d51619-dd22-337a-8950-e1a407dab3d2"
},
"title": "How to Dress Up For a Fashion Holiday Season",
"product_description": "random text fashion"
},
]
}
}
"highlighting": { <-----------------------------------
"e76ec9ae-1b84-3995-939a-1365b2215312": {
"manu": "2020's biggest <em>fashion</em> trends reflect a world in crisis "
},
"a2d51619-dd22-337a-8950-e1a407dab3d2": {
"product_description": "random text <em>fashion</em>"
}
}
/attempt #2567
Algora profile | Completed bounties | Tech | Active attempts | Options |
---|---|---|---|---|
@abhishek818 | 3 bounties from 3 projects | JavaScript, TypeScript |
Cancel attempt |
Looks good to me.
I usually use additional parameters:
- number of fragments
- fragment size (in characters).
- prefix ("
<em>
") - postfix ("
</em>
")
Each fields would yield a list of (max number of fragments) fragments:
"product_description": ["surrounding text <em>fashion</em> surrounding text", "surrounding text <em>fashionable</em> surrounding text"]
In my client code, I finally concatenate fragments from each field for displaying.
@Subh231004 I didn't proceeded with this as I am not sure if maintainers of repo or bounty creators are active on bounty issues, you can see other bounty issues still open and inactive PRs.
Upto you..
@abhishek818 hello :)
The bounty program is still active, we are just a bit behind on reviews. The existing PRs will be addressed shortly.
@parkerduckworth thanks for the heads up. Can you or tag other repo owners to confirm with the requirement listed in comments here and here2.
@parkerduckworth Is this open Can I attempt it?