cognee icon indicating copy to clipboard operation
cognee copied to clipboard

Add a task that creates common association from a group of chunks

Open Vasilije1990 opened this issue 7 months ago • 21 comments

Chunk Associations Task

NOTE: This issue is part of Contribute-to-Win. Please comment first to get assigned. Read the details here

A task for creating semantic associations between document chunks and organizing them into common node sets based on their similarity.

Overview

The Chunk Associations Task analyzes document chunks for semantic similarity and creates weighted association edges between them. It creates weighted edgees between highly associated chunks with value Association and uses LLM classifier that has a prompt behind that should have an output saying if we should create an association or not.

Example

Book 1: Chunk talks about river dolphins Book 2: Chapter about river dolphins

Book 1 - Chunk 1 -> Association weight to Chapter about river dolphin

Input

The task should receive a batch of N chunks retrieved from an existing graph

Output

Datapoints with weighted edges that can be stored in the graph. They can belong to special Associaton nodeset or we can update and delete existing data.

Vasilije1990 avatar Sep 05 '25 15:09 Vasilije1990

I would like to solve this issue. Please assign it to me.

Sumeet2005 avatar Sep 14 '25 17:09 Sumeet2005

thanks for your interest @Sumeet2005 , the issue is assigned to you!

hande-k avatar Sep 15 '25 08:09 hande-k

hey @Sumeet2005 , how is the progress? Do you have a question? As this issue is a part of the challenge, we want to have quick iterations :) please update us! the issue will be un-assigned if no PR is opened in the next 24 hrs

hande-k avatar Sep 17 '25 11:09 hande-k

hi @Sumeet2005 , sorry to inform you but since we haven't heard back from you, we'll make this issue available for other contributors to pick up.

If you submit your PR before another contributor requests to contribute, you may be reassigned to the issue and we can review your PR.

hande-k avatar Sep 19 '25 07:09 hande-k

Hi, I'd love to work on this! Can I get some more context on the task, like the project structure and where I can integrate the logic? Thanks!

AniLeo-01 avatar Sep 26 '25 15:09 AniLeo-01

Hi, is this task completed, or can I be assigned to this task?

DDiyash avatar Sep 29 '25 03:09 DDiyash

hey @AniLeo-01, the issue is assigned to you, thanks for picking it up :)

This task can be implemented as part of memify. We look forward to your PR!

hande-k avatar Oct 01 '25 13:10 hande-k

Thanks @hande-k!

AniLeo-01 avatar Oct 01 '25 14:10 AniLeo-01

hey @AniLeo-01, wanted to check if you have any questions :) please let us know!

hande-k avatar Oct 06 '25 12:10 hande-k

This is actually a good idea. Would it work with the current nodeset implementation? As far as I remember nodeset has some flaws, for example with deduplication, as for example a person who is already in one nodeset gets deduplicated instead of belonging is more nodesets (from top of my mind, was looking into using nodesets the other day)

Varming73 avatar Nov 05 '25 07:11 Varming73

I wanna try it, could you assign it to me? @hande-k @Vasilije1990

EricXiao95 avatar Nov 13 '25 15:11 EricXiao95

hey @EricXiao95 good to see you :) done!

hande-k avatar Nov 17 '25 14:11 hande-k

@EricXiao95 @hande-k @Vasilije1990 Hello! I am curious if this is being worked on? Otherwise, I would like to get assigned! Thank you.

kckoh avatar Nov 24 '25 13:11 kckoh

hi @kckoh thanks for your interest! let's give @EricXiao95 a day to see it then feel free to start cracking :)

hande-k avatar Nov 24 '25 14:11 hande-k

@hande-k Would you mind assigning this to me? I’d love to get started on it.

kckoh avatar Nov 25 '25 14:11 kckoh

Hi @hande-k @kckoh, sorry for the late reply! I missed the notification. I am actually working on this. I'll have a PR ready by the end of this week. Thanks for checking in!

EricXiao95 avatar Nov 26 '25 03:11 EricXiao95

Hi @EricXiao95 @hande-k! No worries about the delay. I actually started working on this since yesterday and I'm nearly finished . @EricXiao95 - would you be open to collaborating? Or we could compare approaches when we're both done? Happy to coordinate to avoid duplicate effort. @hande-k - what do you think is the best path forward here?

kckoh avatar Nov 26 '25 03:11 kckoh

@Vasilije1990 @EricXiao95 @hande-k I've created a PR. Could you please take a look and review?

kckoh avatar Nov 26 '25 05:11 kckoh

Hi @kckoh, thanks for the update. Since the PR is already up, please go ahead with yours. Thanks @hande-k!

EricXiao95 avatar Nov 26 '25 08:11 EricXiao95

Thanks both @EricXiao95 @kckoh! the pr will be reviewed soon :)

hande-k avatar Nov 27 '25 15:11 hande-k

Thanks both @EricXiao95 @kckoh! the pr will be reviewed soon :)

Looking forward to contribute more on cognee :)

kckoh avatar Nov 28 '25 02:11 kckoh