llm-graph-builder icon indicating copy to clipboard operation
llm-graph-builder copied to clipboard

Import emails from an inbox properly

Open nileshtrivedi opened this issue 1 year ago • 1 comments

I would like to try creating a knowledge graph out of my email inbox. All raw emails are stored in AWS Workmail as well as a S3 bucket in the raw form. However, the problem is that these emails are not really pure text files which are ready to ingest into a knowledge graph. Many of them have attached documents which should be processed as well. I could try splitting the email body from attachments and store in a separate bucket, but that would lose the information about which file was attached with which email.

It would be great if this tool could directly import emails from an Inbox properly: either over POP/IMAP or S3 bucket with the expectation that it contains raw email messages.

nileshtrivedi avatar Jul 09 '24 11:07 nileshtrivedi

I think this problem is conceptually similar to ingesting Zip files which contain multiple related documents. A naive unzipping approach before ingestion would lose those relations which we want to avoid.

Does anyone have pointers to reading material about building a knowledge graph out of email inbox?

nileshtrivedi avatar Jul 09 '24 12:07 nileshtrivedi

@jexp

kartikpersistent avatar Nov 24 '24 10:11 kartikpersistent