docling icon indicating copy to clipboard operation
docling copied to clipboard

docs: add Weaviate RAG recipe notebook

Open m-newhauser opened this issue 1 year ago • 2 comments

Description

Added a new recipe for using Weaviate with Docling for RAG workflows. The notebook demonstrates how to:

  • Parse machine learning papers from arXiv using Docling
  • Perform hierarchical chunking of the documents using Docling
  • Generate text embeddings with OpenAI
  • Perform RAG over the articles using Weaviate

Original link for the notebook can be found here.

Checklist:

  • [x] Examples have been added, if necessary.

m-newhauser avatar Nov 27 '24 11:11 m-newhauser

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • [X] title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

mergify[bot] avatar Nov 27 '24 11:11 mergify[bot]

Hi @m-newhauser 👋 Just a question does it possible to run this notebook without Open API?

thukabjj avatar Dec 11 '24 17:12 thukabjj

@m-newhauser, thanks for the update, so still the one thing missing is satisfying our DCO check, for which I see two options:

  1. either you would make sure all the commits in the PR are signed off with your GitHub email (involves going back in git history & potentially risky operations like force-pushing your branch)
  2. or you could just let us know of your GitHub email address and we can reshape the commits to satisfy DCO (attributing authorship accordingly).

If you are unsure how to go about 1, we are fine with 2 too.

vagenas avatar Dec 19 '24 09:12 vagenas

@vagenas Option #2 would be great 🙏. My GitHub email address is [email protected]

m-newhauser avatar Dec 19 '24 17:12 m-newhauser