LLM-Engineers-Handbook icon indicating copy to clipboard operation
LLM-Engineers-Handbook copied to clipboard

The pipeline does not include any posts or repositories

Open elcapo opened this issue 4 months ago • 0 comments

Although chapter 4 addresses the implementation of dispatchers and handlers for cleaning, chunking and embedding posts, articles and repositories, the current version of the pipeline only includes articles from different sources.

I guess it must have something to do with Linkedin and Github difficulting to create crawlers as they may protect their endpoints with user and password. But I'd expect the "import data warehouse from JSON" to work without internet connection. In particular, this command:

poetry poe run-import-data-warehouse-from-json

To achieve that, can the corresponding files...

be populated?

elcapo avatar Aug 17 '25 21:08 elcapo