langchainjs icon indicating copy to clipboard operation
langchainjs copied to clipboard

Create BaseDocumentLoader, BaseCheerioLoader, and loaders

Open samheutmaker opened this issue 1 year ago • 0 comments

This PR is an attempt to add the foundational code needed for loaders. It also adds a BaseCheerioLoader that uses cheerio in place of BeautifulSoup for dom-traversal. It builds on the cheerio loader to add loaders for:

  • Hacker News
  • College Confidential
  • AZLyrics
  • IMSDB

It also adds loaders for:

  • Text Files
  • SRT Files

These loaders function identically to the corresponding loaders in Python LangChain (PyLangChain?). This PR is dependent on previous work in PR #50.

samheutmaker avatar Feb 19 '23 15:02 samheutmaker