langchain icon indicating copy to clipboard operation
langchain copied to clipboard

Add Xorbits Dataframe as a Document Loader

Open yifeis7 opened this issue 2 years ago • 3 comments

  • Xorbits is an open-source computing framework that makes it easy to scale data science and machine learning workloads in parallel. Xorbits can leverage multi cores or GPUs to accelerate computation on a single machine, or scale out up to thousands of machines to support processing terabytes of data.

  • This PR added support for the Xorbits document loader, which allows langchain to leverage Xorbits to parallelize and distribute the loading of data.

  • Dependencies: This change requires the Xorbits library to be installed in order to be used. pip install xorbits

  • Request for review: @rlancemartin, @eyurtsev

  • Twitter handle: https://twitter.com/Xorbitsio

yifeis7 avatar Jul 07 '23 04:07 yifeis7

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Jul 10, 2023 8:17am

vercel[bot] avatar Jul 07 '23 04:07 vercel[bot]

See Lint error

langchain/document_loaders/xorbits.py:22:89: E501 Line too long (95 > 88 characters)

Run poetry run ruff . to check locally before pushing.

Also you can run make format to fix other formatting errors.

rlancemartin avatar Jul 07 '23 16:07 rlancemartin

@rlancemartin Thanks for your help! Bug fixed.

yifeis7 avatar Jul 07 '23 18:07 yifeis7

This PR is ready for merge, @rlancemartin Could you please merge it?

qinxuye avatar Jul 10 '23 06:07 qinxuye