langchain icon indicating copy to clipboard operation
langchain copied to clipboard

community: Add SQLAlchemy document loader

Open amotl opened this issue 1 year ago • 2 comments

  • Description: This patch adds a generic document loader adapter for SQLAlchemy.
  • Issue: NA
  • Dependencies: NA
  • References: https://github.com/crate-workbench/langchain/pull/1
  • Maintainer: @baskaryan

Hi from CrateDB again,

in the same spirit like GH-16243 and GH-16244, this patch breaks out another commit from https://github.com/crate-workbench/langchain/pull/1, in order to reduce the size of this patch before submitting it, and to separate concerns.

To accompany the SQLAlchemy adapter implementation, the patch includes integration tests for both SQLite and PostgreSQL. Let me know if corresponding utility resources should be added at different spots.

With kind regards, Andreas.

Software Tests

docker compose --file libs/community/tests/integration_tests/document_loaders/docker-compose/postgresql.yml up
cd libs/community
pip install psycopg2-binary
pytest -vvv tests/integration_tests -k sqlalchemy
14 passed

image

amotl avatar Jan 19 '24 01:01 amotl

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Feb 9, 2024 0:05am

vercel[bot] avatar Jan 19 '24 01:01 vercel[bot]

Hi @baskaryan,

thank you for merging our other patch so quickly. On this matter, may I humbly ask you to educate us about what to do with this LangSmithUserError: API key must be provided error on CI?

FAILED tests/unit_tests/callbacks/test_callback_manager.py::test_callback_manager_configure_context_vars - langsmith.utils.LangSmithUserError: API key must be provided when using hosted LangSmith API

-- https://github.com/langchain-ai/langchain/actions/runs/7578834938/job/20642075314?pr=16246#step:6:895

Most probably, CI can not pick it up the API key for PRs coming from external contributors?

With kind regards, Andreas.

amotl avatar Jan 19 '24 02:01 amotl

@baskaryan: Sounds good.

To make it happen, we would need to extend SQLDatabase to also accept query parameters on _execute, like @cbornet suggested at https://github.com/langchain-ai/langchain/pull/16246#discussion_r1466119894, right?

Also, the return capabilities would need to be extended, to actually support lazy_load, correct? https://github.com/langchain-ai/langchain/blob/70ff54eace10b50d4f01a4f5a2d8636190ccca1b/libs/community/langchain_community/utilities/sql_database.py#L414-L422

Will you accept corresponding adjustments?

amotl avatar Jan 26 '24 22:01 amotl

@amotl Let me know if you'd like me to comnandeer to incorporate some of the nits!

eyurtsev avatar Feb 22 '24 15:02 eyurtsev

Hi @eyurtsev. While I feel bad about it, if you have the cycles, it would be so nice, indeed! I am currently a bit swamped with documentation improvement matters on our ends, where others are dearly waiting for, so I will appreciate your efforts very much.

amotl avatar Feb 22 '24 17:02 amotl

Comandeering

eyurtsev avatar Feb 28 '24 20:02 eyurtsev

Closing in favor of: https://github.com/langchain-ai/langchain/pull/18281

eyurtsev avatar Feb 28 '24 20:02 eyurtsev