langchain icon indicating copy to clipboard operation
langchain copied to clipboard

community: update AzureSearch class to work with azure-search-documents=11.4.0

Open lz-chen opened this issue 1 year ago • 7 comments

  • Description: Updates libs/community/langchain_community/vectorstores/azuresearch.py to support the stable version azure-search-documents=11.4.0
  • Issue: https://github.com/langchain-ai/langchain/issues/14534, https://github.com/langchain-ai/langchain/issues/15039, https://github.com/langchain-ai/langchain/issues/15355
  • Dependencies: azure-search-documents>=11.4.0

lz-chen avatar Jan 07 '24 19:01 lz-chen

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Feb 12, 2024 11:40pm

vercel[bot] avatar Jan 07 '24 19:01 vercel[bot]

Hello, there is a previous PR #13472 that provided some partial fix to the AzureSearch class but did not fully update the dependency. Now calling class as vector store still doesn't work in my case, with the same problems encontered by others in the issues mentioned in the description. I'd like to contribute to update the dependency to azure-search-documents=11.4.0. However I am not sure if that means updating all the pyproject.toml and poetry.lock files that has azure-search-documents? What is the best way to do that? @efriis @hwchase17 Thanks in advance 😊

lz-chen avatar Jan 07 '24 19:01 lz-chen

Hey @lz-chen ! You actually can't do this with a community integration, as it's not a concrete dependency. Unfortunately, people need to manage their other python package versions themselves.

A good addition would be a note in the documentation to upgrade that package!

efriis avatar Jan 10 '24 00:01 efriis

@efriis Thanks for the reply 😊 What does that mean for this PR? Should I keep the changes I made in libs/community/langchain_community/vectorstores/azuresearch.py without updating the dependencies in libs/langchain/pyproject.toml and libs/langchain/poetry.lock? Or follow what's done in PR #13472 to keep support for both azure-search-documents>=11.4.0 and azure-search-documents<11.4.0?

lz-chen avatar Jan 10 '24 08:01 lz-chen

@efriis Could you give me some pointers on what I should do with this? As of now, the AzureSearch class in libs/community/langchain_community/vectorstores/azuresearch.py doesn't work with azure-search-documents version 11.4.0, which has been released for 2 months now. The older version 11.4.0b8 that langchain pinned to has some bugs and I can't keep using that. If I can't make change in libs/langchain/pyproject.toml, should I make changes somewhere else? For example making contribution to langchain-community?

lz-chen avatar Jan 22 '24 21:01 lz-chen

@mattgotteiner for subject matter expertise on Azure Search SDK migration to 11.4 stable library. Agree with @lz-chen that the existing vector store integration module doesn't work with 11.4.0.

HeidiSteen avatar Jan 25 '24 16:01 HeidiSteen

@lz-chen thank you very much for doing this work.

@efriis Please let us know if there's any additional work required to get the PR merged.

mattgotteiner avatar Jan 25 '24 16:01 mattgotteiner

Hi @lz-chen,

Thanks for merging my PR into your update! I reviewed the original changes and tested them with azure-search-documents==11.4.0, and everything seems to work just fine :) My PR aimed to address a specific bug related to the removal of query_language in azure-search-documents==11.4.0. I'm happy to discuss any further details or assist with additional improvements if needed.

Skar0 avatar Jan 26 '24 18:01 Skar0

Looking forward to getting this merged. Due to this issue I cannot use Azure Search as a vector store

konradbjk avatar Feb 08 '24 00:02 konradbjk

Thank you for making this update - the changes look good

mattgotteiner avatar Feb 08 '24 19:02 mattgotteiner

Just looking at the discussion here about pyproject.toml-

It looks like the main langchain pyproject.toml doesn't reference azure-search-documents, which makes sense as this in community. However, I notice that the langchain-community pyproject.toml also doesn't reference azure-search-documents. I believe that it probably should, just based off similar recent integration updates, like this PR for Azure Document Intelligence: https://github.com/langchain-ai/langchain/pull/14389

The file to change: https://github.com/langchain-ai/langchain/blob/ac970c9497e2aca1f6396c3f6954b4f6cd0ac879/libs/community/pyproject.toml#L86 Notably it'd be marked as optional=True, like the other community dependencies.

That's my suggestion after looking through how the packages are structured. Perhaps @efriis could confirm. Thanks!

pamelafox avatar Feb 12 '24 17:02 pamelafox

Just looking at the discussion here about pyproject.toml-

It looks like the main langchain pyproject.toml doesn't reference azure-search-documents, which makes sense as this in community. However, I notice that the langchain-community pyproject.toml also doesn't reference azure-search-documents. I believe that it probably should, just based off similar recent integration updates, like this PR for Azure Document Intelligence: #14389

The file to change:

https://github.com/langchain-ai/langchain/blob/ac970c9497e2aca1f6396c3f6954b4f6cd0ac879/libs/community/pyproject.toml#L86

Notably it'd be marked as optional=True, like the other community dependencies. That's my suggestion after looking through how the packages are structured. Perhaps @efriis could confirm. Thanks!

Optional dependencies only need to be included in pyproject if they're required for unit tests, so no need to add in this case!

baskaryan avatar Feb 12 '24 23:02 baskaryan

@baskaryan Ah, thanks for clarifying, that explains why azure-document-intelligence was added in that PR, as it had a short unit test.

pamelafox avatar Feb 13 '24 00:02 pamelafox

apologies - but I think we need to follow up here. SearchField is still using vector_search_configuration when it should be vector_search_profile_name

mattgotteiner avatar Feb 15 '24 22:02 mattgotteiner

We are also missing a from azure.search.documents.indexes.models import VectorSearch

mattgotteiner avatar Feb 15 '24 22:02 mattgotteiner

the import is only performed with typing.TYPE_CHECKING is true

mattgotteiner avatar Feb 15 '24 22:02 mattgotteiner

@mattgotteiner noticed these issues today as well and opened a PR to fix here: https://github.com/langchain-ai/langchain/pull/17599

kristapratico avatar Feb 15 '24 22:02 kristapratico