azure-search-openai-demo
azure-search-openai-demo copied to clipboard
When will code support azure-ai-documentintelligence ==1.0.0b1 instead of current azure-ai-formrecognizer==3.3.2
Please provide us with the following information:
azure-ai-formrecognizer==3.3.2 has problem in parsing our PDF, azure-ai-documentintelligence ==1.0.0b1 handles our PDF correctly When can this open source code to support azure-ai-documentintelligence ==1.0.0b1
Some reference:
https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/documentintelligence/azure-ai-documentintelligence/MIGRATION_GUIDE.md
https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/changelog-release-history?view=doc-intel-3.1.0&tabs=python#november-2023-preview-release
This issue is for a: (mark with an x
)
- [ ] bug report -> please search issues before submitting
- [x ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)
Minimal steps to reproduce
Any log messages given by the failure
Expected/desired behavior
OS and Version?
Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)
azd version?
run
azd version
and copy paste here.
Versions
Mention any other details that might be useful
Thanks! We'll be in touch soon.
Thanks for filing, @rebeccaleeasml . I've chatted with the Document Intelligence team about whether it's appropriate to move to azure-ai-documentintelligence. That SDK is still in pre-release mode, so it's going to be going through some changes before it's got its first stable release (in a few months). The Document Intelligence team suggests that we can try it out in a branch, but not necessarily move everyone to using the SDK until it's got its first stable release.
I'm curious, could you share what it handles better in your PDF? I'm wondering how to replicate that improvement on our side, so we can advise other customers when to upgrade.
CC @srbalakr in case he's thought about this as well.
My suggestion is to wait for a stable release.
@rebeccaleeasml just an heads up. Azure AI search has a new feature adding a vectorizer to index. It takes care of parsing pdf and chunking data to index. It should protect from sdk issues.
Hi, Pamela:
I can not share through the open source channel. But if you can pin me through team and through Microsoft, I can share more information with you, since Microsoft and my company has signed NDA.
Rebecca
From: Pamela Fox @.> Sent: Wednesday, December 6, 2023 10:03 AM To: Azure-Samples/azure-search-openai-demo @.> Cc: Rebecca Lee @.>; Mention @.> Subject: Re: [Azure-Samples/azure-search-openai-demo] When will code support azure-ai-documentintelligence ==1.0.0b1 instead of current azure-ai-formrecognizer==3.3.2 (Issue #1036)
CAUTION: This message is from an external sender
Thanks for filing, @rebeccaleeasmlhttps://github.com/rebeccaleeasml . I've chatted with the Document Intelligence team about whether it's appropriate to move to azure-ai-documentintelligence. That SDK is still in pre-release mode, so it's going to be going through some changes before it's got its first stable release (in a few months). The Document Intelligence team suggests that we can try it out in a branch, but not necessarily move everyone to using the SDK until it's got its first stable release.
I'm curious, could you share what it handles better in your PDF? I'm wondering how to replicate that improvement on our side, so we can advise other customers when to upgrade.
CC @srbalakrhttps://github.com/srbalakr in case he's thought about this as well.
Reply to this email directly, view it on GitHubhttps://github.com/Azure-Samples/azure-search-openai-demo/issues/1036#issuecomment-1843403215, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3NQOTUOE4VSJ5MCYYHCJO3YICXM5AVCNFSM6AAAAABAJWS3IKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBTGQYDGMRRGU. You are receiving this because you were mentioned.Message ID: @.@.>>
--- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
@rebeccaleeasml just an heads up. Azure AI search has a new feature adding a vectorizer to index. It takes care of parsing pdf and chunking data to index. It should protect from sdk issues.
@srbalakr That sounds like a great idea. How would you approach adding the skill of parsing in the indexers?
@rebeccaleeasml just an heads up. Azure AI search has a new feature adding a vectorizer to index. It takes care of parsing pdf and chunking data to index. It should protect from sdk issues.
Will this be implemented in this open source code?
Branch for the new azure-ai-documentintelligence package: https://github.com/Azure-Samples/azure-search-openai-demo/pull/1224
Branch for integrated vectorization: https://github.com/Azure-Samples/azure-search-openai-demo/pull/1159
Closing this, since we merged both those branches.