azure-search-openai-demo icon indicating copy to clipboard operation
azure-search-openai-demo copied to clipboard

When will code support azure-ai-documentintelligence ==1.0.0b1 instead of current azure-ai-formrecognizer==3.3.2

Open rebeccaleeasml opened this issue 1 year ago • 7 comments

Please provide us with the following information:

azure-ai-formrecognizer==3.3.2 has problem in parsing our PDF, azure-ai-documentintelligence ==1.0.0b1 handles our PDF correctly When can this open source code to support azure-ai-documentintelligence ==1.0.0b1

Some reference:

https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/documentintelligence/azure-ai-documentintelligence/MIGRATION_GUIDE.md

https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/changelog-release-history?view=doc-intel-3.1.0&tabs=python#november-2023-preview-release

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [x ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

azd version?

run azd version and copy paste here.

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

rebeccaleeasml avatar Dec 06 '23 17:12 rebeccaleeasml

Thanks for filing, @rebeccaleeasml . I've chatted with the Document Intelligence team about whether it's appropriate to move to azure-ai-documentintelligence. That SDK is still in pre-release mode, so it's going to be going through some changes before it's got its first stable release (in a few months). The Document Intelligence team suggests that we can try it out in a branch, but not necessarily move everyone to using the SDK until it's got its first stable release.

I'm curious, could you share what it handles better in your PDF? I'm wondering how to replicate that improvement on our side, so we can advise other customers when to upgrade.

CC @srbalakr in case he's thought about this as well.

pamelafox avatar Dec 06 '23 18:12 pamelafox

My suggestion is to wait for a stable release.

srbalakr avatar Dec 06 '23 19:12 srbalakr

@rebeccaleeasml just an heads up. Azure AI search has a new feature adding a vectorizer to index. It takes care of parsing pdf and chunking data to index. It should protect from sdk issues.

srbalakr avatar Dec 06 '23 19:12 srbalakr

Hi, Pamela:

I can not share through the open source channel. But if you can pin me through team and through Microsoft, I can share more information with you, since Microsoft and my company has signed NDA.

Rebecca

From: Pamela Fox @.> Sent: Wednesday, December 6, 2023 10:03 AM To: Azure-Samples/azure-search-openai-demo @.> Cc: Rebecca Lee @.>; Mention @.> Subject: Re: [Azure-Samples/azure-search-openai-demo] When will code support azure-ai-documentintelligence ==1.0.0b1 instead of current azure-ai-formrecognizer==3.3.2 (Issue #1036)

CAUTION: This message is from an external sender

Thanks for filing, @rebeccaleeasmlhttps://github.com/rebeccaleeasml . I've chatted with the Document Intelligence team about whether it's appropriate to move to azure-ai-documentintelligence. That SDK is still in pre-release mode, so it's going to be going through some changes before it's got its first stable release (in a few months). The Document Intelligence team suggests that we can try it out in a branch, but not necessarily move everyone to using the SDK until it's got its first stable release.

I'm curious, could you share what it handles better in your PDF? I'm wondering how to replicate that improvement on our side, so we can advise other customers when to upgrade.

CC @srbalakrhttps://github.com/srbalakr in case he's thought about this as well.

Reply to this email directly, view it on GitHubhttps://github.com/Azure-Samples/azure-search-openai-demo/issues/1036#issuecomment-1843403215, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3NQOTUOE4VSJ5MCYYHCJO3YICXM5AVCNFSM6AAAAABAJWS3IKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBTGQYDGMRRGU. You are receiving this because you were mentioned.Message ID: @.@.>>

--- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.

rebeccaleeasml avatar Dec 06 '23 21:12 rebeccaleeasml

@rebeccaleeasml just an heads up. Azure AI search has a new feature adding a vectorizer to index. It takes care of parsing pdf and chunking data to index. It should protect from sdk issues.

@srbalakr That sounds like a great idea. How would you approach adding the skill of parsing in the indexers?

david-an-tran-pham avatar Dec 07 '23 09:12 david-an-tran-pham

@rebeccaleeasml just an heads up. Azure AI search has a new feature adding a vectorizer to index. It takes care of parsing pdf and chunking data to index. It should protect from sdk issues.

Will this be implemented in this open source code?

rebeccaleeasml avatar Dec 08 '23 15:12 rebeccaleeasml

Branch for the new azure-ai-documentintelligence package: https://github.com/Azure-Samples/azure-search-openai-demo/pull/1224

Branch for integrated vectorization: https://github.com/Azure-Samples/azure-search-openai-demo/pull/1159

pamelafox avatar Feb 02 '24 15:02 pamelafox

Closing this, since we merged both those branches.

pamelafox avatar Mar 12 '24 20:03 pamelafox