Pamela Fox comments

Results 695 comments of


                                            Pamela Fox

trafficstars

Draft PR: Private endpoint support for container apps

Blocking: https://github.com/Azure/bicep-registry-modules/issues/4387

Update searchmanager.py

Thanks for the PR, I've discussed with @mattgotteiner. He says that the title should not be strictly required for integrated vectorization, but that some developers may want it. If we...

BUG: Getting quota errors in webApp but not in the playground

What TPM do you currently have for your deployment? Each question takes an average 1000 tokens, so it is easy to exceed the rate limits if your deployments have low...

Custom ingestion. Html tables generated too wordy and inefficient.

We did look into this a bit, here's some relevant research on HTML vs plaintext vs markdown: https://arxiv.org/abs/2411.02959 https://arxiv.org/abs/2406.08100 The reason that we're currently picking HTML for tables is that...

Custom ingestion. Html tables generated too wordy and inefficient.

Yeah, makes sense. This is the method that would need changing: DocumentAnalysisParser.table_to_html() in pdfparser.py You could put a table_to_csv() in there and try that instead. If your table are still...

Wrong sourcepage, when section include text from two pages

Thanks for surfacing this. The current splitter attributes a section to the page where it starts, so when a chunk spans pages it can cite the wrong page. I have...

Wrong sourcepage, when section include text from two pages

@elhele Can you specify what you mean by agentic chunking? The adjective "agentic" can have multiple interpretations these days.

Wrong sourcepage, when section include text from two pages

Hm, I think that Document Intelligence already does a bit of that, and with our current splitting logic, it tries not to split things like tables and figures. It may...

Embedding models v3 for integrated vectorization

I'm checking in with the Azure AI Search team about this, it's possible that an Azure AI Search SDK update would be needed.

Embedding models v3 for integrated vectorization

Response from AI Search team: This is supported using the latest SDK versions (preview and GA). Here's how to use them with Python: [azure-search-vector-samples/demo-python/code/e2e-demos/azure-ai-search-e2e-build-demo.ipynb at main · Azure/azure-search-vector-samples (github.com)](https://github.com/Azure/azure-search-vector-samples/blob/main/demo-python/code/e2e-demos/azure-ai-search-e2e-build-demo.ipynb) ....