Emerson Gomes

Results 20 issues of Emerson Gomes

Replace the `web_html_cleanup` function from `html_utils.py` (or potentially the whole web connector) with [trafilatura](https://trafilatura.readthedocs.io/en/latest/) It has much better handling of "noisy" elements, and is able to output markdown text, keeping...

The issue happens sometimes with large embeddings requests. The request times out before the API is able to respond. A timeout parameter should be created to allow more time for...

The chunker will sometimes generate chunks that are too large to be processed for embedding: ``` 500: Error embedding text with EmbeddingProvider.OPENAI: Error embedding text with OpenAI: Error code: 400...

We're seeing some indexing attempt failures with this exception. Apparently in some cases we're trying to embed an empty string, that makes the whole indexing process fail. Maybe we should...

### The Feature Today for image generation models, only the `/images/generation` method is supported. However, there are other methods such as `/images/edits` and `/images/variations` as described in https://platform.openai.com/docs/guides/images/?lang=curl&context=python Google image...

enhancement
mlops user request
feb 2025
openai

## Title New azure models: * DeepSeek V3 0324 * Llama 4 Scout * Llama 4 Maverick ## Type 🆕 New Feature

## Description * Improve ScrapeSessionContext management and cleanup: Enhanced resource management within ScrapeSessionContext for better stability. * Cache protected_url_check results using lru_cache: Added caching to DNS lookups for performance. *...

## Description This patch enables the usage of LiteLLM Customer feature (via an OpenAI provider) by sending out the user id as part of the LLM call. ## How Has...

## Description This PR introduces a comprehensive upgrade to the LLM model selection interface, adding advanced controls for temperature and reasoning levels, along with significant UX improvements to the model...

Stale

### What problem is this feature trying to solve? While today it currently detects and uses `OPENAI_API_KEY` it always assume that the base url is the official OpenAI API. It...

enhancement