yocto-gl
yocto-gl copied to clipboard
[BUG] External Vector Store Issue
Issues Policy acknowledgement
- [X] I have read and agree to submit bug reports in accordance with the issues policy
Where did you encounter this bug?
Local machine
Willingness to contribute
Yes. I can contribute a fix for this bug independently.
MLflow version
- Client: 2.16.0
- Tracking server: 2.16.0
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Mac OSX
- Python version: 3.12.4
- yarn version, if running the dev UI:
Describe the problem
trying to track a RAG application with mlflow which involved qdrant and the index is not retrieved from model_uri
of the model_info
and an error occured as attached in [error.log](https://github.com/user-attachments/files/16931084/error.log) error.log
Tracking information
MLflow version: 2.16.0
Tracking URI: file:///Users/pavanmantha/Pavans/PracticeExamples/DataScience_Practice/LLMs/llama_index_tutorials/qdrant-mlflow-llama-index/mlruns
Artifact URI: file:///Users/pavanmantha/Pavans/PracticeExamples/DataScience_Practice/LLMs/llama_index_tutorials/qdrant-mlflow-llama-index/mlruns/0/5fa900f38f89487a95919c59e5dc3018/artifacts
System information: Darwin Darwin Kernel Version 23.6.0: Mon Jul 29 21:14:30 PDT 2024; root:xnu-10063.141.2~1/RELEASE_ARM64_T6000
Python version: 3.12.4
MLflow version: 2.16.0
MLflow module location: /Users/pavanmantha/Pavans/PracticeExamples/DataScience_Practice/LLMs/llama_index_tutorials/qdrant-mlflow-llama-index/venv/lib/python3.12/site-packages/mlflow/__init__.py
Tracking URI: file:///Users/pavanmantha/Pavans/PracticeExamples/DataScience_Practice/LLMs/llama_index_tutorials/qdrant-mlflow-llama-index/mlruns
Registry URI: file:///Users/pavanmantha/Pavans/PracticeExamples/DataScience_Practice/LLMs/llama_index_tutorials/qdrant-mlflow-llama-index/mlruns
Active experiment ID: 0
Active run ID: 5fa900f38f89487a95919c59e5dc3018
Active run artifact URI: file:///Users/pavanmantha/Pavans/PracticeExamples/DataScience_Practice/LLMs/llama_index_tutorials/qdrant-mlflow-llama-index/mlruns/0/5fa900f38f89487a95919c59e5dc3018/artifacts
MLflow dependencies:
Flask: 3.0.3
Jinja2: 3.1.4
aiohttp: 3.10.5
alembic: 1.13.2
docker: 7.1.0
fastapi: 0.112.1
graphene: 3.3
gunicorn: 23.0.0
markdown: 3.7
matplotlib: 3.9.2
mlflow-skinny: 2.16.0
numpy: 1.26.4
pandas: 2.2.2
pyarrow: 17.0.0
pydantic: 2.9.0
scikit-learn: 1.5.1
scipy: 1.14.1
sqlalchemy: 2.0.34
tiktoken: 0.7.0
uvicorn: 0.30.6
Code to reproduce issue
from llama_index.core import (
VectorStoreIndex,
SimpleDirectoryReader,
Settings,
StorageContext
)
from llama_index.llms.ollama import Ollama
from llama_index.embeddings.ollama import OllamaEmbedding
from llama_index.vector_stores.qdrant import QdrantVectorStore
from llama_index.core.base.response.schema import Response, StreamingResponse, AsyncStreamingResponse, PydanticResponse
from llama_index.core.query_engine import RetryQueryEngine, RetrySourceQueryEngine, RetryGuidelineQueryEngine
from llama_index.core.evaluation import RelevancyEvaluator, GuidelineEvaluator
from llama_index.core.evaluation.guideline import DEFAULT_GUIDELINES
from dotenv import load_dotenv, find_dotenv
from typing import Union
import qdrant_client
import logging
import os
import mlflow
_ = load_dotenv(find_dotenv())
logging.basicConfig(level=int(os.environ['INFO']))
logger = logging.getLogger(__name__)
mlflow.llama_index.autolog() # This is for enabling tracing
class SelfCorrectingRAG:
RESPONSE_TYPE = Union[
Response, StreamingResponse, AsyncStreamingResponse, PydanticResponse
]
def __init__(self, input_dir: str, similarity_top_k: int = 3, chunk_size: int = 128,
chunk_overlap: int = 100, show_progress: bool = False, no_of_retries: int = 5,
required_exts: list[str] = ['.pdf', '.txt']):
self.input_dir = input_dir
self.similarity_top_k = similarity_top_k
self.show_progress = show_progress
self.no_of_retries = no_of_retries
self.index_loaded = False
self.required_exts = required_exts
# use your prefered vector embeddings model
logger.info("initializing the OllamaEmbedding")
embed_model = OllamaEmbedding(model_name=os.environ['OLLAMA_EMBED_MODEL'],
base_url=os.environ['OLLAMA_BASE_URL'])
# openai embeddings, embedding_model_name="text-embedding-3-large"
# embed_model = OpenAIEmbedding(embed_batch_size=10, model=embedding_model_name)
# use your prefered llm
llm = Ollama(model=os.environ['OLLAMA_LLM_MODEL'], base_url=os.environ['OLLAMA_BASE_URL'], request_timeout=600)
# llm = OpenAI(model="gpt-4o")
logger.info("initializing the global settings")
Settings.embed_model = embed_model
Settings.llm = llm
Settings.chunk_size = chunk_size
Settings.chunk_overlap = chunk_overlap
# Create a local Qdrant vector store
logger.info("initializing the vector store related objects")
self.client: qdrant_client.QdrantClient = qdrant_client.QdrantClient(url=os.environ['DB_URL'],
api_key=os.environ['DB_API_KEY'])
self.vector_store = QdrantVectorStore(client=self.client, collection_name=os.environ['COLLECTION_NAME'])
self.query_response_evaluator = RelevancyEvaluator()
self.base_query_engine = None
self.index = None
self.model_uri = None
mlflow.set_tracking_uri("http://127.0.0.1:3000")
mlflow.set_experiment("experiment-1")
self._load_data_and_create_engine()
def _load_data_and_create_engine(self):
if self.client.collection_exists(collection_name=os.environ['COLLECTION_NAME']):
try:
self.index = VectorStoreIndex.from_vector_store(vector_store=self.vector_store)
# self.base_query_engine = self.index.as_query_engine()
self.index_loaded = True
except Exception as e:
self.index_loaded = False
if not self.index_loaded:
# load data
_docs = (SimpleDirectoryReader(input_dir=self.input_dir, required_exts=self.required_exts)
.load_data(show_progress=self.show_progress))
# build and persist index
storage_context = StorageContext.from_defaults(vector_store=self.vector_store)
logger.info("indexing the docs in VectorStoreIndex")
self.index = VectorStoreIndex.from_documents(documents=_docs, storage_context=storage_context,
show_progress=self.show_progress)
# self.base_query_engine = self.index.as_query_engine()
# save the vector database model to mlflow tracking server
mlflow.models.set_model(self.index)
with mlflow.start_run() as run:
model_info = mlflow.llama_index.log_model(
"self_correction_core.py",
artifact_path="llama_index",
engine_type="query", # Defines the pyfunc and spark_udf inference type
input_example="what is mlops?", # Infers signature
registered_model_name="llama_index_vector_store", # Stores an instance in the model registry
)
# run_id = run.info.run_id
# f"runs:/{run_id}/llama_index"
self.model_uri = model_info.model_uri
logger.info(f"Unique identifier for the model location for loading: {self.model_uri}")
self._create_base_query_engine_from_mlflow()
def _create_base_query_engine_from_mlflow(self):
self.index = mlflow.llama_index.load_model(self.model_uri)
print(self.index.vector_store)
# self.base_query_engine = self.index.as_query_engine()
Stack trace
/Users/pavanmantha/Pavans/PracticeExamples/DataScience_Practice/LLMs/llama_index_tutorials/qdrant-mlflow-llama-index/venv/bin/python /Users/pavanmantha/Pavans/PracticeExamples/DataScience_Practice/LLMs/llama_index_tutorials/qdrant-mlflow-llama-index/main.py
INFO:self_correction_core:initializing the OllamaEmbedding
INFO:self_correction_core:initializing the global settings
INFO:self_correction_core:initializing the vector store related objects
/Users/pavanmantha/Pavans/PracticeExamples/DataScience_Practice/LLMs/llama_index_tutorials/qdrant-mlflow-llama-index/venv/lib/python3.12/site-packages/qdrant_client/qdrant_remote.py:130: UserWarning: Api key is used with an insecure connection.
warnings.warn("Api key is used with an insecure connection.")
INFO:httpx:HTTP Request: GET http://localhost:6333/collections/YOUR_COLLECTION/exists "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://localhost:6333/collections/YOUR_COLLECTION/exists "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:6333/collections/YOUR_COLLECTION/points/search "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 200 OK"
2024/09/09 18:51:39 INFO mlflow.llama_index.serialize_objects: API key(s) will be removed from the global Settings object during serialization to protect against key leakage. At inference time, the key(s) must be passed as environment variables.
2024/09/09 18:51:41 WARNING mlflow.utils.environment: Encountered an unexpected error while inferring pip requirements (model URI: /var/folders/1x/36nz44_569501n4xs0px2_8m0000gn/T/tmptri5clwn/model, flavor: llama_index). Fall back to return ['llama-index==0.11.7']. Set logging level to DEBUG to see the full traceback.
Registered model 'llama_index_vector_store' already exists. Creating a new version of this model...
2024/09/09 18:51:41 INFO mlflow.store.model_registry.abstract_store: Waiting up to 300 seconds for model version to finish creation. Model name: llama_index_vector_store, version 4
Created version '4' of model 'llama_index_vector_store'.
Downloading artifacts: 100%|██████████| 8/8 [00:00<00:00, 11459.85it/s]
2024/09/09 18:51:41 WARNING mlflow.models.model: Failed to validate serving input example {
"inputs": "what is mlops?"
}. Alternatively, you can avoid passing input example and pass model signature instead when logging the model. To ensure the input example is valid prior to serving, please try calling `mlflow.models.validate_serving_input` on the model uri and serving input example. A serving input example can be generated from model input example using `mlflow.models.convert_input_example_to_serving_input` function.
Got error: 1 validation error for SentenceSplitter
id_func
Input should be callable [type=callable_type, input_value={'id_func_name': 'default...nc', 'title': 'id_func'}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.9/v/callable_type
INFO:self_correction_core:Unique identifier for the model location for loading: runs:/179a5e04c46c48a9975b570784ed3c74/llama_index
2024/09/09 18:51:42 INFO mlflow.tracking._tracking_service.client: 🏃 View run mysterious-eel-21 at: http://127.0.0.1:3000/#/experiments/4/runs/179a5e04c46c48a9975b570784ed3c74.
2024/09/09 18:51:42 INFO mlflow.tracking._tracking_service.client: 🧪 View experiment at: http://127.0.0.1:3000/#/experiments/4.
Traceback (most recent call last):
File "/Users/pavanmantha/Pavans/PracticeExamples/DataScience_Practice/LLMs/llama_index_tutorials/qdrant-mlflow-llama-index/main.py", line 5, in <module>
self_correcting_rag = SelfCorrectingRAG(input_dir='data', show_progress=True, no_of_retries=3)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/pavanmantha/Pavans/PracticeExamples/DataScience_Practice/LLMs/llama_index_tutorials/qdrant-mlflow-llama-index/self_correction_core.py", line 75, in __init__
self._load_data_and_create_engine()
File "/Users/pavanmantha/Pavans/PracticeExamples/DataScience_Practice/LLMs/llama_index_tutorials/qdrant-mlflow-llama-index/self_correction_core.py", line 115, in _load_data_and_create_engine
self._create_base_query_engine_from_mlflow()
File "/Users/pavanmantha/Pavans/PracticeExamples/DataScience_Practice/LLMs/llama_index_tutorials/qdrant-mlflow-llama-index/self_correction_core.py", line 118, in _create_base_query_engine_from_mlflow
self.index = mlflow.llama_index.load_model(self.model_uri)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/pavanmantha/Pavans/PracticeExamples/DataScience_Practice/LLMs/llama_index_tutorials/qdrant-mlflow-llama-index/venv/lib/python3.12/site-packages/mlflow/tracing/provider.py", line 253, in wrapper
is_func_called, result = True, f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/Users/pavanmantha/Pavans/PracticeExamples/DataScience_Practice/LLMs/llama_index_tutorials/qdrant-mlflow-llama-index/venv/lib/python3.12/site-packages/mlflow/llama_index/__init__.py", line 429, in load_model
deserialize_settings(settings_path)
File "/Users/pavanmantha/Pavans/PracticeExamples/DataScience_Practice/LLMs/llama_index_tutorials/qdrant-mlflow-llama-index/venv/lib/python3.12/site-packages/mlflow/llama_index/serialize_objects.py", line 171, in deserialize_settings
settings_dict = _deserialize_dict_of_objects(path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/pavanmantha/Pavans/PracticeExamples/DataScience_Practice/LLMs/llama_index_tutorials/qdrant-mlflow-llama-index/venv/lib/python3.12/site-packages/mlflow/llama_index/serialize_objects.py", line 117, in _deserialize_dict_of_objects
output.update({k: dict_to_object(v)})
^^^^^^^^^^^^^^^^^
File "/Users/pavanmantha/Pavans/PracticeExamples/DataScience_Practice/LLMs/llama_index_tutorials/qdrant-mlflow-llama-index/venv/lib/python3.12/site-packages/mlflow/llama_index/serialize_objects.py", line 105, in dict_to_object
return object_class.from_dict(kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/pavanmantha/Pavans/PracticeExamples/DataScience_Practice/LLMs/llama_index_tutorials/qdrant-mlflow-llama-index/venv/lib/python3.12/site-packages/llama_index/core/schema.py", line 145, in from_dict
return cls(**data)
^^^^^^^^^^^
File "/Users/pavanmantha/Pavans/PracticeExamples/DataScience_Practice/LLMs/llama_index_tutorials/qdrant-mlflow-llama-index/venv/lib/python3.12/site-packages/llama_index/core/node_parser/text/sentence.py", line 89, in __init__
super().__init__(
File "/Users/pavanmantha/Pavans/PracticeExamples/DataScience_Practice/LLMs/llama_index_tutorials/qdrant-mlflow-llama-index/venv/lib/python3.12/site-packages/pydantic/main.py", line 211, in __init__
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for SentenceSplitter
id_func
Input should be callable [type=callable_type, input_value={'id_func_name': 'default...nc', 'title': 'id_func'}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.9/v/callable_type
Process finished with exit code 1
Other info / logs
REPLACE_ME
What component(s) does this bug affect?
- [ ]
area/artifacts
: Artifact stores and artifact logging - [ ]
area/build
: Build and test infrastructure for MLflow - [X]
area/deployments
: MLflow Deployments client APIs, server, and third-party Deployments integrations - [ ]
area/docs
: MLflow documentation pages - [X]
area/examples
: Example code - [ ]
area/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registry - [ ]
area/models
: MLmodel format, model serialization/deserialization, flavors - [ ]
area/recipes
: Recipes, Recipe APIs, Recipe configs, Recipe Templates - [ ]
area/projects
: MLproject format, project running backends - [ ]
area/scoring
: MLflow Model server, model deployment tools, Spark UDFs - [X]
area/server-infra
: MLflow Tracking server backend - [X]
area/tracking
: Tracking Service, tracking client APIs, autologging
What interface(s) does this bug affect?
- [ ]
area/uiux
: Front-end, user experience, plotting, JavaScript, JavaScript dev server - [ ]
area/docker
: Docker use across MLflow's components, such as MLflow Projects and MLflow Models - [ ]
area/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registry - [ ]
area/windows
: Windows support
What language(s) does this bug affect?
- [ ]
language/r
: R APIs and clients - [ ]
language/java
: Java APIs and clients - [X]
language/new
: Proposals for new client languages
What integration(s) does this bug affect?
- [ ]
integrations/azure
: Azure and Azure ML integrations - [ ]
integrations/sagemaker
: SageMaker integrations - [ ]
integrations/databricks
: Databricks integrations