giskard.scan() defaults back to OpenAI, when I have been using LM Studio (Solution suggested)

osok opened this issue 1 week ago

Issue Type




Giskard Library Version


Giskard Hub Version

not using

OS Platform and Distribution

Ubuntu 22.04.4 LTS

Python version

Python 3.9.19

Installed python packages

Current Behaviour?

NOTE I added a second comment below which gets to the root of the problem and suggests a fix.

When I call

scan_results = giskard.scan(model=giskard_model)

It uses OpenAI even though the model is configured to call LM Studio

2024-06-20 17:56:49,828 pid:3414141 MainThread httpx        INFO     HTTP Request: POST "HTTP/1.1 429 Too Many Requests"
2024-06-20 17:56:49,829 pid:3414141 MainThread openai._base_client INFO     Retrying request to /chat/completions in 0.823091 seconds

[cut here for space]

RateLimitError: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs:', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

I givwe the full response below, but this sticks out:

I have not configured this anywhere to call to openai, rather the I did configure LM Studio.

Is there a way to globally configure Giskard to usin LM Studio?

Standalone code OR list down the steps to reproduce the issue

I'm using LM Studio Model : TheBloke/Llama 2 13B Q 8.0 GGUF Embeddings : nomic-embeded-text

Here is the code I use to get to this point. I'm using Jyputer Notebook so I'll break it out code / response

import requests
import numpy as np
import faiss
from openai import OpenAI
from langchain import PromptTemplate
from langchain.chains import RetrievalQA
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.docstore.in_memory import InMemoryDocstore
from langchain.schema import Document
from langchain_openai import ChatOpenAI
import giskard
import pandas as pd

# Initialize OpenAI client for LM Studio embeddings
embedding_client = OpenAI(base_url="http://localhost:5000/v1", api_key="lm-studio")

# Function to get embeddings from LM Studio
def get_embedding(text, model="model-identifier", retries=3, timeout=120):
    text = text.replace("\n", " ")
    data = {
        "input": [text],
        "model": model
    for attempt in range(retries):
            response ="http://localhost:5000/v1/embeddings", json=data, timeout=timeout)
            response.raise_for_status()  # Raise an error for bad status codes
            response_data = response.json()
            if 'data' in response_data and len(response_data['data']) > 0:
                return response_data['data'][0]['embedding']
            return None
        except requests.exceptions.RequestException as e:
            if attempt < retries - 1:
                time.sleep(2 ** attempt)  # Exponential backoff
        except Exception as e:
            if attempt < retries - 1:
                time.sleep(2 ** attempt)  # Exponential backoff
    return None

# Function to load and split PDF using PyMuPDF
def load_pdf(file_path):
    import fitz  # PyMuPDF
    doc =
    text = ""
    for page_num in range(len(doc)):
        page = doc.load_page(page_num)
        text += page.get_text("text")
    return text

# Load the PDF
pdf_text = load_pdf("IPCC_AR6_SYR_LongerReport.pdf")

# Split the text into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100, add_start_index=True)
texts = text_splitter.split_text(pdf_text)

# Get embeddings for each text chunk
embeddings = [get_embedding(text) for text in texts if get_embedding(text)]

# Convert embeddings to a NumPy array
embeddings = np.array(embeddings, dtype=np.float32)

# Ensure the embeddings have the desired dimension
desired_dimension = 768
if embeddings.shape[1] != desired_dimension:
    print(f"Warning: The embedding dimension is {embeddings.shape[1]}, not {desired_dimension}")

# Create a FAISS index
index = faiss.IndexFlatL2(embeddings.shape[1])

# Prepare documents and docstore
documents = [Document(page_content=text) for text in texts]
docstore = InMemoryDocstore({str(i): doc for i, doc in enumerate(documents)})

# Create an index-to-docstore-id mapping
index_to_docstore_id = {i: str(i) for i in range(len(documents))}

# Use Langchain's FAISS retriever
vectorstore = FAISS(embedding_function=get_embedding, index=index, docstore=docstore, index_to_docstore_id=index_to_docstore_id)

# Prepare QA chain
PROMPT_TEMPLATE = """You are the Climate Assistant, a helpful AI assistant made by Giskard.
Your task is to answer common questions on climate change.
You will be given a question and relevant excerpts from the IPCC Climate Change Synthesis Report (2023).
Please provide short and clear answers based on the provided context. Be polite and helpful.



Your answer:

llm = ChatOpenAI(base_url="http://localhost:5000/v1", temperature=0.85, api_key="not_needed")

prompt = PromptTemplate(template=PROMPT_TEMPLATE, input_variables=["question", "context"])
climate_qa_chain = RetrievalQA.from_llm(llm=llm, retriever=vectorstore.as_retriever(), prompt=prompt)


2024-06-20 17:56:33,100 pid:3414141 MainThread langchain_community.vectorstores.faiss WARNING  `embedding_function` is expected to be an Embeddings object, support for passing in a function will soon be removed.

# Example question
question = "What are the main impacts of climate change?"
answer = climate_qa_chain({"query": question})


Hi there! As the Climate Assistant, I can tell you that the main impacts of climate change include:

* Increased frequency and severity of extreme weather events such as heatwaves, droughts, floods, and storms (high confidence)
* Rising sea levels and coastal erosion, with associated impacts on coastal ecosystems and human settlements (high confidence)
* Loss of biodiversity, including changes in the distribution and abundance of species, as well as the loss of ecosystem functions and services (very high confidence)
* Negative impacts on human health, including an increased risk of heat-related illness and the spread of disease vectors (high confidence)
* Economic losses and damages to infrastructure, including impacts on agriculture, forestry, fisheries, energy, and tourism (high confidence)
* Increased risk of water scarcity and competition for resources, with potential impacts on food security and human settlements (medium to high confidence)

Overall, the impacts of climate change are far-reaching and can have significant consequences for both the environment and human societies. It is important to take action to mitigate and adapt to these impacts in order to minimize their effects.
def model_predict(df: pd.DataFrame):
    """Wraps the LLM call in a simple Python function.

    The function takes a pandas.DataFrame containing the input variables needed
    by your model, and must return a list of the outputs (one for each row).
    return [{"query": question}) for question in df["question"]]

# Don’t forget to fill the `name` and `description`: they are used by Giskard
# to generate domain-specific tests.
giskard_model = giskard.Model(
    name="Climate Change Question Answering",
    description="This model answers any question about climate change based on IPCC reports",


2024-06-20 17:56:46,379 pid:3414141 MainThread giskard.models.automodel INFO     Your 'prediction_function' is successfully wrapped by Giskard's 'PredictionFunctionModel' wrapper class.


scan_results = giskard.scan(model=giskard_model)

full results

2024-06-20 17:56:49,828 pid:3414141 MainThread httpx        INFO     HTTP Request: POST "HTTP/1.1 429 Too Many Requests"
2024-06-20 17:56:49,829 pid:3414141 MainThread openai._base_client INFO     Retrying request to /chat/completions in 0.823091 seconds
2024-06-20 17:56:50,729 pid:3414141 MainThread httpx        INFO     HTTP Request: POST "HTTP/1.1 429 Too Many Requests"
2024-06-20 17:56:50,729 pid:3414141 MainThread openai._base_client INFO     Retrying request to /chat/completions in 1.627491 seconds
2024-06-20 17:56:52,434 pid:3414141 MainThread httpx        INFO     HTTP Request: POST "HTTP/1.1 429 Too Many Requests"
RateLimitError                            Traceback (most recent call last)
Cell In[7], line 1
----> 1 scan_results = giskard.scan(model=giskard_model)

File /home/michael/anaconda3/envs/giskard/lib/python3.9/site-packages/giskard/scanner/, in scan(model, dataset, features, params, only, verbose, raise_exceptions)
     35 """Automatically detects model vulnerabilities.
     37 See :class:`Scanner` for more details.
     61     A scan report object containing the results of the scan.
     62 """
     63 scanner = Scanner(params, only=only)
---> 64 return scanner.analyze(
     65     model, dataset=dataset, features=features, verbose=verbose, raise_exceptions=raise_exceptions
     66 )

File /home/michael/anaconda3/envs/giskard/lib/python3.9/site-packages/giskard/scanner/, in Scanner.analyze(self, model, dataset, features, verbose, raise_exceptions)
     77 """Runs the analysis of a model and dataset, detecting issues.
     79 Parameters
     96     A report object containing the detected issues and other information.
     97 """
     98 with TemporaryRootLogLevel(logging.INFO if verbose else logging.NOTSET):
     99     # Check that the model and dataset were appropriately wrapped with Giskard
--> 100     model, dataset, model_validation_time = self._validate_model_and_dataset(model, dataset)
    102     # Check that provided features are valid
    103     features = self._validate_features(features, model, dataset)

File /home/michael/anaconda3/envs/giskard/lib/python3.9/site-packages/giskard/scanner/, in Scanner._validate_model_and_dataset(self, model, dataset)
    204 if dataset is not None and not isinstance(dataset, Dataset):
    205     raise ValueError(
    206         "The dataset object you provided is not valid. Please wrap your dataframe with `giskard.Dataset`. "
    207         "You can follow the docs here:"
    208     )
--> 210 model, dataset = self._prepare_model_dataset(model, dataset)
    212 if not model.is_text_generation:
    213     time_start = perf_counter()

File /home/michael/anaconda3/envs/giskard/lib/python3.9/site-packages/giskard/scanner/, in Scanner._prepare_model_dataset(self, model, dataset)
    267 logger.debug("Automatically generating test dataset.")
    268 try:
--> 269     return model, generate_test_dataset(model)
    270 except LLMGenerationError:
    271     warning(
    272         "Failed to generate test dataset. Trying to run the scan with an empty dataset. For improved results, please provide a test dataset."
    273     )

File /home/michael/anaconda3/envs/giskard/lib/python3.9/site-packages/giskard/llm/, in generate_test_dataset(model, num_samples, temperature, column_types, llm_seed)
     14 """Generates a synthetic test dataset using an LLM.
     16 Parameters
     41 :class:`giskard.llm.generators.BaseDataGenerator`
     42 """
     43 generator = SimpleDataGenerator(llm_temperature=temperature, llm_seed=llm_seed)
---> 45 return generator.generate_dataset(model, num_samples, column_types)

File /home/michael/anaconda3/envs/giskard/lib/python3.9/site-packages/giskard/llm/generators/, in _BaseLLMGenerator.generate_dataset(self, model, num_samples, column_types)
     41 """Generates a test dataset for the model.
     43 Parameters
     60     If the generation fails.
     61 """
     62 messages = self._format_messages(model, num_samples, column_types)
---> 64 out = self.llm_client.complete(
     65     messages=messages,
     66     temperature=self.llm_temperature,
     67     caller_id=self.__class__.__name__,
     68     seed=self.llm_seed,
     69     format="json",
     70 )
     72 generated = self._parse_output(out)
     74 dataset = Dataset(
     75     df=pd.DataFrame(generated),
     76     name=self._make_dataset_name(model),
     77     validation=False,
     78     column_types=column_types,
     79 )

File /home/michael/anaconda3/envs/giskard/lib/python3.9/site-packages/giskard/llm/client/, in OpenAIClient.complete(self, messages, temperature, max_tokens, caller_id, seed, format)
     60         extra_params["response_format"] = {"type": "json_object"}
     62 try:
---> 63     completion =
     64         model=self.model,
     65         messages=[asdict(m) for m in messages],
     66         temperature=temperature,
     67         max_tokens=max_tokens,
     68         **extra_params,
     69     )
     70 except openai.AuthenticationError as err:
     71     raise LLMConfigurationError(AUTH_ERROR_MESSAGE) from err

File /home/michael/anaconda3/envs/giskard/lib/python3.9/site-packages/sentry_sdk/, in ensure_integration_enabled.<locals>.patcher.<locals>.runner(*args, **kwargs)
   1708 if sentry_sdk.get_client().get_integration(integration) is None:
   1709     return original_function(*args, **kwargs)
-> 1711 return sentry_patched_function(*args, **kwargs)

File /home/michael/anaconda3/envs/giskard/lib/python3.9/site-packages/sentry_sdk/integrations/, in _wrap_chat_completion_create.<locals>.new_chat_completion(*args, **kwargs)
    151     _capture_exception(e)
    152     span.__exit__(None, None, None)
--> 153     raise e from None
    155 integration = sentry_sdk.get_client().get_integration(OpenAIIntegration)
    157 with capture_internal_exceptions():

File /home/michael/anaconda3/envs/giskard/lib/python3.9/site-packages/sentry_sdk/integrations/, in _wrap_chat_completion_create.<locals>.new_chat_completion(*args, **kwargs)
    147 span.__enter__()
    148 try:
--> 149     res = f(*args, **kwargs)
    150 except Exception as e:
    151     _capture_exception(e)

File /home/michael/anaconda3/envs/giskard/lib/python3.9/site-packages/openai/_utils/, in required_args.<locals>.inner.<locals>.wrapper(*args, **kwargs)
    275             msg = f"Missing required argument: {quote(missing[0])}"
    276     raise TypeError(msg)
--> 277 return func(*args, **kwargs)

File /home/michael/anaconda3/envs/giskard/lib/python3.9/site-packages/openai/resources/chat/, in Completions.create(self, messages, model, frequency_penalty, function_call, functions, logit_bias, logprobs, max_tokens, n, parallel_tool_calls, presence_penalty, response_format, seed, service_tier, stop, stream, stream_options, temperature, tool_choice, tools, top_logprobs, top_p, user, extra_headers, extra_query, extra_body, timeout)
    606 @required_args(["messages", "model"], ["messages", "model", "stream"])
    607 def create(
    608     self,
    638     timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN,
    639 ) -> ChatCompletion | Stream[ChatCompletionChunk]:
--> 640     return self._post(
    641         "/chat/completions",
    642         body=maybe_transform(
    643             {
    644                 "messages": messages,
    645                 "model": model,
    646                 "frequency_penalty": frequency_penalty,
    647                 "function_call": function_call,
    648                 "functions": functions,
    649                 "logit_bias": logit_bias,
    650                 "logprobs": logprobs,
    651                 "max_tokens": max_tokens,
    652                 "n": n,
    653                 "parallel_tool_calls": parallel_tool_calls,
    654                 "presence_penalty": presence_penalty,
    655                 "response_format": response_format,
    656                 "seed": seed,
    657                 "service_tier": service_tier,
    658                 "stop": stop,
    659                 "stream": stream,
    660                 "stream_options": stream_options,
    661                 "temperature": temperature,
    662                 "tool_choice": tool_choice,
    663                 "tools": tools,
    664                 "top_logprobs": top_logprobs,
    665                 "top_p": top_p,
    666                 "user": user,
    667             },
    668             completion_create_params.CompletionCreateParams,
    669         ),
    670         options=make_request_options(
    671             extra_headers=extra_headers, extra_query=extra_query, extra_body=extra_body, timeout=timeout
    672         ),
    673         cast_to=ChatCompletion,
    674         stream=stream or False,
    675         stream_cls=Stream[ChatCompletionChunk],
    676     )

File /home/michael/anaconda3/envs/giskard/lib/python3.9/site-packages/openai/, in, path, cast_to, body, options, files, stream, stream_cls)
   1236 def post(
   1237     self,
   1238     path: str,
   1245     stream_cls: type[_StreamT] | None = None,
   1246 ) -> ResponseT | _StreamT:
   1247     opts = FinalRequestOptions.construct(
   1248         method="post", url=path, json_data=body, files=to_httpx_files(files), **options
   1249     )
-> 1250     return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))

File /home/michael/anaconda3/envs/giskard/lib/python3.9/site-packages/openai/, in SyncAPIClient.request(self, cast_to, options, remaining_retries, stream, stream_cls)
    922 def request(
    923     self,
    924     cast_to: Type[ResponseT],
    929     stream_cls: type[_StreamT] | None = None,
    930 ) -> ResponseT | _StreamT:
--> 931     return self._request(
    932         cast_to=cast_to,
    933         options=options,
    934         stream=stream,
    935         stream_cls=stream_cls,
    936         remaining_retries=remaining_retries,
    937     )

File /home/michael/anaconda3/envs/giskard/lib/python3.9/site-packages/openai/, in SyncAPIClient._request(self, cast_to, options, remaining_retries, stream, stream_cls)
   1013 if retries > 0 and self._should_retry(err.response):
   1014     err.response.close()
-> 1015     return self._retry_request(
   1016         options,
   1017         cast_to,
   1018         retries,
   1019         err.response.headers,
   1020         stream=stream,
   1021         stream_cls=stream_cls,
   1022     )
   1024 # If the response is streamed then we need to explicitly read the response
   1025 # to completion before attempting to access the response text.
   1026 if not err.response.is_closed:

File /home/michael/anaconda3/envs/giskard/lib/python3.9/site-packages/openai/, in SyncAPIClient._retry_request(self, options, cast_to, remaining_retries, response_headers, stream, stream_cls)
   1059 # In a synchronous context we are blocking the entire thread. Up to the library user to run the client in a
   1060 # different thread if necessary.
   1061 time.sleep(timeout)
-> 1063 return self._request(
   1064     options=options,
   1065     cast_to=cast_to,
   1066     remaining_retries=remaining,
   1067     stream=stream,
   1068     stream_cls=stream_cls,
   1069 )

File /home/michael/anaconda3/envs/giskard/lib/python3.9/site-packages/openai/, in SyncAPIClient._request(self, cast_to, options, remaining_retries, stream, stream_cls)
   1013 if retries > 0 and self._should_retry(err.response):
   1014     err.response.close()
-> 1015     return self._retry_request(
   1016         options,
   1017         cast_to,
   1018         retries,
   1019         err.response.headers,
   1020         stream=stream,
   1021         stream_cls=stream_cls,
   1022     )
   1024 # If the response is streamed then we need to explicitly read the response
   1025 # to completion before attempting to access the response text.
   1026 if not err.response.is_closed:

File /home/michael/anaconda3/envs/giskard/lib/python3.9/site-packages/openai/, in SyncAPIClient._retry_request(self, options, cast_to, remaining_retries, response_headers, stream, stream_cls)
   1059 # In a synchronous context we are blocking the entire thread. Up to the library user to run the client in a
   1060 # different thread if necessary.
   1061 time.sleep(timeout)
-> 1063 return self._request(
   1064     options=options,
   1065     cast_to=cast_to,
   1066     remaining_retries=remaining,
   1067     stream=stream,
   1068     stream_cls=stream_cls,
   1069 )

File /home/michael/anaconda3/envs/giskard/lib/python3.9/site-packages/openai/, in SyncAPIClient._request(self, cast_to, options, remaining_retries, stream, stream_cls)
   1029     log.debug("Re-raising status error")
-> 1030     raise self._make_status_error_from_response(err.response) from None
   1032 return self._process_response(
   1033     cast_to=cast_to,
   1034     options=options,
   1037     stream_cls=stream_cls,
   1038 )

RateLimitError: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs:', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

osok avatar Jun 20 '24 22:06 osok