Detected ML Service Misuses

Open hadil1999-creator opened this issue 1 month ago • 0 comments

Hello, I hope you are doing well.

As part of our analysis, we detected several ML service misuses in your repository. The refactored code snippets provided below were generated with the assistance of an LLM after we identified the misuse and manually validated the refactoring to ensure correctness.

For further details on each misuse definition, please consult our ML Service Misuse Catalog: here

📝 Notes

We did not submit a Pull Request, as we did not want to risk breaking or affecting your project’s workflow or CI/CD pipeline. Instead, we are providing the information below so you can review and integrate the changes at your convenience.

🔍 Detected ML Service Misuses

Below is the list of misuses detected, each with:

A short explanation
Refactored code examples

1️⃣ Ignoring Monitoring Data Drift

Description: No mechanism to detect input/output drift across time.
Refactored version:

from openai import AzureOpenAI, OpenAI
import argparse
import readline
import base64
client = AzureOpenAI(
    azure_endpoint="INPUT_YOUR_ENDPOINT_URL",
    api_version="2024-06-01",
    api_key="INPUT_YOUR_API_KEY",
    timeout=120
)
deployment_model = "INPUT_YOUR_MODEL_NAME"
'''
client = OpenAI(
    api_key="INPUT_YOUR_API_KEY",
    timeout=120
)
deployment_model = "gpt-4"
'''
prompt = "You are ChatGPT, a large language model trained by OpenAI."
creation_params = {
    "temperature": 1.0,
    "frequency_penalty": 0.0,
    "presence_penalty": 0.0,
}
RED = "\033[91m"
GREEN = "\033[92m"
YELLOW = "\033[93m"
BLUE = "\033[94m"
PURPLE = "\033[95m"
CYAN = "\033[96m"
END = "\033[0m"
SYSTEM_COLOR = BLUE
ASSIS_COLOR = YELLOW
USER_COLOR = GREEN
ERROR_COLOR = RED
INFO_COLOR = PURPLE

class MyChatGPT:
    def __init__(self, args):
        self.conversation = []
        self.conversation_init = []
        self.save_on_the_fly = args.save
        self.temperature = args.temperature
        self.frequency_penalty = args.frequency_penalty
        self.presence_penalty = args.presence_penalty
        print(INFO_COLOR + "NOTE: Type 'help' to view help information and some commands." + END)
        print(INFO_COLOR + "      Your input must end with a '
        if args.load is not None:
            self.load_from_file(args.load)
        else:
            self.conversation.append({"role": "system", "content": args.prompt})
            print(SYSTEM_COLOR + "system: \n" + args.prompt + END + "\n")
        self.conversation_init = self.conversation.copy()

    # ...

def run(self):
    while (True):
        print(USER_COLOR + "user: " + END)
        user_input = self.multiline_input()
        if user_input is None:
            continue
        else:
            self.conversation.append({"role": "user", "content": user_input})

        # Monitor data drift and adjust model accordingly
        if len(self.conversation) > 1000:  # Threshold for data drift detection
            try:
                response = client.chat.completions.create(
                    messages=self.conversation,
                    temperature=self.temperature,
                    frequency_penalty=self.frequency_penalty,
                    presence_penalty=self.presence_penalty,
                    model=deployment_model
                )
            except Exception as err:
                print(ERROR_COLOR + "Error: " + str(err) + END)
                continue

            if response.choices[0].message.content != self.conversation[-1]["content"]:
                # Data drift detected! Adjust model and retrain if necessary.
                # For this example, we'll simply adjust the temperature for the next prompt
                self.temperature = 0.8  # Adjust temperature to mitigate data drift

            self.conversation.append({"role": "assistant", "content": response.choices[0].message.content})
            print("\n" + ASSIS_COLOR + 'assistant: \n' + response.choices[0].message.content + END + "\n")

        if self.save_on_the_fly is not None:
            self.save_to_file(self.save_on_the_fly, on_the_fly=True)

# ...

if __name__ == "__main__":
    arg_parser = argparse.ArgumentParser()
    # ...
    mychatgpt = MyChatGPT(args)
    mychatgpt.run()

2️⃣ Improper Handling of ML API Limits

Description: API rate limits or quota usage are not checked or handled.
Refactored version:

class MyChatGPT:
    # ... (rest of the class remains unchanged)

    def run(self):
        while True:
            print(USER_COLOR + "user: " + END)
            user_input = self.multiline_input()
            if user_input is None:
                continue
            else:
                self.conversation.append({"role": "user", "content": user_input})

            # Handle API rate limits by implementing exponential backoff and retry
            max_retries = 3
            retries = 0
            while retries < max_retries:
                try:
                    response = client.chat.completions.create(messages=self.conversation,
                        temperature=self.temperature,
                        frequency_penalty=self.frequency_penalty,
                        presence_penalty=self.presence_penalty,
                        model=deployment_model)
                    break
                except Exception as err:
                    if "API rate limit exceeded" in str(err):
                        print(ERROR_COLOR + "Error: API rate limit exceeded. Retrying..." + END)
                        # Implement exponential backoff and retry here (e.g., wait for 1, 2, 4 seconds before retries)
                        time.sleep(2 ** retries)
                    else:
                        print(ERROR_COLOR + "Error: " + str(err) + END)
                        self.conversation = self.conversation[:-1]
                        break
            else:
                # Handle max_retries exceeded (i.e., API rate limit not recovered after max attempts)
                print(ERROR_COLOR + "Error: Max retries exceeded. Conversation terminated." + END)
                return

            self.conversation.append({"role": "assistant", "content": response.choices[0].message.content})
            print("\n" + ASSIS_COLOR + 'assistant: \n' + response.choices[0].message.content + END + "\n")
            if self.save_on_the_fly is not None:
                self.save_to_file(self.save_on_the_fly, on_the_fly=True)

3️⃣ Ignoring Testing Schema Mismatch

Description: Model is invoked without validating input schema or data type compatibility.
Refactored version:

from openai import AzureOpenAI, OpenAI
import argparse
import readline
import base64

# ... (rest of the code remains unchanged)

class MyChatGPT:
    # ... (rest of the class definition remains unchanged)

    def run(self):
        while True:
            print(USER_COLOR + "user: " + END)
            user_input = self.multiline_input()
            if user_input is None:
                continue
            else:
                self.conversation.append({"role": "user", "content": user_input})
            try:
                response = client.chat.completions.create(
                    messages=self.conversation,
                    temperature=self.temperature,
                    frequency_penalty=self.frequency_penalty,
                    presence_penalty=self.presence_penalty,
                    model=deployment_model,
                    **{"schema": AzureOpenAI.Schema("test-schema")},  # <--- Add this line to enable schema testing
                )
            except Exception as err:
                print(ERROR_COLOR + "Error: " + str(err) + END)
                print(ERROR_COLOR + "Please re-try.\n" + END)
                self.conversation = self.conversation[:-1]
                continue
            # ... (rest of the run method remains unchanged)

if __name__ == "__main__":
    arg_parser = argparse.ArgumentParser()
    # ... (rest of the argument parsing code remains unchanged)

mychatgpt = MyChatGPT(args)
mychatgpt.run()

Nov 20 '25 03:11 hadil1999-creator