openai-python icon indicating copy to clipboard operation
openai-python copied to clipboard

Memory leak

Open a383615194 opened this issue 4 months ago • 6 comments

Confirm this is an issue with the Python library and not an underlying OpenAI API

  • [X] This is an issue with the Python library

Describe the bug

I am using the AsyncAzureOpenAI class to instantiate a client and using a stream call to client.chat.completions.create. Even after performing close() on both client and response within a try-finally block, I am still encountering a memory leak that eventually leads to server crash. I tried the solution outlined in https://github.com/openai/openai-python/issues/1181, where the pydantic package was upgraded to 2.6.3, but this hasn't resolved my issue. I noticed using the gc library that memory usage increases after each call to this service. Our service is used for centralized management of AzureOpenAI accounts, hence a client is instantiated for every incoming request. Given the concurrent nature of this service, I'm wondering if client.with_options can support concurrent usage. Do you have any good solutions to address this memory leak issue?

To Reproduce

Several calls in a row, for example, to embeddings that are wrapped with asynс.

Code snippets

class LlmStreamApiHandler(tornado.web.RequestHandler):
    executor = ThreadPoolExecutor(200)

    def __init__(self, *args, **kwargs):
        super(LlmStreamApiHandler, self).__init__(*args, **kwargs)
        self.set_header('Content-Type', 'text/event-stream')
        self.set_header('Access-Control-Allow-Origin', "*")
        self.set_header("Access-Control-Allow-Headers", "*")
        self.set_header("Access-Control-Allow-Methods", "*")

    def on_finish(self):
        return super().on_finish()


    async def post(self):

        try:
            result = await self.process(...)
        except Exception as e:
            ...

        self.write(json.dumps(result) + "\n")
        await self.flush()


    async def process(self, ...)
        client = openai.AsyncAzureOpenAI(
            api_version=api_version,
            api_key=api_key,
            azure_endpoint=azure_endpoint,
            http_client=httpx.AsyncClient(
                proxies=config.api_proxy,
            ),
            max_retries=0
        )
        response_text = False
        try:

            response_text = await client.chat.completions.create(**prompt)
            async for chunk in response_text:
                chunk = chunk.model_dump()
                if chunk['choices'] == [] and chunk['id'] == "" and chunk['model'] == "" and chunk['object'] == "":
                    continue
                chunk_message = chunk['choices'][0]['delta']
                current_text = chunk_message.get('content', '')
                if bool(chunk_message) and current_text:
                    ...
                elif chunk['choices'][0]["finish_reason"] == "stop":
                    break

                elif current_text == '' and chunk_message.get('role', '') == "assistant":
                    ...
                elif chunk['choices'][0]["finish_reason"] == "content_filter":
                    ...
                else:
                    continue
                self.write(json.dumps(json_data) + "\n")
                await self.flush()
        except Exception as e:
            ...
            raise ...
        finally:
            if response_text:
                await response_text.close()
            await client.close()
        return ...

OS

CentOS

Python version

Python 3.8

Library version

openai v1.12.0

a383615194 avatar Mar 19 '24 09:03 a383615194