chroma [Feature Request]: Add a close method to HttpClient

Describe the problem

@app.get("/get_collections")
async def get_collections():
    try:
        client = HttpClient(host="127.0.0.1", port=8800)
        response = client.list_collections()
        result = []
        for i in response:
            temp = {
                "id": i.id,
                "name":i.name,
                "count":await i.count()
            }
            result.append(temp)
        return result
    except Exception as e:
        raise

When I encapsulate ChromaDB-related services as APIs using FastAPI, each API call generates a new TCP connection that isn't released after completion. During frequent 'add' operations, this leads to a massive accumulation of temporary ports being occupied, eventually causing the process to freeze.

While AsyncHttpClient doesn't have this issue, I want to add an explicit close() method to HttpClient for actively terminating requests.

Describe the proposed solution

I want to add a close method to the client to improve request management.

Alternatives considered

No response

Importance

would make my life easier

Additional Information

No response

Apr 16 '25 08:04 llmadd

Hey @llmadd we've had some prior work on that (#2581) and you are right there are situations where tcp sockets are left open by httpx (which by the way uses connection pooling)

Apr 16 '25 09:04 tazarov

Thanks for your work. I'd also like to ask - if connections remain unclosed, could this prevent memory from being released? I'm frequently encountering memory issues now where the service becomes unresponsive, but I'm not sure why.

Apr 16 '25 09:04 llmadd

httpx defaults use 10 connections. Just 10 connections alone should not cause memory leaks on their own. Assuming you have a Python backend that uses Chroma (server) via HttpClient, is your python backend the one crashing or is Chroma crashing?

Apr 16 '25 10:04 tazarov

httpx defaults use 10 connections. Just 10 connections alone should not cause memory leaks on their own. Assuming you have a Python backend that uses Chroma (server) via HttpClient, is your python backend the one crashing or is Chroma crashing?httpx 默认使用 10 个连接。仅仅 10 个连接本身不应该导致内存泄漏。假设你有一个通过 HttpClient 使用 Chroma（服务器）的 Python 后端，是你的 Python 后端崩溃了，还是 Chroma 崩溃了？

Hello, I'd like to ask: After a VACUUM operation was interrupted, it seems some data was corrupted. Is this repairable?

Additionally, I have about 1 million records with 3584 dimensions each. Why are queries and COUNT operations running so slowly? My server configuration is 8 cores with 64GB RAM.

I tried setting up a config file for startup that looks something like this:

port: 8001
listen_address: "0.0.0.0"
max_payload_size_bytes: 104857600
cors_allow_origins: []
persist_path: "/soft/chromaDB/data_bak/20250415/"
allow_reset: false


vector_index:

  type: "hnsw"
  hnsw:

    m: 32

    ef_construction: 100

    ef_search: 200

  persist_interval_seconds: 1200

# 资源限制配置
resource_limits:

  max_memory_ratio: 0.8

  max_threads: 12


sqlitedb:

  cache_size: -4000000

  synchronous: 0

  journal_mode: "WAL"

  page_size: 4096

  temp_store: 2
  mmap_size: 8589934592

Cached data queries are very fast, but new queries remain extremely slow. However, the server's memory isn't being fully utilized. Could you help me resolve this? Thank you.

Apr 16 '25 11:04 llmadd

Hi @llmadd , going back to the original problem the issue is that you are creating a new client on every API request. you should instead have 1 global client, and reuse that across your APIs

Jun 09 '25 23:06 jairad26

Hi @llmadd , going back to the original problem the issue is that you are creating a new client on every API request. you should instead have 1 global client, and reuse that across your APIs你好，回到最初的问题，问题是你在每个 API 请求上都创建了一个新客户端。你应该有一个全局客户端，并在你的 API 中重复使用它

Okay, thank you. That's indeed the right way to do it.

Jun 10 '25 01:06 llmadd