weave icon indicating copy to clipboard operation
weave copied to clipboard

Multiple architectural issues preventing clean imports and gql 4.0 compatibility

Open andrewmonostate opened this issue 4 months ago • 5 comments

Problem 1: gql 4.0+ Incompatibility

Description: Weave fails when users have gql 4.0+ installed due to a breaking change in the execute() method signature.

Error:

TypeError: SyncClientSession.execute() takes 2 positional arguments but 3 were given

Root Cause:

  • gql 3.x: session.execute(query, variables) - variables as positional argument
  • gql 4.x: session.execute(query, variable_values=variables) - keyword-only argument
  • Weave's pyproject.toml doesn't pin gql version, allowing 4.0+ to be installed
  • Code in weave/wandb_interface/wandb_api.py only supports gql 3.x syntax

Impact:

  • Anyone with gql 4.0+ installed cannot use Weave
  • Fresh installations may get gql 4.0 by default, breaking immediately

Problem 2: Client code requires server dependencies (clickhouse_connect)

Description: Client-side code in weave/trace/ unnecessarily requires clickhouse_connect even though it's only needed for the trace server functionality.

Error when importing without clickhouse_connect:

>>> from weave.trace import refs
ModuleNotFoundError: No module named 'clickhouse_connect'

Root Cause:

  • weave/trace/refs.py and weave/trace/vals.py import ObjectDeletedError from weave/trace_server/errors.py
  • trace_server/errors.py unconditionally imports clickhouse_connect at module level
  • This creates a dependency chain: client → server → clickhouse

Impact:

  • Users who only want client functionality must install server dependencies
  • Testing is complicated - requires server dependencies even for client tests
  • Violates separation of concerns principle
  • Makes the package heavier than necessary for client-only use

Steps to Reproduce

Problem 1 (gql):

pip install weave
pip install gql==4.0.0
python -c "from weave.wandb_interface.wandb_api import WandbApi"
# TypeError occurs

Problem 2 (clickhouse):

# Fresh environment without clickhouse_connect
pip install weave[core]  # hypothetically client-only
python -c "from weave.trace import refs"
# ModuleNotFoundError: No module named 'clickhouse_connect'

Expected Behavior

  1. Weave should work with both gql 3.x and 4.x versions
  2. Client code should be importable without server dependencies

Environment

  • Python: 3.9+
  • Weave: latest
  • gql: 4.0.0
  • OS: Any

Suggested Solutions

  1. For gql compatibility: Add version detection and use appropriate method signature
  2. For dependency separation: Move shared exceptions to a common module that doesn't require server dependencies

Both issues indicate architectural improvements needed for better modularity and compatibility.

andrewmonostate avatar Aug 18 '25 06:08 andrewmonostate

I hit this too, thanks for investigating and providing a workaround!

nicsuzor avatar Aug 20 '25 01:08 nicsuzor

@tssweeney @andrewtruong can anyone review this to close it?

andrewmonostate avatar Sep 04 '25 18:09 andrewmonostate

Hey @andrewmonostate, thanks for the PR!

I think we resolved this here: https://github.com/wandb/weave/pull/5290

andrewtruong avatar Sep 08 '25 16:09 andrewtruong

Hey @andrewmonostate, thanks for the PR!

I think we resolved this here: https://github.com/wandb/weave/pull/5290

@andrewtruong Thanks for the quick response and for #5290.

A couple clarifications on scope & durability:

What my PR (#5288) fixes vs #5290 • gql 4.x signature break: • #5290 go on gql’s temporary compat shim so session.execute(query, kwargs) still works. • #5288 updates the call sites to the stable API (variable_values=kwargs) for both sync/async, i.e., it’s future-proof even if gql removes the shim in a minor or major. • Client/server dependency leak (clickhouse_connect): • #5290 doesn’t address this. • #5288 separates client imports from server-only deps so from weave.trace import refs doesn’t force installing ClickHouse client libraries just to use client-side features.

So: #5290 is a good short-term unblock, but #5288 is the long-term fix + modularity/maintainability improvement.

On mixing fixes in one PR, sorry I bundled things. The reason I did is practical: right now the repo’s own README acknowledges ongoing re-org, and these two issues surface together in fresh installs (users hit gql breakage and then immediately hit the server-dep import chain). Fixing them together made the environment actually usable and keeps us from regressions.

andrewmonostate avatar Sep 08 '25 16:09 andrewmonostate

@andrewtruong any comments?

andrewmonostate avatar Oct 01 '25 21:10 andrewmonostate