mods icon indicating copy to clipboard operation
mods copied to clipboard

Add --no-stream flag for reasoning models

Open gwpl opened this issue 3 months ago • 1 comments

AI Assistant:

Summary

This PR adds a minimal --no-stream CLI flag to disable response streaming, resolving issues with reasoning models (o1, o3, o3-pro) that require organization verification for streaming or don't support streaming on certain platforms (Azure OpenAI).

Fixes #430

Problem

Currently, mods hardcodes streaming for all API calls, which causes 400 errors for:

  • o3/o3-mini/o3-pro models without org verification: "param": "stream", "code": "unsupported_value"
  • o1 models on Azure OpenAI (streaming not supported)
  • Any reasoning model with streaming restrictions

Solution

Add --no-stream flag that:

  • Disables streaming when set
  • Uses regular Chat Completion API instead of streaming API
  • Maintains backward compatibility (streaming enabled by default)
  • Implements NonStreamingWrapper to provide consistent stream.Stream interface

Changes

1. config.go (2 changes)

  • Add NoStream bool field to Config struct (line 148)
  • Add help text for --no-stream flag (line 47)

2. main.go (1 change)

  • Register --no-stream CLI flag (line 270)

3. internal/proto/proto.go (1 change)

  • Add Stream *bool field to Request struct (line 78)

4. mods.go (1 change)

  • Pass NoStream config to request (lines 440-443)

5. internal/openai/openai.go (2 changes)

  • Implement conditional streaming logic (lines 96-106)
  • Add NonStreamingWrapper struct implementing stream.Stream (lines 197-254)

Testing

Basic functionality tests

# Non-streaming mode works
echo "what is 2+2?" | ./mods -m gpt-4o-mini --no-stream
# Output: 2 + 2 equals 4.

# Streaming mode still works (default)
echo "what is 3+3?" | ./mods -m gpt-4o-mini
# Output: 3 + 3 equals 6.

# Help text includes new flag
./mods --help | grep no-stream
# Output: --no-stream  Disable streaming of responses (useful for reasoning models without streaming support)

Real-world o3 model test (proves the fix works)

# WITHOUT --no-stream: FAILS with streaming error
echo "test" | ./mods -m o3
# ERROR: 400 Bad Request
# "message": "Your organization must be verified to stream this model..."
# "param": "stream"
# "code": "unsupported_value"

# WITH --no-stream: WORKS!
echo "test" | ./mods -m o3 --no-stream
# Output: Hello! It looks like you're testing the chat. How can I help you today?
# Conversation saved: 6f469bf test

This proves the --no-stream flag successfully resolves issue #430 for o3/o3-pro reasoning models.

Scope & Future Work

This PR implements ONLY the CLI flag. Per-model configuration in config_template.yml is intentionally left for future work to keep this change minimal and focused on resolving the immediate issue.

Future enhancements could include:

  • Per-model stream: false configuration in config files
  • Auto-detection of models that don't support streaming
  • Azure OpenAI specific handling

Backward Compatibility

✅ Fully backward compatible:

  • Default behavior unchanged (streaming enabled)
  • Existing scripts/workflows continue to work
  • New flag is opt-in only

🤖 Generated with Claude Code

gwpl avatar Oct 01 '25 21:10 gwpl

AI Assistant:

CI Lint Status Update

The macOS lint failure is due to a pre-existing issue in the upstream codebase, not introduced by this PR.

Error: os/exec.Command must not be called. use os/exec.CommandContext (noctx) at mods.go:485

Analysis:

  • This error is at line 485 of mods.go
  • The --no-stream implementation only modifies lines 440-443 of mods.go
  • Line 485 is unrelated to the streaming changes
  • This noctx violation exists in the base branch

Files actually modified by this PR:

  1. config.go (lines 47, 148)
  2. main.go (line 270)
  3. internal/proto/proto.go (line 78)
  4. mods.go (lines 440-443)
  5. internal/openai/openai.go (lines 96-254)

Build Status: ✅ All build checks pass on ubuntu, macos, and windows

Testing: Successfully tested with o3 model:

  • WITHOUT --no-stream: 400 error (streaming not allowed)
  • WITH --no-stream: Works successfully ✅

Request: The maintainers may want to address the pre-existing noctx issue separately, or accept this PR despite the lint warning since it's unrelated to the changes introduced here.

gwpl avatar Oct 02 '25 16:10 gwpl