Add --no-stream flag for reasoning models
AI Assistant:
Summary
This PR adds a minimal --no-stream CLI flag to disable response streaming, resolving issues with reasoning models (o1, o3, o3-pro) that require organization verification for streaming or don't support streaming on certain platforms (Azure OpenAI).
Fixes #430
Problem
Currently, mods hardcodes streaming for all API calls, which causes 400 errors for:
- o3/o3-mini/o3-pro models without org verification:
"param": "stream", "code": "unsupported_value" - o1 models on Azure OpenAI (streaming not supported)
- Any reasoning model with streaming restrictions
Solution
Add --no-stream flag that:
- Disables streaming when set
- Uses regular Chat Completion API instead of streaming API
- Maintains backward compatibility (streaming enabled by default)
- Implements
NonStreamingWrapperto provide consistentstream.Streaminterface
Changes
1. config.go (2 changes)
- Add
NoStream boolfield to Config struct (line 148) - Add help text for
--no-streamflag (line 47)
2. main.go (1 change)
- Register
--no-streamCLI flag (line 270)
3. internal/proto/proto.go (1 change)
- Add
Stream *boolfield to Request struct (line 78)
4. mods.go (1 change)
- Pass
NoStreamconfig to request (lines 440-443)
5. internal/openai/openai.go (2 changes)
- Implement conditional streaming logic (lines 96-106)
- Add
NonStreamingWrapperstruct implementingstream.Stream(lines 197-254)
Testing
Basic functionality tests
# Non-streaming mode works
echo "what is 2+2?" | ./mods -m gpt-4o-mini --no-stream
# Output: 2 + 2 equals 4.
# Streaming mode still works (default)
echo "what is 3+3?" | ./mods -m gpt-4o-mini
# Output: 3 + 3 equals 6.
# Help text includes new flag
./mods --help | grep no-stream
# Output: --no-stream Disable streaming of responses (useful for reasoning models without streaming support)
Real-world o3 model test (proves the fix works)
# WITHOUT --no-stream: FAILS with streaming error
echo "test" | ./mods -m o3
# ERROR: 400 Bad Request
# "message": "Your organization must be verified to stream this model..."
# "param": "stream"
# "code": "unsupported_value"
# WITH --no-stream: WORKS!
echo "test" | ./mods -m o3 --no-stream
# Output: Hello! It looks like you're testing the chat. How can I help you today?
# Conversation saved: 6f469bf test
This proves the --no-stream flag successfully resolves issue #430 for o3/o3-pro reasoning models.
Scope & Future Work
This PR implements ONLY the CLI flag. Per-model configuration in config_template.yml is intentionally left for future work to keep this change minimal and focused on resolving the immediate issue.
Future enhancements could include:
- Per-model
stream: falseconfiguration in config files - Auto-detection of models that don't support streaming
- Azure OpenAI specific handling
Backward Compatibility
✅ Fully backward compatible:
- Default behavior unchanged (streaming enabled)
- Existing scripts/workflows continue to work
- New flag is opt-in only
🤖 Generated with Claude Code
AI Assistant:
CI Lint Status Update
The macOS lint failure is due to a pre-existing issue in the upstream codebase, not introduced by this PR.
Error: os/exec.Command must not be called. use os/exec.CommandContext (noctx) at mods.go:485
Analysis:
- This error is at line 485 of mods.go
- The --no-stream implementation only modifies lines 440-443 of mods.go
- Line 485 is unrelated to the streaming changes
- This noctx violation exists in the base branch
Files actually modified by this PR:
- config.go (lines 47, 148)
- main.go (line 270)
- internal/proto/proto.go (line 78)
- mods.go (lines 440-443)
- internal/openai/openai.go (lines 96-254)
Build Status: ✅ All build checks pass on ubuntu, macos, and windows
Testing: Successfully tested with o3 model:
- WITHOUT --no-stream: 400 error (streaming not allowed)
- WITH --no-stream: Works successfully ✅
Request: The maintainers may want to address the pre-existing noctx issue separately, or accept this PR despite the lint warning since it's unrelated to the changes introduced here.