docs(cli): Add --audio, --audio-language, and --audio-provider flags documentation
Pull Request
Description
Adds documentation for audio CLI flags (--audio, --audio-language, --audio-provider) to enable speech-to-speech interactions and audio analysis via the CLI.
Type of Change
- [ ] π Bug fix (non-breaking change which fixes an issue)
- [ ] β¨ New feature (non-breaking change which adds functionality)
- [ ] π₯ Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [x] π Documentation update
- [ ] π§Ή Code refactoring (no functional changes)
- [ ] β‘ Performance improvement
- [ ] π§ͺ Test coverage improvement
- [ ] π§ Build/CI configuration change
Related Issues
- Fixes #AUDIO-036
- Related to #AUDIO-026
Changes Made
- Added audio flags to
generatecommand's Key Flags section - Updated Command Map with audio example for
streamcommand - Added new Audio Input section with:
- Usage examples (basic, with evaluation, with analytics)
- Audio flags reference table
- Supported formats (WAV, MP3, PCM 16-bit LE)
- Provider/model compatibility matrix
- Added Multimodal Flags reference table consolidating all media input flags
- Added audio troubleshooting entries
- Added Real-Time Speech Agents to Related Features
Example usage documented:
npx @juspay/neurolink stream "Respond to the following audio" \
--audio ./recordings/question.wav \
--audio-language en \
--provider google-ai
AI Provider Impact
- [ ] OpenAI
- [ ] Anthropic
- [x] Google AI/Vertex
- [ ] AWS Bedrock
- [ ] Azure OpenAI
- [ ] Hugging Face
- [ ] Ollama
- [ ] Mistral
- [ ] All providers
- [ ] No provider-specific changes
Component Impact
- [x] CLI
- [ ] SDK
- [ ] MCP Integration
- [ ] Streaming
- [ ] Tool Calling
- [ ] Configuration
- [x] Documentation
- [ ] Tests
Testing
- [ ] Unit tests added/updated
- [ ] Integration tests added/updated
- [ ] E2E tests added/updated
- [x] Manual testing performed
- [x] All existing tests pass
Test Environment
- OS: N/A (documentation only)
- Node.js version: N/A
- Package manager: N/A
Performance Impact
- [x] No performance impact
- [ ] Performance improvement
- [ ] Minor performance impact (acceptable)
- [ ] Significant performance impact (needs discussion)
Breaking Changes
None.
Screenshots/Demo
N/A
Checklist
- [x] My code follows the project's style guidelines
- [x] I have performed a self-review of my code
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my feature works
- [x] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged and published
Additional Notes
Audio model names ending in -preview are noted as subject to change per provider documentation.
Original prompt
This section details on the original issue you should resolve
<issue_title>AUDIO-036: Update CLI Commands Documentation</issue_title> <issue_description>## Summary
Update the CLI commands documentation (
docs/cli/commands.md) to include --audio, --audio-language, and --audio-provider flags with descriptions and examples.Technical Details
- File(s):
docs/cli/commands.md- Effort: 0.5 hours
Acceptance Criteria
- [ ] --audio flag documented with examples
- [ ] --audio-language flag documented
- [ ] --audio-provider flag documented
- [ ] Usage examples provided
- [ ] Flag reference table updated
Dependencies
- Depends on: AUDIO-026
- Blocks: none
Priority: medium
Effort: 0.5h
Complexity: simple</issue_description><agent_instructions>Please create a Pull Request using the issue description and comments. Follow these rules:
- Write a clear, concise PR title.
- Summarize the problem and the solution in the PR description.
- List all major changes in bullet points.
- Add testing steps so reviewers can validate the changes.
- Mention any breaking changes, if applicable.
- Keep the PR easy to read and professional.</agent_instructions>
Comments on the Issue (you are @copilot in this section)
- Fixes juspay/neurolink#509
β¨ Let Copilot coding agent set things up for you β coding agent works faster and does higher quality work when set up for your repo.
Walkthrough
This pull request updates CLI commands documentation to comprehensively cover audio and multimodal capabilities. It adds detailed documentation for --audio, --audio-language, and --audio-provider flags with usage examples, best practices, supported formats, provider references, and audio-specific troubleshooting guidance.
Changes
| Cohort / File(s) | Summary |
|---|---|
Audio/Multimodal Documentation docs/cli/commands.md |
Introduces Audio Input section with flag documentation (--audio, --audio-language, --audio-provider), supported formats, and providers. Expands command metadata to reflect audio capabilities. Adds new Multimodal Flags reference table for generate/stream commands. Includes extensive audio usage examples, best practices, provider/model mappings, Real-Time Speech Agents reference link, and audio-specific troubleshooting notes. |
Estimated code review effort
π― 2 (Simple) | β±οΈ ~10 minutes
Possibly related PRs
- juspay/neurolink#141 β Adds REAL-TIME-SPEECH-AGENTS.md documentation, which is directly referenced in this PR's CLI commands guide.
Poem
π° Hop along, dear audio friends,
With flags and docs that never end,
The rabbit's typed with care so bright,
Making sound flows clear and right! β¨π§
Pre-merge checks and finishing touches
β Passed checks (5 passed)
| Check name | Status | Explanation |
|---|---|---|
| Description Check | β Passed | Check skipped - CodeRabbitβs high-level summary is enabled. |
| Title check | β Passed | The title accurately and specifically summarizes the main change: adding documentation for three CLI audio flags, which aligns with the PR's core objective. |
| Linked Issues check | β Passed | All five acceptance criteria from issue #509 are met: --audio flag documented with examples, --audio-language flag documented, --audio-provider flag documented with defaults, usage examples provided, and flag reference tables updated. |
| Out of Scope Changes check | β Passed | Additional items beyond issue #509 (provider/model matrix, supported formats, audio troubleshooting, Real-Time Speech Agents link) are closely related enhancements that support the core audio documentation objective and are not out of scope. |
| Docstring Coverage | β Passed | No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check. |
β¨ Finishing touches
π§ͺ Generate unit tests (beta)
- [ ] Create PR with unit tests
- [ ] Post copyable unit tests in a comment
- [ ] Commit unit tests in branch
copilot/update-cli-commands-docs
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.
@coderabbitai Can you check if the attached issue has been resolved by the pull request changes? also do full review
π€ AI Review & Build Compliance β
Status: AI analysis complete β’ Build rules validated β’ Ready for review
π View detailed analysis results
π‘οΈ Analysis Complete
- β Security scan (vulnerabilities, API keys)
- β TypeScript safety & code quality
- β Error handling & best practices
- β Build rule enforcement validated
- β Commit format & compliance checks
π Ready for Merge When
- [ ] All CI checks passing
- [ ] Manual review approved
- [ ] Any AI-flagged issues resolved
π€ AI analysis complete - check individual code comments for specific feedback
π§© Analysis chain
π Script executed:
#!/bin/bash
# Fetch issue AUDIO-036 details
echo "=== Fetching Issue AUDIO-036 ==="
gh issue view 509 --repo juspay/neurolink --json title,body,labels,state
echo -e "\n=== Fetching PR #537 Files Changed ==="
gh pr view 537 --repo juspay/neurolink --json files --jq '.files[] | "\(.path): +\(.additions) -\(.deletions)"'
echo -e "\n=== Fetching PR #537 Diff ==="
gh pr diff 537 --repo juspay/neurolink
Length of output: 13115
@murdore, I've verified that issue #509 (AUDIO-036) has been fully resolved by this PR. Here's the verification:
β Acceptance Criteria Status
All 5 acceptance criteria from the issue have been successfully addressed:
-
β --audio flag documented with examples
- Added to Key Flags section in
generatecommand - Included in Audio Flags reference table
- Listed in new Multimodal Flags table
- Multiple usage examples provided
- Added to Key Flags section in
-
β --audio-language flag documented
- Documented in Key Flags with clear description and examples (
en,es,fr) - Included in both Audio Flags and Multimodal Flags tables
- Usage explained in examples
- Documented in Key Flags with clear description and examples (
-
β --audio-provider flag documented
- Documented in Key Flags with default behavior explanation
- Included in reference tables
- Shows it defaults to main
--provider
-
β Usage examples provided
- Basic audio usage with 3 examples
- Examples with evaluation and analytics
- Best practices section with 5 recommendations
-
β Flag reference table updated
- New Audio Flags table with 3 flags
- New Multimodal Flags table consolidating all media inputs
π Additional Value
The PR goes beyond the minimum requirements with:
- Provider/model compatibility matrix (Google AI, OpenAI, Vertex)
- Supported audio formats documentation (WAV, MP3, PCM 16-bit LE)
- Audio troubleshooting entries
- Link to Real-Time Speech Agents documentation
- Updated Command Map with audio example
I'll now perform a full review of all the changes.
β Actions performed
Full review triggered.