go-livepeer
go-livepeer copied to clipboard
(feat) configurable timestamp options for audio-to-text
What does this pull request do? Explain your changes. (required)
This change adds the return_timestamps parameter to the audio-to-text pipeline, allowing end-users to configure the inference job to return timestamps at word-level, sentence-level or no timestamps at all.
Supported values for return_timestamps are false and word. The pipeline defaults to existing behavior of sentence-level timestamp transcription to avoid breaking changes with existing applications.
Specific updates (required)
- This change only updates the go.mod references for ai-worker. See https://github.com/livepeer/ai-worker/pull/228
How did you test each of these updates (required)
sentence-level timestamps
- Sent request without
return_timestampsparameter to verify inference job still defaults to sentence-level timestamps sentence-timestamps.json
curl -X POST "https://<GATEWAY_IP>/audio-to-text" \
-F model_id=openai/whisper-large-v3 \
-F audio=@<PATH_TO_FILE> \
word-level timestamps
- Sent request with
return_timestamps=wordto validate timestamps are returned at word-level word-timestamps.json
curl -X POST "https://<GATEWAY_IP>/audio-to-text" \
-F model_id=openai/whisper-large-v3 \
-F audio=@<PATH_TO_FILE> \
-F return_timestamps="word"
no timestamps
- Sent request with
return_timestamps=falseto validate timestamps are excluded no-timestamps.json
curl -X POST "https://<GATEWAY_IP>/audio-to-text" \
-F model_id=openai/whisper-large-v3 \
-F audio=@<PATH_TO_FILE> \
-F return_timestamps="false"
Does this pull request close any open issues?
AI-630
Checklist:
- [x] Read the contribution guide
- [x]
makeruns successfully - [x] All tests in
./test.shpass - [x] README and other documentation updated
- [x] Pending changelog updated
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 35.92244%. Comparing base (
c41f3c4) to head (9d5130c). Report is 8 commits behind head on ai-video.
Additional details and impacted files
@@ Coverage Diff @@
## ai-video #3207 +/- ##
===================================================
- Coverage 36.07820% 35.92244% -0.15576%
===================================================
Files 124 124
Lines 34525 34658 +133
===================================================
- Hits 12456 12450 -6
- Misses 21381 21520 +139
Partials 688 688
see 1 file with indirect coverage changes
Continue to review full report in Codecov by Sentry.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 1bc4a6a...9d5130c. Read the comment docs.