f1-dash
f1-dash copied to clipboard
feat: implement automatic driver radio transcription
Abstract
This patch adds new feature which displays transcript of every driver radio.
What's changed
live-backend
- New
/api/audio
API added As F1TV's live timing CDN (https://livetiming.formula1.com/static
) does not permit cross-origin requests, every calls to obtain the speech file should be proxied through the backend. Since routing every request to the file can marginally increase traffic burden of thelive-backend
(and also potential IP ban from F1TV CDN), I have decided to make this API only as an optional feature, which can be opted in by definingENABLE_AUDIO_FETCH
environment variable when loading the server process.
dash
- Automatic Speech Recognition pipeline
This pipeline accepts a sampled audio data and then inferences the transcription data with help of Transformers.js and OpenAI's Whisper Model. There are loads of whisper-based models, but based on my experiences, I have made three models as available option in this project (check
dash/src/app/(nav)/settings/page.tsx
). Those options will be labeled asMore Quality
,Balanced
andLow Latency
as it stands.
That says, only the computational resource of the client browser will be affected when executing the pipeline; API backend will not take part of the process.