transcript.fish
transcript.fish copied to clipboard
Unofficial No Such Thing As A Fish episode transcripts.
transcript.fish
Unofficial No Such Thing As A Fish episode transcripts.
Running webapp locally
- Run
npm install - Run
npm run dev
Download episodes from the RSS feed, transcribe them, and add them to the database
TODO: Add instructions for creating database
-
Install deps
- Run
pip install -r requirements.txt
- Run
-
Download most recent episodes and transcribe them
-
Change line 11 of whisper.py to
local_files_only=False -
(Optional): Change line 5 of whisper.py
model_size = 'large-v2'to your preferred model, see note below for details, see available models. -
Run
npm run convert(this is idempotent and will go through all episodes)NOTE: By default this uses the
medium.enWhisper model. On an M1 Mac with 64GB of RAM this transcribes at about1.4xspeed. This means an hour long episode gets transcribed in about 42 minutes.So, as of 25 July 2023:
select sum(duration) from episodes -- 12921751,292,175.0 seconds ÷ 60.0 seconds ÷ 60.0 minutes ÷ 24.0 hours ----------------------- = 15.0 days ÷ 1.4 speed ----------------------- = 10.7 daysThe good news is changing to the
small.enor thetiny.enincreases this speed dramatically but the accuracy goes down slightly.small.entranscribes at about3xspeed, for example.The other good news is you can kill the script (
Ctrl + C) and restart it at any time and it will pick back up after the last fully transcribed episode.NOTE: This script also downloads all the audio files for the episodes as well as each episode's album art. As of 25 July 2023 this amounts to 487 episodes, ~20GB audio, ~130MB images.
-
-
Split database into chunks
- Run
npm run split:db
- Run
-
(Optional) Sync database, audio, images, and fonts to (Cloudflare) R2. Needs
rcloneandjqinstalled.- Run
npm run sync
- Run