omi
omi copied to clipboard
Speech Profiles [Part 2]
I want to be able to have a free recording of about 30 seconds or 1 minute, where the user can speak freely, and we show the transcript while is happening, and tell them when to stop, once they have certain amount of words/content.
We could put certain phrases for the user to read, so the speech profile is much more accurate.
- Speech profile should be required in onboarding.
- If they have this setup, they shouldn't be able to access this page again from settings once it's done, so it is less confusing.
Samples should be a single sample, instead of:
@router.post('/samples/upload')
def upload_sample(file: UploadFile, uid: str):
print('upload_sample')
path = f"_temp/{uid}"
os.makedirs(path, exist_ok=True)
file_path = f"{path}/{file.filename}"
with open(file_path, 'wb') as f:
f.write(file.file.read())
aseg = AudioSegment.from_wav(file_path)
print(f'Uploading sample audio {aseg.duration_seconds} secs and {aseg.frame_rate / 1000} khz')
uploaded_url, count = upload_sample_storage(file_path, uid)
print('upload_sample ~ file uploaded')
if count >= 5:
threading.Thread(target=_create_profile, args=(uid,)).start()
# os.remove(file_path)
return {"url": uploaded_url}
Ideally we store that in redis in base64 or something, so we don't use cloud storage anymore, and once read, simply load the bytes in pydub.
Done already.