omi icon indicating copy to clipboard operation
omi copied to clipboard

Speech Profiles [Part 2]

Open josancamon19 opened this issue 1 year ago • 2 comments

josancamon19 avatar Jul 30 '24 04:07 josancamon19

I want to be able to have a free recording of about 30 seconds or 1 minute, where the user can speak freely, and we show the transcript while is happening, and tell them when to stop, once they have certain amount of words/content.

We could put certain phrases for the user to read, so the speech profile is much more accurate.

  • Speech profile should be required in onboarding.
  • If they have this setup, they shouldn't be able to access this page again from settings once it's done, so it is less confusing.

josancamon19 avatar Jul 30 '24 04:07 josancamon19

Samples should be a single sample, instead of:


@router.post('/samples/upload')
def upload_sample(file: UploadFile, uid: str):
    print('upload_sample')
    path = f"_temp/{uid}"
    os.makedirs(path, exist_ok=True)
    file_path = f"{path}/{file.filename}"
    with open(file_path, 'wb') as f:
        f.write(file.file.read())
        aseg = AudioSegment.from_wav(file_path)
        print(f'Uploading sample audio {aseg.duration_seconds} secs and {aseg.frame_rate / 1000} khz')
        uploaded_url, count = upload_sample_storage(file_path, uid)
        print('upload_sample ~ file uploaded')
        if count >= 5:
            threading.Thread(target=_create_profile, args=(uid,)).start()
    # os.remove(file_path)
    return {"url": uploaded_url}

Ideally we store that in redis in base64 or something, so we don't use cloud storage anymore, and once read, simply load the bytes in pydub.

josancamon19 avatar Jul 30 '24 04:07 josancamon19

Done already.

josancamon19 avatar Sep 02 '24 20:09 josancamon19