omi Speech Profiles [Part 2]

Speech Profiles [Part 2]

Open josancamon19 opened this issue 1 year ago • 2 comments

Jul 30 '24 04:07 josancamon19

I want to be able to have a free recording of about 30 seconds or 1 minute, where the user can speak freely, and we show the transcript while is happening, and tell them when to stop, once they have certain amount of words/content.

We could put certain phrases for the user to read, so the speech profile is much more accurate.

Speech profile should be required in onboarding.
If they have this setup, they shouldn't be able to access this page again from settings once it's done, so it is less confusing.

Jul 30 '24 04:07 josancamon19

Samples should be a single sample, instead of:


@router.post('/samples/upload')
def upload_sample(file: UploadFile, uid: str):
    print('upload_sample')
    path = f"_temp/{uid}"
    os.makedirs(path, exist_ok=True)
    file_path = f"{path}/{file.filename}"
    with open(file_path, 'wb') as f:
        f.write(file.file.read())
        aseg = AudioSegment.from_wav(file_path)
        print(f'Uploading sample audio {aseg.duration_seconds} secs and {aseg.frame_rate / 1000} khz')
        uploaded_url, count = upload_sample_storage(file_path, uid)
        print('upload_sample ~ file uploaded')
        if count >= 5:
            threading.Thread(target=_create_profile, args=(uid,)).start()
    # os.remove(file_path)
    return {"url": uploaded_url}

Ideally we store that in redis in base64 or something, so we don't use cloud storage anymore, and once read, simply load the bytes in pydub.

Jul 30 '24 04:07 josancamon19

Done already.

Sep 02 '24 20:09 josancamon19

omi omi copied to clipboard

Speech Profiles [Part 2]

omi
omi copied to clipboard