Talking_Face_Avatar
Talking_Face_Avatar copied to clipboard
Avatar Generation For Characters and Game Assets Using Deep Fakes
Talking Face Avatar: single portrait image From Leonardo.ai API 🙎♂️ + audio From ElevenLabs TTS API 🎤 = talking head video 🎞.
Leonardo.ai
Go To Leonardo.Ai And Enter your Prompt And Negative Prompts To Generate Artistic Images
Here Some Recources :Leonardo.ai Youtube Video Leonardo.ai Youtube Video Toutorial
or you can use APIs Leonardo.Ai API Guide
Leonardo.ai Image Generation | Leonardo.ai Image Generation | Leonardo.ai Image Generation |
|
|
|
ElevenLabs
Go To Eleven Labs And Enter your Text And Generate Beautiful Audios With Diffrent Pitchs and Speeckers. ElvenLabs also is Multilingual
Here Some Recources :ElevenLabs Youtube Video
or you can use APIs ElevenLabs API Guide
Eleven Labs TTS | Eleven Labs TTS | Eleven Labs TTS |
https://github.com/saba99/Talking_Face_Avatar/assets/33378412/bd68137d-2e67-41df-a1df-4162db170ff8 |
https://github.com/saba99/Talking_Face_Avatar/assets/33378412/f622369b-9e69-492d-975b-685671c663c1 |
https://github.com/saba99/Talking_Face_Avatar/assets/33378412/1f78eb67-cc76-4c9a-8664-a28f3f795bee |
🔥 Highlight
-🔥 Scroll To left and Right To See All Videos
video 1 + enhancer(GFPGAN ) | video 2 | video 3 |
---|---|---|
video 4 | video 5 | video 6 |
---|---|---|
- 🔥 Several new mode, eg,
still mode
,reference mode
,resize mode
are online for better and custom applications.
Our Diagram Approach
Linux:
-
Installing anaconda, python and git.
-
Creating the env and install the requirements.
git clone https://github.com/saba99/Talking_Face_Avatar.git
cd SadTalker
conda create -n sadtalker python=3.8
conda activate sadtalker
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
conda install ffmpeg
pip install -r requirements.txt
### tts is optional for gradio demo.
### pip install TTS
UI + API:
look at index.html
📥 2. Download Trained Models.
You can run the following script to put all the models in the right place.
bash scripts/download_models.sh
Model Details
The final folder will be shown as:

Model explains:
Model | Description |
---|---|
checkpoints/auido2exp_00300-model.pth | Pre-trained ExpNet in Sadtalker. |
checkpoints/auido2pose_00140-model.pth | Pre-trained PoseVAE in Sadtalker. |
checkpoints/mapping_00229-model.pth.tar | Pre-trained MappingNet in Sadtalker. |
checkpoints/mapping_00109-model.pth.tar | Pre-trained MappingNet in Sadtalker. |
checkpoints/facevid2vid_00189-model.pth.tar | Pre-trained face-vid2vid model from the reappearance of face-vid2vid. |
checkpoints/epoch_20.pth | Pre-trained 3DMM extractor in Deep3DFaceReconstruction. |
checkpoints/wav2lip.pth | Highly accurate lip-sync model in Wav2lip. |
checkpoints/shape_predictor_68_face_landmarks.dat | Face landmark model used in dilb. |
checkpoints/BFM | 3DMM library file. |
checkpoints/hub | Face detection models used in face alignment. |
gfpgan/weights | Face detection and enhanced models used in facexlib and gfpgan . |
🔮 3. Quick Start (Best Practice).
WebUI Demos:
## you need manually install TTS(https://github.com/coqui-ai/TTS) via `pip install tts` in advanced.
python app.py
Manually usages:
Animating a portrait image from default config:
python inference.py --driven_audio <audio.wav> \
--source_image <video.mp4 or picture.png> \
--enhancer gfpgan
The results will be saved in results/$SOME_TIMESTAMP/*.mp4
.
Full body/image Generation:
Using --still
to generate a natural full body video. You can add enhancer
to improve the quality of the generated video.
python inference.py --driven_audio <audio.wav> \
--source_image <video.mp4 or picture.png> \
--result_dir <a file to store results> \
--still \
--preprocess full \
--enhancer gfpgan