alts
alts copied to clipboard
100% free, local & offline voice assistant with speech recognition
alts
( 🎙️ listens | 💭 thinks | 🔊 speaks )
💬 about
100% free, local and offline assistant with speech recognition and talk-back functionalities.
🤖 default usage
ALTS runs in the background and waits for you to press cmd+esc (or win+esc).
- 🎙️ While holding the hotkey, your voice will be recorded (saves in the project root).
- 💭 On release, the recording stops and a transcript is sent to the LLM (the recording is deleted).
- 🔊 The LLM responses then get synthesized and played back to you (also shown as desktop notifications).
You can modify the hotkey combination and other settings in your config.yaml.
ALL processes are local and NONE of your recordings or queries leave your environment; the recordings are deleted as soon as they are used; it's ALL PRIVATE by default
⚙️ pre-requisites
-
python
(tested on) version >=3.11 on macOS and version >=3.8 on windows
-
llm
By default, the project is configured to work with Ollama, running the
stablelm2model (a very tiny and quick model). This setup makes the whole system completely free to run locally and great for low resource machines.However, we use LiteLLM in order to be provider agnostic, so you have full freedom to pick and choose your own combinations. Take a look at the supported Models/Providers for more details on LLM configuration.
See
.env.templateandconfig-template.yamlfor customizing your setup
-
stt
We use
openAI's whisperto transcribe your voice queries. It's a general-purpose speech recognition model.You will need to have
ffmepginstalled in your environment, you can download it from the official site.Make sure to check out their setup docs, for any other requirement.
if you stumble into errors, one reason could be the model not downloading automatically. If that's the case you can run a
whisperexample transcription in your terminal (see examples) or manually download it and place the model-file in the correct folder -
tts
We use
coqui-TTSfor ALTS to talk-back to you. It's a library for advanced Text-to-Speech generation.You will need to install
eSpeak-ngin your environment:- macOS –
brew install espeak - linux –
sudo apt-get install espeak -y - windows – download the executable from their repo
on windows you'll also need
Desktop development with C++and.NET desktop build tools. Download the Microsoft C++ Build Tools and install these dependencies.
Make sure to check out their setup docs, for any other requirement.
if you don't have the configured model already downloaded it should download automatically during startup, however if you encounter any problems, the default model can be pre-downloaded by running the following:
tts --text "this is a setup test" --out_path test_output.wav --model_name tts_models/en/vctk/vits --speaker_idx p364The default model has several "speakers" to choose from; running the following command will serve a demo site where you can test the different voices available:
tts-server --model_name tts_models/en/vctk/vits - macOS –
✅ get it running
clone the repo
git clone https://github.com/alxpez/alts.git
go to the main folder
cd alts/
install the project dependencies
pip install -r requirements.txt
see the pre-requisites section, to make sure your machine is ready to start the ALTS
duplicate and rename the needed config files
cp config-template.yaml config.yaml
cp .env.template .env
modify the default configuration to your needs
start up ALTS
sudo python alts.py
the
keyboardpackage requires to be run as admin (in macOS and Linux), it's not the case on Windows