torchchat
torchchat copied to clipboard
Run PyTorch LLMs locally on servers, desktop and mobile
### 🐛 Describe the bug ` python3 torchchat.py export llama3.1 --output-dso-path exportedModels/llama3.1.so` ``` Using device=cuda Setting max_seq_length to 300 for DSO export. Loading model... Time to load model: 2.74 seconds...
Currently, we download models to a local (~/.torchchat by default). For Hugging Face models, we should download to the Hugging Face cache instead. As per Hugging Face: ``` By default,...
## Dependencies This PR is part of a sequence in support of adding Granite Code. It depends on merging the following PRs: - [x] Safetensors: #1255 - [x] Bias tensors:...
### 🐛 Describe the bug error out when running the install sh script ``` Building wheels for collected packages: zstd Building wheel for zstd (pyproject.toml) ... error error: subprocess-exited-with-error ×...
### 🐛 Describe the bug I wanted to try the new Llama 3.2 1B parameter model on mobile. I downloaded the model and generated the `pte` like so: ``` python...
### 🚀 The feature, motivation and pitch - Would be nice to have automatically **generated changelog** from commit history. Maybe github's [generated release notes](https://docs.github.com/en/repositories/releasing-projects-on-github/automatically-generated-release-notes) or with [git-cliff](https://github.com/orhun/git-cliff) ([example](https://github.com/stanfordnlp/dspy/issues/1455#issuecomment-2338339308)). Then it's...
### 🐛 Describe the bug Hi all, I ran into some confusion when trying to export llama3 on my system. I have a small graphics card (8GB VRAM on an...
### 🐛 Describe the bug Currently, **Llama 3.2 11B** only supports a single optional image prompt in torchchat. The base torchtune model backing Llama3.2 11B should* be capable of supporting...
When composing distributed with quantization, one potential case is that the model has been quantized and saved so a second run do not need to quantize it again. This is...
### 🚀 The feature, motivation and pitch The request is to extend the [tokenizer](https://github.com/pytorch/torchchat/tree/main/tokenizer) module in `torchchat` to support tokenizers that use the Huggingface [tokenizers](https://github.com/huggingface/tokenizers) library. There are many models...