torchchat icon indicating copy to clipboard operation
torchchat copied to clipboard

Run PyTorch LLMs locally on servers, desktop and mobile

Results 143 torchchat issues
Sort by recently updated
recently updated
newest added

### 🐛 Describe the bug I am using ET and generating the quantized version of the model as shown in the README. ``` python torchchat.py export llama3.1 --quantize config/data/mobile.json --output-pte-path...

Summary: This improves best tokens/sec from 73 to 85.

CLA Signed

### 🐛 Describe the bug Run `python3 torchchat.py generate stories110M` on a system with a bad network connection will hang for 90+ sec before it starts generatin anything ### Versions...

bug
actionable

Clarify iOS requirements

CLA Signed

### 🚀 The feature, motivation and pitch I believe, this is what ollama's one huge advantage. This can also encourage devs to go test llm which they can run on...

need-user-input
Quantization

As titled, simple changes as it was jsons `build/known_model_params` => `torchchat/model_params`

CLA Signed

**Issue** Inputs aren't set up correctly for .pte files. The input tensors must be static and cannot be reshaped. Currently, running eval will result in this error: ``` python3 torchchat.py...

CLA Signed

**Goal: ** Users should be able to select the model from the chat interface and receive a response from that model. **Currently:** we just send the request and take the...

CLA Signed

Moves the top level distributed folder into a separate distributed folder within the torchchat umbrella. There are intentionally no code changes outside of the README and script path updates

CLA Signed

Added files: - model_dist.py a mirror of model.py with Tensor Parallelism baked in. - dist_run.py toy example of how to run the model in distributed way. Test: ``` torchrun --nproc-per-node...

CLA Signed