torchchat issues

Eval script fails on CPU on model generated by ExecuTorch

2

### 🐛 Describe the bug I am using ET and generating the quantized version of the model as shown in the README. ``` python torchchat.py export llama3.1 --quantize config/data/mobile.json --output-pte-path...

agunapal

[AOTI] Add a --max-seq-length option for export

1

Summary: This improves best tokens/sec from 73 to 85.

desertfire

CLA Signed

torchchat generate requires network connection, even if models are cached

1

### 🐛 Describe the bug Run `python3 torchchat.py generate stories110M` on a system with a bad network connection will hang for 90+ sec before it starts generatin anything ### Versions...

malfet

bug

actionable

Update README.md

1

Clarify iOS requirements

shoumikhin

CLA Signed

Support for quantized llm for smaller memory devices

1

### 🚀 The feature, motivation and pitch I believe, this is what ollama's one huge advantage. This can also encourage devs to go test llm which they can run on...

jhetuts

need-user-input

Quantization

[Hackability Refactor] Move known_model_params under torchchat

1

As titled, simple changes as it was jsons `build/known_model_params` => `torchchat/model_params`

Jack-Khuu

CLA Signed

Fix eval for .pte

1

**Issue** Inputs aren't set up correctly for .pte files. The input tensors must be static and cannot be reshaped. Currently, running eval will result in this error: ``` python3 torchchat.py...

vmpuri

CLA Signed

Add ability to select models and edit system/assistant prompts

1

**Goal: ** Users should be able to select the model from the chat interface and receive a response from that model. **Currently:** we just send the request and take the...

vmpuri

CLA Signed

[WIP] [Hackability Refactor]: Move the distributed folder from the top level repo to torchchat/torchchat

2

Moves the top level distributed folder into a separate distributed folder within the torchchat umbrella. There are intentionally no code changes outside of the README and script path updates

Jack-Khuu

CLA Signed

[WIP] Initial add of distributed model

1

Added files: - model_dist.py a mirror of model.py with Tensor Parallelism baked in. - dist_run.py toy example of how to run the model in distributed way. Test: ``` torchrun --nproc-per-node...

kwen2501

CLA Signed

torchchat
torchchat copied to clipboard

Metadata

Eval script fails on CPU on model generated by ExecuTorch

[AOTI] Add a --max-seq-length option for export

torchchat generate requires network connection, even if models are cached

Update README.md

Support for quantized llm for smaller memory devices

[Hackability Refactor] Move known_model_params under torchchat

Fix eval for .pte

Add ability to select models and edit system/assistant prompts

[WIP] [Hackability Refactor]: Move the distributed folder from the top level repo to torchchat/torchchat

[WIP] Initial add of distributed model

← Metadata

Owner

Metadata

torchchat torchchat copied to clipboard

Metadata

← Metadata

Owner

Metadata

torchchat
torchchat copied to clipboard