torchchat icon indicating copy to clipboard operation
torchchat copied to clipboard

Run PyTorch LLMs locally on servers, desktop and mobile

Results 143 torchchat issues
Sort by recently updated
recently updated
newest added

Supposedly we're not calling into SDPA when running on CUDA. Verify that SDPA is used, and fix if a problem does in fact exist. @malfet and @larryliu0820 have been talking...

Ensure that we can pass dtype tests for fp16, bf16 (?) and fp32 for Executorch with runner et. bf16 may not yet be a thing, but @malfet 's tests suggest...

https://github.com/pytorch/torchchat/actions/runs/9163587067/job/25193029015?pr=838 ``` The methods are: {'forward'} + python3 torchchat.py eval stories15M --pte-path stories15M.pte Note: NumExpr detected 16 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8. NumExpr defaulting...

### 🚀 The feature, motivation and pitch Currently, model evaluation is a WIP and mostly focused on pure PyTorch and compile. This is planned work to improve PT support and...

Known Gaps

Added a `aoti_package` path, dependent on https://github.com/pytorch/pytorch/pull/129895. Follow up will be to delete the `--output-dso-path`. To export, use the `--output-aoti-package-path` to specify a file with a `.pt2` extension. This will...

CLA Signed

Please add a test for #491, to build model, plus also using the ability to launch android tests from OSS to confirm they work

Running ./install_requirements.sh runs but has this warning: ~~~ WARNING: Skipping triton as it is not installed. ~~~ Which then results in failing when it attemps to locate Triton: ~~~ Successfully...

Asymmetric int4 weight variant for a8w4 w/ dynamic quantization? May be a good test run for 80/20 approach?

I was trying to compare TorchChat's generate tokens/second with llama-cpp's generate on my mac. I am using GGML FP16 with llama-cpp and I believe the default in TorchChat is FP16?...

`chat` and `generate `for the same model should yield the same number of tokens, shouldn't they? But right now there are more than 3x difference, at least as observed on...