torchchat issues

unimplemented operators - workarounds and long term perspective

Today users have to set PYTORCH_ENABLE_MPS_FALLBACK=1 when they call torchchat if they want to use _weight_int4pack_mm. Can we set that automatically, from inside the program. This is a crude workaround,...

mikekgfb

cache executorch builds on runners

4

As suggested by @malfet

mikekgfb

performance

linear:int4 issues - RuntimeError: Missing out variants: {'aten::_weight_int4pack_mm'}

``` (py311) mikekg@mikekg-mbp torchchat % python export.py --checkpoint-path ${MODEL_PATH} --temperature 0 --quantize '{"linear:int4": {"groupsize": 128}}' --output-pte mode.pte [...] Traceback (most recent call last): File "/Users/mikekg/qops/torchchat/export.py", line 111, in main(args) File...

mikekgfb

Executorch excessive non-actionable output https://github.com/pytorch/executorch/issues/3515

1

see https://github.com/pytorch/executorch/issues/3515 documents excessive non-actionable output

mikekgfb

actionable

Open AI API Maturity

4

### 🚀 The feature, motivation and pitch The OpenAI API support in torchchat is actively in development and will be one of the main entry point for interacting with torchchat...

Jack-Khuu

Known Gaps

Update CLI arg builders to check for only args that the subcommand uses: Export/Generate

### 🚀 The feature, motivation and pitch Currently, the CLI arg parsing in torchchat is too general and overzealous; subcommands indirectly access cli args that it doesn't actually use (e.g....

Jack-Khuu

actionable

Running `torchchat export` with just the model name does not error out

1

### 🐛 Describe the bug Running `python torchchat.py export stories15M` does not error out, nor generates any export files, though it should have? ```shell % python torchchat.py export stories15M; echo...

malfet

bug

actionable

Github code search doesnt work with folders called `build`

1

### 🐛 Describe the bug I was trying to look for the `model.py` definition https://github.com/pytorch/torchchat/tree/main/build but it wasn't showing up generate.py which is not in builder works fine Can we...

msaroufim

actionable

Explore expanding dynamic quantization kernels (broaden a8w4dq support)

1

Add support for 1 - asymmetric a8w4dq, basically require to subtract zero from each value before multiplying, so should add a single multiply. This will help accelerate and better handle...

mikekgfb

TorchChat is slower than gpt-fast

3

Using `torch==2.4.0.dev20240502` on Apple M2 pro I get following numbers for stories110M + float16 dtype | application | speed (eager) | speed (compile) | | ---- | ---- | ----...

malfet

performance

torchchat
torchchat copied to clipboard

Metadata

unimplemented operators - workarounds and long term perspective

cache executorch builds on runners

linear:int4 issues - RuntimeError: Missing out variants: {'aten::_weight_int4pack_mm'}

Executorch excessive non-actionable output https://github.com/pytorch/executorch/issues/3515

Open AI API Maturity

Update CLI arg builders to check for only args that the subcommand uses: Export/Generate

Running `torchchat export` with just the model name does not error out

Github code search doesnt work with folders called `build`

Explore expanding dynamic quantization kernels (broaden a8w4dq support)

TorchChat is slower than gpt-fast

← Metadata

Owner

Metadata

torchchat torchchat copied to clipboard

Metadata

← Metadata

Owner

Metadata

torchchat
torchchat copied to clipboard