Sam Havens comments

Results 32 comments of


                                            Sam Havens

GPTQ support for quantization

@casperbh96 @abhinavkulkarni We are working on a PR which adds support for `output_attentions` when using `torch` attention; #210 For supporting `device_map="auto"`, I believe the only change we need is to...

Add `device_map` support for `hf_generate.py` and `hf_chat.py`

Just chiming in that I am already using this branch for testing the chat model!

Feature/peft compatible models

Tests are failing with ``` ___________________ ERROR collecting tests/test_training.py ____________________ ImportError while importing test module '/llm-foundry/tests/test_training.py'. Hint: make sure your test modules/packages have valid Python names. Traceback: llmfoundry/__init__.py:8: in from...

GPU OOM while fine-tuning MPT-7B

I believe that AWS instance has 4xT4 ~= 64GB of VRAM. You want at least twice that. Also, this stack is mostly tested on AA100s and there have been reports...

GPU OOM while fine-tuning MPT-7B

Do you get an OOM with the A10s as well, or a different error?

improve hf_chat UI and readme

@alextrott16 done

AMD and CommonJS support

To echo what @pcavanaugh said, to get WOW working with Browserify, I had to download the AMD branch, then change a few `this`s to `window` to get it to work.

WIP: Preventing the loss from being computed when the input token is EOS Token

I think having this option is good, some users almost certainly want it. However, I think this should be optional, as I am not convinced it shouldn't learn to predict...

WIP: Preventing the loss from being computed when the input token is EOS Token

@vchiley for models which have both EOS and BOS, are you saying don't learn that BOS comes after EOS? it isn't worth learning, true, but also... we'll always stop generating...

WIP: Preventing the loss from being computed when the input token is EOS Token

as discussed on Slack, I think that: * EOS is effectively a BOS token, and so we want P(t|EOS) to be different than P(t), so we don't want to mask...