Alyssa Vance comments

Repositories
Issues
Comments

Results 3 comments of


                                            Alyssa Vance

Gradient checkpointing throws use_reentrant warning on PyTorch 2.1

@ArthurZucker is this still outstanding?

run_glue_no_trainer.py script crashes on Mistral model due to tokenizer issue

Adding these lines seems to fix it, not sure if this is the best/most general solution though: ``` tokenizer = AutoTokenizer.from_pretrained( args.model_name_or_path, use_fast=not args.use_slow_tokenizer, trust_remote_code=args.trust_remote_code ) tokenizer.pad_token = tokenizer.eos_token config.pad_token_id...

added conversion script and example

@rib-2 Thanks for this! Unfortunately it doesn't work on my machine (8xA100), presumably because it's designed for only one GPU? ``` alyssavance@7e72bd4e-02:/scratch/brr$ python3 marlin/conversion/convert.py --model-id "TheBloke/Llama-2-7B-Chat-GPTQ" --save-path "./marlin-chat" --do-generation Loading...