orellavie1212
orellavie1212
any updates for that bug? I can't run it even with batch_size of 2 or 8 (tried in sagemaker with ml.g5.12xlarge and ml.g4dn.12xlarge) I am out of ideas, even tried...
yes, that is the problem! I found out the dict is empty.. when I am checking the adapter_weights. how did you fix it specifically? I'll mock. at the checkpoints there...
On the main directory I have dapter_config.json adapter_model.bin checkpoint-69 checkpoint-72 checkpoint-75 runs On the specific directory (checkpoint-75) optimizer.pt pytorch_model.bin rng_state.pth scaler.pt scheduler.pt special_tokens_map.json tokenizer_config.json tokenizer.json trainer_state.json training_args.bin you only need...
I have adapter_model.bin but it is empty as I said. I tried to understand what you offered, but the only possible I see is to take pytorch_model.bin from the checkpoint...
yes, the weights actually loaded successful now. Any idea if the last checkpoint is exactly the last training? or I missed some of the epoch, so adapter_bin.model is advanced in...
> I have an example script that works with Mixtral: > > https://github.com/casper-hansen/AutoAWQ/blob/main/examples/basic_vllm.py checking it right now https://github.com/casper-hansen/AutoAWQ/blob/main/examples/mixtral_quant.py, I hope this is the configuration you added to your model at...
I thought as the solution for general mixtral (not quantize gptq or awq, just regular one) was via .PT https://huggingface.co/IbuNai/Mixtral-8x7B-v0.1-gptq-4bit-pth/tree/main even that which is .bin (not found .pt in hf)...
> I just used the following Docker image and ran `pip install vllm` > > `runpod/pytorch:2.1.1-py3.10-cuda12.1.1-devel-ubuntu22.04` I am using djl container v25 with same setup (py3.10, torch 2.1.1, cuda 12.1)
> > Could you try the Docker image I referenced to see if it's an environment issue? > > tp = 1 good, but tp=2 error; i found different named_parameters...
> Not sure if this relates to #2203. Does it work in FP16 with TP > 1? tried also fp16 besides auto