Daniel Han
Daniel Han
@patrickjchen I re-did the Kaggle notebook and fixed the issue! Use the latest one here: https://www.kaggle.com/code/danielhanchen/kaggle-mistral-7b-unsloth-notebook
@patrickjchen I re-ran my notebook - it seems to work fine. If you're using your own notebook, you need to exactly copy the notebook I provided, then add your code...
@patrickjchen Hmm I think it's torch 2.2.1 or 2.2.2 tbh unsure. Yes xformers itself installs torch
@hvico Oh no - https://github.com/kuleshov-group/caduceus/issues/2 it seems like 1080s are now not supported? :( It was before hmmm
There are some tags on our github repo - you could try our first ever version to see if it works
Hmmm `temp_QA` is supposed to be for prefilling - I wonder why it doesnt work - do you know if other models have this issue?
@JhonDan1999 The padding side of "right" is for training only - this is not a bug. For inference, one must use padding "left" or else you'll get wrong results.
@JhonDan1999 Yes so left is only used for generation. If you use right, you'll get gibberish. This only is important for batched decoding, and not single decoding.
I auto change the padding side to fix this issue
@pratikkotian04 https://github.com/huggingface/transformers/issues/17117 maybe can help?