Daniel Han

Results 781 comments of Daniel Han

@patrickjchen I re-did the Kaggle notebook and fixed the issue! Use the latest one here: https://www.kaggle.com/code/danielhanchen/kaggle-mistral-7b-unsloth-notebook

@patrickjchen I re-ran my notebook - it seems to work fine. If you're using your own notebook, you need to exactly copy the notebook I provided, then add your code...

@patrickjchen Hmm I think it's torch 2.2.1 or 2.2.2 tbh unsure. Yes xformers itself installs torch

@hvico Oh no - https://github.com/kuleshov-group/caduceus/issues/2 it seems like 1080s are now not supported? :( It was before hmmm

There are some tags on our github repo - you could try our first ever version to see if it works

Hmmm `temp_QA` is supposed to be for prefilling - I wonder why it doesnt work - do you know if other models have this issue?

@JhonDan1999 The padding side of "right" is for training only - this is not a bug. For inference, one must use padding "left" or else you'll get wrong results.

@JhonDan1999 Yes so left is only used for generation. If you use right, you'll get gibberish. This only is important for batched decoding, and not single decoding.

I auto change the padding side to fix this issue

@pratikkotian04 https://github.com/huggingface/transformers/issues/17117 maybe can help?