Kolesh jr
Kolesh jr
Is the format : json there on default?? Because using langchain and chatollama also hangs even without the format : json. Option ?
I am also getting the same error, did you get a fix?
I am having the same issue even on the new version: 0.1.28 . This happens after 200 iterations on a custom finetuned 4 bit mistral on collabs free tier t4
Hey @danielhanchen I am facing this issue during inference: NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs: query : shape=(1, 2327, 8, 4, 128) (torch.float16) key : shape=(1, 2327, 8,...
Yes I did , it's failing for free tier t4 when you call model.generate but for v100 it's passing.
@danielhanchen These are the new imports that you have suggested in this thread import torch major_version, minor_version = torch.cuda.get_device_capability() !pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git" if major_version >= 8: !pip install...
@danielhanchen Apparently for some reason it now fixed. Sorry For this. I appreciate your feedback though. Thanks
This bug is so frustrating and it doesn't seem to be fixed even in the newer versions
Could someone help us. This issue still persists : I have updated to the latest release version: v0.1.28 and it still get's stuck after around 200 iterations on google collab...