Shrivu Shankar
Shrivu Shankar
Hey that sounds super cool! I don't have an example off hand but this is very do-able. Youd essentially just have "preprocess rows" return the raw voxel data, then "forward"...
More data? In theory 249 to 8 tokens will actually overfit easily (so low training loss but high test). You can also try pre-training the projector on some proxy task...
Will also note that loss especially in the context of lora fine-tuning like this can be misleading / not an accurate representation of efficiency. It's worth just sampling/testing your weights...
This library was mainly to proof of concept these different modalities so didn't mess with decoding params too much. Not reason it's not included (they'd work the same as any...
Hey! Probably not, if there's strong enough interest I can take a look but rn seems like it would take a bit of time and budget.
What's the command and dataset you are using?
@linchen111 im curious if you are able to run the CLIP demo code with one of your example images https://huggingface.co/docs/transformers/model_doc/clip ``` from PIL import Image import requests from transformers import...
Let me look into this -- tbh not super sure. I ran all the commands from the README at some point so seems like it disappeared.
Still looking -- `adapter_model.bin` is in my upload script https://github.com/sshh12/multi_token/blob/main/scripts/upload_model.py but don't see any edit logs on huggingface since uploading
Yeah still not sure. Might just update the README to show that pre-trained models are no longer available ):