Marianna
Marianna
Hi! Can I do the image registration? :)
> This is more efficient since it doesn't bring the data in memory: > > ```python > for i in range(len(dset) // batch_size) > start = i * batch_size >...
Hi @lucidrains ! Can you use riffusion spectrogram as input in the `encode_image` function?
@lucidrains no, unfortunately I get this error: `RuntimeError: Given groups=1, weight of size [768, 3, 16, 16], expected input[2, 1, 32, 1024] to have 3 channels, but got 1 channels...
@lucidrains I checked again now it works! (I just forgot that I've made changes to the code) sorry, that's my bad!
@lucidrains yes, I changed back to 1 channel and it worked, but also I tried to run it over a batch of images but it didn't work :(
> @marianna13 i'll add the `MulanCoCa` version tomorrow too, so we can possibly leap frog the state of the art going on within google That's great! Thank you :)
Hi @lucidrains ! Sorry for the late reply. Here's the code I'm using: ```python import torch import cv2 from src.open_clip import AudioCLIP, CLIPAudioCfg, CLIPTextCfg import webdataset as wds import sys...
@lucidrains it works! Thank you! :)
Hey @lucidrains, I tried to train a model with a small fraction of the dataset but it gets stuck at the first epoch and then gets killed. I can post...