Results 14 issues of ethan cohen

Hi, First, thanks for all the shared work ! I have a question concerning the sampling function in the Vanilla VAE. Why do you sample from a normal distribution (0,1)...

Hi again :) Is there any way to change the transformer architecture easily as in x-clip ? I would like to use my own ( which is pretrained ) :)...

Hi, Is it possible to use the models with images that have more than 3 channels ( 5 in my case ). Thanks a lot

Hi, I am trying to feed as argument image size=300 but I'm getting this error (my images are indeed of size 300) assert h == img_size and w == img_size,...

Hi, Is there a way to make the latent diffusion model able to get the context from any modality (that would be user defined) with a training script associated to...

Hi, I would like to know (if it is possible) how to get the graph from a list of smiles and then feed it into a GNN ( GIN for...

Hi, Is there a way to get megamolbart embedding from smiles as pretrained encoder with the associated tokenizer if needed ? Thanks a lot

Hi, Is there a way to use this model for training conditional image generation (from text or other) with a custom dataset ? Thanks

Is it possible to use and train dalle with an external ( frozen) text encoder ( as those available in hugging face) ?

Do you have any idea on the different performance of stable-diffusion/latent-diffusion based on way the conditiioning is incorporated (clip based vs scratch based vs pretrained text based for example )...