Giovanni Puccetti
Giovanni Puccetti
@Pclanglais Hi, I will get to work on 1 as soon as I can, as it is not possible right away. For 2 did you try setting `generation_type="top_p"` inside `.generate`?...
Hi @alexcode4u, the size difference between the checkpoint and the model, is because the checkpoint stores more things than the state_dict alone (I think gradient is the largest part), while...
@alexcode4u if I understand you are running this on kaggle, I am not too familiar with it so please bear with me a little and I try to help, the...
Hey @alexcode4u no worries at all, I'm happy to help, there isn't option to do it automatically, a simple way could be replicating the input several times maybe. I will...
@alexcode4u I will try to help tomorrow, what do you mean that the main problem is generating results?
@alexcode4u sorry i am really busy, however which geberation_type are you using?
@alexcode4u unfortunately speeding up beam search is a very long work, and I don't thing ut will be done in a short while. You could try top_p as generation type,...
@alexcode4u if I undestand correctly what your problem is, making beam_search faster needs a large rewriting and will take a lot of time to do. However, can I ask you...
@alexcode4u so if the issue is just doing it in few lines of code, you can call .repeat or .repeat_interleave on the batch, however this will take longer of course....
Hi @vturrisi, so there are two things, the output from _encode_image in CoCa is already passed through attentional pooling, so in principle it might be different from the same in...