Varuna Jayasiri
Varuna Jayasiri
What model are you referring to?
Sorry for the very late reply. I'm not sure what you are referring to exactly, could you please point to a line or a section of code please?
Very sorry about the late reply. Finetune uses pipeline parallel, but this error doesn't seem like it was because of that. Did it train for some or did it crash...
Yes you are correct, we have missed activation layer
Can you give link(s) to other implementations? Thanks
Closing because of no reply
which notebook are you running?
The data path is `[project_folder]/data/celebA`
In cached sin and cos the first dimension is the sequence length. So it should be `cos_cached[:x.shape[0]]`
Although a lot of people showed interest in annotating, almost no one used lit.labml.ai; so we took it down because of server costs. We will consider open sourcing the annotation...