Rohan issues

Results 7 issues of


                                            Rohan

Original vggish vs this..

Hey doesn't the original tf implementation have only four convolution layers and two fc layers? this one has 6, 3...why the difference? How could the embeddings be identical then?

ImbalancedDataGenerator() ISSUE

Hello, nice work! I was just wondering why you convert every label

Why is the norm of the grad not used at all in the optimisation process of SAGA?

Where exactly is the equation (3) in the main paper implemented in the SAGA algorithm?

Reason for using LARGE_NUMBER

Hello, thanks for the great work! I was wondering what the reason must be behind using the self.LARGE_NUMBER. I understand that it serves to suppress the logits due to self...

MultiLevel Attention

Hi, Why are the multilevel attentions being used during encoding? They are used only during decoding according to the paper about Multimodal attention..

Dataset

Hi, the dataset isn't available in the links you mentioned before in a different issue. Kindly guide..

Llama 2 for larger context

Is there a chance that you may train a model with a larger context capacity like the Llama-2? Thanks!