Trevor Morris

Results 27 comments of Trevor Morris

> I've ran initially SRRESNET with VGG54 for 10 power 5 iterations. How do the results of this training look? If I recall correctly, in the paper they train SRResNet...

Sorry for the delay. 1. Yes, that is a good way to change the scaling to 4x. 2. That looks correct. How does the output look? 3. That is strange,...

srgan.SRGanGenerator and srgan.SRGanDiscriminator can be created with num_upsamples=3 for an 8x factor. There will also be some other parts in the code that need to be adjusted from 4x to...

Thanks for filing this issue @AlessioNetti, I was able to reproduce the bug. Taking a look now.

I think I found the issue. We should be able to get the fix in soon.

Hi @AlessioNetti, it looks like we intentionally changed this so that the logits length will match the tokens. We will consider whether it should be added back.

@ezhulenev Could you please take a look when you have a chance?

> Is it possible to test those? Thanks for reviewing, added test to `xla_client_test`.

I am also encountering this issue when using dynamic rope scaling and here is what's happening: 1. During `LlamaAttention.__init__()`, the `LlamaRotaryEmbedding` module is initialized. No `device` arg is provided: https://github.com/huggingface/transformers/blob/d6ba1ac041ac0b07bc589dd82a67cfb76f75d0f9/src/transformers/models/llama/modeling_llama.py#L304...