Nicolas Patry

Results 978 comments of Nicolas Patry

What are those `input_ids` corresponding to ? If they are `LongTensor` like regular input ids, where did the image go ? Does it need a combination or not ?

Shouldn't we implement `GitForVision2Seq` in the first case ? It's a classic `encoder-decoder` case, correct ?

IT seems `input_ids` is **not** necessary: https://colab.research.google.com/drive/1sLiSY2ixv52yqlw9EUW6BPwCq9XPLc9R#scrollTo=b3XKBvEcU_PR&line=2&uniqifier=1 No ? If it's a regular decoder for the text, then the `decoder_input_ids` should automatically be set by `generate` making `GitForVision2Seq` possible. No...

Oh this code is already not looking pretty, there could be a way to make it better. But we could always add ```python GitForVision2Seq(GitForCausalLM): def forward(self, pixel_values, ***): return super().formward(self,...

Thank you for this PR. It looks promising. > is now able to process sequences longer than 512. Do you have a specific model in mind, `512` seems oddly specific....

> @Narsil, all tests passed except the code quality. I used black but it doesn't pass. I also update the schema above to explain the algorithm for update/aggregate scores try...

> I'll start creating a validation set to have results of what we've done. Sounds great ! Again don't hesitate to ask for resources for larger runs.

I haven't forgotten this PR, it seems to have some external interest. I wanted to dedicate some good time for a proper review and didn't have a lot. I'm looking...

And here are more complete logs so you can inspect a bit more the edgy cases if you want: wikiann all languages x top 5 token classification models Overall it...

Oops: https://gist.github.com/Narsil/e8609805e8e52c7e4114586eede8a481