fyang064

Results 7 issues of fyang064

Hello, I'm wondering how to fix the loading state_dict issue while trying to use pre-trained model. Once the num_classes of the custom dataset has changed from 91, size mismatches for...

Hi Dr. Dai, I have seen your wonderful work on infrared small target detection. I'm wondering if your model could be applied on other infrared images like thermal frames for...

Just wondering anything special when training Gemini, like is there any differences train on the multimodal data? ## Upvote & Fund - We're using [Polar.sh](https://polar.sh/kyegomez) so you can upvote and...

Just wondering if any limitations of the Infini-attention like inference speed and model performance. Not too much discussions in the paper.

I'm wondering the cost of features mentioned in the TUTEL paper. It looks like the dynamic features including top-anything as well as the dynamic capacity factor will introduce the additional...

In the Table 4 of the paper, GPU total number **16384** is not matching with the parallelism group [8, 16, 16, 4]. Is this a mistake in the paper?

I saw the compositional approach adding multimodal capabilities to Llama3 in the report, and am curious about the details about the image encoder and adaptor. Can you please provide any...