fyang064 issues

Results 7 issues of


                                            fyang064

Transfer learning issue

Hello, I'm wondering how to fix the loading state_dict issue while trying to use pre-trained model. Once the num_classes of the custom dataset has changed from 91, size mismatches for...

Eval and demo function

Hi Dr. Dai, I have seen your wonderful work on infrared small target detection. I'm wondering if your model could be applied on other infrared images like thermal frames for...

Just wondering anything special when training Gemini, like is there any differences train on the multimodal data? ## Upvote & Fund - We're using [Polar.sh](https://polar.sh/kyegomez) so you can upvote and...

Limitations of the method

Just wondering if any limitations of the Infini-attention like inference speed and model performance. Not too much discussions in the paper.

How about the cost of TUTEL features?

I'm wondering the cost of features mentioned in the TUTEL paper. It looks like the dynamic features including top-anything as well as the dynamic capacity factor will introduce the additional...

Scaling configurations (Table 4) in the paper "The Llama 3 Herd of Models"

In the Table 4 of the paper, GPU total number **16384** is not matching with the parallelism group [8, 16, 16, 4]. Is this a mistake in the paper?

Multimodal capabilities of Llama3

I saw the compositional approach adding multimodal capabilities to Llama3 in the report, and am curious about the details about the image encoder and adaptor. Can you please provide any...