Quan Sun comments

Results 23 comments of


                                            Quan Sun

a new state-of-the-art in zero-shot on ImageNet 83.26.

Hi @pandaupc, We are very happy to hear that. It's a really big improvement. We are wondering if you have a plan to release your methods.

Where can I find the script to test retrieval zero-shot performance?

@1338199 https://github.com/LAION-AI/CLIP_benchmark

Inquiry Regarding the Integration of Domain-Specific Knowledge into Emu2 for Enhanced Multimodal Learning

Hi @yihong1120, Thank you for sharing your insights and demonstrating interest in Emu2's capabilities. Emu2, being a multimodal foundational model, indeed possesses the flexibility for fine-tuning with domain-specific knowledge. Your...

[Deepspeed stage-3 student+teacher crash]

In my case, when using zero3 and zero.Init in a distillation scenario, it has been observed that a memory leak can occur.

[Deepspeed stage-3 student+teacher crash]

hi @tjruwase, have opened an issue [#3286](https://github.com/microsoft/DeepSpeed/issues/3286)

add EVA-CLIP & deepspeed

> new features I see here: > > * deepspeed support > * eva vit model > * load visual and text tower independently > * report text and image...

add EVA-CLIP & deepspeed

@nahidalam It's not merged. Maybe you can check https://github.com/baaivision/EVA/tree/master/EVA-CLIP which supports loading visual and text tower independently.

Training open_clip with custom text and image encoder

@nahidalam In https://github.com/mlfoundations/open_clip/pull/255, you can specify --pretrained-image and --pretrained-text to simultaneously load custom image and text encoder.

support layer decay and different lr for text/visual encoder

Just a follow-up. Is anyone taking a look?

support layer decay and different lr for text/visual encoder

Hello Gabriel, thanks for your reply! This PR is for layer decay and different lr for text/visual encoder. Learning rate layer decay is a common trick when we train a...