Yuxuan Zhang comments

Results 18 comments of


                                            Yuxuan Zhang

Must I use regulization imgs retrieved from laion?

Hi! Thanks for your reply! Useful solutions! But while following your "single common prompt" suggestion, there comes 2 cases: 1. `concept_list` should be None, otherwise `class_data_dir` won't work. And this...

'llava' is already used by a Transformers config, pick another name.

use an older transformers version, for example 4.31.0

Face detection not working as expected

Hi, you may try to prompt the model with "[semantic] human face", which is our employed datasets' annotation. Besides, if you want super accurate segmentation masks, our model may not...

Face detection not working as expected

Furthermore, if you want to include human hair, try to prompt with "[semantic] head"

Face detection not working as expected

1. 224 image is in responsible for providing coarse indication of the grounded object, which may have few things to do with the detailed segmentation quality, I guess. 2. Our...

Face detection not working as expected

1. I ignored the multiple frames conditioning. I will take a look at that when I'm done with my current work. Thanks! 2. I suggest you simply prompt the model...

How to support multi-class segmentation

EVF-SAM doesn't support multi-class segmentation within one inference. You may consider batch inference.

objects365 dataset annotation file

We select from original o365 annotations by excluding categories with more than one instance for each image. Then we apply sam-2 to transfer bounding boxes to segmentation masks. Easy code...

objects365 dataset annotation file

here is our pipeline to produce o365 res data: 1.git clone sam2 2.run this py ``` from collections import Counter import json import torch import cv2 from tqdm import tqdm...

BEIT-3-Large - Layer fusion

Hi, thank you for reproducing our work! Our BEIT experiment is to prove the effectiveness of "early-fusion", where "late" means use beit3 to extract separate single-modal feat and concat them....