ODISE icon indicating copy to clipboard operation
ODISE copied to clipboard

Can you share some implementation details about the result about 'K-Means Clustering of Frozen Diffusion Features'??

Open TyroneLi opened this issue 1 year ago • 10 comments

About 'K-Means Clustering of Frozen Diffusion Features', how do you perform on the dataset? Because the LDM model accept the text input to generate the new image samples, and what do you input to obtain which layers' latent feature map and how do you perform the k-menas cluster? Great thanks.

TyroneLi avatar Mar 31 '23 06:03 TyroneLi

About 'K-Means Clustering of Frozen Diffusion Features', how do you perform on the dataset? Because the LDM model accept the text input to generate the new image samples, and what do you input to obtain which layers' latent feature map and how do you perform the k-menas cluster? Great thanks.

I guess this idea derives from paper F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models

KinGeorge avatar Apr 03 '23 00:04 KinGeorge

About 'K-Means Clustering of Frozen Diffusion Features', how do you perform on the dataset? Because the LDM model accept the text input to generate the new image samples, and what do you input to obtain which layers' latent feature map and how do you perform the k-menas cluster? Great thanks.

I guess this idea derives from paper F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models

This paper is so similar to F-VLM that the moment I saw "K-Means Clustering of Frozen Diffusion Features," it just kept beeping in my head, lol. Nonetheless, it's still excellent work.

BingliangLi avatar Apr 17 '23 11:04 BingliangLi

just wondering is there any updates on this? Anyone able to reproduce the mid-figure below? Any help would be appreciated! image

Tsingularity avatar May 11 '23 23:05 Tsingularity

Did someone reproduce this?

JayKarhade avatar Jun 05 '23 03:06 JayKarhade

Same question here.

zgzxy001 avatar Jun 13 '23 17:06 zgzxy001

any updates?

yxchng avatar Jul 13 '23 04:07 yxchng

I have the same question

Neyleer avatar Sep 12 '23 08:09 Neyleer

+1

nhw649 avatar Feb 20 '24 09:02 nhw649

+1

jakub-prokop avatar Mar 08 '24 12:03 jakub-prokop

I would like to discuss with everyone how this part of the ODISE paper is implemented: image

Currently, my approach is as follows in the code: image

In the code, 'cfeatures' refers to the feature pyramid extracted by stable diffusion. I have fused the features at various levels of this feature pyramid. The approach is roughly based on "Panoptic Feature Pyramid Networks."

Currently, the results of using KMeans clustering in combination with the feature pyramid extracted by stable diffusion are (please ignore the text labels): image

The clustering results obtained solely by using KMeans are: image

How can I improve the result?

jianghongjie328 avatar Sep 10 '24 12:09 jianghongjie328