Feature_Critic icon indicating copy to clipboard operation
Feature_Critic copied to clipboard

About the dataset of Imagenet

Open wyf0912 opened this issue 5 years ago • 7 comments

Thanks for sharing the code.

I have the imagenet2012 dataset, which is more than 100GB. How can I process it to 6.1GB preprocessed dataset.

Looking forward to your reply~

wyf0912 avatar Oct 12 '19 06:10 wyf0912

There is no need for you to process it by yourself. The ImageNet images have been preprocessed already to 6.1G for the Visual Domain Decathlon. You can download it directly online from the Visual Domain Decathlon website for example. Well, if you really want to process the images by yourself, I think the main difference is the different resolution of the images, so you need to change the resolution at least.

liyiying avatar Oct 12 '19 16:10 liyiying

Thank you~ The download application I submitted before was not approved, but now I find it

------------------ 原始邮件 ------------------ 发件人: "liyiying"[email protected]; 发送时间: 2019年10月13日(星期天) 凌晨0:32 收件人: "liyiying/Feature_Critic"[email protected]; 抄送: "WangYufei"[email protected];"Author"[email protected]; 主题: Re: [liyiying/Feature_Critic] About the dataset of Imagenet (#3)

There is no need for you to process it by yourself. The ImageNet images have been preprocessed already to 6.1G for the Visual Domain Decathlon. You can download it directly online from the Visual Domain Decathlon website for example. Well, if you really want to process the images by yourself, I think the main difference is the different resolution of the images, so you need to change the resolution at least.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

wyf0912 avatar Oct 14 '19 05:10 wyf0912

Dear Li Yiying,

Thanks for sharing your code. But I have some problem when using KNN.

I change the code from SVM to KNN, also use sklearn to impley.

I use the parameters set as following for search:

tuned_parameters = [ { 'weights':['uniform'], 'n_neighbors':[i for i in range(1,15)] }, { 'weights':['distance'], 'n_neighbors':[i for i in range(1,11)], 'p':[i for i in range(1,6)] } ] clf = GridSearchCV(KNeighborsClassifier(), tuned_parameters, scoring='precision_macro', n_jobs=10) But I can't get the excepted results in paper. I'd like to know that if my code has some problem or if I didn't use the correct parameters set. Hope you can share your implementaion or setting abou KNN

The following is my test result using the pretrained ADD model

the test domain aircraft.

{'n_neighbors': 6, 'weights': 'uniform'} 0.11071107110711072 the test domain dtd.

{'n_neighbors': 7, 'weights': 'uniform'} 0.300531914893617 the test domain vgg-flowers.

{'n_neighbors': 1, 'weights': 'uniform'} 0.4343137254901961 the test domain ucf101.

{'n_neighbors': 6, 'weights': 'uniform'} 0.35758196721311475

Best regards, Yufei

------------------ 原始邮件 ------------------ 发件人: "im_wyf"[email protected]; 发送时间: 2019年10月14日(星期一) 中午1:29 收件人: "liyiying/Feature_Critic"[email protected];

主题: 回复: [liyiying/Feature_Critic] About the dataset of Imagenet (#3)

Thank you~ The download application I submitted before was not approved, but now I find it

------------------ 原始邮件 ------------------ 发件人: "liyiying"[email protected]; 发送时间: 2019年10月13日(星期天) 凌晨0:32 收件人: "liyiying/Feature_Critic"[email protected]; 抄送: "WangYufei"[email protected];"Author"[email protected]; 主题: Re: [liyiying/Feature_Critic] About the dataset of Imagenet (#3)

There is no need for you to process it by yourself. The ImageNet images have been preprocessed already to 6.1G for the Visual Domain Decathlon. You can download it directly online from the Visual Domain Decathlon website for example. Well, if you really want to process the images by yourself, I think the main difference is the different resolution of the images, so you need to change the resolution at least.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

wyf0912 avatar Oct 15 '19 10:10 wyf0912

We use the cosine_similarity (from sklearn.metrics.pairwise) for KNN instead of the gridsearch's own Manhattan Distance or Euclidean distance. And then use the gridsearch idea (search n in range (1,21)) and got the results.

liyiying avatar Oct 15 '19 12:10 liyiying

Thanks a lot. I try to use cosine_simliarity, but can't find obvious improvement. Could you share your code about KNN?

Looking forward to your reply.

Best regards, Yufei ------------------ 原始邮件 ------------------ 发件人: "liyiying"[email protected]; 发送时间: 2019年10月15日(星期二) 晚上8:05 收件人: "liyiying/Feature_Critic"[email protected]; 抄送: "WangYufei"[email protected];"Author"[email protected]; 主题: Re: [liyiying/Feature_Critic] About the dataset of Imagenet (#3)

We use the cosine_similarity (from sklearn.metrics.pairwise) for KNN instead of the gridsearch's own Manhattan Distance or Euclidean distance. And then use the gridsearch idea (search n in range (1,21)) and got the results.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

wyf0912 avatar Oct 16 '19 07:10 wyf0912

def cos_knn(self, k, test_data, test_target, stored_data, stored_target): cosim = cosine_similarity(test_data, stored_data) top = [(heapq.nlargest((k), range(len(i)), i.take)) for i in cosim] top = [[stored_target[j] for j in i[:k]] for i in top] pred = [max(set(i), key=i.count) for i in top] pred = np.array(pred) precision = np.mean(pred == test_target) return precision

liyiying avatar Oct 16 '19 11:10 liyiying

Thanks for your help and sorry for the late reply~ Best regards.

------------------ 原始邮件 ------------------ 发件人: "liyiying"[email protected]; 发送时间: 2019年10月16日(星期三) 晚上7:20 收件人: "liyiying/Feature_Critic"[email protected]; 抄送: "WangYufei"[email protected];"Author"[email protected]; 主题: Re: [liyiying/Feature_Critic] About the dataset of Imagenet (#3)

def cos_knn(self, k, test_data, test_target, stored_data, stored_target): cosim = cosine_similarity(test_data, stored_data) top = [(heapq.nlargest((k), range(len(i)), i.take)) for i in cosim] top = [[stored_target[j] for j in i[:k]] for i in top] pred = [max(set(i), key=i.count) for i in top] pred = np.array(pred) precision = np.mean(pred == test_target) return precision

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

wyf0912 avatar Oct 27 '19 13:10 wyf0912