Domain-Consensus-Clustering negative loss of dcc

Hello,

When I run the code of PDA on my own dataset, it appears that the loss of dcc is negative. I've read the appendix A, which describes that the loss of dcc represents the inter-class domain discrepancy and intra-class domain discrepancy. If it means that the inter-class domain discrepancy is large, leading to the loss of dcc is negative. How can I fix it? Is it possible to set hyper parameters to fix it?
In the answer of #6 issuse, you replied that num_pclass (A) * num_samples (B) are hypers for class-wise sampling in CDD, it enables each batch composes of num_class* num_samples samples, i.e., contain samples from B classes, and each class contains A samples. So I'm wondering that the num_class and num_samples is set according to the source data or target data?

Thanks for your reading. I'm looking forward to your reply.

Mar 07 '22 03:03 DongXiaoheng

Hello,

Thanks for your interest. 1, I think the loss you mentioned is CDD loss, which models the inter-/intra- class discrepancy with the Kernel distance. The kernel distance itself could be negative. For more details, you could refer to Contrastive Domain Discrepancy.

2, We did not perform much tuning of the hyper-parameter you mentioned, but you could surely tune it according to the dataset.

Mar 07 '22 07:03 Solacex

Thanks for your reply. Yes, I mentioned is CDD loss. Actually, I had some problems while doing the experiment. I have two datasets, and I constructed the dataset into heterogeneous sub-class dataset. In one dataset, it achieved good performance in dataset 1, which got positive loss of CDD loss. But it was not good in the dataset 2 for specific sub-classes, it got negative loss. According to your reply, it's possible to happen. Now I wonder know how to fix the problem. So I checked the distribution between two datasets, compared with Dataset 1, Dataset 2 has larger inter-/intra-class shift. No improvements were seen in the results by performing tuning of hyper-parameter. How to improve the performance of DCC method for larger inter-/intra-class shift?

Mar 08 '22 02:03 DongXiaoheng

Hello,

As the domain gap is larger, I think you could use a larger weight for CDD loss, (lambda). Besides, you could introduce a warmup stage before the clustering, which has been provided in the code.

Mar 08 '22 02:03 Solacex

Thanks for you advice and time. I'll follow your suggestion and try it out.

Mar 08 '22 03:03 DongXiaoheng

Hi, I've tried the advice you suggested, but it no big difference. It may be related to the distribution of the data. These two datasets has three classes, respectively. For the dataset 1, it obtained good results, because the source and target data distributed similarly. But for dataset 2, the class 1 and class 2 of source domain distributed partially overlapping, and the class 3 distributed separately from class 1 and class 2. In the target domain, The class 1 and class 2 of target domain distributed separately, and the class 2 and class 3 distributed partially overlapping. So when I performed the open set domain adaptation, such as 1,2,3->1,2 , it appears that it could mismatch class 2 of target domain to the class 2 of source domain. Considering the distribution of the dataset, do you think the unsupervised DCC are still appropriate to deal with the PDA and OSDA scenarios? If it does, what do you think I could do to solve the problem?

Mar 11 '22 02:03 DongXiaoheng

For dataset 2, is it possible to set a more distinct separation for different classes in both source domain and target domain? ie. class 1 and class 2 in the source and class 2 and class 3 in the target ? I think such confusion may be the main cause of the deteriorated performance of the adaptation.

Mar 16 '22 02:03 Solacex

Thanks for your reply. For dataset 2, I did the work like class 1 and class 2 in the source and class 2 and class 3 in the target, and it mostly could achieve good results. I am confused that I think the general domain adaptation should not be constrained to specific source and target classes. But I also noticed that for the public data, most researches usually did the experiment on fixed known classes to target classes.

Mar 16 '22 04:03 DongXiaoheng

Domain-Consensus-Clustering Domain-Consensus-Clustering copied to clipboard

negative loss of dcc

Domain-Consensus-Clustering
Domain-Consensus-Clustering copied to clipboard