self-adaptive-training
self-adaptive-training copied to clipboard
tabular data/ noisy instances
Hi, thanks for sharing your implementation. I have two questions about it:
- Does it also work on tabular data?
- Is it possible to identify the noisy instances (return the noisy IDs or the clean set)?
Thanks!
Hi,
regarding your questions:
- I am not familiar with tabular data, but I think it is worth a try if your goal is classification and you were using the CE loss.
- The simplest way, as we did in the paper, is to compare the find the mismatch between maximal indices of the training targets and the original labels. Generally, a mismatch indicates a noisy instance with a high probability.
thank you for clarification! By tabular data I mean non-image data e.g. iris dataset
In my opinion, the data modality is not a crucial problem for SAT.
Say you have an input data $x$ of any modality and a deep model $f(\cdot)$ that produces prediction $p = f(x)$. SAT operates only on the prediction $p$ and $p$ to update the training target, independent of the modality of $x$. As long as your model $f(\cdot)$ is able to overfit the training labels, SAT should help.