mmf icon indicating copy to clipboard operation
mmf copied to clipboard

Temperature scaled sampling for multi dataset

Open vedanuj opened this issue 4 years ago • 7 comments

🚀 Feature

Temperature scaled sampling for multi dataset

Motivation

In multitask training involving multiple datasets, it is often desirable to be able to control the sampling ratios for different datasets. Currently mmf provides either equal or size proportional sampling. In order to have more control over the sampling we should add temperature scaled sampling.

Pitch

Currently multi_dataset_loader.py implements two sampling strategies for multiple dataset trianing. One is equal and the other is proportional. In order to have more control over the sampling ratios for multiple datasets, we can add a temperature(T) scaling capability when deciding the proportions of different datasets to be used. On one extreme, when T=1 it will be same as the current 'proportion' sampling. As T increases the sampling tends to become more and more equal. Reference: Google T5

Additional context

Temperature scaled sampling is often required during multi task training. For reference Google's T5 paper. The task will involve adding a temperature parameter that can be configured for sampling datasets during multi dataset training.

vedanuj avatar Aug 10 '20 04:08 vedanuj

@vedanuj @apsdehal any update on this? I'd like to work on this if no one is already on it and if it is still needed since we now have the option of specifying the sampling ratios for each dataset.

Suhruth9 avatar Jun 11 '21 14:06 Suhruth9

Hi, @vedanuj @apsdehal any update on this? I'd like to work on this and this could be my first issue

parthduggal avatar Aug 19 '21 12:08 parthduggal

@parthduggal Thanks for your interest. This can be added as a form of iteration strategy. Take a look at https://github.com/facebookresearch/mmf/blob/master/mmf/datasets/iteration_strategies.py

apsdehal avatar Aug 23 '21 06:08 apsdehal

@apsdehal , if I'm not wrong, I have to add temperature scaled sampling to the current iteration strategies in the format of the other iteration strategies that you showed in the link?

parthduggal avatar Aug 24 '21 13:08 parthduggal

May i work on this..

istakhar1 avatar Aug 30 '21 15:08 istakhar1

@shinobi-AI Since, @parthduggal is already working on this can you check any other issue?

vedanuj avatar Aug 30 '21 20:08 vedanuj

take

vaish-muk avatar Jan 27 '22 07:01 vaish-muk