alexixu

Results 2 issues of alexixu

**Is your feature request related to a problem? Please describe.** I need to discard examples when I read the data. BUT the data loader will stop when the data reader...

Feature request

The sentiment ppo training is normal and the loss is decreasing. The reward mean is increasing slowly. In 40th step, the loss suddenly increase to 1e+10,which cause the reward decrease...