Achyudh Ram
Achyudh Ram
For now this is something specific to CharCNN due to the large size of the character quantized matrices. But in general I feel it's better to have a streaming approach...
Take a look at datasets/reuters.py. Removing the special characters from the regular expression should do what you want. ``` def clean_string(string): """ Performs tokenization and string cleaning for the Reuters...
For the second point, no we wouldn't need two models. I did something similar for regularizing KimCNN and it's fine if we just have one model with optional regularization parameters.