Achyudh Ram comments

Repositories
Issues
Comments

Results 13 comments of


                                            Achyudh Ram

Fix insane memory usage when loading datasets

For now this is something specific to CharCNN due to the large size of the character quantized matrices. But in general I feel it's better to have a streaming approach...

Fix tokenizer for reuters dataset

Take a look at datasets/reuters.py. Removing the special characters from the regular expression should do what you want. ``` def clean_string(string): """ Performs tokenization and string cleaning for the Reuters...

Reduce code redundancy

For the second point, no we wouldn't need two models. I did something similar for regularizing KimCNN and it's fine if we just have one model with optional regularization parameters.