DeepSpell
DeepSpell copied to clipboard
Not reaching quoted accuracy
Hello!
I have been playing around with this model for a few days now and I am unable to reach the accuracy you quoted in the original blog post [1]. To run, I used the 2013 dataset with the preprecessing step preprocesses_split_lines2()
.
My issue is that after ~24 hours on the same AWS instance (g2.2xlarge) I'm not seeing accuracy levels close to what you quoted (eg. 90% after 12 hours). I was wondering if you either did some different preprocessing, or if after you switched to batch learning you didn't update what to expect... Any comments will likely save others some time in the future.
Bellow I'm also attaching accuracy and loss figures.
[1] https://medium.com/@majortal/deep-spelling-9ffef96a24f6
Hi there,
I made many changes to the code while experimenting. Most of my experiments were with private data which I cannot share, but I tried to copy code changes to the public repo to reflect our internal work. Note - the epoch number is deceiving because the code reads data from an infinite generator and I didn't want to wait for a full epoch until I measure and get a sense of quality, so I set: CONFIG.steps_per_epoch = 1000 # This is a mini-epoch. Using News 2013 an epoch would need to be ~60K. I'm sorry but I can't spend more time on this code at the moment - big things are happening in the background. I hope to be able to update the blog post, the code (and add the much needed "attention" mechanism!) soon.
Thanks for reply and clarifications!
I appreciate very much that you are providing your code in a self contained format (that runs out of the box). I think it would be good (whenever you have time) to post your training accuracy and loss on the 2013 dataset with your default parameters for clarity to others. This always helps people new to deep learning debug problems when they try to build upon code like this (ie. they have some basis to start from).
I look forward to seeing the potential future updates with attention 😄
Cheers!