deep-rules Perform sanity checks and follow good coding practices

Have you checked the list of proposed rules to see if the rule has already been proposed?

[x] Yes

Did you add yourself as a contributor by making a pull request if this is your first contribution?

[x] Yes, I added myself or am already a contributor

Feel free to elaborate, rant, and/or ramble. When coding DL models, it is important to maintain good software engineering practices. All code should be documented and include rigorous tests. Sanity checks are also useful. For instance, something is probably wrong (e.g. bug in code, ill posed problem, bad hyperparameters) if model training loss does not decrease (i.e. not overfitting) when considering a very small subset of the training data.

Any citations for the rule? (peer-reviewed literature preferred but not required)

https://arxiv.org/pdf/1206.5533v2.pdf
http://cs231n.github.io/neural-networks-3/#sanitycheck

Nov 21 '18 01:11 evancofer

This is slightly similar to #49 , #35 , and #21

Nov 21 '18 01:11 evancofer

I think it would be good to have your suggestion as s a separate rule, but it is also somewhat connected #42, the fact that we need to usually have a larger/more extensive model selection part when using deep learning as opposed to "traditional" machine learning. In addition, we need to "more babysit" the different model fitting procedures and evaluating the internal procedure ("does it converge?") vs just the external metrics ("what is the prediction accuracy?")

Nov 21 '18 01:11 rasbt

Indeed. In many cases however, sanity checks need to occur before hyperparameter optimization occurs.

Nov 21 '18 02:11 evancofer

All code should be documented and include rigorous tests.

With respect to this, I just published a paper on the exact topic that might be worth citing. I'm happy to expand further on documentation best practices for DL.

Dec 23 '18 10:12 Benjamin-Lee

I also recommend Top considerations for creating bioinformatics software documentation for software documentation

Dec 23 '18 20:12 agitter

@Benjamin-Lee Congrats! I think this and the paper linked by @agitter would be great to cite here.

Dec 23 '18 21:12 pstew

Those all look like pretty relevant citations, and we should definitely keep them in mind as we draft. IIRC this discussion of testing & other software engineering best practices was going to go into Tip # 1 (deep learning is still machine learning), but maybe it should go somewhere else if we intend to discuss certain aspects (e.g. testing) in more detail?

Dec 24 '18 22:12 evancofer

Sort of mentioned in tip 3, need to add this reference to it.

Feb 21 '19 22:02 fmaguire