dmol-book New Chapter: hyperparameter selection

New Chapter: hyperparameter selection

Open mehradans92 opened this issue 2 years ago • 6 comments

Sep 08 '21 02:09 mehradans92

https://docs.ray.io/en/latest/tune/index.html

Sep 08 '21 02:09 whitead

a bit more far-fetched would be experiment tracking.

I'm thinking of stuff like weights and biases which 1) has tools for hyperparameter sweeps 2) tools to visualize some chemistry

Oct 05 '21 15:10 kjappelbaum

as framework, I really enjoy using Optuna

Oct 05 '21 15:10 kjappelbaum

@whitead Do you want to use this package (https://docs.ray.io/en/latest/tune/index.html) in this chapter?

Dec 06 '21 14:12 mehradans92

@mehradans92 I think that comment was me sharing some existing methods used. It would be better to be as package agnostic as possible though.

Dec 06 '21 20:12 whitead

@mehradans92 Read through it briefly. Looks great, a lot of work went into it! Also I can tell it will be very helpful. A few proposed changes:

[ ] Try to look at the layers chapter once more, there is some overlapping material (e.g., dropout, regularization, hyperparameters).
[ ] Cite some papers on learning rate schedulers and maybe add some information on momentum, since it's related. Also some have mentioned warm-start, which I'm not familiar with. Maybe mention it..
[ ] Fig 8.2 - does it need to be a movie? Can be distracting while reading. I can see the benefit for 8.1 certainly
[ ] Batch size - would love to get 1-2 citations here on batch size and its connection to randomness in estimating gradient
[ ] Dropout - can you cite the paper and maybe add a bit more on where it should be added (all layers?), if it should be combined with other regularization, etc.
[ ] It is really critical to use validation data for a hyperparameter search - otherwise you're implicitly fitting to testing data. See here. You need to strongly emphasize this point early and make sure code/examples uses the word validation, instead of test, for the search.
[ ] On Keras, can you reduce the output level of the logging (verbose=0) so the text isn't rendered in the chapter.
[ ] You've split up the code nicely, but it'd be great to have some discussion, maybe showing how snippets of how the methods work too, before going right into training.

Mar 30 '22 17:03 whitead