flow-forecast
flow-forecast copied to clipboard
Workflow is poorly explained in the docs and unintuitive to grasp when reading notebooks
I found it very difficult to understand what was going on with the experiments when reading through the FF docs and accompanying notebooks.
It would be beneficial for this to be included on a page in the docs.
I've broken down the workflow slightly to look more like how wandb suggests it should be done. I think it would be useful to have a play-by-play walkthrough of the actual actions you take to run experiments. It doesn't have to be this way but this seems clear and easy-to-understand to me.
import wandbandfrom flood_forecast.trainer import train_function- Define dict
default_configin funcmake_final_config(sweep_config)that returns thedefault_config. Use Training Models page and notebooks to help. - Define dict
sweep_configand the params you want to search with help from wandb - (this is the part that took me ages to figure out yesterday!) Go back to
make_final_config(sweep_config)and overwrite the params you want to search withsweep_config['param']. - Define
train()function
def train():
"""Train function to perform sweeps via wandb.agent(sweep_id, train).
This function should only be called in wandb.agent(). For it to work, you must
have already created a valid sweep_id with
wandb.sweep(sweep_config, project='proj_name', entity='your_wandb_name').
"""
wandb.init(project='proj_name', entity='your_wandb_name')
sweep_config = wandb.config
final_config = make_final_config(sweep_config)
train_function('PyTorch', final_config)
- Run
sweep_id = wandb.sweep(sweep_config, project='proj_name', entity='your_wandb_name') - Finally, run
wandb.agent(sweep_id, train)
This will run a sweep of the hyperparams you set in sweep_config and build models with FF.
Hey @theadammurphy, thanks for addressing this issue and providing some clarification. I also find the documentation extremely confusing, and have spent several hours trying to reproduce the experiment from the Colab notebook on my own local machine.
I follow most of what you suggest, but I am still unsure what the make_final_config config builder function should look like. I am particularly curious as to how it should make use of the sweep_config parameter you suggest. In the Colab notebook, they create the config object (inside this builder function) using wandb.config instead, even when referring to hyperparameters defined in sweep_config.
It sounds like you already went through these problems, so would you mind uploading a complete example that uses your code above? Thanks!
@joelrorseth Could you describe your problems in more detail?
As for your question above the make_final_config aims to solve the problem of creating a flow-forecast valid configuration file from weights and biases sweep via subsisting in the proper parameters to the config from the sweep. W&B essentially creates sweep object with a specific set of parameters (e.g. wandb.config). In the notebook we then take those params and subsititute them into our configuration. This allows us to alter our configuration files on the fly make them compatible with the library.
Hey, @joelrorseth here is the link to my file flow_forecast_utils.py that I've tried to use to manage experiments.
The most important functions are:
make_final_configget_sweep_configtrainrun_sweep
You can ignore load_data and load_default_config. The former is for the specific problem I am working on now and I keep the latter to have a reference to turn back to should my working config explode in complexity.
Notice how in make_final_config I have set the value of config['training_params']['epochs'] to sweep_config.epochs and config['training_params']['optim_params']['lr'] to sweep_config.lr.
If you look at get_sweep_config, you will see that the 'parameters' I am searching over are lr and epochs.
WHAT I ACTUALLY DID Here are the steps I took when actually coding it.
- I filled out all values in
make_final_configwith default values, - Decided I wanted to sweep over
lrandepochs, so - Went back to
make_final_configand replaced the appropriate parameters withsweep_config.lrandsweep_config.epochs.
Note: I have been trying to use flow-forecast for a multi-class classification problem which it isn't really designed to handle. The 'inference_params' section of the config is noticeably lacking. I haven't been able to train any models using pure flow-forecast and have taken to modifying the models myself. So, there are no guarantees my config works as is. But hopefully, it gives you a much clearer idea of how to make your own!
Thanks for this detailed explanation @theadammurphy, I followed your advice and my script works just fine now! I found your file to be very helpful, especially the notes and documentation. Cheers!
Wonderful! I'm very happy to hear that @joelrorseth! Glad to be of assistance :)