wheatley
wheatley copied to clipboard
Issue while Testing Jssp Problem
Basically i followed your documentation for setting environment and all. I trained Jssp model by reducing the arguments as my pc doesn't have enough GPU. Here is my Command "python -m jssp.train --batch_size 32 --clip_range 0.20 --custom_heuristic_names SPT MWKR MOPNR FDD/MWKR --device cuda:0 --duration_type deterministic --ent_coef 0.05 --exp_name_appendix QUICKSTART_RUN --fe_type dgl --fixed_validation --gae_lambda 0.99 --gamma 1.00 --graph_has_relu --graph_pooling max --hidden_dim_actor 8 --hidden_dim_critic 8 --hidden_dim_features_extractor 16 --layer_pooling last --lr 1e-4 --max_n_j 100 --max_n_m 30 --mlp_act gelu --n_epochs 3 --n_j 10 --n_layers_features_extractor 5 --n_m 10 --n_mlp_layers_actor 1 --n_mlp_layers_critic 1 --n_mlp_layers_features_extractor 1 --n_steps_episode 5000 --n_validation_env 100 --n_workers 1 --optimizer adamw --ortools_strategy realistic --residual_gnn --seed 0 --target_kl 0.04 --total_timesteps 5000 --validation_freq 3 --vf_coef 2.0
"
It trained as shown below
But I didn't understand the test command in your documentation what is Path to experiment and also instance?
Also I couldn't able to get enough results from training itself
Hi @JohnnSnow218 and thank you for using wheatley (at least for trying to use it despite its poor documentation...)
The problem I can see is that --total_timesteps
is set to 5000
si basically wheatley does only 5000 steps in the environment and stops training. As you also stated 5000 as --n_steps_episode
it does only one episode and stops (and does not even optimize due to the way the number of optimization steps to do is rounded).
Short : you should use a very large number for total_timesteps (and interrupt your training when it does not improve anymore)
Long: wheatley (based on PPO) first collects rollouts (by doing actions in the env), then for these rollouts it does n_epoch
optimization steps, then collects rollouts again and so on for a given number of iterations. The length of the rollout buffer (ie the number of action done) is n_steps_episode * n_workers
. The number of iterations of the outer loop is total_timesteps / (n_steps_episode*n_workers)
.
Hope this helps!
Hi @fantes Can you explain me the test command in your documentation what is Path to experiment and also instance?
--path
is the path were the trained net is stored which is in ./saved_network/ as a default (precise name is generated using the options), instance the precise problem you want to solve
Hi @fantes I didn't get the last argument for taillard files…also Check these commands and errors once.
Hi @JohnnSnow218
The last argument --first_machine_id_is_one
should be present or not , this is the meaning of regexp-like syntax []
. If present, it states that in test files the first machine id is 1 and not zero (we internally use zero, but most taillard examples do not have a machine 0, so the first machine id is 1).
In your second example, you trained with a max number of jobs of 10 and a max number of machines of 10, and tested on an instance of size 20x15. To be able to do so, you should use options like this :
--n_j 10 \
--n_m 10 \
--max_n_j 20\
--max_n_m 20\
This will allow the model to solve problems up to size 20x20 but with training on size 10x10 only. This limitation (of having to define max inference size at train time) is purely due to implementation details and are not theoretical.
RCPSP code does not have such limitation
Hi @fantes, Just need to know where can we get gantt chart for test solution? As i only got saved solution schedule
Hi @JohnnSnow218 . I am afraid that you will have to do the gantt chart or whatever formatting of the schedule yourself