openpom
openpom copied to clipboard
Problems in the training and inference
2 problems, thank U!
1, Training problem
Training the model with ensemble_benchmark.ipynb
using curated_GS_LF_merged_4983.csv as train dataset
modify the line 'train_dataset, test_dataset = splitter.train_test_split(dataset, frac_train=0.8, train_dir='./splits/train_data', test_dir='./splits/test_data')'
let frac_train=0.9 (instead of frac_train=0.8)
ERROR
'Traceback (most recent call last):
File "/home/ubuntu/openpom/examples/benchmark2.py", line 132, in
How frac_train=0.9 made the ERROR?
2, Inference problem I had trained the model with n_models = 10 and nb_epoch = 62 (By the way, what value of nb_epoch is best?) model.restore(f"./ensemble_models/experiments_1/checkpoint2.pt") model.restore(f"./ensemble_models/experiments_10/checkpoint2.pt") So, There is a significant difference between the Inference results with experiments_1/checkpoint2.pt and experiments_10/checkpoint2.pt. For example, the 5 odors with top 5 high Inference values and 138 values OC12C3CC3C4CC(CCC41C)C2(C)C ['woody', 'green', 'amber', 'camphoreous', 'dry']
OC12C3CC3C4CC(CCC41C)C2(C)C ['spicy', 'earthy', 'herbal', 'woody', 'green']
Which model is better and how to get the best one? Is it an overfitting problem caused by a small training dataset?