Unable to access experiments that have previously finished running
Describe the issue:
After stopping a run of experiments at one time and opening a new one, all previous experiments that have ended their run are no longer available. Neither nnictl view/resume, nnictl experiment list --all, nor Web ui interface will work.
Currently only running experiments are available, none of the previously concluded experiments are available.
And in the ~/nni-experiments directory, the results of previously run experiments are actually there.
What can I do to enable previously closed experiments to be viewed again?
Environment:
- NNI version: 2.7
- Training service (local|remote|pai|aml|etc):local
- Client OS:ubuntu
- Server OS (for remote mode only):
- Python version:3.8
- PyTorch/TensorFlow version:pytorch
- Is conda/virtualenv/venv used?:yes
- Is running in Docker?:no
Configuration:
- Experiment config (remember to remove secrets!):
- Search space:
Log message:
- nnimanager.log:
- dispatcher.log:
- nnictl stdout and stderr:
How to reproduce it?:
I think your experiment meta-data file might be corrupted. The file can be found in ~/nni-experiments/.experiment. You can check whether some experiments are missing in the file. As it's a JSON file, you can expect that it's not very reliable. If you want to restore some experiments back, you can use nnictl experiment load/save.
I think your experiment meta-data file might be corrupted. The file can be found in
~/nni-experiments/.experiment. You can check whether some experiments are missing in the file. As it's a JSON file, you can expect that it's not very reliable. If you want to restore some experiments back, you can usennictl experiment load/save.
@nnnnnzy any updates for the issue with these suggestions? thanks.