blueoil
blueoil copied to clipboard
fix loading of dataset class from yaml config
What this patch does to fix the issue.
In some cases, the dataset class saved to the config.yaml
is 'abc.DATASET_CLASS'
. In these cases, trying to load config.yaml
fails, causing error. However, in these cases, the base class is a sensible class, which it can load. So this fix is to load the base class in these cases.
The default situation is to load the config.py
, not the config.yaml
. For this reason, this bug has not been previously caught by the tests. However, I think the point of the config.yaml
is to provide a separate config file that could also be used. So, I think it's good to fix it.
p.s. the error seems to occur when init was used before training. It also occurs during the auto-tests. Maybe the error doesn't happen for dataset classes which are hand-crafted (not generated).
Link to any relevant issues or pull requests.
This PR also fixes the error that is blocking PR https://github.com/blue-oil/blueoil/pull/1096 due to loading from config.yaml
.
A simple example of the new behaviour after this fix
case 1:
$ print(config_dict['DATASET_CLASS'])
<class 'abc.DATASET_CLASS'>
$ print(config_dict['DATASET_CLASS'].__bases__[0])
<class 'blueoil.datasets.delta_mark.ObjectDetectionBase'>
(in this case, load <class 'blueoil.datasets.delta_mark.ObjectDetectionBase'>
)
case 2:
$ print(config_dict['DATASET_CLASS'])
<class 'blueoil.datasets.cifar100.Cifar100'>
$ print(config_dict['DATASET_CLASS'].__bases__[0])
<class 'blueoil.datasets.base.Base'>
(in this case, load as usual <class 'blueoil.datasets.cifar100.Cifar100'>
)
How to reproduce the bug
$ docker run -it --runtime=nvidia -e CUDA_VISIBLE_DEVICES="0" -e PYTHONPATH=.:lmnet:dlk/python/dlk --rm -v `pwd`:/home/blueoil -w /home/blueoil --user=$(id -u):$(id -g) <docker-container> bash
$ python blueoil/cmd/main.py init
to create a config my_model.py
. then train by:
$ python blueoil/cmd/main.py train -c my_model.py -e testing123 --recreate
Converting at this point will work fine because it will load from the config.py
in the saved
directory. But instead, if you do
$ rm saved/testing123/config.py
Then now, running convert
$ python blueoil/cmd/main.py convert -e testing123
will load from the config.yaml
instead, and will cause an error:
The main point is: yaml.constructor.ConstructorError: while constructing a Python object cannot find 'DATASET_CLASS' in the module 'abc'
.
After making the change in this PR, the error doesn't happen any more. So, it is able to load the config from config.yaml
.
Other possible ways to fix it
In terms of readability, the fix in this PR is may be not ideal, but I'm not sure how else to fix the bug in a simple way. Some other ideas for ways to fix it:
- don't use this yaml file anymore (maybe delete it? I'm not sure if it is used for anything other than loading the config).
- a similar solution as in this PR, but written in a more readable way (I don't know it, but maybe there is a way).
- make a larger change to the behaviour of loading and saving
DATASET_CLASS
. In particular, line 41 of classification.tpl.py and other template files, when this is loaded, it becomes a class, then for theconfig.yaml
, it is saved as a class. The further loading of this class is what causes the difficulty, so changing how this is done (to some completely different way) could be a solution.
ah right, I should first request for owner approval, then after that, for readability approval. I'll remember for next time!
@joelN123
Please check my understand.
In your reproduce step, Why do you remove config.py
file?
Yes, so I'm trying to show there is an error when loading the config.yaml
. But if the config.py
exists, then the config.yaml
is not loaded, so the error doesn't happen. So, it's necessary to delete the config.py
, to show that there is an error when loading config.yaml
. (in export, it calls utils/config.py, it loads only one file, not both).
This PR needs Approvals as follows.
- Ownership Approval for
/
from iizukak, tkng, ruimashita - Readability Approval for
Python
from tkng, tsawada, tfujiwar
Please choose reviewers and requet reviews!
Click to see how to approve each reviews
You can approve this PR by triggered comments as follows.
-
Approve all reviews requested to you (readability and ownership) and LGTM review
Approval, LGTM
-
Approve all ownership reviews
Ownership Approval
orOA
-
Approve all readability reviews
Readability Approval
orRA
-
Approve specified review targets
- Example of Ownership Reviewer of
/
:Ownership Approval for /
orOA for /
- Example of Readability Reviewer of
Python
:Readability Approval for Python
orRA for Python
- Example of Ownership Reviewer of
-
Approve LGTM review
LGTM
See all trigger comments
Please replace [Target] to review target
- Ownership Approval
- Ownership Approval for [Target]
- OA for [Target]
- Ownership Approval
- OA
- Approval
- Readability Approval
- Readability Approval for [Target]
- RA for [Target]
- [Target] Readability Approval
- [Target] RA
- Readability Approval
- RA
- Approval
- LGTM
- LGTM
- lgtm
Maybe "fix saving of dataset class for yaml config" would be a better name for this PR. Well, the loading was causing the error, but the way of fixing is to change the way it is saved.
I put a comment into the code because I think that someone reading the code later will not know why these two lines of code are here. (unless they look at the blame of the line and find this PR, which would use up their time). So, better to put a comment to explain.
@a-hanamoto, please could you do a readability review when you have time