open-solution-mapping-challenge
open-solution-mapping-challenge copied to clipboard
predicting/evaluating issue
When predicting or evaluating with python main.py -- predict(evaluate) --pipeline_name unet --chunk_size 5000
, the following error occurs:
neptune: Executing in Offline Mode. neptune: Executing in Offline Mode. 2018-05-30 21-01-50 mapping-challenge >>> predicting /home/rs/anaconda3/envs/pytorch0.3/lib/python3.6/site-packages/click/core.py:895: DtypeWarning: Columns (6,7) have mixed types. Specify dtype option on import or set low_memory=False. return ctx.invoke(self.callback, **ctx.params) neptune: Executing in Offline Mode. 0%| | 0/13 [00:00<?, ?it/s]2018-05-30 21-01-56 steps >>> step xy_inference adapting inputs 2018-05-30 21-01-56 steps >>> step xy_inference loading transformer... 2018-05-30 21-01-56 steps >>> step xy_inference transforming... 2018-05-30 21-01-56 steps >>> step xy_inference adapting inputs 2018-05-30 21-01-56 steps >>> step xy_inference loading transformer... 2018-05-30 21-01-56 steps >>> step xy_inference transforming... 2018-05-30 21-01-56 steps >>> step loader adapting inputs 2018-05-30 21-01-56 steps >>> step loader loading transformer... 2018-05-30 21-01-56 steps >>> step loader transforming... 2018-05-30 21-01-56 steps >>> step unet unpacking inputs 2018-05-30 21-01-56 steps >>> step unet loading transformer... 2018-05-30 21-01-58 steps >>> step unet transforming... 2018-05-30 21-01-58 steps >>> step mask_resize adapting inputs 2018-05-30 21-01-58 steps >>> step mask_resize loading transformer... 2018-05-30 21-01-58 steps >>> step mask_resize transforming... 2018-05-30 21-01-58 steps >>> step category_mapper adapting inputs 2018-05-30 21-01-58 steps >>> step category_mapper loading transformer... 2018-05-30 21-01-58 steps >>> step category_mapper transforming... 2018-05-30 21-01-58 steps >>> step mask_erosion adapting inputs 2018-05-30 21-01-58 steps >>> step mask_erosion loading transformer... 2018-05-30 21-01-58 steps >>> step mask_erosion transforming... 2018-05-30 21-01-58 steps >>> step labeler adapting inputs 2018-05-30 21-01-58 steps >>> step labeler loading transformer... 2018-05-30 21-01-58 steps >>> step labeler transforming... 2018-05-30 21-01-58 steps >>> step mask_dilation adapting inputs 2018-05-30 21-01-58 steps >>> step mask_dilation loading transformer... 2018-05-30 21-01-58 steps >>> step mask_dilation transforming... 2018-05-30 21-01-59 steps >>> step xy_inference adapting inputs 2018-05-30 21-01-59 steps >>> step xy_inference loading transformer... 2018-05-30 21-01-59 steps >>> step xy_inference transforming... 2018-05-30 21-01-59 steps >>> step xy_inference adapting inputs 2018-05-30 21-01-59 steps >>> step xy_inference loading transformer... 2018-05-30 21-01-59 steps >>> step xy_inference transforming... 2018-05-30 21-01-59 steps >>> step loader adapting inputs 2018-05-30 21-01-59 steps >>> step loader loading transformer... 2018-05-30 21-01-59 steps >>> step loader transforming... 2018-05-30 21-01-59 steps >>> step unet unpacking inputs 2018-05-30 21-01-59 steps >>> step unet loading transformer... 2018-05-30 21-01-59 steps >>> step unet transforming... 2018-05-30 21-01-59 steps >>> step mask_resize adapting inputs 2018-05-30 21-01-59 steps >>> step mask_resize loading transformer... 2018-05-30 21-01-59 steps >>> step mask_resize transforming... 2018-05-30 21-01-59 steps >>> step score_builder adapting inputs 2018-05-30 21-01-59 steps >>> step score_builder fitting and transforming... 2018-05-30 21-08-33 steps >>> step score_builder saving transformer... 2018-05-30 21-08-33 steps >>> step output adapting inputs Traceback (most recent call last): File "main.py", line 282, in
action()4, 12.69it/s] File "/home/rs/anaconda3/envs/pytorch0.3/lib/python3.6/site-packages/click/core.py", line 722, in call return self.main(*args, **kwargs) File "/home/rs/anaconda3/envs/pytorch0.3/lib/python3.6/site-packages/click/core.py", line 697, in main rv = self.invoke(ctx) File "/home/rs/anaconda3/envs/pytorch0.3/lib/python3.6/site-packages/click/core.py", line 1066, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/rs/anaconda3/envs/pytorch0.3/lib/python3.6/site-packages/click/core.py", line 895, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/rs/anaconda3/envs/pytorch0.3/lib/python3.6/site-packages/click/core.py", line 535, in invoke return callback(*args, **kwargs) File "main.py", line 158, in predict _predict(pipeline_name, dev_mode, submit_predictions, chunk_size) File "main.py", line 169, in _predict prediction = generate_prediction(meta_test, pipeline, logger, CATEGORY_IDS, chunk_size) File "main.py", line 238, in generate_prediction return _generate_prediction_in_chunks(meta_data, pipeline, logger, category_ids, chunk_size) File "main.py", line 271, in _generate_prediction_in_chunks output = pipeline.transform(data) File "/media/rs/3EBAC1C7BAC17BC1/Xavier/crowdAI/open-solution-mapping-challenge/steps/base.py", line 155, in transform step_inputs = self.adapt(step_inputs) File "/media/rs/3EBAC1C7BAC17BC1/Xavier/crowdAI/open-solution-mapping-challenge/steps/base.py", line 192, in adapt raw_inputs = [step_inputs[step_name][step_var] for step_name, step_var in step_mapping] File "/media/rs/3EBAC1C7BAC17BC1/Xavier/crowdAI/open-solution-mapping-challenge/steps/base.py", line 192, in raw_inputs = [step_inputs[step_name][step_var] for step_name, step_var in step_mapping] KeyError: 'images'
The error above may be caused by using --chunk_size 5000
, since the program crashes exactly after 5000 iterations(?). But even if I don't specify chunk_size and just run python main.py -- predict --pipeline_name unet
, another error occurs, which is the same error when I simply run python main.py -- train_evaluate_predict --pipeline_name unet --chunk_size 5000
as ReadMe suggests.
neptune: Executing in Offline Mode. neptune: Executing in Offline Mode. 2018-05-30 21-45-10 mapping-challenge >>> predicting /home/rs/anaconda3/envs/pytorch0.3/lib/python3.6/site-packages/click/core.py:895: DtypeWarning: Columns (6,7) have mixed types. Specify dtype option on import or set low_memory=False. return ctx.invoke(self.callback, **ctx.params) neptune: Executing in Offline Mode. 2018-05-30 21-45-14 steps >>> step xy_inference adapting inputs 2018-05-30 21-45-14 steps >>> step xy_inference loading transformer... 2018-05-30 21-45-14 steps >>> step xy_inference transforming... 2018-05-30 21-45-14 steps >>> step xy_inference adapting inputs 2018-05-30 21-45-14 steps >>> step xy_inference loading transformer... 2018-05-30 21-45-14 steps >>> step xy_inference transforming... 2018-05-30 21-45-14 steps >>> step loader adapting inputs 2018-05-30 21-45-14 steps >>> step loader loading transformer... 2018-05-30 21-45-14 steps >>> step loader transforming... 2018-05-30 21-45-14 steps >>> step unet unpacking inputs 2018-05-30 21-45-14 steps >>> step unet loading transformer... 2018-05-30 21-45-17 steps >>> step unet transforming... 2018-05-30 21-45-17 steps >>> step mask_resize adapting inputs 2018-05-30 21-45-17 steps >>> step mask_resize loading transformer... 2018-05-30 21-45-17 steps >>> step mask_resize transforming... 2018-05-30 21-45-17 steps >>> step category_mapper adapting inputs 2018-05-30 21-45-17 steps >>> step category_mapper loading transformer... 2018-05-30 21-45-17 steps >>> step category_mapper transforming... 2018-05-30 21-45-17 steps >>> step mask_erosion adapting inputs 2018-05-30 21-45-17 steps >>> step mask_erosion loading transformer... 2018-05-30 21-45-17 steps >>> step mask_erosion transforming... 2018-05-30 21-45-17 steps >>> step labeler adapting inputs 2018-05-30 21-45-17 steps >>> step labeler loading transformer... 2018-05-30 21-45-17 steps >>> step labeler transforming... 2018-05-30 21-45-17 steps >>> step mask_dilation adapting inputs 2018-05-30 21-45-17 steps >>> step mask_dilation loading transformer... 2018-05-30 21-45-17 steps >>> step mask_dilation transforming... 2018-05-30 21-45-17 steps >>> step xy_inference adapting inputs 2018-05-30 21-45-17 steps >>> step xy_inference loading transformer... 2018-05-30 21-45-17 steps >>> step xy_inference transforming... 2018-05-30 21-45-17 steps >>> step xy_inference adapting inputs 2018-05-30 21-45-17 steps >>> step xy_inference loading transformer... 2018-05-30 21-45-17 steps >>> step xy_inference transforming... 2018-05-30 21-45-17 steps >>> step loader adapting inputs 2018-05-30 21-45-17 steps >>> step loader loading transformer... 2018-05-30 21-45-17 steps >>> step loader transforming... 2018-05-30 21-45-17 steps >>> step unet unpacking inputs 2018-05-30 21-45-17 steps >>> step unet loading transformer... 2018-05-30 21-45-17 steps >>> step unet transforming... 2018-05-30 21-45-17 steps >>> step mask_resize adapting inputs 2018-05-30 21-45-17 steps >>> step mask_resize loading transformer... 2018-05-30 21-45-17 steps >>> step mask_resize transforming... 2018-05-30 21-45-17 steps >>> step score_builder adapting inputs 2018-05-30 21-45-17 steps >>> step score_builder loading transformer... 2018-05-30 21-45-17 steps >>> step score_builder transforming... Traceback (most recent call last): File "main.py", line 282, in
action()4:40, 12.93it/s] File "/home/rs/anaconda3/envs/pytorch0.3/lib/python3.6/site-packages/click/core.py", line 722, in call return self.main(*args, **kwargs) File "/home/rs/anaconda3/envs/pytorch0.3/lib/python3.6/site-packages/click/core.py", line 697, in main rv = self.invoke(ctx) File "/home/rs/anaconda3/envs/pytorch0.3/lib/python3.6/site-packages/click/core.py", line 1066, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/rs/anaconda3/envs/pytorch0.3/lib/python3.6/site-packages/click/core.py", line 895, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/rs/anaconda3/envs/pytorch0.3/lib/python3.6/site-packages/click/core.py", line 535, in invoke return callback(*args, **kwargs) File "main.py", line 158, in predict _predict(pipeline_name, dev_mode, submit_predictions, chunk_size) File "main.py", line 169, in _predict prediction = generate_prediction(meta_test, pipeline, logger, CATEGORY_IDS, chunk_size) File "main.py", line 240, in generate_prediction return _generate_prediction(meta_data, pipeline, logger, category_ids) File "main.py", line 252, in _generate_prediction output = pipeline.transform(data) File "/media/rs/3EBAC1C7BAC17BC1/Xavier/crowdAI/open-solution-mapping-challenge/steps/base.py", line 152, in transform step_inputs[input_step.name] = input_step.fit_transform(data) File "/media/rs/3EBAC1C7BAC17BC1/Xavier/crowdAI/open-solution-mapping-challenge/steps/base.py", line 109, in fit_transform step_output_data = self._cached_fit_transform(step_inputs) File "/media/rs/3EBAC1C7BAC17BC1/Xavier/crowdAI/open-solution-mapping-challenge/steps/base.py", line 117, in _cached_fit_transform step_output_data = self.transformer.transform(**step_inputs) File "/media/rs/3EBAC1C7BAC17BC1/Xavier/crowdAI/open-solution-mapping-challenge/postprocessing.py", line 127, in transform for image, image_probabilities in tqdm(zip(images, probabilities)): File "/home/rs/anaconda3/envs/pytorch0.3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 941, in iter for obj in iterable: File "/media/rs/3EBAC1C7BAC17BC1/Xavier/crowdAI/open-solution-mapping-challenge/postprocessing.py", line 200, in _transform for image in tqdm(images): File "/home/rs/anaconda3/envs/pytorch0.3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 941, in iter for obj in iterable: File "/media/rs/3EBAC1C7BAC17BC1/Xavier/crowdAI/open-solution-mapping-challenge/postprocessing.py", line 137, in _transform for i, image in enumerate(images): File "/media/rs/3EBAC1C7BAC17BC1/Xavier/crowdAI/open-solution-mapping-challenge/postprocessing.py", line 174, in _transform yield erode_image(image, self.selem_size) File "/media/rs/3EBAC1C7BAC17BC1/Xavier/crowdAI/open-solution-mapping-challenge/postprocessing.py", line 267, in erode_image eroded_image = binary_erosion(mask, selem=selem) File "/home/rs/anaconda3/envs/pytorch0.3/lib/python3.6/site-packages/skimage/morphology/misc.py", line 37, in func_out return func(image, selem=selem, *args, **kwargs) File "/home/rs/anaconda3/envs/pytorch0.3/lib/python3.6/site-packages/skimage/morphology/binary.py", line 42, in binary_erosion ndi.binary_erosion(image, structure=selem, output=out) File "/home/rs/anaconda3/envs/pytorch0.3/lib/python3.6/site-packages/scipy/ndimage/morphology.py", line 370, in binary_erosion output, border_value, origin, 0, brute_force) File "/home/rs/anaconda3/envs/pytorch0.3/lib/python3.6/site-packages/scipy/ndimage/morphology.py", line 227, in _binary_erosion if numpy.product(structure.shape,axis=0) < 1: File "/home/rs/anaconda3/envs/pytorch0.3/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 1897, in product return um.multiply.reduce(a, axis=axis, dtype=dtype, out=out, **kwargs) File "/home/rs/anaconda3/envs/pytorch0.3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 175, in handler _error_if_any_worker_fails() RuntimeError: DataLoader worker (pid 6930) is killed by signal: Killed.
@XYAskWhy the error above is caused by the mistake on our part in the inference mode of unet
.
We are running always unet_padded
and unet_padded_tta
in the inference mode and didn't catch that typo. I would suggest that you run evaluate again with the unet_padded
on --chunk_size 5000
or unet_padded_tta
on smaller chunk size to fit it in memory when combining tta predictions. My advice is to go with --chunk_size 200
with unet_padded_tta
as it gives the best results.
@jakubczakon Many thanks, but training configuration might not be practical, since most mainstream GPUs now have about 10G memory while the 20 images batch only use 2G. As a result, training is very slow. What's your suggestion on larger batch_size and corresponding learning rate?
Very simple just change batch_size_train in the neptune.yaml. you change all other things there too. Including encoder network fron resnet34 to resnet152 or 101, learning rates training schedule and other stuff
You can also train multi gpu. Remember to set num_workers to a higher number because that usually is the bottleneck
how can you work Neptune in Offline Mode
Hi @hs0531
You can do something like this:
from neptune import OfflineBackend
neptune.init(backend=OfflineBackend())
...
as [explained here[(https://docs.neptune.ai/neptune-client/docs/neptune.html?highlight=offline).
In that case, nothing will be logged to Neptune -> I use it usually for debugging purposes.
thank you
---Original--- From: "Jakub"<[email protected]> Date: Wed, May 20, 2020 17:35 PM To: "neptune-ai/open-solution-mapping-challenge"<[email protected]>; Cc: "hs0531"<[email protected]>;"Mention"<[email protected]>; Subject: Re: [neptune-ai/open-solution-mapping-challenge] predicting/evaluating issue (#123)
Hi @hs0531
You can do something like this: from neptune import OfflineBackend neptune.init(backend=OfflineBackend()) ...
as [explained here[(https://docs.neptune.ai/neptune-client/docs/neptune.html?highlight=offline).
In that case, nothing will be logged to Neptune -> I use it usually for debugging purposes.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
thank you … ---Original--- From: "Jakub"<[email protected]> Date: Wed, May 20, 2020 17:35 PM To: "neptune-ai/open-solution-mapping-challenge"<[email protected]>; Cc: "hs0531"<[email protected]>;"Mention"<[email protected]>; Subject: Re: [neptune-ai/open-solution-mapping-challenge] predicting/evaluating issue (#123) Hi @hs0531 You can do something like this: from neptune import OfflineBackend neptune.init(backend=OfflineBackend()) ... as [explained here[(https://docs.neptune.ai/neptune-client/docs/neptune.html?highlight=offline). In that case, nothing will be logged to Neptune -> I use it usually for debugging purposes. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
which version of neptune do you install. i do follow you but get "cannot import name OfflineBackend"
Hi @hs0531
You can do something like this:
from neptune import OfflineBackend neptune.init(backend=OfflineBackend()) ...
as [explained here[(https://docs.neptune.ai/neptune-client/docs/neptune.html?highlight=offline).
In that case, nothing will be logged to Neptune -> I use it usually for debugging purposes.
which version of neptune do you install. i do follow you but get "cannot import name OfflineBackend"