supervisely-tutorials icon indicating copy to clipboard operation
supervisely-tutorials copied to clipboard

Is it possible to run on a normal docker not NVIDIA docker?

Open ClaudeCoulombe opened this issue 7 years ago • 6 comments

Nice tutorial, but I've got problem with the docker image.

The run.sh starts with nvidia-docker and I suspect that your docker image should be dependent on NVIDIA libraries and GPUs. Is-it possible to run your docker image on a CPU? More precisely I would like to run the docker image on a Macbook pro. Did you build a compatible docker image?

ClaudeCoulombe avatar Nov 04 '17 05:11 ClaudeCoulombe

I've tried to run the Notebook on my Macbook pro directly (without docker). I've got an error that seems related to the data generator. (TensorFlow version: 1.3.0, Keras version: 2.0.9, Python 3.6.0a2). I am not sure that error would occur on a GPU. What do you think?


Layer (type) Output Shape Param # 1 Connected to

================================================================================================== the_input (InputLayer) (None, 128, 64, 1) 0


conv1 (Conv2D) (None, 128, 64, 16) 160 the_input[0][0]


max1 (MaxPooling2D) (None, 64, 32, 16) 0 conv1[0][0]


conv2 (Conv2D) (None, 64, 32, 16) 2320 max1[0][0]


max2 (MaxPooling2D) (None, 32, 16, 16) 0 conv2[0][0]


reshape (Reshape) (None, 32, 256) 0 max2[0][0]


dense1 (Dense) (None, 32, 32) 8224 reshape[0][0]


gru1 (GRU) (None, 32, 512) 837120 dense1[0][0]


gru1_b (GRU) (None, 32, 512) 837120 dense1[0][0]


add_41 (Add) (None, 32, 512) 0 gru1[0][0]
gru1_b[0][0]


gru2 (GRU) (None, 32, 512) 1574400 add_41[0][0]


gru2_b (GRU) (None, 32, 512) 1574400 add_41[0][0]


concatenate_41 (Concatenate) (None, 32, 1024) 0 gru2[0][0]
gru2_b[0][0]


dense2 (Dense) (None, 32, 23) 23575 concatenate_41[0][0]


softmax (Activation) (None, 32, 23) 0 dense2[0][0]

================================================================================================== Total params: 4,857,319 Trainable params: 4,857,319 Non-trainable params: 0


Epoch 1/1 Exception in thread Thread-37: Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", line 864, in run self._target(*self._args, **self._kwargs) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/keras/utils/data_utils.py", line 579, in data_generator_task generator_output = next(self._generator) File "", line 84, in next_batch img, text = self.next_sample() File "", line 68, in next_sample return self.imgs[self.indexes[self.cur_index]], self.texts[self.indexes[self.cur_index]] IndexError: list index out of range


StopIteration Traceback (most recent call last) in () ----> 1 model = train(128, load=False)

in train(img_w, load) 96 epochs=1, 97 validation_data=tiger_val.next_batch(), ---> 98 validation_steps=tiger_val.n) 99 100 return model

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/keras/legacy/interfaces.py in wrapper(*args, **kwargs) 85 warnings.warn('Update your ' + object_name + 86 ' call to the Keras 2 API: ' + signature, stacklevel=2) ---> 87 return func(*args, **kwargs) 88 wrapper._original_function = func 89 return wrapper

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/keras/engine/training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch) 2044 batch_index = 0 2045 while steps_done < steps_per_epoch: -> 2046 generator_output = next(output_generator) 2047 2048 if not hasattr(generator_output, 'len'):

StopIteration:

ClaudeCoulombe avatar Nov 05 '17 06:11 ClaudeCoulombe

Current docker image is working only with GPU. You have to change dockerfile and build new image. After that it will be possible to run it on Mac. We had some positive experience to do it. You have to find command "pip install tensorflow-gpu" and change it to "pip install tensorflow". As i remember - that's all :)

mkolomeychenko avatar Nov 06 '17 05:11 mkolomeychenko

I'll try it as you suggested. Spasibo! Claude

ClaudeCoulombe avatar Nov 06 '17 08:11 ClaudeCoulombe

Your advice enabled me to run the docker image on my Mac but I still got the same StopIteration error during the training.

The StopIteration problem should come from the generator. Any idea?

My configuration: TensorFlow version: 1.4.0, Keras version: 2.0.9, Python: 3.6.0a2

Spasibo!

Claude


Exception in thread Thread-5: Traceback (most recent call last): File "/opt/conda/lib/python3.5/threading.py", line 914, in _bootstrap_inner self.run() File "/opt/conda/lib/python3.5/threading.py", line 862, in run self._target(*self._args, **self._kwargs) File "/src/keras/utils/data_utils.py", line 630, in data_generator_task generator_output = next(self._generator) File "", line 84, in next_batch img, text = self.next_sample() File "", line 68, in next_sample return self.imgs[self.indexes[self.cur_index]], self.texts[self.indexes[self.cur_index]] IndexError: list index out of range


StopIteration Traceback (most recent call last) in () ----> 1 model = train(128, load=False)

in train(img_w, load) 96 epochs=1, 97 validation_data=tiger_val.next_batch(), ---> 98 validation_steps=tiger_val.n) 99 100 return model

/src/keras/legacy/interfaces.py in wrapper(*args, **kwargs) 85 warnings.warn('Update your ' + object_name + 86 ' call to the Keras 2 API: ' + signature, stacklevel=2) ---> 87 return func(*args, **kwargs) 88 wrapper._original_function = func 89 return wrapper

/src/keras/engine/training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch) 2045 batch_index = 0 2046 while steps_done < steps_per_epoch: -> 2047 generator_output = next(output_generator) 2048 2049 if not hasattr(generator_output, 'len'):

StopIteration:

ClaudeCoulombe avatar Nov 07 '17 08:11 ClaudeCoulombe

Has anyone been able to find a solution for this issue? I ran into the same error as @ClaudeCoulombe when trying to run this tutorial on MacOS, and changing pip install tensorflow-gpu to pip install tensorflow doesn't seem to fix the error:

./run.sh: line 3: nvidia-docker: command not found

adsmaniotto avatar Jun 16 '18 15:06 adsmaniotto

I use env with anaconda. create python 3.6, tensorflow-gpu 1.8.0 and keras 2.1.5 and it worked for me.

xjohnxjohn avatar Dec 29 '18 09:12 xjohnxjohn