Gregory Johnson comments

Results 10 comments of


                                            Gregory Johnson

OS level crashes when using V3 or V4 and Titan X

I'm having this same problem, using a Docker image ([here](https://github.com/gregjohnso/dl-docker/blob/master/Dockerfile.gpu)) with various large networks distributed in series across 2 or 3 Pascal Titan Xs. My observations: **Without cudnn:** Works fine,...

OS level crashes when using V3 or V4 and Titan X

Attached is a screenshot of "watch nvidia-smi" at the time of a crash. The temps are all within normal range. ![screen shot 2017-02-15 at 12 03 16 pm](https://cloud.githubusercontent.com/assets/17319655/22993148/ccbb48be-f376-11e6-81e6-847f1aac115d.png)

Gregory Johnson

OS level crashes when using V3 or V4 and Titan X

OS level crashes when using V3 or V4 and Titan X

Add `batch_size` parameter to `Task.map`

training will randomly freeze for training AlexNet from scratch.

GPU memory allocation

GPU memory allocation

Pg executor

Training example issues with csv creation

Hashing a Task to include its dependencies

Implement apply_on_single_zstack in fnet_model