Jonathan Schmidt

Results 24 comments of Jonathan Schmidt

With distributed:true the training hangs at "building line graphs" even when only using one node. However there seem to be quite a lot of issues with hanging processes in accelerate....

@knc6 just a quick check in concerning data parallel as the distributed was removed. I am getting some device errors with dataparallel and I am also not sure whether dataparallel...

Great, will try it out.

Thank for sharing the branch. I tested it with cached datasets and 2 gpus and it was reproducible and consistent with what I would expect from 1 gpu. However I...

that's a good idea, lmdb datasets definitely work for this. If you would like to use lmdb datasets, there are a few examples of how to do lmdb datasets in...

Thank you very much. Will give it a try this week.

@utf Just a reminder to take a look if you find the time. So I know whether this is going in the direction intended.

Thank you for taking a look @utf . I will take care of it next week.

the error should be unrelated to this PR

@utf if you are happy with the corrections it should be ready to merge