dist-keras icon indicating copy to clipboard operation
dist-keras copied to clipboard

Model runs in all GPUs

Open bachandr opened this issue 6 years ago • 1 comments

I have user partitioned data (for now 3 users), and want to train a model for every partitioned data.

I used dist-keras and using local[*] spark mode with 3 executors (8g) and each with 1 cores i.e. 1 executor for 1 user. When the script is triggered i see the model runs on all GPUs. Has anyone experienced the similar issue, I can provide more information if asked.

Version keras - 2.1.3 tensorflow - 1.4.0-rc0 spark - 2.2.1

[1] Tesla K80 | 53'C, 0 % | 11439 / 11439 MB | br(10856M) br(208M) br(285M) br(60M) [2] Tesla K80 | 49'C, 0 % | 11439 / 11439 MB | br(10856M) br(208M) br(285M) br(60M) [3] Tesla K80 | 55'C, 0 % | 11439 / 11439 MB | br(10856M) br(208M) br(285M) br(60M) [4] Tesla K80 | 42'C, 0 % | 11439 / 11439 MB | br(10854M) br(210M) br(285M) br(60M) [5] Tesla K80 | 49'C, 0 % | 11439 / 11439 MB | br(10854M) br(210M) br(285M) br(60M) [6] Tesla K80 | 37'C, 0 % | 11439 / 11439 MB | br(10854M) br(210M) br(285M) br(60M) [7] Tesla K80 | 45'C, 0 % | 11439 / 11439 MB | br(10852M) br(212M) br(285M) br(60M)

bachandr avatar Jan 22 '18 03:01 bachandr

How many partitions does your DataFrame or RDD consist of?

JoeriHermans avatar Feb 14 '18 16:02 JoeriHermans