CaffeOnSpark
CaffeOnSpark copied to clipboard
CaffeProcessor in multiple GPU
The CaffeProcessor is kick of training on each GPU one by one. I think the better strategy is to make them parallelly? It seems slow down the GPU in the way.
private def startThreads(): Unit = { //start threads only once for JVM if (threadsStarted) return results.clear solvers.clear transformers.clear
for (g <- 0 until numLocalGPUs) {
val queuePair = new QueuePair[(Array[String], Array[FloatBlob], FloatBlob)]()
if (source.isTrain) {
//start solvers w/ only rank 0 will save model
solvers.add(Future {
doTrain(caffeNetList(0), g, queuePair)
})
//start transformers
for (t <- 0 until conf.transform_thread_per_device)
transformers.add(Future {
doTransform(caffeNetList(0), g, queuePair, g)
})
} else {
//start solvers for test
solvers.add(Future {
doFeatures(caffeNetList(g), 0, queuePair)
})
//start transformers
for (t <- 0 until conf.transform_thread_per_device)
transformers.add(Future {
doTransform(caffeNetList(g), 0, queuePair, g)
})
}
}
threadsStarted = true
}
The threads are started sequentially, but they run in parallel. I don't know if there is any loss in the speed. But I am happy to take a look at your proposal.