CaffeProcessor in multiple GPU

Open zerocurve opened this issue 9 years ago • 1 comments

The CaffeProcessor is kick of training on each GPU one by one. I think the better strategy is to make them parallelly? It seems slow down the GPU in the way.

private def startThreads(): Unit = { //start threads only once for JVM if (threadsStarted) return results.clear solvers.clear transformers.clear

for (g <- 0 until numLocalGPUs) {  
  val queuePair = new QueuePair[(Array[String], Array[FloatBlob], FloatBlob)]()

  if (source.isTrain) {
    //start solvers w/ only rank 0 will save model
    solvers.add(Future {
      doTrain(caffeNetList(0), g, queuePair) 
    })
    //start transformers
    for (t <- 0 until conf.transform_thread_per_device)
      transformers.add(Future {
        doTransform(caffeNetList(0), g, queuePair, g)
      })
  } else {
    //start solvers for test
    solvers.add(Future {
      doFeatures(caffeNetList(g), 0, queuePair)
    })
    //start transformers
    for (t <- 0 until conf.transform_thread_per_device)
      transformers.add(Future {
        doTransform(caffeNetList(g), 0, queuePair, g)
      })
  }
}

threadsStarted = true

}

Nov 11 '16 20:11 zerocurve

The threads are started sequentially, but they run in parallel. I don't know if there is any loss in the speed. But I am happy to take a look at your proposal.

Nov 11 '16 23:11 junshi15