tffm icon indicating copy to clipboard operation
tffm copied to clipboard

Multi-GPU support

Open arassadin opened this issue 7 years ago • 5 comments

Hi,

I just wondered how FM can be parallelized effectively between multiple GPUs. I'm a bit familiar with TF and not really with FMs. If you provide me with ideas or any highlights, I would make a necessary modifications and subsequently a PR since for today I'm interested in parallel GPU FM implementation and seems that your code is a well base for this.

Best Regards, Alexandr

arassadin avatar Jun 08 '17 11:06 arassadin

Hi, the simplest way is to do data parallelism, I mean just split batch over several GPU

geffy avatar Jun 12 '17 18:06 geffy

Hi,

Thanks for the response. If I'm right, in FMs there is no explicit batches as in NNs. In a sample-wise splitting, at least two questions come to mind:

  • should it be somehow balanced by feature presence?
  • how to merge independently-learned weights then, since they are feature-wise and abstracted from samples?

Best Regards, Alexandr

arassadin avatar Jun 12 '17 18:06 arassadin

in FMs there is no explicit batches as in NNs

You need to solve an optimization task. While it's common to use sample-wise updates in such settings (for example, in libFFM), mini-batch learning also works. This implementation use batches exactly as in NNs. You can see it in batch dimension in placeholders (https://github.com/geffy/tffm/blob/master/tffm/core.py#L129) and param batch_size of TFFMBaseModel (https://github.com/geffy/tffm/blob/master/tffm/base.py#L160)

geffy avatar Jun 12 '17 19:06 geffy

Hi, any news there?

geffy avatar Aug 31 '17 16:08 geffy

Hi,

Unfortunately, priorities changed rapidly. I nearly had not a chance to handle this issue.

arassadin avatar Sep 01 '17 06:09 arassadin