RecBole
RecBole copied to clipboard
Question about scaling
Hi I have been working on recommendation systems project. We found this library to be very useful. I have tried using this library on a subset of data, and most of the things have been smooth. This is as such not an issue with codebase but a general query for experienced users of library. I wanted to know if you have any particular advice in case I go ahead with deployment of training/retraining and serving module. Data size Details: ~10M users ~500K items
99% sparsity (Only positive samples) 4-5 sparse item features No user features
We are considering DeepFM, Caser or SHAN models. We will be retraining the model frequently (once in a week or two). I would like to know from community -- if some additional optimisation can be done in general or with respect to selected model (DeepFM) -- If any particular system configuration are must or recommended? -- What GPU configurations are recommended? -- If I want to use an AWS instance to host this setup, which setup/AMI I should choose? I want to know more about scalability capabilities of algorithm implementation.
Thanks in advance.
@samruddhag1 Hello! You can refer to our docs for fine-tuning parameters .For GPU,there is no specific requirement.