basenji
basenji copied to clipboard
What was the hardware setup used to train the best models for Basenji?
I'm trying to reproduce the basenji results. The code seems to have a setup for distributed training but Params.json doesn't mention the number of GPUs and I couldn't find any info on it elsewhere. Could you please share what your setup was and how long it took to train these models?
For the Basenji manuscripts, I used a single p100 GPU to train the models. They took several weeks to train.
Multi-GPU training hasn't worked well for us locally, because the data transfer between GPUs isn't fast enough. For the more recent Enformer model, we used TPUs. For some of the more recent internal work, we've used multiple A100 GPUs in Google Cloud, which do have fast data transfer.