torchdrug icon indicating copy to clipboard operation
torchdrug copied to clipboard

pretrained model serving

Open wconnell opened this issue 3 years ago • 5 comments

I'm wondering if there is an intention to serve pretrained models directly through the API? It seems to me readily available pretrained models (eg large-scale trained molecular representation models) would be of great utility for many users and generally reduce waste.

See the huggingface transformers library as an example. There is vast demand for this type of interface...

wconnell avatar Sep 30 '21 20:09 wconnell

Hi! Yes, we are considering this as a part of our future release. May I know what pretrained models you would prefer to have?

KiddoZhu avatar Oct 01 '21 04:10 KiddoZhu

I'm not an expert in this field but I work adjacent to it. So I don't have a great grasp on SOTA models that would serve folks best. Papers with code has a drug discovery task leaderboard.

I think any reasonably well-benchmarked models trained on the largest datasets available would provide the most utility. Then again there are other non-graph based representation learning models that work quite well. Would be happy to continue the conversation.

wconnell avatar Oct 01 '21 16:10 wconnell

Thanks! The leaderboard seems to be a good place to find SOTA models.

We would probably start from existing pretraining models in this platform, try to pretrain them on large datasets, and then move on to newer models.

KiddoZhu avatar Oct 01 '21 16:10 KiddoZhu

I think that is a good place to start. It would be interesting and useful to get a benchmark going with the models right here. I would be interested to see the effect of dataset & model scaling for the available methods. I'm sure some of the methods have done these experiments but nonetheless in this new era of pretrained models it is useful information.

wconnell avatar Oct 01 '21 16:10 wconnell

Hi! Any updates in this regard?

jasperhyp avatar May 11 '23 15:05 jasperhyp