kaggle_clrp_1st_place_solution
kaggle_clrp_1st_place_solution copied to clipboard
Applying a Shared Pretrained Model
Reproducibility on the training process is great, but for someone like me just looking for a way to use the model, is there anywhere the pretrained model is hosted? And if not, how would you recommend doing this?
The training process looks quite involved with many models, which makes me doubt whether the process is worth the increase in accuracy. A single training script that does everything might help also.
Hey Joshua,
if you just want to use the models, then you could start with this notebook: https://github.com/mathislucka/kaggle_clrp_1st_place_solution/blob/main/notebooks/05_clrp_inference.ipynb. You'd need to download the models from Kaggle (links in README). However, I don't think it's very practical to use these models because it is an ensemble of more than 30 large transformer models. It will be way too slow for any real world application.
I'd use the same process but distill the knowledge into a single smaller model. You don't need to have as many teacher models as I used. See this paper for a more recent and streamlined version of the approach: https://arxiv.org/pdf/2208.09243.pdf
Thanks for the response and interesting paper! I've been looking around for a practical solution. Most of the solutions near the top seem to have the same problems in terms of being hard to use, which seems unfortunate given the goals of the competition, and my project is for students. Your success bringing in more data was inspiring nonetheless. Fine-tuning a pretrained model for the type of data I'm working with seems like a practical solution, especially if I can leverage the same embedding across multiple similar tasks.