mamba Small datasets

Small datasets

Open Anri-Lombard opened this issue 7 months ago • 7 comments

I'm curious; I've trained a 20M MAMBA model for molecular generation, and it seems to fair quite badly when trained on small datasets. I added a dropout layer since it seems to overfit otherwise, but would Mamba perhaps need a lot of intricate optimisation and regularisation to work well with smaller datasets?

I know previous LSTM and RNN models needed this (https://arxiv.org/pdf/1708.02182v1) and curious about your intuition.

Jul 08 '24 16:07 Anri-Lombard

mamba mamba copied to clipboard

Small datasets

mamba
mamba copied to clipboard