mamba icon indicating copy to clipboard operation
mamba copied to clipboard

Wikitext pipeline

Open elephantmipt opened this issue 7 months ago • 9 comments

Hi, can you please share pipeline for the wikitext dataset. I found results with 16.3 for mamba and 18 (vs. 18.6 everywhere else) perplexity for the transformer baseline and can not reproduce it. Maybe there is something different in preprocessing etc. Could you provide any details on the preprocessing steps or hyperparameters used that may be different from the default? Understanding those differences could help me reproduce the results.

elephantmipt avatar Dec 05 '23 09:12 elephantmipt