Results 8 issues of jmercat

This allows to change the rotary positional embedding frequency parameter. This is useful given the more recent approaches: LLaMA 1&2 used 10000 which is the default value here. LLaMA 3...

Mamba is now using a class for its input, this updates OpenLM accordingly.

It seems that the Params class is better in the params.py file than in model.py It allows importing params without importing the rest (which in some case might result in...

see PR #298 This is the same thing but with the hack in llm_foundry_wrapper instead of hf_model. It might be a better place to do it here but is probably...

We might not want to merge this because it is hacky and there might be a usage that I don't foresee that could be impacted. Problem: Somewhere in llm-foundry or...

Allow to build wheel following https://packaging.python.org/en/latest/tutorials/packaging-projects/

It seems that new versions of webdataset causes issuse https://github.com/mlfoundations/dclm/issues/62 This avoids failing on it but skips some data.