benchmarks
benchmarks copied to clipboard
Fast and flexible reference benchmarks
Hi, I'm currently working with the [resnet50 training recipe](https://github.com/mosaicml/examples/tree/main/examples/benchmarks/resnet_imagenet#using-mosaic-recipes). However, I'm aiming to adapt Mosaic to my custom MobileNetV2 model and need to incorporate a custom parameter into the model....
Fixes a bug where the betas could be improperly converted from a omega ListConfig
I have forked the docker from the README and installed the dependencies from requirements.txt. One difference is I'm using singularity instead of docker. I face the following error only when...
Hi, we could sucessfully pretrain various MosaicBERT models and evaluations with composer-based fine-tuning look really good :) However, when using a/the conversion script `llm-foundry/scripts/inference/convert_composer_to_hf.py` the converted HF model seems to...
This PR modernizes the MosaicBERT codebase with Flash Attention 2, PyTorch 2 (`torch==2.1.1`), and an updated version of composer (`mosaicml>=0.17`). In particular, this updates MosaicBERT to be compatible with [Flash...
Hi MosaicML team, many thanks for releasing the code and models for your MosaicBERT! I highly appreciate the effort that you put in modernizing the BERT architecture. I am interested...
Hi, I tried replicating the pretraining bert script and when I ran it with the yaml script I got the following error: Value bf16 is not available in Precision. I...
This fixes #322. When I was working through #440, I got bitten by this bug.
Hey I am trying to pull the model from huggingface repo using `AutoModelForMaskedLM.from_pretrained( 'mosaicml/mosaic-bert-base-seqlen-2048', trust_remote_code=True, revision='b7a0389')` (with revision param and without) I am getting the same error that goes like...
What I want to do: ``` model = MosaicGPT.from_pretrained( "mosaicml/mpt-1b-redpajama-200b", trust_remote_code=True, attn_impl='torch' ) trainer = SFTTrainer( model=model, tokenizer=tokenizer, train_dataset=tokenized_train_data["train"], eval_dataset=tokenized_val_data["validation"], dataset_text_field="text", args=training_args, neftune_noise_alpha=5 #the only one important thing for me...