benchmarks icon indicating copy to clipboard operation
benchmarks copied to clipboard

Fast and flexible reference benchmarks

Results 26 benchmarks issues
Sort by recently updated
recently updated
newest added

Hi, I'm currently working with the [resnet50 training recipe](https://github.com/mosaicml/examples/tree/main/examples/benchmarks/resnet_imagenet#using-mosaic-recipes). However, I'm aiming to adapt Mosaic to my custom MobileNetV2 model and need to incorporate a custom parameter into the model....

Fixes a bug where the betas could be improperly converted from a omega ListConfig

I have forked the docker from the README and installed the dependencies from requirements.txt. One difference is I'm using singularity instead of docker. I face the following error only when...

Hi, we could sucessfully pretrain various MosaicBERT models and evaluations with composer-based fine-tuning look really good :) However, when using a/the conversion script `llm-foundry/scripts/inference/convert_composer_to_hf.py` the converted HF model seems to...

This PR modernizes the MosaicBERT codebase with Flash Attention 2, PyTorch 2 (`torch==2.1.1`), and an updated version of composer (`mosaicml>=0.17`). In particular, this updates MosaicBERT to be compatible with [Flash...

Hi MosaicML team, many thanks for releasing the code and models for your MosaicBERT! I highly appreciate the effort that you put in modernizing the BERT architecture. I am interested...

Hi, I tried replicating the pretraining bert script and when I ran it with the yaml script I got the following error: Value bf16 is not available in Precision. I...

This fixes #322. When I was working through #440, I got bitten by this bug.

Hey I am trying to pull the model from huggingface repo using `AutoModelForMaskedLM.from_pretrained( 'mosaicml/mosaic-bert-base-seqlen-2048', trust_remote_code=True, revision='b7a0389')` (with revision param and without) I am getting the same error that goes like...

What I want to do: ``` model = MosaicGPT.from_pretrained( "mosaicml/mpt-1b-redpajama-200b", trust_remote_code=True, attn_impl='torch' ) trainer = SFTTrainer( model=model, tokenizer=tokenizer, train_dataset=tokenized_train_data["train"], eval_dataset=tokenized_val_data["validation"], dataset_text_field="text", args=training_args, neftune_noise_alpha=5 #the only one important thing for me...