BERT-of-Theseus issues

CoLA reproducibility

2

Hi, I cannot reproduce the CoLA score as same as the one on paper. I followed the HuggingFace's repo to train a predecessor model with Matthew correlation score of 55.76....

mcps5601

Duplicate definition of name (last_hidden_state) for float 16 onnx

After training like that: ``` # For compression with a replacement scheduler export GLUE_DIR=glue_script/glue_data export TASK_NAME=MRPC python ./run_glue.py \ --model_name_or_path /home/bert-base \ --task_name $TASK_NAME \ --do_train \ --do_eval \ --do_lower_case...

MrRace

What does “max_length” mean in config.json of successor

What does “max_length” mean in config.json of successor? I set max_seq_length=128 when I Run compression, but the "max_length" in config.json of successor is 20. ![image](https://user-images.githubusercontent.com/29936015/133574859-773acffe-efa1-49e7-bd62-4e680e8f1a7c.png)

SuMeng123

Bump transformers from 2.4.0 to 4.30.0

Bumps [transformers](https://github.com/huggingface/transformers) from 2.4.0 to 4.30.0. Release notes Sourced from transformers's releases. v4.30.0: 100k, Agents improvements, Safetensors core dependency, Swiftformer, Autoformer, MobileViTv2, timm-as-a-backbone 100k Transformers has just reached 100k stars...

dependabot[bot]

dependencies

BERT-of-Theseus
BERT-of-Theseus copied to clipboard

Metadata

CoLA reproducibility

Duplicate definition of name (last_hidden_state) for float 16 onnx

What does “max_length” mean in config.json of successor

Bump transformers from 2.4.0 to 4.30.0

← Metadata

Owner

Metadata

BERT-of-Theseus BERT-of-Theseus copied to clipboard

Metadata

CoLA reproducibility

Duplicate definition of name (last_hidden_state) for float 16 onnx

What does “max_length” mean in config.json of successor

Bump transformers from 2.4.0 to 4.30.0

← Metadata

Owner

Metadata

BERT-of-Theseus
BERT-of-Theseus copied to clipboard