JFDuan
JFDuan
I have a similar problem...
Resolve conflicts for reference.
@zhuohan123 I think it's possible, and I've updated the `hf_model_weights_iterator` function. Maybe you can review it.
@ShawnXuan After that, I meet other errors. It seems the version of oneflow-benchmark is not consistent with the version of oneflow, thus it has many errors. data:image/s3,"s3://crabby-images/7e30f/7e30f22ef4d8a40bcf07f45add83ea7ee3664934" alt="image"
I can train BERT in a single node. But for two node, I use this scripts ``` NUM_NODES=$1 NODE_IPS=$2 DATA_DIR=/home/duanjiangfei/OneFlow-Benchmark/LanguageModeling/BERT/wiki_ofrecord_seq_len_128_example python run_pretraining.py \ --gpu_num_per_node=8 \ --num_nodes=$NUM_NODES \ --node_ips=$NODE_IPS \ --learning_rate=1e-4...
Thanks a lot. The Bert benchmark can run successfully. But for cnn benchmark, I cannot run due to https://github.com/Oneflow-Inc/OneFlow-Benchmark/issues/130#issuecomment-692714197
@ShawnXuan Thanks. @yuanms2 I think you should add some git tags to clarify different versions of benchmark. I have one more question. You only realse the speed of bert base...