Junlei Zhang

Results 38 comments of Junlei Zhang

Currently, it looks that the tool do not support models exceeding 2GB

@Taka152 Hello, thank you for your reply. I am trying to accelerate the MBart Model. But the model size is too large. Could the main branch solve the issue as...

initializing bart tokenizer... creating lightseq model... Parsing hdf5: /home/sysadmin/downlaod/lightseq_models/lightseq_mbart_base.hdf5 loading 976 MB of embedding weight. Finish loading src_emb_wei from host to device loading 1073 MB of embedding weight. Finish loading...

@byshiue Thank you for your reply. If I just use the python interface, could I get the logits output of the decoder?

The acc for the network1 is almost the same

Thank you for your reply. I changed two things: 1. I set the batch_size for train_loader and valid_loader the same; 2. I set the random seed. Otherwise, it is hard...

> Hello, if you simply fix this error by setting global_bank=0, 7/8 of your dataset will not be trained in an epoch. And if all workers work on the same...

I found that there is no ShardedDDPOption in transformer 4.2.1. Is there any error for the version?

> Thanks for the attention. Here we reduce the number of nodes by `reduce_ratio` so as to reduce the computational cost of distance calculation. Thank you for your rapid reply....