gluon-nlp icon indicating copy to clipboard operation
gluon-nlp copied to clipboard

Post DeferredCompute Verification

Open sxjscience opened this issue 4 years ago • 8 comments

Description

After https://github.com/dmlc/gluon-nlp/pull/1356 (Thanks @szha and @leezu!), GluonNLP has now fully embraced the new Gluon 2.0 API. We will no longer need to worry about the hybrid_forward and can directly use forward to write the logic! This will help accelerate the model development.

Post this merge, we will need to verify again our existing scripts:

@szha @szhengac @barry-jin @leezu

sxjscience avatar Oct 29 '20 18:10 sxjscience

I am verifying machine Translation scripts. It will take about 7 days.

xinyual avatar Nov 19 '20 01:11 xinyual

Which instance are you using? Would you try out the p3.8 instance?

Get Outlook for iOShttps://aka.ms/o0ukef


From: xinyual [email protected] Sent: Wednesday, November 18, 2020 5:37:51 PM To: dmlc/gluon-nlp [email protected] Cc: Xingjian SHI [email protected]; Author [email protected] Subject: Re: [dmlc/gluon-nlp] Post DeferredCompute Verification (#1413)

I am verifying machine Translation scripts. It will take about 7 days.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/dmlc/gluon-nlp/issues/1413#issuecomment-730066558, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABHQH3WRXYUE5ITVXMSCXUTSQRZG7ANCNFSM4TEDMXRQ.

sxjscience avatar Nov 19 '20 01:11 sxjscience

Which instance are you using? Would you try out the p3.8 instance? Get Outlook for iOShttps://aka.ms/o0ukef ________________________________ From: xinyual [email protected] Sent: Wednesday, November 18, 2020 5:37:51 PM To: dmlc/gluon-nlp [email protected] Cc: Xingjian SHI [email protected]; Author [email protected] Subject: Re: [dmlc/gluon-nlp] Post DeferredCompute Verification (#1413) I am verifying machine Translation scripts. It will take about 7 days. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#1413 (comment)>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABHQH3WRXYUE5ITVXMSCXUTSQRZG7ANCNFSM4TEDMXRQ.

No. I use g4 instance now. Do I need to use p3?

xinyual avatar Nov 19 '20 02:11 xinyual

Yes, try out to use p3.8 to verify the NMT scripts.

sxjscience avatar Nov 19 '20 02:11 sxjscience

I'm verified conversion scripts with https://github.com/dmlc/gluon-nlp/blob/master/tools/batch/run_batch_conversion.sh on batch. Success: mobilebert, electra, albert Failure: bart ('BARTHubInterface' object has no attribute 'args'), xlmr and roberta ('RobertaHubInterface' object has no attribute 'args'), bert (tf can't open due to not finding cudart 10.1)

For ELECTRA pretraining, what's the recommended setting?

szha avatar Dec 08 '20 03:12 szha

turns out fairseq has changed .args to .cfg https://github.com/pytorch/fairseq/blob/f3d5045a71ae463bd3f05254d7c4216801a04bc2/fairseq/hub_utils.py#L93

szha avatar Dec 16 '20 22:12 szha

ELECTRA pretraining script was broken. Fixes are in https://github.com/dmlc/gluon-nlp/pull/1491

I'm also trying and see if ELECTRA-base results can be reproduced with the script. To my understanding this has never been done yet.

szha avatar Jan 18 '21 04:01 szha

@szha Also, you may try to turn on AMP support of ELECTRA-pretraining if you have time.

sxjscience avatar Jan 18 '21 04:01 sxjscience