Mu Li
Mu Li
Job d2l-en/PR-2030/1 is complete. Check the results at http://preview.d2l.ai/d2l-en/PR-2030/
i think we need to find the root cause, i.e. maybe due to bad weight initializations.
Job d2l-en/PR-1998/1 is complete. Check the results at http://preview.d2l.ai/d2l-en/PR-1998/
Job d2l-en/PR-1998/2 is complete. Check the results at http://preview.d2l.ai/d2l-en/PR-1998/
it looks like the training is not very stable, maybe learning rate is too large?
@astonzhang @smolix @AnirudhDagar @archersama
we can have multiple version of `Trainer`, for example, the basic CPU trainer `BasicTrainer`, the multi-gpu trainer `Trainer`, and others. The idea is to reuse codes. We don't need to...
Job PR-798/1 is complete. Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-798/1/index.html
Job PR-798/3 is complete. Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-798/3/index.html
Job PR-798/4 is complete. Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-798/4/index.html