Xingjian Shi
Xingjian Shi
I will leave the issue open and we can change the name / add alias later.
Would you try the alpha version?
This might be difficult to fix. You may try to see if running GPT-2 alone will still cause the error.
@leezu What's your opinion on yacs? This is simpler and easier to hack.
If we check the `fvcore/common/config`: https://github.com/facebookresearch/fvcore/blob/master/fvcore/common/config.py, we may find that it's based on yacs and adds some other functionalities. Currently, I'm doing it here: https://github.com/dmlc/gluon-nlp/blob/a646c34304c4bde9423468714bd2ff6357cd2091/src/gluonnlp/utils/config.py#L4-L26
Thanks to the efforts by @zhreshold , who has created the https://github.com/zhreshold/autocfg project (See https://github.com/zhreshold/autocfg/blob/master/examples/basics.ipynb for a demo). GluonCV and GluonNLP may try to use autocfg as the main configuration...
We may try to first add it and later figure out if we can hold a snapchat of BookCorpus by ourselves. What do you think?
@shawwn Really appreciate the information! I've tried out huggingface/datasets and find that it's quite good. In fact we can add it even if the tarball changes. It's the same as...
I think the reason to use BlockSparse rather than General Sparse is that the speed can usually be faster since we are dealing with a block of elements at one...
Also, the DGL team has been profiling the performance of sparse kernels for some time. So I also ping @yzh119 and @zheng-da here.