Zheyu Ye issues

Results 8 issues of


Zheyu Ye

Wrong configuration of albert xlarge

It seems that there exist a conflict in the `num_attention_heads` which is set as 32 in the `albert_config.json` included in the model tar file downloaded from [TF Hub](https://tfhub.dev/google/albert_xlarge/2) instead of...

whats the difference between `albert_tiny_zh` and `albert_tiny_google_zh`

我注意到这两个tiny版本略有不同，从config里看到模型size保持了一致，但是模型文件参数`ckpt.data`大小有差异（17.2MB vs 49.7MB）想请教一下brightmart版本是在预训练的具体什么阶段做了哪些优化

Unrelated parameters in the config

There are a lot of irrelevant or unsed parameters in the configuration file such as `pooler_fc_size`, `pooler_num_attention_heads`, `pooler_size_per_head`, `pooler_type`. It's even more remarkable that `type_vocab_size` is set to be `2`...

Badge link was broken in README

As title descripted, the following url link failed (https://github.com/dmlc/gluon-nlp/blob/master/README.md#14) ``` ``` And the correct one would be ``` https://github.com/dmlc/gluon-nlp/actions ```

bug

[WIP] add character-level tokenizer

## Description ## Add character-level tokenizer with test. ## Comments ## Although this is a work-in-process PR, it is still open to review and please leave comments on code quality...

transfomers model hub support

I am woudering if is possible that adding nezha chinese model into huggingface transformers repo, for both code and paramaters. nazhe is widely used in chinese nlp community, and would...

bert4keras 0.11.3加载报错

使用`bert4keras==0.11.3`加载模块时报错`RecursionError: maximum recursion depth exceeded while calling a Python object`, 像是tensorflow以及keras多个backend互相调用导致的报错概要: ``` Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python3.6/dist-packages/bert4keras/tokenizers.py", line 5, in from bert4keras.snippets...

The difference of reproduced results on electra_small_owt

I used the same hyper-parameters as the paper but generator size 1:1 with the hidden size of 256 as you claimed in #39 to pretrain a electra small model on...