Results 10 issues of cyk

Hi, by running scripts with `--html` option, I met `KeyError` when trying to transform [XML dump](https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2) to HTML, same as [#40](https://github.com/attardi/wikiextractor/issues/40). This is also reported in [#247](https://github.com/attardi/wikiextractor/issues/247). Any solution to...

### Feature request Any documentations for the the `load_dataset(streaming=True)` for (multi-node multi-GPU) DDP training? ### Motivation Given a bunch of data files, it is expected to split them onto different...

### Feature request Any documentations for the the `load_dataset(streaming=True)` for (multi-node multi-GPU) DDP training? ### Motivation Given a bunch of data files, it is expected to split them onto different...

enhancement

For the use of multiple datasets and tasks, we use around more than 200+ dataloaders, then pass it into `dataloader1, dataloader2, ..., dataloader200=accelerate.prepare(dataloader1, dataloader2, ..., dataloader200)` It causes the memory...

bug

I would like to add special tokens into an existing (pre-trained) tokenizer, in which the added tokens are not whitespace-separated between tokens. Therefore, the decoded string contains additional whitespace ahead...

When running `python main.py -g rankgan`, got IndexError: ```bash Traceback (most recent call last): File "/home/xxx/Texygen/main.py", line 78, in parse_cmd gan.train_oracle() File "/home/xxx/Texygen/models/rankgan/Rankgan.py", line 121, in train_oracle self.evaluate() File "/home/chaiyekun/GAN.tf/Texygen/models/rankgan/Rankgan.py",...

Is the generator loss of MaliGAN correct? It should be: ![image](https://user-images.githubusercontent.com/13767887/91018741-8867c500-e622-11ea-8988-c6bd96660982.png) [https://github.com/geek-ai/Texygen/blob/3104e22ac75f3cc2070da2bf5e2da6d2bef149ad/models/maligan_basic/MaliganGenerator.py#L112](https://github.com/geek-ai/Texygen/blob/3104e22ac75f3cc2070da2bf5e2da6d2bef149ad/models/maligan_basic/MaliganGenerator.py#L112) ```python self.g_loss = -tf.reduce_sum( tf.reduce_sum( tf.one_hot(tf.to_int32(tf.reshape(self.x, [-1])), self.num_vocabulary, 1.0, 0.0) * tf.log( tf.clip_by_value(tf.reshape(self.g_predictions, [-1, self.num_vocabulary]), 1e-20, 1.0) ),...

### PR types ### PR changes ### Description

contributor
stale

Hi there, I am confused about the part of applying prior to the computed variances. Would you by any chance explain it? Thanks ;) [Link](https://github.com/aakhundov/tf-example-models/blob/40b32991a76cb8d7201f9a5851789847db310b79/models/tf_gmm.py#L100) ```python # applying prior to...

Supplement solutions to exercise 1-12 compatible with the new version of Ray library