Jonathan Chang

Results 12 issues of Jonathan Chang

# Fine-tune a BPE tokenize by only adding merge rules only add, no remove ## Motivation I want to update a GPT2 tokenizer on my corpus, without manually adding special...

Here is an example of `Dropout` in a model definition: https://github.com/google/flax/blob/d068512a932da3e05b822790a591bac391aeab36/examples/nlp_seq/models.py#L211 Here is the `apply()`, where `rngs` is passed in https://github.com/google/flax/blob/d068512a932da3e05b822790a591bac391aeab36/examples/nlp_seq/train.py#L206-L207 However the `rng` is not very clearly explained in...

Priority: P3 - no schedule

```python import huggingface_hub huggingface_hub.__version__ ``` ``` '0.0.13' ``` ```python from huggingface_hub import snapshot_download, HfApi api = HfApi() # success api.model_info("flax-community/ft5-cnn-dm") # success api.model_info( "flax-community/ft5-cnn-dm", revision="859350e337148108b32b6f9eef45d0d4c6b668a9" ) # fail api.model_info("flax-community/ft5-cnn-dm",revision='859350e') ```...

discussion

**Is your feature request related to a problem? Please describe.** ``` from pathlib import Path from datasets import Dataset ds = Dataset.read_json(Path('data.json')) ``` causes an error ``` AttributeError: 'PosixPath' object...

enhancement

The current download progress bar doesn't show the file names being downloaded, and doesn't show how many files will be downloaded With allow_patterns , ignore_patterns, having a dry run option...

enhancement
good first issue

WIP for #154 - [x] Code to install FasterTransformer - [x] Code to optimize model using FasterTransformer # test this PR locally WIP for #154 - [x] Code to install...

**Is your feature request related to a problem? Please describe.** The current `.map` does not support multi-process, CPU can become bottleneck if the pre-processing is complex (e.g. t5 span masking)....

enhancement

this PR is for visibility only, it's not intended to be merged.