Jonathan Chang issues

Results 12 issues of


                                            Jonathan Chang

Fine-tune a BPE tokenize by only adding merge rules

# Fine-tune a BPE tokenize by only adding merge rules only add, no remove ## Motivation I want to update a GPT2 tokenizer on my corpus, without manually adding special...

Improve documentation for `Dropout` and `rngs` argument in `linen.Module.apply()`

Here is an example of `Dropout` in a model definition: https://github.com/google/flax/blob/d068512a932da3e05b822790a591bac391aeab36/examples/nlp_seq/models.py#L211 Here is the `apply()`, where `rngs` is passed in https://github.com/google/flax/blob/d068512a932da3e05b822790a591bac391aeab36/examples/nlp_seq/train.py#L206-L207 However the `rng` is not very clearly explained in...

Priority: P3 - no schedule

`HfApi.model_info(revision=)` does not resolve hash prefix

```python import huggingface_hub huggingface_hub.__version__ ``` ``` '0.0.13' ``` ```python from huggingface_hub import snapshot_download, HfApi api = HfApi() # success api.model_info("flax-community/ft5-cnn-dm") # success api.model_info( "flax-community/ft5-cnn-dm", revision="859350e337148108b32b6f9eef45d0d4c6b668a9" ) # fail api.model_info("flax-community/ft5-cnn-dm",revision='859350e') ```...

discussion

Allow pathlib PoxisPath in Dataset.read_json

**Is your feature request related to a problem? Please describe.** ``` from pathlib import Path from datasets import Dataset ds = Dataset.read_json(Path('data.json')) ``` causes an error ``` AttributeError: 'PosixPath' object...

enhancement

Allow dry_run for snapshot_download

The current download progress bar doesn't show the file names being downloaded, and doesn't show how many files will be downloaded With allow_patterns , ignore_patterns, having a dry run option...

enhancement

good first issue

Jonathan Chang

Fine-tune a BPE tokenize by only adding merge rules

Improve documentation for `Dropout` and `rngs` argument in `linen.Module.apply()`

`HfApi.model_info(revision=)` does not resolve hash prefix

Allow pathlib PoxisPath in Dataset.read_json

Allow dry_run for snapshot_download

Add Faster Transformer compiler for Bert

Make stable diffusion compiled model portable

Add diffusers to dependency in setup.py

Support multi-worker with streaming dataset (IterableDataset).

debug code (WIP)