Chen Qian

Results 42 issues of Chen Qian

&#x1F6E0 DevTools &#x1F6E0 [![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/chenmoneygithub/mlflow/pull/11172?quickstart=1) #### Install mlflow from this PR ``` pip install git+https://github.com/mlflow/mlflow.git@refs/pull/11172/merge ``` #### Checkout with GitHub CLI ``` gh pr checkout 11172 ``` ###...

&#x1F6E0 DevTools &#x1F6E0 [![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/chenmoneygithub/mlflow/pull/11094?quickstart=1) #### Install mlflow from this PR ``` pip install git+https://github.com/mlflow/mlflow.git@refs/pull/11094/merge ``` #### Checkout with GitHub CLI ``` gh pr checkout 11094 ``` ###...

rn/feature

Hi team, I am a bit confused about how the FSDP works with Accelerator. Basically if I run the code below from the TRL SFT example in my 4-GPU instance:...

We added dropout layer to `keras_nlp.models.BertClassifier`, so we need to update the presets accordingly.

type:Bug
stat:contributions welcome
stale

Hi team, I was checking the implementation of `RotaryEmbedding` layer, and was a bit confused at the following computation: ``` def _apply_rotary_pos_emb(self, tensor, cos_emb, sin_emb): x1, x2 = ops.split(tensor, 2,...

type:Bug

Relative postion is useful for text of arbitrary length. Our DeBERTa model now has a relative postional encoding, but it now only returns the repeated embedding matrix: [code link](https://github.com/keras-team/keras-nlp/blob/340a5cc7370d0f91bd1acff5b25bf60a73aa6e38/keras_nlp/models/deberta_v3/relative_embedding.py#L73) I...

type:feature

Current [unit tests](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/tokenizers/byte_pair_tokenizer_test.py) depends on real vocab/merges used in RoBERTa model. We should figure out how to do testing with local vocab/merges. Note that it is a bit complex to...

type:feature

We need to ensure our models work well on TPU. Will check how to add automatic TPU testing, hopefully we get this done before 0.4 release.

scoping required

Currently the method `_bpe_merge_one_step` is annotated with `@tf.function`. Ideally we should not require this annotation, but removing it causes several errors. We should make corresponding fixes and remove the annotation.

type:Bug

One interesting part is how we handle long context since our models have a limit on the input length due to positional embedding. Ideally we should ship a task model...

type:feature