One issues

Results 24 issues of

One

[BUG] Failed to checkpoint with deepspeed 0.12.4

**Describe the bug** I encountered an issue when using DeepSpeed 0.12.4 with the [OpenChat trainer](https://github.com/imoneoi/openchat), where checkpointing failed and raised an NCCL error. However, the checkpoints work fine when using...

bug

training

When will the Llama 2 34B model be released?

question

research-paper

[CLI]: wandb.finish() stuck when uploaded all data

### Describe the bug When running a training loop multiple times and calling `wandb.finish()` after each run, although it shows that all data is uploaded, the program is still stuck...

c:sync

a:cli

c:service

HTML visualizer follow target and freeze angle features not working in Brax v2

pytinyrenderer is incompatible with Python 3.11

Brax cannot be installed with pip for Python 3.11. The dependency pytinyrenderer is not compatible with the latest Python. pytinyrenderer seems like a course project and hasn't been updated for...

bug

Original bfloat16 weights

Can we download the original (supposedly bfloat16) weights for fine-tuning? The checkpoint is int8 quantized.

6x slower compared to IsaacGymEnvs

Why is OmniIsaacGymEnvs (Omniverse Isaac Sim) version about 6x slower, compared with IsaacGymEnvs (IsaacGym Preview) version, with same hardware? Tested on Humanoid environment, headless, with 8192 environments (RTX 3080 Laptop)...

AESLC Checksum error

When I try to create "flan2021_submix", the process fails with a wrong checksum on the AESLC dataset. ``` Downloading and preparing dataset 11.10 MiB (download: 11.10 MiB, generated: Unknown size,...

No train split in `adversarial_qa_dbert_answer_the_following_q_template_0to10_no_opt_x_shot`

I can't create "t0_submix", says no training split found in "adversarial_qa_dbert_answer_the_following_q_template_0to10_no_opt_x_shot" ``` ERROR:absl:Failed to load task 't0_task_adaptation:adversarial_qa_dbert_answer_the_following_q_template_0to10_no_opt_x_shot' as part of mixture 't0_submix' Traceback (most recent call last): File "/home/one/anaconda3/envs/flan/lib/python3.10/runpy.py", line...

Fused Linear and Cross-Entropy Loss `torch.nn.functional.linear_cross_entropy`

### 🚀 The feature, motivation and pitch It'd be great to have a fused linear and cross-entropy function in PyTorch, for example, `torch.nn.functional.linear_cross_entropy`. This function acts as a fused linear...

module: performance

module: nn

module: loss

triaged