Gym icon indicating copy to clipboard operation
Gym copied to clipboard

Dataset Download Utils requires Gitlab integration

Open fsiino-nvidia opened this issue 3 weeks ago • 0 comments

Describe the bug

External users cannot train using NeMo Gym because GitLab integration is hardcoded as a requirement. The DatasetConfig validator enforces that all train/validation datasets must have a gitlab_identifier, triggering automatic GitLab downloads that crash without credentials.

Steps/Code to reproduce bug

Have a dataset config without gitlab_identifier Run ng_prepare_data "+config_paths=[config.yaml]" Fails with assertion error: "A Gitlab path is required for train"

Expected behavior

External users should be able to use local files or HuggingFace without GitLab credentials. Internal NVIDIA users should retain full GitLab functionality.

Proposed Fixes Remove gitlab requirement Update docs

fsiino-nvidia avatar Dec 01 '25 21:12 fsiino-nvidia