Aayush Sharma
Aayush Sharma
Is there a way use sequence parallelism for NemotronHForCausalLM model? I am getting: ``` File "/data/TTS/aayush/nemo-rl/nemo_rl/models/policy/dtensor_policy_worker.py", line 352, in __init__ self.model = _parallelize_model( ^^^^^^^^^^^^^^^^^^^ File "/data/TTS/aayush/nemo-rl/nemo_rl/models/dtensor/parallelize.py", line 529, in _parallelize_model...
I am able to train a text LLM in Nemo by just passing mtp_heads config in model configm, how do I do that in this?
This is how my nemo model looks like: ``` /path/to/runs/tts/model │ ├── lightning_logs/ ├── model/ ├── model--reduced_train_loss=4.0176-epoch=1-consumed_samples=91406336.0-last/ │ ├── context/ │ │ ├── / │ │ ├── / │ │...