atfujita

Results 3 issues of atfujita

HI, When code_location is used in estimator of TrainingStep(), the uploaded s3 path and sagemaker_submit_directory timestamp do not match(about 400 ms). This will cause the execution to fail. In SageMaker...

bug

Why do we get cross-entropy and accuracy logs when we assign a numeric variable? I got a continuous value for the Impute value, but I'm wondering. ```2021-02-19 19:20:06,853 [INFO] Note:...

`examples/nlp/language_modeling/tuning/megatron_gpt_finetuning.py` ignores `trainer.max_epochs`. Always refer to `trainer.max_steps` only. Tested on: `nvcr.io/nvidia/nemo:24.03.framework` Test case: `trainer.max_steps=200` and `trainer.max_epochs=1` (187 steps)  - 200 step job finished `trainer.max_steps=200` and `trainer.max_epochs=5` (935 steps)  - 200...

bug