amazon-sagemaker-examples icon indicating copy to clipboard operation
amazon-sagemaker-examples copied to clipboard

[Bug Report] Wrong function call in model_parallel_v2

Open florianbodr opened this issue 1 year ago • 0 comments

Hi all, There is a bug in the following py file: training/distributed_training/pytorch/model_parallel_v2/shared-scripts/logging_utils.py Line 151 states the following: avg_tflops = compute_tflops(avg_throughput, num_params, world_size, batch_seqlen) But the function definition in: training/distributed_training/pytorch/model_parallel_v2/shared-scripts/train_utils.py at line 36 is the following: def compute_tflops(args, global_batch_size, step_time, world_size):

The arguments of the function call should be adapted, at least args shall be passed as the first one (or a new function as to be defined).

florianbodr avatar Nov 29 '24 10:11 florianbodr