mmengine
mmengine copied to clipboard
[Bug] Using Deepspeed: the meaning of inputs_to_half=[0]
Prerequisite
- [X] I have searched Issues and Discussions but cannot get the expected help.
- [X] The bug has not been fixed in the latest version(https://github.com/open-mmlab/mmengine).
Environment
latest version of mmengine
Reproduces the problem - code sample
Here it's a common config.
strategy = dict(
type='DeepSpeedStrategy',
fp16=dict(
enabled=True,
auto_cast=False,
fp16_master_weights_and_grads=False,
loss_scale=0,
loss_scale_window=1000,
hysteresis=1,
min_loss_scale=1,
initial_scale_power=16,
),
# bf16=dict(
# enabled='auto',
# ),
# inputs_to_half=['images'],
inputs_to_half=[0],
zero_optimization=dict(
stage=3,
allgather_partitions=True,
allgather_bucket_size=2e8,
reduce_scatter=True,
reduce_bucket_size='auto',
overlap_comm=True,
contiguous_gradients=True,
),
)
the inputs_to_half is set to [0] however, In https://github.com/open-mmlab/mmengine/blob/e43bbb5e03a412ea07f9ef6d0f118586082c2845/mmengine/_strategy/deepspeed.py#L179C1-L206C55
It will not be triggered, because inputs are always the dict type.
Reproduces the problem - command or script
n/a
Reproduces the problem - error message
Dypes of the inputs and model weight type are not the same
Additional information
No response
If inputs is a dict
, inputs_to_half
should be a list of str