mmengine icon indicating copy to clipboard operation
mmengine copied to clipboard

[Bug] Using Deepspeed: the meaning of inputs_to_half=[0]

Open KyanChen opened this issue 1 year ago • 1 comments

Prerequisite

  • [X] I have searched Issues and Discussions but cannot get the expected help.
  • [X] The bug has not been fixed in the latest version(https://github.com/open-mmlab/mmengine).

Environment

latest version of mmengine

Reproduces the problem - code sample

Here it's a common config.

strategy = dict(
    type='DeepSpeedStrategy',
    fp16=dict(
        enabled=True,
        auto_cast=False,
        fp16_master_weights_and_grads=False,
        loss_scale=0,
        loss_scale_window=1000,
        hysteresis=1,
        min_loss_scale=1,
        initial_scale_power=16,
    ),
    # bf16=dict(
    #     enabled='auto',
    # ),
    # inputs_to_half=['images'],
    inputs_to_half=[0],
    zero_optimization=dict(
        stage=3,
        allgather_partitions=True,
        allgather_bucket_size=2e8,
        reduce_scatter=True,
        reduce_bucket_size='auto',
        overlap_comm=True,
        contiguous_gradients=True,
    ),
)

the inputs_to_half is set to [0] however, In https://github.com/open-mmlab/mmengine/blob/e43bbb5e03a412ea07f9ef6d0f118586082c2845/mmengine/_strategy/deepspeed.py#L179C1-L206C55

It will not be triggered, because inputs are always the dict type.

Reproduces the problem - command or script

n/a

Reproduces the problem - error message

Dypes of the inputs and model weight type are not the same

Additional information

No response

KyanChen avatar Oct 27 '23 13:10 KyanChen

If inputs is a dict, inputs_to_half should be a list of str

HAOCHENYE avatar Oct 30 '23 11:10 HAOCHENYE