DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

[REQUEST]How to set Ulysses in deepspeed config json?

Open xs1997zju opened this issue 1 year ago • 1 comments

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like A clear and concise description of what you want to happen.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

xs1997zju avatar Jul 22 '24 09:07 xs1997zju

Hi @xs1997zju are you asking how to enable using Deepspeed Ulysses?

loadams avatar Jul 29 '24 19:07 loadams

Hi @xs1997zju are you asking how to enable using Deepspeed Ulysses?

@loadams Yes, I'm now using transformer accelerate deepspeed plugin for training, I want to know is there a way to enable the ulysses in deepspeed config json, like this: ds_config.json `{

"bf16": {
    "enabled": true
},

"zero_optimization": {
    "stage": 3,
    "offload_optimizer": {
        "device": "cpu",
        "pin_memory": true,
        "buffer_count": 10
    },
    "offload_param": {
        "device": "cpu",
        "pin_memory": true
    },
    "overlap_comm": true,
    "contiguous_gradients": true,
    "sub_group_size": 1e9,
    "reduce_bucket_size": "auto",
    "stage3_prefetch_bucket_size": 1e9,
    "stage3_param_persistence_threshold": "auto",
    "stage3_gather_16bit_weights_on_model_save": true
},

"gradient_accumulation_steps": "auto",
"gradient_clipping": "auto",
"steps_per_print": 1,
"train_batch_size": "auto",
"train_micro_batch_size_per_gpu": 1,
"wall_clock_breakdown": false

}`

code from accelerate.utils import DeepSpeedPlugin deepPlugin = DeepSpeedPlugin(hf_ds_config='ds_config.json', zero3_init_flag=True) accelerator = Accelerator(deepspeed_plugin=deepPlugin) model = AutoLigerKernelForCausalLM.from_pretrained(pretrained_model_path) model = accelerator.prepare(model)

xs1997zju avatar Oct 30 '24 08:10 xs1997zju