Geun, Lim

Results 18 comments of Geun, Lim

The first picture above is an error when using the merge method after dpo learning using qlora. When SFT training the 7.8b model with 2 nodes (H100*8), we use a...

The first issue has been resolved since the reinstallation, thank you. After using the v0.5.0 version, I proceeded with the update this time, so tracking is difficult..

I'm learning llamafy models. The same is issue with Qwen2.5. I'm using the settings below and I'm using zero3.json as the --deepspeed option. Please let me know if there is...

``` cache_dir: ~/cache environment: LOCAL_MACHINE debug: false deepspeed_config: deepspeed_config_file: /data/axolotl/deepspeed_configs/zero3.json deepspeed_hostfile: /data/axolotl/hosts/hostfile deepspeed_multinode_launcher: pdsh zero3_init_flag: true distributed_type: DEEPSPEED downcast_bf16: 'no' enable_cpu_affinity: false machine_rank: 0 main_process_ip: [main_ip] main_process_port: [main_port] main_training_function: main...

It works fine if you run it with yes 0.4.1 version. Or if you run it with cpu_offload in the current version, But it takes a very long time.