Gangsheng Wu

Results 7 issues of Gangsheng Wu

replace Checkpoint usage by using TorchCheckpoint/TensorflowCheckpoint

### Describe the issue Hi, I have following codes: ```python import dpctl import torch import intel_extension_for_pytorch xpu_num = len(dpctl.get_devices(backend="level_zero", device_type="gpu")) print(f"xpu_num = {xpu_num}") # device = torch.device("xpu:0") device = torch.device("cpu:0")...

XPU/GPU
UserExperience

My environment: ``` bash export ONEAPI_DEVICE_SELECTOR="level_zero:gpu" ``` ``` bash $ sycl Warning: ONEAPI_DEVICE_SELECTOR environment variable is set to level_zero:gpu. To see the correct device id, please unset ONEAPI_DEVICE_SELECTOR. [ext_oneapi_level_zero:gpu:0] Intel(R)...

## Why are these changes needed? To leverage the potential of Intel Gaudi accelerator, we extend Ray Train's capabilities by adding support for Intel Gaudi (HPU) hardware. This PR include...

## Why are these changes needed? To leverage the potential of Intel Gaudi accelerator, we extend Ray Train's capabilities by adding support for Intel Gaudi (HPU) hardware. This PR include...

triage
train

### Feature request I see the release version 1.12 has supported fp8, but I didn't see any example code for how to train LLM by using FP8. How can I...

# What does this PR do? ## background when I running inference with command: ``` bash INPUT=32768 OUTPUT=32768 BATCH_SIZE=12 python gaudi_spawn.py --use_deepspeed --world_size 8 run_generation.py \ --model_name_or_path Meta-Llama-3.1-70B-Instruct/ \ --max_input_tokens...

review wip