Heyang Sun comments

Results 10 comments of


                                            Heyang Sun

Flex 170 x8 is failing when targeting 6 or 8 GPUs

> > Hi Yang, from our debug synch you indicated that on the same machine your fellow team member were not seeing issues on 8-GPU config. May I kindly ask...

Speculative Starcoder on CPU

Enabled prepare_past_kv, prepare_draft_past_kv and update_kv, and have tested on 15.5B and tiny starcoders.

failed to inference latest version (67cf22a4e6809edb7308dd0a2ae2c1ffb86f4984) of mpt-7b with BigDL

Hi @jiafuzha , the error tells that an unexpected argument is passed to [BigDL-wrapped forward](https://github.com/intel-analytics/BigDL/blob/main/python/llm/src/bigdl/llm/transformers/models/mpt.py#L32) of mpt attention, and this happened because BigDL currently only supports [mosaicml/mpt-7b-chat](https://huggingface.co/mosaicml/mpt-7b-chat) and [mosaicml/mpt-30b-chat](https://huggingface.co/mosaicml/mpt-30b-chat) that...

failed to inference latest version (67cf22a4e6809edb7308dd0a2ae2c1ffb86f4984) of mpt-7b with BigDL

Hi @jiafuzha , pls take a wait and I am WIP to support the feature.

failed to inference latest version (67cf22a4e6809edb7308dd0a2ae2c1ffb86f4984) of mpt-7b with BigDL

Hi @jiafuzha , rotary embedding has been enabled for MPT in #10208 , you can upgrade `bigdl-llm` in your environment by `pip install --pre --upgrade bigdl-llm[all]`.

Baichuan2-13B with bigdl-bf16 does not apply greedy_search when calling model.generate

Is `import intel_extension_for_pytorch as ipex` necessary? As import will do some init works. @rnwang04

Feature Request: RoSA and QRoSA

Hi @ElliottDyson , thanks for your proposal. Currently we provide many fine-tuning options e.g. ReLoRA, axolotl and DPO etc. as shown [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/LLM-Finetuning#running-llm-finetuning-using-ipex-llm-on-intel-gpu), as well as [Galore](https://github.com/intel-analytics/ipex-llm/pull/10722) and [LISA](https://github.com/intel-analytics/ipex-llm/pull/10743) on way,...

Heyang Sun

Flex 170 x8 is failing when targeting 6 or 8 GPUs

Speculative Starcoder on CPU

failed to inference latest version (67cf22a4e6809edb7308dd0a2ae2c1ffb86f4984) of mpt-7b with BigDL

failed to inference latest version (67cf22a4e6809edb7308dd0a2ae2c1ffb86f4984) of mpt-7b with BigDL

failed to inference latest version (67cf22a4e6809edb7308dd0a2ae2c1ffb86f4984) of mpt-7b with BigDL

Baichuan2-13B with bigdl-bf16 does not apply greedy_search when calling model.generate

Feature Request: RoSA and QRoSA

RuntimeError: "fused_dropout" not implemented for 'Byte' when running trl ppo finetuning

RuntimeError: "fused_dropout" not implemented for 'Byte' when running trl ppo finetuning

Cannot load the previous model weights when using ZeRO 3 optimizer in DeepSpeed Chat