Mayank Mishra comments

Results 187 comments of


                                            Mayank Mishra

♻️ replace deprecated functions for communication

> @mayank31398, I think the formatting issues can be fixe by upgrading pre-commit and clang-format i am not seeing any issues with the formatting in the CI. are you suggesting...

RuntimeError: This event loop is already running

@syp1997 can you tell me how you are launching the job?

RuntimeError: This event loop is already running

If you launch via the Makefile, that shouldn't be a problem since I have set it to only 1 worker in Makefile.

Finetuning BLOOM

This is the script used for launching 176B: https://github.com/bigscience-workshop/bigscience/blob/master/train/tr11-176B-ml/tr11-176B-ml.slurm The architecture is not the same since BLOOM uses alibi and GPT uses absolute embeddings.

Finetuning BLOOM

For Starcoder, 4D parallelism is used Tensor Parallel, Pipeline Parallel, Sequence Parallel, Data Parallel This is the repo used for starcoder and santacoder training: https://github.com/bigcode-project/Megatron-LM

Enhancement: detach dtype for prompt embeddings from the model itself

Opened a PR for this

Any plans to support multi-task tuning?

Hey @brian-pieces check this one out: https://github.com/huggingface/peft/pull/400 Trying to get this in

"bloom-ds-zero-inference.py" works but "inference_server.cli --deployment_framework ds_zero" fails

this repo is no longer being maintained @sevenandseven I suggest using vLLM or TGI

fix checkpoints file list to align with DeepSpeed

Hi, this repo is no longer maintained

Inference(chatbot) does not work as expected on 2 gpus with bigscience/bloom-7b1 model

Um, I am not sure. Maybe ur process is getting stuck somewhere?