sujithjoseph
sujithjoseph
Let me commit this by next week Sujith On Fri, Jan 28, 2022 at 11:13 PM Maarten Grootendorst < ***@***.***> wrote: > Thus far, I have not tried working with...
I was able to re-create the config file with a smaller data set training and then saved it using finalmodel = accelerator.unwrap_model(model) finalmodel.save_pretrained(peft_model_id)
how can i do inference easily using huggingface pipelines like this from a PeftModelForSeq2SeqLM model . ``` from transformers import pipeline summarizer = pipeline("summarization", "cdcFT5lra", torch_dtype=torch.bfloat16) raw_document = 'You must...
Thanks @pacman100 . Really Appreciate it! Had a follow up Q. I was trying to load the model with int-8 ``` max_memory={0: "30GIB", 1: "0GIB", 2: "0GIB", 3: "0GIB", 4:...
fine-tuned flan-t5-xxl takes around 10-20 seconds on a single 40 GB A100 GPU to give answer for a prompt.. If there anything than can be done it to make it...
Thanks a lot @pacman100 @mayank31398! , This has been really insightful! I didn't know that converting the model to Tensor RT and serve via TRT inference server, would be more...
I also see quality issues on the fine-tuned flan-t5-xxl (on 500K records), unlike the original model. Its hallucinating a lot. I had used batch size as 1 , as I...
Yes. DeepSpeed zero 3. It worked fine with batch size as 1, not 2. I am concerned if lower batch size is impacting model quality. I had 500K records as...
@mayank31398 I had started with 4 and expanded to 8 . My final config has num proc as 8. Doesnt this enable cpu offoading? ``` offload_optimizer_device: cpu offload_param_device: cpu ```
I also had changed this in the final config - dynamo_backend: 'INDUCTOR'