Ashvini Jindal

Results 4 comments of Ashvini Jindal

Hi @mmliang , apologies for the late response. Have you already made changes to existing code-base to predict arc_label as well. Few things that will change are data reading, the...

Hi @shihe123 , Apologies for the late response. Model will be saved under `data/params_*` Please have a look at method: `def compute_dependencies()`

Hi @yohan-pg , on 1x 4090 GPU, I am getting 29% MFU while training GPT-2 124M model. What is your GPU setup?

I am also seeing similar issue where loss is trending downwards but quite unstable and it seems to learn very slowly. I am running full fine-tuning of latest Phi2 model...