Ashvini Jindal
Results
4
comments of
Ashvini Jindal
Hi @mmliang , apologies for the late response. Have you already made changes to existing code-base to predict arc_label as well. Few things that will change are data reading, the...
Hi @shihe123 , Apologies for the late response. Model will be saved under `data/params_*` Please have a look at method: `def compute_dependencies()`
Hi @yohan-pg , on 1x 4090 GPU, I am getting 29% MFU while training GPT-2 124M model. What is your GPU setup?
I am also seeing similar issue where loss is trending downwards but quite unstable and it seems to learn very slowly. I am running full fine-tuning of latest Phi2 model...