Kevin Canwen Xu comments

Results 36 comments of


                                            Kevin Canwen Xu

Set up a basic MLflow setup

The server's at http://deplo-mlflo-1s4xwzhh8tic4-97cf518635d8c72d.elb.us-east-2.amazonaws.com/

Hard to reproduce the results of GLUE benchmark

Hi @Harry-zzh, thanks for your interest in our work! Just to confirm, 3% lower than reported means absolutely, right? Then this is lower than all baselines in Table 1, even...

Hard to reproduce the results of GLUE benchmark

@Harry-zzh Thanks for the info. Is it on test set (i.e., GLUE server) or validation set? If it's on test set, could you please also provide the results on the...

Hard to reproduce the results of GLUE benchmark

By the way, in NLP experiments, the students in our implementation of KD and our approach are initialized with pretrained BERT (well-read student) rather than fine-tuned teacher. That's probably the...

Can open the link to download dataset!

Please check the release of this repository for the encrypted zip. Use the password you get when you complete the Google Form to uncompress.

AttributeError: 'NoneType' object has no attribute 'view'

Could you try upgrade Transformers? Also print the input? Not quite sure about this error.

SDF, how does it work?

We'll release the code soon. It's actually very simple. Just ask ChatGPT to pick the best response and use that to fine-tune Baize.

SDF, how does it work?

> Is that really self-distillation? It sounds more like synthetic data generation and you're still distilling ChatGPT into the model. > > > > Don't get me wrong, it's a...

SDF, how does it work?

> Right, but it's the intelligence of ChatGPT you're distilling into your model. > > > > If a child learning math gives 4 answers to a math question, and...

very high CPU during inference. GPU seems to be idle.

This seems to be a problem with int8. In our test, it is indeed slower than fp16. We'll have an investigation into this.