One

Results 109 comments of One

Any updates to this PR? It's important for evaluating chat/instruction-finetuned models.

I manually modified the hash of AESLC in tensorflow-datasets, and it worked fine.

What is the seed file for the WizardLM and WizardCoder datasets?

BTW, the scripts seem to be missing the error checker and iterative evolution described in the paper. Are these parts necessary?

@Green-Sky We observed that fine-tuning may still cause performance degradation. It is better to have a native 8192 pretrained model.

Thanks! How does it compare to native long context base models such as StarCoder 8192? BTW, if we want the 8192 version of OpenLLaMA, maybe we need a JAX FlashAttention...

Also excited to see V2 13B! Better with coding + 8192 native context length

I think it's related to the model loading of Transformers, which checks for model updates on HuggingFace every time. One temporary solution would be downloading to a local folder and...