Ganesh issues

Repositories
Issues
Comments

Results 3 issues of


                                            Ganesh

Imbalanced validation and test data

Section 5.1 has this line: "We split the articles in a balanced way, with 10k for training (5k per label), 2k for validation, and 8k for testing." But the "generator=mega~dataset=p0.94.jsonl"...

outputs can contain values more than the max size of rev_lang_vocab

When the output emits ids which are larger than the max size of the rev_lang_vocab, it throws an index error in this line. https://github.com/davidjurgens/equilid/blob/master/equilid/equilid.py#L667 As a result, the predictions list...

How to finetune llama checkpoint using metaseq?

I want to finetune the 7B llama checkpoint using metaseq. It seems the llama checkpoints are the consolidated versions of the model. It's not clear how to finetune consolidated model...