Hasan Abed Al Kader Hammoud
Hasan Abed Al Kader Hammoud
@Kimiko-AI I really think this pull request is worth finishing! Very useful - would love to see how Prodigy would perform on LLM training after I used it before on...
+1 Same issue, currently I have a hardcoded line for Llama-3 to get it from hub.
This LGTM but I was testing it out and there might be an issue with Phi-3 and flash-attention. On 4xA100s node I have a warning is obtained when training Phi-3...
Btw, my issue here got resolved when I turned off sample packing. Maybe Phi-3 sample packing with Flash-Attention isn't compatible. @brianfitzgerald
FYI https://github.com/OpenAccess-AI-Collective/axolotl/issues/1683 @winglian @brianfitzgerald
This might be an issue related to HuggingFace transformers library - I'm having same error in a different setting.
@williambarberjr you could probably pass `max_length=8192` in the yml file ``` datasets: - path: williambarberjr/L3_8B_Instruct_MarkdownToSummaryConvert type: chat_template chat_template: llama3 max_length: 8192 field_messages: messages message_field_role: role message_field_content: content roles: user: -...
@Ahmedn1 was this ever resolved on your end ? I'm having something similar unless I apply multipack attn.
@Ahmedn1 what are you currently using ?