Easy-Transformer icon indicating copy to clipboard operation
Easy-Transformer copied to clipboard

Add support for Qwen2 models

Open g-w1 opened this issue 1 year ago • 1 comments

Description

I added support for Qwen2 models. All this entailed was fixing the Qwen2 architecture loading code to use grouped query attention, since that is what Qwen2 expects anyways (Qwen1.5 just used a special case of grouped query attention where it was equivalent to regular attention, so this does not break Qwen1.5).

Type of change

  • [x] New feature (non-breaking change which adds functionality)

Checklist:

  • [x] I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [x] My changes generate no new warnings
  • [ ] I have added tests that prove my fix is effective or that my feature works
  • [ ] New and existing unit tests pass locally with my changes
  • [x] I have not rewritten tests relating to key interfaces which would affect backward compatibility

g-w1 avatar Jul 08 '24 23:07 g-w1

Great! I should be able to review this, and get it into a release early next week

bryce13950 avatar Jul 13 '24 00:07 bryce13950

Hey! Sorry for not getting to this earlier. I got pulled away to wrap up a couple things. Looking at it now. Will let you know if anything odd pops up!

bryce13950 avatar Jul 23 '24 21:07 bryce13950

I am going to go ahead and merge this. The implementation seems to be a bit inaccurate, but I don't think that has anything to do with what has been done here, since the same inaccuracies are in Qwen-1X models. Just a word of caution if anyone sees this and decides to start using it.

bryce13950 avatar Jul 23 '24 22:07 bryce13950