llm icon indicating copy to clipboard operation
llm copied to clipboard

Codegen Implementation

Open hhamud opened this issue 1 year ago • 2 comments

Completes #165

TODO:

  • [x] find out how codegen model differs to GPT-J
  • [ ] Implement difference
  • [ ] Convert codegen HF model to GGML
  • [ ] refactor code to share with GPT-J

hhamud avatar May 06 '23 15:05 hhamud

How different is this to the original GPT-J implementation? Can the codegen model be implemented by calling into GPT-J with a parameter to use a slightly different computation graph?

I'd really like to avoid any unnecessary duplication if possible, especially as we evolve the models.

philpax avatar May 06 '23 15:05 philpax

How different is this to the original GPT-J implementation? Can the codegen model be implemented by calling into GPT-J with a parameter to use a slightly different computation graph?

I'd really like to avoid any unnecessary duplication if possible, especially as we evolve the models.

That's more or less what I'm thinking of doing but the stage I'm currently at is to try and implement codegen's treatment of the QKV vectors.

hhamud avatar May 06 '23 16:05 hhamud

No longer useful

hhamud avatar May 18 '23 15:05 hhamud