llm Codegen Implementation

Codegen Implementation

Open hhamud opened this issue 1 year ago • 2 comments

Completes #165

TODO:

[x] find out how codegen model differs to GPT-J
[ ] Implement difference
[ ] Convert codegen HF model to GGML
[ ] refactor code to share with GPT-J

May 06 '23 15:05 hhamud

How different is this to the original GPT-J implementation? Can the codegen model be implemented by calling into GPT-J with a parameter to use a slightly different computation graph?

I'd really like to avoid any unnecessary duplication if possible, especially as we evolve the models.

May 06 '23 15:05 philpax

How different is this to the original GPT-J implementation? Can the codegen model be implemented by calling into GPT-J with a parameter to use a slightly different computation graph?

I'd really like to avoid any unnecessary duplication if possible, especially as we evolve the models.

That's more or less what I'm thinking of doing but the stage I'm currently at is to try and implement codegen's treatment of the QKV vectors.

May 06 '23 16:05 hhamud

No longer useful

May 18 '23 15:05 hhamud

llm llm copied to clipboard

Codegen Implementation

TODO:

llm
llm copied to clipboard