lean_transformer icon indicating copy to clipboard operation
lean_transformer copied to clipboard

Inference / LeanGPT.generate

Open justheuristic opened this issue 4 years ago • 1 comments

This is a master discussion for memory-efficient inferencing, further notes will be added shortly

Current quest stage: add a dummy cache that is passed to all attention layers

justheuristic avatar Mar 15 '22 13:03 justheuristic

ideally, this should be available as a .generate method in LeanGPTForPreTraining https://github.com/learning-at-home/lean_transformer/blob/main/lean_transformer/models/gpt.py#L184

justheuristic avatar Mar 15 '22 21:03 justheuristic