ecco minGPT support

minGPT is a minimal PyTorch re-implementation of OpenAI GPT by @karpathy: https://github.com/karpathy/minGPT

It's extremely readable and great for teaching purpose. I was wondering if you would consider extending ecco to also support it?

If I was to work on it, is it something you'd prefer to see in this repository or in a separate "ecco-mingpt" repository?

Feb 23 '21 14:02 MasterScrat

I'm actively working on a mechanism to enable more and more models in the https://github.com/jalammar/ecco/tree/batch-input-activations branch. But they're all from Hugging Face.

Is there a pre-trained minGPT that people can download? What features are you mostly interested in? (different features require different wiring different parts of the Ecco wrapper class into the underlying model)

Feb 24 '21 09:02 jalammar

minGPT doesn't come with pre-trained models but you can easily train them yourself, the goal is to go through the whole process to understand how things work rather than to provide SotA checkpoints.

For example this notebook trains a full character-level GPT: https://github.com/karpathy/minGPT/blob/master/play_char.ipynb

It would be great to be able to see the character-level saliency of the output!

There's also a small scale "image GPT" notebook, it could also be super cool to visualize image saliency, but I guess that would require a lot more work on the UI side: https://github.com/karpathy/minGPT/blob/master/play_image.ipynb

Feb 24 '21 11:02 MasterScrat

I love character-level models. Indeed watching their saliency would be interesting. I also agree that the text version is more relevant than the image version for now.

I'd love to add the support to Ecco. Other users have requested adding local models (instead of only relying on online pre-trained models. This sounds like a good model to start with given it's small and simple.

For saliency to work, Ecco needs to know the name of the embedding layer. These are now specified in https://github.com/jalammar/ecco/blob/main/src/ecco/model-config.yaml. A local model would need to tell Ecco about its embedding layer somehow. Got minGPT, we can likely just include it in the list.

For displaying the saliency scores, eccoJS might need some tweaks to work better for a character level model. But we'll have to see how it displays it first.

Feb 25 '21 10:02 jalammar

Please also consider Karpathy's 'llama2c':

https://github.com/karpathy/llama2.c

Dec 25 '23 23:12 dbl001