Open-Assistant Using Hidden Engrams for long context

Using Hidden Engrams for long context

Open SummerSigh opened this issue 2 years ago • 0 comments

Currently, the OA models have a fairly short window of context (Compared to other models). While efforts are underway to expand the context size, I suggest that we should use hidden engrams (https://github.com/AeroScripts/HiddenEngrams) to expand the context size of the model temporarily. Previously, Areo (the maker of hidden engrams) had used this on GPT-J with some very good results. Given models with instruction finetuning produce more predictable and useable output than models that only have causal LM pretraining, it can be reasonably inferred that hidden engrams would be even more effective at producing the desired output compared to the model than what was previously looked at. I could take a look at implementing this, but if anyone else wants to try their hand at making a notebook demo, that would be fine too.

May 12 '23 14:05 SummerSigh

Open-Assistant Open-Assistant copied to clipboard

Using Hidden Engrams for long context

Open-Assistant
Open-Assistant copied to clipboard