octogen
octogen copied to clipboard
Desgin: The memory system for octogen
Background and Why
Even GPT-4, a well-performing model, can generate outdated code due to outdated training data. However, LLMs have a strong language understanding ability, which can be used to correct these errors through prompts.
-
To enable LLM to use command-line tools and libraries that are not covered by the training data. This includes two cases:
- Tools and code repositories don't being included in the training data at all.
- The model is trained on data from outdated tools or libraries, which means that the model cannot use them right
- To prevent LLM from repeating the same mistakes. When a LLM uses an incorrect code usage or tool, it will always repeat the incorrect code and then recall the correct result through long-term memory before executing the code. However, this can be improved by using short-term memory to store the correct tool or code usage in the instructions.
Desgin
Other Memory Desgin
- https://arxiv.org/pdf/2310.08560.pdf this paper can provide some data proof for the desgin
- https://arxiv.org/abs/1909.09436