randaller
randaller
I made a dummy modification to make LLaMA acts like ChatGPT. It keeps 2048 bytes of context. And it does it pretty well!!! I am running a sliding chat window...
**As the model was trained on a "scientific-looks" data and wiki, we need to be "more scientific" when prompting.** Model: 30B, prompt: ``` Write the Python code with detailed comments...
For example, transformers model; I assume this is not enough: ``` model = Transformer(model_args) model.state_dict = zipslicer.load(checkpoints[-1], map_location="cpu", debug=True) model.to("cuda") ``` As it throws out of memory error. How to...