OpenChatKit
OpenChatKit copied to clipboard
prevent prompt tensors from accumulating in GPU
this could help with CUDA OOM errors especially on consumer grade hardware.
prompt and output tensors will be erased from VRAM