erkintelnyx

Results 1 comments of erkintelnyx

> The thinking switch is based on chat template. So if you must use `LLM.generate` instead of `LLM.chat`, you can call `tokenizer.apply_chat_template` manually just like in HF repo before passing...