nullptr
nullptr
> @not-nullptr I do not have mixed-refresh-rate monitors so I cannot easily test this. If you're willing to test patches, I can send you a patch and if it works,...
could we please have a contributor review this PR? json is currently unusable with the `llama3` family of models, and i think this PR should fix it
getting this exact issue on `llama3:8b` but not with `mistral:latest`, weirdly enough? speeds for regular text between both models are the exact same on my 3080. i think @coder543 is...
https://github.com/ollama/ollama/assets/62841684/a91cb579-4160-445d-ad47-caf888f17a39 https://github.com/ollama/ollama/assets/62841684/fbc5a9b4-0113-4d2a-8467-5b24083433f7 the first video demonstrates my function calling without `"format": "json"`, and the second demonstrates it with `"format": "json"`. you can see the speed difference is insane; same prompt...
just tested. it seems this solves #3851
i was getting this behaviour too, it is solved in hugh's commit
my reply isn't gonna be very constructive, but does it matter if quality is a tad worse? for any sort of real time production use case, json mode is unusable....
removing all whitespace seems like an unambiguous improvement too. i don't see why generation quality would be hindered, and every token is more CO2 (like OP said)
this has been an issue for years. insane how its not gonna get fixed
@rozniak yup. would be very helpful for running this over vnc, as of right now it just doesn't open