Brandon Beiler
Brandon Beiler
@Isotr0py After making an issue request on the transformers project, the response was that transformers doesn't serialize config values that are equal to the default values: ref: https://github.com/huggingface/transformers/issues/38981#issuecomment-2996301499 This would...
Also, worth noting that adding those missing config fields to the quantized config.json files resolved the issues and the models seem to be running with expected outputs! - https://huggingface.co/brandonbeiler/InternVL3-8B-FP8-Dynamic -...
@bracesproul @megahirt Definitely still seeing this on https://swe.langchain.com/ as of this morning, and have seen it consistently every couple of days that I've checked in the last week. Is it...
The "Think" or "Reason" toggles are becoming fairly standard on AI chat interfaces and apps alike. I think a toggle such as that (or potentially enhanced with a dropdown, on...
Ideally, it would be actioned by the standard openai "reasoning" field, with the effort object. But, openai reasoning.effort doesn't support disabling currently, just low/medium/high. So yah, being flexible for various...
Yah, that's definitely a valid point. Keep it simple and then require the external api to conform to the demands. I think I was coming at it from the standpoint...
Given the massive context sizes in the llama 4 drops, and the equally large model sizes, it seems that parallel context handling could really improve serving llama 4 models. Plus,...
Definitely experiencing this as well with opus 4.5 and auto or manual compacts