LostRuins Concedo
LostRuins Concedo
Having tested changing them I did not notice any performance loss. Not changing them also does not seem to cause any errors, perhaps on most systems the OS transparently treats...
> In that case, wouldn't it be possible to cache all probs[N] of the initial prompt after a first run, and reuse them as-is for later runs? Yes that is...
not a fan of this idea. Not only would it break all prior formats for little reason again, it would also be unnecessary padding for those who don't need such...
@philpax hmm I get your point, but I think it will end up as a https://xkcd.com/927/ situation. The problem is that such a "free comment field" is by definition arbitrary...
I like the flexibility of @philpax suggestion. A few fields should be enforced as mandatory for all models for a model to be considered compliant - the currently existing fields...
Also `"max_tokens": 15821` is not a good idea. It should ideally be half or less of your maximum context length, 1k is a good value.
@klosax i'd say the ggml magic would take care of that - ideally non-ggml formats *shouldn't* be using it as a container format. No need to over engineer it (my...
I would recommend including stuff that's mainly essential for loading the model - things that are required for proper functioning. Samplers are technically not even dependent on the model -...
Hmm I think that will be fine as an optional parameter, but not as a standard parameter. Standard params should be stuff that are *required* for loading correctly, like use_parallel_residual...
I went ahead and did it myself, hopefully this helps someone out there.