Stephan Walter
Stephan Walter
This makes two somewhat tedious changes to ggml.c, that should help us with trying out other tensor types, as has been discussed recently in various issues/PRs. * define a separate...
As rightly pointed out by @jxy [here](https://github.com/ggerganov/llama.cpp/commit/6232f2d7fd7a22d5eeb62182b2f21fcf01359754#commitcomment-108812025), my changes in #703 limiting the calculation to `int8_t` might overflow. -> Change the types to `int` instead.
This combines some ideas from PR #729 and issue #397 to select a scale factor for Q4_0 with low RMS error. In order to KISS, I simply made a table...
This adds support for 2-bit and 3-bit quantization with an FP16 shared scale and 16 quants per block. I don't consider it ready to merge, as we might come up...
This seems excessive but I'm not sure how to turn that off (`edited` in `build.yml`?), without inadvertently also removing other triggers (such as a push to the branch the PR...
Apart from adding the AVX2 optimization for Q4_3, this refactors some commonly used intrinsic sequences into `inline` functions.
I hope this isn't too controversial... Q4_3 turns out to be equal or worse than the Q5 types in all criteria we have: perplexity, file size, token generation speed. In...
The nRF8001 Product Specification 1.3 says that the file name has length 22, but `aci_evt_params_hw_error_t` in file aci_evts.h has size 20. Which is correct?
`nrf_reset_network_force_off` uses `nrf53_errata_161` to decide if the workaround for [anomaly 161](https://infocenter.nordicsemi.com/topic/errata_nRF5340_Rev1/ERR/nRF5340/Rev1/latest/anomaly_340_161.html) is needed: https://github.com/NordicSemiconductor/nrfx/blob/1c721175f22dbb1bf125a570a427b53f810881bb/hal/nrf_reset.h#L175-L187 However that function only returns `true` when compiled for the network core: https://github.com/NordicSemiconductor/nrfx/blob/1c721175f22dbb1bf125a570a427b53f810881bb/mdk/nrf53_erratas.h#L5876-L5909
BLEAdapter.authenticate waits for the gap_evt_sec_params_request event with timeout=10s. It does not handle a disconnected event during this time, but will keep waiting for the full 10 seconds. A somewhat common...