llama.cpp
llama.cpp copied to clipboard
testing: allclose operator
allclose tests that all the floats in two tensors of identical size are within an epsilon error tolerance.
See also: https://pytorch.org/docs/stable/generated/torch.allclose.html
bool allclose(ggml_tensor * a, ggml_tensor * b, f32 epsilon);
Currently, the benchmark etc uses sum of tensor elements:
// Let's use the F32 result from above as a reference for the q4_0 multiplication
float sum_of_F32_reference = tensor_sum_elements(gf.nodes[0]);
It's not totally wrong, but allclose is a more precise measure, especially as an error can get averaged out.
This issue was closed because it has been inactive for 14 days since being marked as stale.