LintDB icon indicating copy to clipboard operation
LintDB copied to clipboard

Support computations using float16

Open mtbarta opened this issue 1 year ago • 0 comments

We use sgemm to run matmuls for every query.

We can cut this in half by supporting float16, which MKL/OpenBLAS support.

Requirements:

  • We need to be able to toggle this behavior. We should expect some perf loss from the loss in precision.
  • We need to decide when float16 computation is allowed. e.g. Are we casting our stored embeddings to float16 instead of float32? Where does that happen?

mtbarta avatar Aug 20 '24 19:08 mtbarta