elasticsearch-java
elasticsearch-java copied to clipboard
Knn query vector is serialized as array of double, not float
Java API client version
9.0.3
Java version
21
Elasticsearch Version
9.0.3
Problem description
KnnQuery models queryVector as List<Float> but serializes as List<Double>, adding roughly 9 bytes per value. For dense_vector of typical sizes, e.g. 1024, this can add roughly 9kb to the uncompressed request body.
This reason is due to the float being cast to a double as consumed by jakarta.json.stream.JsonGenerator.write(double) for which does not have a write(float) overload. The solution would be write as a BigDecimal instead, converting the value first to a string.
new java.math.BigDecimal(Float.toString(float))
A unit test which reproduces the issue
public class EsTest {
@Test
public void knnSerialization() {
var expectedJson = """
{"knn":{"field":"f1","query_vector":[3.1415927,2.7182817,0.501,100.0,1.0E-5]}}""";
Query query1 = QueryBuilders.knn(k -> k
.queryVector(List.of((float) Math.PI, (float) Math.E, 0.501f, 100f, 1e-5f))
.field("f1")
);
String actualJson = JsonpUtils.toJsonString(query1, new JacksonJsonpMapper());
assertThat(actualJson).isEqualTo(expectedJson);
// Expected :"{"knn":{"field":"f1","query_vector":[3.1415927,2.7182817,0.501,100.0,1.0E-5]}}"
// Actual :"{"knn":{"field":"f1","query_vector":[3.1415927410125732,2.7182817459106445,0.5009999871253967,100.0,9.999999747378752E-6]}}"
}
}