Zipkin not working with Opensearch - appears to be double encoding UTF-8
Describe the Bug
Zipkin fails to start when using Opensearch (but succeeds when using Elasticsearch)
Steps to Reproduce
- run
docker compose upand wait for containers - run
curl --verbose http://0.0.0.0:9200to see Elasticsearch / Opensearch information - run
curl --verbose http://0.0.0.0:9411/healthto see Zipkin health
Elasticsearch (working)
docker-compose.yml:
services:
elasticsearch:
container_name: elasticsearch
environment:
- _JAVA_OPTIONS=-Xms512m -Xmx512m -XX:UseSVE=0
- action.destructive_requires_name=false
- discovery.type=single-node
- http.host=0.0.0.0
- transport.host=127.0.0.1
- xpack.monitoring.collection.enabled=false
- xpack.security.enabled=false
- xpack.security.http.ssl.enabled=false
healthcheck:
interval: 5s
retries: 10
start_period: 10s
test: curl --silent http://localhost:9200/_cluster/health | grep --extended-regexp '"status":"(green|yellow)"'
timeout: 10s
image: elastic/elasticsearch:8.17.2
restart: on-failure
ports:
- "9200:9200"
- "9300:9300"
zipkin:
container_name: zipkin
depends_on:
elasticsearch:
condition: service_healthy
environment:
- ES_HOSTS=http://elasticsearch:9200
- JAVA_OPTS=-XX:UseSVE=0
- STORAGE_TYPE=elasticsearch
image: openzipkin/zipkin:latest
ports:
- "9411:9411"
restart: on-failure
output of curl --verbose http://0.0.0.0:9200:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 0.0.0.0:9200...
* Connected to 0.0.0.0 (127.0.0.1) port 9200 (#0)
> GET / HTTP/1.1
> Host: 0.0.0.0:9200
> User-Agent: curl/7.88.1
> Accept: */*
>
< HTTP/1.1 200 OK
< X-elastic-product: Elasticsearch
< content-type: application/json
< content-length: 541
<
{ [541 bytes data]
100 541 100 541 0 0 46409 0 --:--:-- --:--:-- --:--:-- 49181
* Connection #0 to host 0.0.0.0 left intact
{
"name" : "662459531c8d",
"cluster_name" : "docker-cluster",
"cluster_uuid" : "uw14QxYCTM2HEg5OZsnWKg",
"version" : {
"number" : "8.17.2",
"build_flavor" : "default",
"build_type" : "docker",
"build_hash" : "747663ddda3421467150de0e4301e8d4bc636b0c",
"build_date" : "2025-02-05T22:10:57.067596412Z",
"build_snapshot" : false,
"lucene_version" : "9.12.0",
"minimum_wire_compatibility_version" : "7.17.0",
"minimum_index_compatibility_version" : "7.0.0"
},
"tagline" : "You Know, for Search"
}
Highlighted part of response
< content-type: application/json
output of curl --verbose http://0.0.0.0:9411/health:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 0.0.0.0:9411...
* Connected to 0.0.0.0 (127.0.0.1) port 9411 (#0)
> GET /health HTTP/1.1
> Host: 0.0.0.0:9411
> User-Agent: curl/7.88.1
> Accept: */*
>
< HTTP/1.1 200 OK
< content-type: application/json; charset=utf-8
< content-length: 209
< server: Armeria/1.31.3
< date: Mon, 17 Feb 2025 20:21:47 GMT
<
{ [209 bytes data]
100 209 100 209 0 0 20727 0 --:--:-- --:--:-- --:--:-- 20900
* Connection #0 to host 0.0.0.0 left intact
{
"status" : "UP",
"zipkin" : {
"status" : "UP",
"details" : {
"ElasticsearchStorage{initialEndpoints=http://elasticsearch:9200, index=zipkin}" : {
"status" : "UP"
}
}
}
}
Opensearch (not working)
docker-compose.yml:
services:
opensearch:
container_name: opensearch
environment:
- _JAVA_OPTIONS=-XX:UseSVE=0
- action.destructive_requires_name=false
- DISABLE_INSTALL_DEMO_CONFIG=true
- DISABLE_SECURITY_PLUGIN=true
- discovery.type=single-node
- http.host=0.0.0.0
- transport.host=127.0.0.1
healthcheck:
interval: 5s
retries: 10
start_period: 10s
test: curl --silent http://localhost:9200/_cluster/health | grep --extended-regexp '"status":"(green|yellow)"'
timeout: 10s
image: opensearchproject/opensearch:latest
restart: on-failure
ports:
- "9200:9200"
- "9600:9600"
zipkin:
container_name: zipkin
depends_on:
opensearch:
condition: service_healthy
environment:
- ES_HOSTS=http://opensearch:9200
- JAVA_OPTS=-XX:UseSVE=0
- STORAGE_TYPE=elasticsearch
image: openzipkin/zipkin:latest
ports:
- "9411:9411"
restart: on-failure
output of curl --verbose http://0.0.0.0:9200:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 0.0.0.0:9200...
* Connected to 0.0.0.0 (127.0.0.1) port 9200 (#0)
> GET / HTTP/1.1
> Host: 0.0.0.0:9200
> User-Agent: curl/7.88.1
> Accept: */*
>
< HTTP/1.1 200 OK
< content-type: application/json; charset=UTF-8
< content-length: 568
<
{ [568 bytes data]
100 568 100 568 0 0 50818 0 --:--:-- --:--:-- --:--:-- 51636
* Connection #0 to host 0.0.0.0 left intact
{
"name" : "bd49f6011512",
"cluster_name" : "docker-cluster",
"cluster_uuid" : "dBYK3GZERkeEqDPB51Pghg",
"version" : {
"distribution" : "opensearch",
"number" : "2.19.0",
"build_type" : "tar",
"build_hash" : "fd9a9d90df25bea1af2c6a85039692e815b894f5",
"build_date" : "2025-02-05T16:13:57.130576800Z",
"build_snapshot" : false,
"lucene_version" : "9.12.1",
"minimum_wire_compatibility_version" : "7.10.0",
"minimum_index_compatibility_version" : "7.0.0"
},
"tagline" : "The OpenSearch Project: https://opensearch.org/"
}
Highlighted part of response
< content-type: application/json; charset=UTF-8
output of curl --verbose http://0.0.0.0:9411/health:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 0.0.0.0:9411...
* Connected to 0.0.0.0 (127.0.0.1) port 9411 (#0)
> GET /health HTTP/1.1
> Host: 0.0.0.0:9411
> User-Agent: curl/7.88.1
> Accept: */*
>
< HTTP/1.1 503 Service Unavailable
< content-type: application/json; charset=utf-8
< content-length: 1055
< server: Armeria/1.31.3
< date: Mon, 17 Feb 2025 20:08:03 GMT
<
{ [1055 bytes data]
100 1055 100 1055 0 0 104k 0 --:--:-- --:--:-- --:--:-- 114k
* Connection #0 to host 0.0.0.0 left intact
{
"status" : "DOWN",
"zipkin" : {
"status" : "DOWN",
"details" : {
"ElasticsearchStorage{initialEndpoints=http://opensearch:9200, index=zipkin}" : {
"status" : "DOWN",
"details" : {
"error" : "IllegalArgumentException: .version.number not found in response: �\u0006\u0000\u0000sNaPpY\u0000�\u0001\u0000�爞�\u0004l{\n \"name\" : \"bd49f6011512\",\u0001\u001B\u001Ccluster_\u0015#\u0018docker-\r\u00186%\u0000\fuuid\u0005HTdBYK3GZERkeEqDPB51Pghg\t-\u0018version\u0001(<{\n \"distribut\r\u0017(\"opensearch\u00053\u0001�\u0010umber\u00014\u0018\"2.19.0\u0011\u0019 build_typ\t�\btar6\u001A\u0000\fhash\u00057�fd9a9d90df25bea1af2c6a85039692e815b894f5\"\u0001�\u0004 \rY\fdate\u0005?t2025-02-05T16:13:57.130576800Z6t\u0000\u001Csnapshot\u00019\u0010false\rS\u0018lucene_=\u0001\u0018\"9.12.1\u0011?dminimum_wire_compatibility25\u0000\f7.109\u0002\u00115\u0010indexr6\u0000\u00015\f\n }\u0001�\u0018\"taglin\t� The OpenS%tH Project: https://o5� .org/\"\n}\n"
}
}
}
}
}
Expected Behaviour
Zipkin should work with Opensearch the way it does with Elasticsearch
Notes
Since Elasticsearch is returning a response with
< content-type: application/json
and Opensearch is returning a response with
< content-type: application/json; charset=UTF-8
I wonder if the root cause might be that the Opensearch response is being "double encoded" as UTF-8 since I see this logic here that appears to be common to both Elasticsearch and Opensearch:
- https://github.com/openzipkin/zipkin/blob/0f8fc88d33131dc938532322e19126def8cad8e8/zipkin-storage/elasticsearch/src/main/java/zipkin2/elasticsearch/internal/client/HttpCall.java#L245
zipkin works with opensearch 2.17 but fails with the above error on opensearch 2.19
We are having the same issue. Is there any expectation to be solved soon?
In BaseVersion.convert() method, it compares String with '=='. I'm not sure this occur this issue, but it must be equals() call.
if (parser.currentToken() == JsonToken.VALUE_STRING) {
if (parser.currentName() == "distribution") {
distribution = parser.getText();
} else if (parser.currentName() == "number") {
version = parser.getText();
}
}
EDIT: I raised #3809 to fix this.
I found the root cause. The OpenSearch 2.19 returns JSON in HTTP compression with content-encoding=snappy, and Zipkin failed to umcompress it. (The OpenSsearch 2.17 does not compress it) I think Zipkin should send an HTTP request with Accept-Encoding header with "identity" or "gzip"
I found a workaround for this. You can set http.compression to false in opensearch.yml to disable HTTP compression. Or you can use HTTPS on OpenSearch, because the compression becomes off when using HTTPS.
http.compression: false