spring-cloud-sleuth icon indicating copy to clipboard operation
spring-cloud-sleuth copied to clipboard

Tags ERROR CANCELLED on Zipkin

Open patpatpat123 opened this issue 3 years ago • 4 comments

Hello Spring Cloud Sleuth Team,

I wanted to start an issue observed using this project

Background, the project is a Java 11 + Spring Boot 2.6.7 + Jubilee 2021.0.1, latest as of this writing.

The issue is that Zipkin always (reproducible 100%) shows a "Cancelled" (screenshots to show the issue below)

Here is the minimal sample reproducible code. I tried my best to water it down to only two classes and one pom.

Instead of attaching a zip / linking a repo, here they are, it is really just couple of files, again, 100% reproducible.

the pom (not very important actually)

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.6.7</version>
        <relativePath/>
    </parent>

    <groupId>io.github.hakky54</groupId>
    <artifactId>java-tutorials</artifactId>
    <version>1.0.0-SNAPSHOT</version>
    <packaging>pom</packaging>

    <licenses>
        <license>
            <name>Apache License, Version 2.0</name>
            <url>http://www.apache.org/licenses/LICENSE-2.0.txt</url>
            <distribution>repo</distribution>
        </license>
    </licenses>

    <properties>
        <version.java>11</version.java>
    </properties>

    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>org.springframework.cloud</groupId>
                <artifactId>spring-cloud-dependencies</artifactId>
                <version>2021.0.1</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>

    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-webflux</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-security</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-starter-sleuth</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-sleuth-zipkin</artifactId>
        </dependency>
        <dependency>
            <groupId>io.projectreactor.netty</groupId>
            <artifactId>reactor-netty-http-brave</artifactId>
        </dependency>
        <dependency>
            <groupId>io.micrometer</groupId>
            <artifactId>micrometer-registry-prometheus</artifactId>
        </dependency>
        <dependency>
            <groupId>io.github.resilience4j</groupId>
            <artifactId>resilience4j-micrometer</artifactId>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.10.1</version>
                <configuration>
                    <source>11</source>
                    <target>11</target>
                </configuration>
            </plugin>
        </plugins>
    </build>

</project>

The controller file (also not very important)

package com.why.cancelled;

import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.http.MediaType;
import org.springframework.stereotype.Controller;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.ResponseBody;
import org.springframework.web.reactive.function.client.WebClient;
import reactor.core.publisher.Mono;

@Controller
public class CancelledController {

    private static final Logger LOGGER = LogManager.getLogger(CancelledController.class.getName());
    private final    WebClient webClient;

    @Autowired
    public CancelledController(final WebClient webClient) {
        this.webClient = webClient;
    }

    @ResponseBody
    @PostMapping(path = "/whycancelled", consumes = MediaType.APPLICATION_JSON_VALUE, produces = MediaType.APPLICATION_JSON_VALUE)
    public Mono<String> whycancelled() {
        LOGGER.info("why cancelled when using the bean?");
        return webClient.post().uri("https://httpbin.org/post").retrieve().bodyToMono(String.class);
    }

}


The file where the issue happens

package com.why.cancelled;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.cloud.client.discovery.EnableDiscoveryClient;
import org.springframework.context.annotation.Bean;
import org.springframework.http.HttpHeaders;
import org.springframework.http.MediaType;
import org.springframework.http.client.reactive.ReactorClientHttpConnector;
import org.springframework.security.config.web.server.ServerHttpSecurity;
import org.springframework.security.web.server.SecurityWebFilterChain;
import org.springframework.web.reactive.function.client.WebClient;
import reactor.netty.http.HttpProtocol;
import reactor.netty.http.client.HttpClient;

import java.util.Map;
import java.util.function.Function;

@EnableDiscoveryClient
@SpringBootApplication
public class CancelledApplication {

    /**
     * The issue is happening here. Please run a query with the first line comment, and then another test with the second line comment.
     */
    @Bean
    public WebClient getWebClient(HttpClient httpClient) {
        return WebClient.create().mutate().defaultHeader(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE, HttpHeaders.ACCEPT, MediaType.APPLICATION_JSON_VALUE).clientConnector(new ReactorClientHttpConnector(httpClient)).build();
//        return WebClient.create().mutate().defaultHeader(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE, HttpHeaders.ACCEPT, MediaType.APPLICATION_JSON_VALUE).clientConnector(new ReactorClientHttpConnector(HttpClient.create().wiretap(false).metrics(true, Function.identity()).protocol(HttpProtocol.HTTP11))).build();
    }

    @Bean
    public HttpClient getHttpClient() {
        return HttpClient.create().wiretap(false).metrics(true, Function.identity()).protocol(HttpProtocol.HTTP11);
    }

    @Bean
    public SecurityWebFilterChain insecureWebFilterChain(ServerHttpSecurity http) {
        return http.csrf().disable().build();
    }

    public static void main(String[] args) {
        SpringApplication app = new SpringApplication(CancelledApplication.class);
        app.setDefaultProperties(Map.of("server.port", "8765", "spring.application.name", "cancelledapplication", "spring.sleuth.sampler.probability", "1.0", "spring.sleuth.sampler.rate", "10000", "spring.zipkin.base-url", "http://localhost:9411", "spring.zipkin.enabled", "true", "spring.zipkin.sender.type", "web"));
        app.run(args);
    }

}

Structure (screenshot) Screen Shot 2022-04-27 at 9 54 43 PM

Run command: A curl equivalent is ok

POST http://localhost:8765/whycancelled
Content-Type: application/json

{
    "foo": "bar"
}

Case 1: the issue: (please run with )

 @Bean
    public WebClient getWebClient(HttpClient httpClient) {
        return WebClient.create().mutate().defaultHeader(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE, HttpHeaders.ACCEPT, MediaType.APPLICATION_JSON_VALUE).clientConnector(new ReactorClientHttpConnector(httpClient)).build();
//        return WebClient.create().mutate().defaultHeader(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE, HttpHeaders.ACCEPT, MediaType.APPLICATION_JSON_VALUE).clientConnector(new ReactorClientHttpConnector(HttpClient.create().wiretap(false).metrics(true, Function.identity()).protocol(HttpProtocol.HTTP11))).build();
    }

With this, Zipkin will show this: (please see the screenshot, one "picture is worth a thousand words")

Screen Shot 2022-04-27 at 9 55 14 PM

I would like to highlight the part where it shows "Cancelled" and there is something red.

I am having a very hard time understanding why this "red Cancelled" is there.

Actual: There is an extra line in Zipkin showing as cancelled (see above screenshot)

Expected: I would have expected to look like Case 2 (see screenshot) Screen Shot 2022-04-27 at 9 55 53 PM

Case 2: the good (it is really strange, jut one single line change, reproducible 100%, please run with)

  @Bean
    public WebClient getWebClient(HttpClient httpClient) {
//        return WebClient.create().mutate().defaultHeader(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE, HttpHeaders.ACCEPT, MediaType.APPLICATION_JSON_VALUE).clientConnector(new ReactorClientHttpConnector(httpClient)).build();
        return WebClient.create().mutate().defaultHeader(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE, HttpHeaders.ACCEPT, MediaType.APPLICATION_JSON_VALUE).clientConnector(new ReactorClientHttpConnector(HttpClient.create().wiretap(false).metrics(true, Function.identity()).protocol(HttpProtocol.HTTP11))).build();
    }

With case two, where there is the use of the bean, we can see a "correct Zipkin" see above screenshot.

I am very puzzled by this, (spent a lot of time finding the root cause, even more trying to understand why).

Could you please kindly take a look?

Thank you

patpatpat123 avatar Apr 27 '22 13:04 patpatpat123

Is it possible to help on this please?

patpatpat123 avatar May 04 '22 18:05 patpatpat123

Apparently there was a cancel signal (check https://github.com/spring-cloud/spring-cloud-sleuth/blob/3.1.x/spring-cloud-sleuth-instrumentation/src/main/java/org/springframework/cloud/sleuth/instrument/web/client/TraceExchangeFilterFunction.java#L293-L312) you can put a breakpoint in that method and see why is it being thrown.

marcingrzejszczak avatar May 09 '22 13:05 marcingrzejszczak

Understood, thank you @marcingrzejszczak

patpatpat123 avatar May 10 '22 07:05 patpatpat123

Hello @marcingrzejszczak ,

Apologies to reopen this issue. I tried your advice and put breakpoints pretty much everywhere on TraceExchangeFilterFunction. Also, I ran into another interesting classes:

org.springframework.cloud.sleuth.instrument.web.client.HttpClientBeanPostProcessor
org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber

Especially the TracingMapConnect part.

The code is a bit hard to follow on my end.

May I ask if there is a possibility because it is a Bean, this project, Spring Cloud Sleuth will treat the flow differently, and add this cancel, why not using the bean will not show the cancel please?

Thank you

patpatpat123 avatar May 12 '22 03:05 patpatpat123

Hi patpatpat123 did you find any final solution for this problem?

Delirante avatar Dec 08 '22 14:12 Delirante

Hello @Delirante and team,

This issue still persists with the latest SpringBoot 3. I am still having a hard time understanding why. Could you please help?

Thank you

patpatpat123 avatar Dec 09 '22 08:12 patpatpat123

Is this still a problem with Spring Boot 3 and Micrometer Tracing? If it is please create an issue in Spring Framework instrumentaiton

marcingrzejszczak avatar Feb 09 '24 13:02 marcingrzejszczak