fluent-bit icon indicating copy to clipboard operation
fluent-bit copied to clipboard

multiline parser does not concat messages consistently

Open ben-dov opened this issue 1 month ago • 13 comments

Bug Report

Describe the bug

Fluent-bit configuration with multiline parser in k8s does not concat messages consistently. I am comparing the output to a fluentd daemonset, and do not have the same issues. The weird thing is that its very inconsistent, sometime the logs are merged fully or partially, but mostly it does not at all, even when the log is the same format.

To Reproduce I have added configuration snippets related to the issue.

[Service]
    Log_Level    info
    Parsers_File    /fluent-bit/etc/parsers.conf

[Input]
    Name    tail
    Path    /var/log/containers/*.log
    DB    /fluent-bit/tail/pos.db
    Mem_Buf_Limit    50MB
    Tag    kube.*
    multiline.parser    kubernetes-multiline-parser, cri
    Read_from_Head    true

[MULTILINE_PARSER]
    Name    kubernetes-multiline-parser
    Type    regex
    Parser    cri
    Key_Content    message
    Flush_Timeout    15000
    Rule    "start_state" "/^(({\"timestamp\": \")?[0-9]{2,4}(\/|-)[0-9]{2}(\/|-)[0-9]{2})|^({).*/" "cont"
    Rule    "cont" "/^(?!(({\"timestamp\": \")?[0-9]{2,4}(\/|-)[0-9]{2}(\/|-)[0-9]{2})|^({)).*/" "cont"

Expected behavior

I expect the messages to combine correctly

Your Environment

  • Version used: 4.2.0 and 4.1.1
  • Configuration: deamonset (using operator) that send to fluentd
  • Environment name and version: K8S 1.34

ben-dov avatar Nov 26 '25 16:11 ben-dov

Can you provide examples that do not work or a way to reproduce it?

patrick-stephens avatar Nov 27 '25 14:11 patrick-stephens

OpenJDK 64-Bit Server VM warning: Cannot open file /data/logs/mexico_risk_service_api/gc.vgc due to No such file or directory
Listening for transport dt_socket at address: 8000
  .   ____          _            __ _ _
 /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
  '  |____| .__|_| |_|_| |_\__, | / / / /
 =========|_|==============|___/=/_/_/_/
 :: Spring Boot ::       (v2.3.12.RELEASE)
2025-11-28 19:22:10.615 [main] INFO  com.tanzhishuju.CourtApplication - Starting CourtApplication v1.0 on badasset-ah-court-service-api-556468498d-799ml with PID 1 (/root/app.jar started by root in /)
2025-11-28 19:22:10.636 [main] DEBUG com.tanzhishuju.CourtApplication - Running with Spring Boot v2.3.12.RELEASE, Spring v5.2.15.RELEASE
2025-11-28 19:22:10.637 [main] INFO  com.tanzhishuju.CourtApplication - The following profiles are active: staging
2025-11-28 19:22:19.592 [main] INFO  c.g.l.r.s.boot.core.RetrofitClientScannerRegistrar - Scan the @RetrofitClient annotated interface using the @RetrofitScan configuration. packages=[com.tanzhishuju.client]
2025-11-28 19:22:22.759 [main] INFO  o.s.d.r.config.RepositoryConfigurationDelegate - Multiple Spring Data modules found, entering strict repository configuration mode!
2025-11-28 19:22:22.779 [main] INFO  o.s.d.r.config.RepositoryConfigurationDelegate - Bootstrapping Spring Data Redis repositories in DEFAULT mode.
2025-11-28 19:22:23.001 [main] INFO  o.s.d.r.config.RepositoryConfigurationDelegate - Finished Spring Data repository scanning in 125ms. Found 0 Redis repository interfaces.
2025-11-28 19:22:25.031 [main] INFO  o.s.c.s.PostProcessorRegistrationDelegate$BeanPostProcessorChecker - Bean 'org.springframework.retry.annotation.RetryConfiguration' of type [org.springframework.retry.annotation.RetryConfiguration] is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying)
2025-11-28 19:22:34.872 [main] INFO  o.s.boot.web.embedded.tomcat.TomcatWebServer - Tomcat initialized with port(s): 7086 (http)
2025-11-28 19:22:34.970 [main] INFO  org.apache.catalina.core.StandardService - Starting service [Tomcat]
2025-11-28 19:22:34.971 [main] INFO  org.apache.catalina.core.StandardEngine - Starting Servlet engine: [Apache Tomcat/9.0.46]
2025-11-28 19:22:35.572 [main] INFO  o.a.c.core.ContainerBase.[Tomcat].[localhost].[/] - Initializing Spring embedded WebApplicationContext
2025-11-28 19:22:35.615 [main] INFO  o.s.b.w.s.c.ServletWebServerApplicationContext - Root WebApplicationContext: initialization completed in 24028 ms
2025-11-28 19:22:43.368 [main] INFO  org.redisson.Version - Redisson 3.18.0
2025-11-28 19:22:45.125 [redisson-netty-2-5] INFO  o.r.connection.pool.MasterPubSubConnectionPool - 1 connections initialized for 192.168.199.234/192.168.199.234:6379
2025-11-28 19:22:45.267 [redisson-netty-2-19] INFO  org.redisson.connection.pool.MasterConnectionPool - 8 connections initialized for 192.168.199.234/192.168.199.234:6379
2025-11-28 19:22:46.247 [main] INFO  com.tanzhishuju.helper.ProxyIpHelper - 创建OkHttpClient,代理IP: i226.kdltps.com
2025-11-28 19:22:47.118 [main] INFO  com.tanzhishuju.helper.ProxyIpHelper - SSLSocketFactory创建成功
2025-11-28 19:22:47.321 [main] INFO  com.tanzhishuju.helper.ProxyIpHelper - SSL配置已添加到OkHttpClient
2025-11-28 19:22:47.322 [main] INFO  com.tanzhishuju.helper.ProxyIpHelper - OkHttpClient创建完成
2025-11-28 19:22:47.322 [main] INFO  com.tanzhishuju.helper.ProxyIpHelper - 创建OkHttpClient,代理IP: i225.kdltps.com
2025-11-28 19:22:47.368 [main] INFO  com.tanzhishuju.helper.ProxyIpHelper - SSLSocketFactory创建成功
2025-11-28 19:22:47.369 [main] INFO  com.tanzhishuju.helper.ProxyIpHelper - SSL配置已添加到OkHttpClient
2025-11-28 19:22:47.369 [main] INFO  com.tanzhishuju.helper.ProxyIpHelper - OkHttpClient创建完成
 _ _   |_  _ _|_. ___ _ |    _ 
| | |\/|_)(_| | |_\  |_)||_|_\ 
     /               |         
                        3.4.1 
2025-11-28 19:23:16.548 [main] INFO  o.s.b.actuate.endpoint.web.EndpointLinksResolver - Exposing 2 endpoint(s) beneath base path '/actuator'
2025-11-28 19:23:17.417 [main] INFO  s.d.s.w.PropertySourcedRequestMappingHandlerMapping - Mapped URL path [/v2/api-docs] onto method [springfox.documentation.swagger2.web.Swagger2Controller#getDocumentation(String, HttpServletRequest)]
2025-11-28 19:23:18.954 [main] INFO  o.s.scheduling.concurrent.ThreadPoolTaskExecutor - Initializing ExecutorService 'applicationTaskExecutor'
2025-11-28 19:23:22.426 [main] INFO  o.s.boot.web.embedded.tomcat.TomcatWebServer - Tomcat started on port(s): 7086 (http) with context path ''
2025-11-28 19:23:22.429 [main] INFO  s.d.s.web.plugins.DocumentationPluginsBootstrapper - Context refreshed
2025-11-28 19:23:22.519 [main] INFO  s.d.s.web.plugins.DocumentationPluginsBootstrapper - Found 1 custom documentation plugin(s)
2025-11-28 19:23:22.820 [main] INFO  s.d.spring.web.scanners.ApiListingReferenceScanner - Scanning for api listing references
2025-11-28 19:23:24.948 [main] INFO  s.d.s.w.r.operation.CachingOperationNameGenerator - Generating unique operation named: handleUsingGET_1
2025-11-28 19:23:24.954 [main] INFO  s.d.s.w.r.operation.CachingOperationNameGenerator - Generating unique operation named: handleUsingGET_2
2025-11-28 19:23:25.221 [main] INFO  com.tanzhishuju.CourtApplication - Started CourtApplication in 79.187 seconds (JVM running for 84.782)
2025-11-28 19:23:31.921 [http-nio-7086-exec-1] INFO  o.a.c.core.ContainerBase.[Tomcat].[localhost].[/] - Initializing Spring DispatcherServlet 'dispatcherServlet'
2025-11-28 19:23:31.922 [http-nio-7086-exec-1] INFO  org.springframework.web.servlet.DispatcherServlet - Initializing Servlet 'dispatcherServlet'
2025-11-28 19:23:31.965 [http-nio-7086-exec-1] INFO  org.springframework.web.servlet.DispatcherServlet - Completed initialization in 40 ms
2025-11-28 19:23:32.056 [http-nio-7086-exec-1] INFO  c.tanzhishuju.interceptor.UserIdContextInterceptor - UserIdContextInterceptor URI: /health-check, Query String: null
2025-11-28 19:23:32.056 [http-nio-7086-exec-1] INFO  c.tanzhishuju.interceptor.UserIdContextInterceptor - Got userId from parameter: null
2025-11-28 19:23:32.057 [http-nio-7086-exec-1] WARN  c.tanzhishuju.interceptor.UserIdContextInterceptor - No userId found in request for URI: /health-check
2025-11-28 19:23:32.084 [http-nio-7086-exec-1] ERROR o.a.c.c.C.[.[localhost].[/].[dispatcherServlet] - Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is com.tanzhishuju.common.exception.BusinessException: No userId found in request] with root cause
com.tanzhishuju.common.exception.BusinessException: No userId found in request
	at com.tanzhishuju.interceptor.UserIdContextInterceptor.preHandle(UserIdContextInterceptor.java:34)
	at org.springframework.web.servlet.HandlerExecutionChain.applyPreHandle(HandlerExecutionChain.java:151)
	at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1035)
	at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:943)
	at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006)
	at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:898)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:626)
	at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:883)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:733)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:227)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
	at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
	at com.github.xiaoymin.swaggerbootstrapui.filter.SecurityBasicAuthFilter.doFilter(SecurityBasicAuthFilter.java:80)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
	at com.github.xiaoymin.swaggerbootstrapui.filter.ProductionSecurityFilter.doFilter(ProductionSecurityFilter.java:53)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
	at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
	at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)

guilinonline avatar Nov 28 '25 12:11 guilinonline

I’m also testing the multiline log collection feature, but I haven’t succeeded yet.

guilinonline avatar Nov 28 '25 12:11 guilinonline

These are not the actual kubelet logs though, this is an application log - FB will be reading the kubelet logs not the output of the pod via kubectl logs. It's vital you look at the actual raw logs on disk, not the output you think your pod is generating.

Can you reduce it down to just the input that's failing and the expected vs actual output? Or are you saying that whole log file fails in some way?

    multiline.parser    kubernetes-multiline-parser, cri

Note that parsers are applied in order so if any of them succeed then later ones will not be used. I don't think this is what you want: you likely first want to cope with the CRI format (the kubelet may decide to break your logs however it wants) then have a later multiline filter/processor do the actual recombination of your application logs, e.g. https://www.fluent.do/blog/kubernetes-logging#application-specific-parsing.

tail with cri --> multiline processor/filter -->

Maybe how the kubelet splits the logs is what is then triggering your "random" failures.

[MULTILINE_PARSER]
    Name    kubernetes-multiline-parser
    Type    regex
    Parser    cri

You seem to be mixing cri usage in there with some other non-multiline cri usage too. I think this is confusing things a bit with my previous comment too.

patrick-stephens avatar Nov 28 '25 12:11 patrick-stephens

In the latest version, this syntax is theoretically valid: multiline.parser kubernetes-multiline-parser, cri But in my testing, it didn’t work as expected. I noticed that when querying logs in VictoriaLogs, the Java stack trace errors weren’t being rendered as a single line. I reviewed the multiline processing documentation on the official Fluent Bit website thoroughly, and neither Fluent Bit’s built-in multiline processors nor my custom configurations produced the desired result. Please see the code snippet below.

[INPUT]
        Name              tail
        Tag               kube.*
        Path              /var/log/containers/*.log
        DB                /var/log/flb_kube.db
        multiline.parser  multiline-regex-java,cri
        multiline_flush   5
        Mem_Buf_Limit     10MB
        Skip_empty_lines  On
        Refresh_Interval  5
        read_from_head    true

[MULTILINE_PARSER]
        name          multiline-regex-java
        type          regex
        flush_timeout 1000
        Rule          "start_state"   "^\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2}"  "cont"
        Rule          "cont"          "^\\s+(at |Caused by: |\\.{3})"               "cont"

=======================================================================

fluent bit built-in:

[INPUT]
        Name              tail
        Tag               kube.*
        Path              /var/log/containers/*.log
        DB                /var/log/flb_kube.db
        multiline.parser  java,cri
        multiline_flush   5
        Mem_Buf_Limit     10MB
        Skip_empty_lines  On
        Refresh_Interval  5
        read_from_head    true

guilinonline avatar Nov 28 '25 15:11 guilinonline

[OUTPUT]
        Name http
        Match kube.*
        host 192.168.198.5
        port 9428
        compress gzip
        uri /insert/jsonline?_stream_fields=stream&_msg_field=log&_time_field=time
        format json_lines
        json_date_format iso8601
        header AccountID 0
        header ProjectID 0

guilinonline avatar Nov 28 '25 15:11 guilinonline

What's your actual input log file though? That's what you need to check against, not anything else where assumptions are made. If we have the input log file then it should be consistently failing.

Trying to debug via another stack at the end will be difficult.

patrick-stephens avatar Nov 28 '25 16:11 patrick-stephens

OpenJDK 64-Bit Server VM warning: Cannot open file /data/logs/mexico_risk_service_api/gc.vgc due to No such file or directory
Listening for transport dt_socket at address: 8000
  .   ____          _            __ _ _
 /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
  '  |____| .__|_| |_|_| |_\__, | / / / /
 =========|_|==============|___/=/_/_/_/
 :: Spring Boot ::       (v2.3.12.RELEASE)
2025-11-28 19:22:10.615 [main] INFO  com.tanzhishuju.CourtApplication - Starting CourtApplication v1.0 on badasset-ah-court-service-api-556468498d-799ml with PID 1 (/root/app.jar started by root in /)
2025-11-28 19:22:10.636 [main] DEBUG com.tanzhishuju.CourtApplication - Running with Spring Boot v2.3.12.RELEASE, Spring v5.2.15.RELEASE
2025-11-28 19:22:10.637 [main] INFO  com.tanzhishuju.CourtApplication - The following profiles are active: staging
2025-11-28 19:22:19.592 [main] INFO  c.g.l.r.s.boot.core.RetrofitClientScannerRegistrar - Scan the @RetrofitClient annotated interface using the @RetrofitScan configuration. packages=[com.tanzhishuju.client]
2025-11-28 19:22:22.759 [main] INFO  o.s.d.r.config.RepositoryConfigurationDelegate - Multiple Spring Data modules found, entering strict repository configuration mode!
2025-11-28 19:22:22.779 [main] INFO  o.s.d.r.config.RepositoryConfigurationDelegate - Bootstrapping Spring Data Redis repositories in DEFAULT mode.
2025-11-28 19:22:23.001 [main] INFO  o.s.d.r.config.RepositoryConfigurationDelegate - Finished Spring Data repository scanning in 125ms. Found 0 Redis repository interfaces.
2025-11-28 19:22:25.031 [main] INFO  o.s.c.s.PostProcessorRegistrationDelegate$BeanPostProcessorChecker - Bean 'org.springframework.retry.annotation.RetryConfiguration' of type [org.springframework.retry.annotation.RetryConfiguration] is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying)
2025-11-28 19:22:34.872 [main] INFO  o.s.boot.web.embedded.tomcat.TomcatWebServer - Tomcat initialized with port(s): 7086 (http)
2025-11-28 19:22:34.970 [main] INFO  org.apache.catalina.core.StandardService - Starting service [Tomcat]
2025-11-28 19:22:34.971 [main] INFO  org.apache.catalina.core.StandardEngine - Starting Servlet engine: [Apache Tomcat/9.0.46]
2025-11-28 19:22:35.572 [main] INFO  o.a.c.core.ContainerBase.[Tomcat].[localhost].[/] - Initializing Spring embedded WebApplicationContext
2025-11-28 19:22:35.615 [main] INFO  o.s.b.w.s.c.ServletWebServerApplicationContext - Root WebApplicationContext: initialization completed in 24028 ms
2025-11-28 19:22:43.368 [main] INFO  org.redisson.Version - Redisson 3.18.0
2025-11-28 19:22:45.125 [redisson-netty-2-5] INFO  o.r.connection.pool.MasterPubSubConnectionPool - 1 connections initialized for 192.168.199.234/192.168.199.234:6379
2025-11-28 19:22:45.267 [redisson-netty-2-19] INFO  org.redisson.connection.pool.MasterConnectionPool - 8 connections initialized for 192.168.199.234/192.168.199.234:6379
2025-11-28 19:22:46.247 [main] INFO  com.tanzhishuju.helper.ProxyIpHelper - 创建OkHttpClient,代理IP: i226.kdltps.com
2025-11-28 19:22:47.118 [main] INFO  com.tanzhishuju.helper.ProxyIpHelper - SSLSocketFactory创建成功
2025-11-28 19:22:47.321 [main] INFO  com.tanzhishuju.helper.ProxyIpHelper - SSL配置已添加到OkHttpClient
2025-11-28 19:22:47.322 [main] INFO  com.tanzhishuju.helper.ProxyIpHelper - OkHttpClient创建完成
2025-11-28 19:22:47.322 [main] INFO  com.tanzhishuju.helper.ProxyIpHelper - 创建OkHttpClient,代理IP: i225.kdltps.com
2025-11-28 19:22:47.368 [main] INFO  com.tanzhishuju.helper.ProxyIpHelper - SSLSocketFactory创建成功
2025-11-28 19:22:47.369 [main] INFO  com.tanzhishuju.helper.ProxyIpHelper - SSL配置已添加到OkHttpClient
2025-11-28 19:22:47.369 [main] INFO  com.tanzhishuju.helper.ProxyIpHelper - OkHttpClient创建完成
 _ _   |_  _ _|_. ___ _ |    _ 
| | |\/|_)(_| | |_\  |_)||_|_\ 
     /               |         
                        3.4.1 
2025-11-28 19:23:16.548 [main] INFO  o.s.b.actuate.endpoint.web.EndpointLinksResolver - Exposing 2 endpoint(s) beneath base path '/actuator'
2025-11-28 19:23:17.417 [main] INFO  s.d.s.w.PropertySourcedRequestMappingHandlerMapping - Mapped URL path [/v2/api-docs] onto method [springfox.documentation.swagger2.web.Swagger2Controller#getDocumentation(String, HttpServletRequest)]
2025-11-28 19:23:18.954 [main] INFO  o.s.scheduling.concurrent.ThreadPoolTaskExecutor - Initializing ExecutorService 'applicationTaskExecutor'
2025-11-28 19:23:22.426 [main] INFO  o.s.boot.web.embedded.tomcat.TomcatWebServer - Tomcat started on port(s): 7086 (http) with context path ''
2025-11-28 19:23:22.429 [main] INFO  s.d.s.web.plugins.DocumentationPluginsBootstrapper - Context refreshed
2025-11-28 19:23:22.519 [main] INFO  s.d.s.web.plugins.DocumentationPluginsBootstrapper - Found 1 custom documentation plugin(s)
2025-11-28 19:23:22.820 [main] INFO  s.d.spring.web.scanners.ApiListingReferenceScanner - Scanning for api listing references
2025-11-28 19:23:24.948 [main] INFO  s.d.s.w.r.operation.CachingOperationNameGenerator - Generating unique operation named: handleUsingGET_1
2025-11-28 19:23:24.954 [main] INFO  s.d.s.w.r.operation.CachingOperationNameGenerator - Generating unique operation named: handleUsingGET_2
2025-11-28 19:23:25.221 [main] INFO  com.tanzhishuju.CourtApplication - Started CourtApplication in 79.187 seconds (JVM running for 84.782)
2025-11-28 19:23:31.921 [http-nio-7086-exec-1] INFO  o.a.c.core.ContainerBase.[Tomcat].[localhost].[/] - Initializing Spring DispatcherServlet 'dispatcherServlet'
2025-11-28 19:23:31.922 [http-nio-7086-exec-1] INFO  org.springframework.web.servlet.DispatcherServlet - Initializing Servlet 'dispatcherServlet'
2025-11-28 19:23:31.965 [http-nio-7086-exec-1] INFO  org.springframework.web.servlet.DispatcherServlet - Completed initialization in 40 ms
2025-11-28 19:23:32.056 [http-nio-7086-exec-1] INFO  c.tanzhishuju.interceptor.UserIdContextInterceptor - UserIdContextInterceptor URI: /health-check, Query String: null
2025-11-28 19:23:32.056 [http-nio-7086-exec-1] INFO  c.tanzhishuju.interceptor.UserIdContextInterceptor - Got userId from parameter: null
2025-11-28 19:23:32.057 [http-nio-7086-exec-1] WARN  c.tanzhishuju.interceptor.UserIdContextInterceptor - No userId found in request for URI: /health-check
2025-11-28 19:23:32.084 [http-nio-7086-exec-1] ERROR o.a.c.c.C.[.[localhost].[/].[dispatcherServlet] - Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is com.tanzhishuju.common.exception.BusinessException: No userId found in request] with root cause
com.tanzhishuju.common.exception.BusinessException: No userId found in request
	at com.tanzhishuju.interceptor.UserIdContextInterceptor.preHandle(UserIdContextInterceptor.java:34)
	at org.springframework.web.servlet.HandlerExecutionChain.applyPreHandle(HandlerExecutionChain.java:151)
	at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1035)
	at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:943)
	at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006)
	at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:898)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:626)
	at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:883)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:733)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:227)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
	at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
	at com.github.xiaoymin.swaggerbootstrapui.filter.SecurityBasicAuthFilter.doFilter(SecurityBasicAuthFilter.java:80)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
	at com.github.xiaoymin.swaggerbootstrapui.filter.ProductionSecurityFilter.doFilter(ProductionSecurityFilter.java:53)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
	at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
	at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)

The above shows the log format. My environment is based on Kubernetes (k8s) and containerd. I have mounted the log directories generated by pods to the Fluent Bit container directory, allowing it to read these log files directly.

guilinonline avatar Nov 29 '25 02:11 guilinonline

That is not your actual log file on disk though, that's what you need to check and see if your parser is failing there - maybe the kubelet is splitting it in some fashion as well.

patrick-stephens avatar Dec 01 '25 10:12 patrick-stephens

hey @patrick-stephens sorry for the late response, I will probably be able to add more info next week.

ben-dov avatar Dec 01 '25 13:12 ben-dov

Good day @patrick-stephens, here is an example log that failed to combine (copied from the container log by opening with less)

2025-12-01T19:24:21.152992159Z stderr F 2025-12-01 19:24:21.151+0000 00022 E C                       test xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-xxxxxxxxxxxxxxxx <ERROR> Exception in C stage which allows continuing on failure
2025-12-01T19:24:21.153021784Z stderr F Traceback (most recent call last):
2025-12-01T19:24:21.153025326Z stderr F   File "/opt/A/B/C/__main__.py", line 370, in async_main
2025-12-01T19:24:21.153028665Z stderr F     await _execute_stage_increase_timeout_if_needed(
2025-12-01T19:24:21.153031911Z stderr F   File "/opt/A/B/C/__main__.py", line 435, in _execute_stage_increase_timeout_if_needed
2025-12-01T19:24:21.153034347Z stderr F     async for attempt in AsyncRetrying(
2025-12-01T19:24:21.15303714Z stderr F   File "/usr/lib/python3.11/site-packages/tenacity/asyncio/__init__.py", line 166, in __anext__
2025-12-01T19:24:21.153039617Z stderr F     do = await self.iter(retry_state=self._retry_state)
2025-12-01T19:24:21.153041934Z stderr F          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-12-01T19:24:21.153044404Z stderr F   File "/usr/lib/python3.11/site-packages/tenacity/asyncio/__init__.py", line 153, in iter
2025-12-01T19:24:21.153046665Z stderr F     result = await action(retry_state)
2025-12-01T19:24:21.153049101Z stderr F              ^^^^^^^^^^^^^^^^^^^^^^^^^
2025-12-01T19:24:21.153051474Z stderr F   File "/usr/lib/python3.11/site-packages/tenacity/_utils.py", line 99, in inner
2025-12-01T19:24:21.153054229Z stderr F     return call(*args, **kwargs)
2025-12-01T19:24:21.153056513Z stderr F            ^^^^^^^^^^^^^^^^^^^^^
2025-12-01T19:24:21.153058786Z stderr F   File "/usr/lib/python3.11/site-packages/tenacity/__init__.py", line 400, in <lambda>
2025-12-01T19:24:21.153061025Z stderr F     self._add_action_func(lambda rs: rs.outcome.result())
2025-12-01T19:24:21.15306344Z stderr F                                      ^^^^^^^^^^^^^^^^^^^
2025-12-01T19:24:21.153065793Z stderr F   File "/usr/lib/python3.11/concurrent/futures/_base.py", line 449, in result
2025-12-01T19:24:21.153068256Z stderr F     return self.__get_result()
2025-12-01T19:24:21.153070606Z stderr F            ^^^^^^^^^^^^^^^^^^^
2025-12-01T19:24:21.153072918Z stderr F   File "/usr/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
2025-12-01T19:24:21.153075101Z stderr F     raise self._exception
2025-12-01T19:24:21.153077486Z stderr F   File "/opt/A/B/C/__main__.py", line 442, in _execute_stage_increase_timeout_if_needed
2025-12-01T19:24:21.153079826Z stderr F     await stage.create_engine_and_run(parallel_partition_params)
2025-12-01T19:24:21.153082184Z stderr F   File "/opt/A/B/C/stage.py", line 402, in create_engine_and_run
2025-12-01T19:24:21.153084688Z stderr F     await self.run()
2025-12-01T19:24:21.153087172Z stderr F   File "/opt/A/B/C/stages/recalls.py", line 285, in run
2025-12-01T19:24:21.153091415Z stderr F     await self.process_partitions_in_parallel(
2025-12-01T19:24:21.153094054Z stderr F   File "/opt/A/B/C/stage.py", line 213, in process_partitions_in_parallel
2025-12-01T19:24:21.153096266Z stderr F     raise RuntimeError(
2025-12-01T19:24:21.153098596Z stderr F RuntimeError: 1 parallel partition subprocess failed with error codes: {4} (__main__.py:378)

ben-dov avatar Dec 01 '25 19:12 ben-dov

hey @patrick-stephens, were you able to replicate the issue on your end?

ben-dov avatar Dec 10 '25 18:12 ben-dov

I am not looking at this I'm afraid, was just trying to triage it with enough information to replicate it.

patrick-stephens avatar Dec 11 '25 10:12 patrick-stephens