When openobserve as syslog server, tcp connection was closed repeatly almost every 60s
Which OpenObserve functionalities are the source of the bug?
ingestion
Is this a regression?
Yes
Description
When openobserve as syslog server, tcp connection was closed repeatly almost every 60s, client print these annoying logs. Even so, logs can still be collected.
Jul 26 15:34:16 DX-C9 rsyslogd: omfwd: remote server at 10.11.10.24:5514 seems to have closed connection. This often happens when the remote peer (or an interim system like a load balancer or firewall) shuts down or aborts a connection. Rsyslog will re-open the connection if configured to do so (we saw a generic IO Error, which usually goes along with that behaviour). [v8.2406.0 try https://www.rsyslog.com/e/2027 ]
Jul 26 15:34:16 DX-C9 rsyslogd: action 'action-2-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2406.0 try https://www.rsyslog.com/e/2007 ]
Jul 26 15:34:16 DX-C9 rsyslogd: action 'action-2-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2406.0 try https://www.rsyslog.com/e/2359 ]
Jul 26 15:34:16 DX-C9 rsyslogd: omfwd: remote server at 10.11.10.24:5514 seems to have closed connection. This often happens when the remote peer (or an interim system like a load balancer or firewall) shuts down or aborts a connection. Rsyslog will re-open the connection if configured to do so (we saw a generic IO Error, which usually goes along with that behaviour). [v8.2406.0 try https://www.rsyslog.com/e/2027 ]
Jul 26 15:34:16 DX-C9 rsyslogd: action 'action-2-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2406.0 try https://www.rsyslog.com/e/2007 ]
Jul 26 15:34:16 DX-C9 rsyslogd: action 'action-2-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2406.0 try https://www.rsyslog.com/e/2359 ]
Jul 26 15:34:16 DX-C9 rsyslogd: omfwd: remote server at 10.11.10.24:5514 seems to have closed connection. This often happens when the remote peer (or an interim system like a load balancer or firewall) shuts down or aborts a connection. Rsyslog will re-open the connection if configured to do so (we saw a generic IO Error, which usually goes along with that behaviour). [v8.2406.0 try https://www.rsyslog.com/e/2027 ]
Jul 26 15:34:16 DX-C9 rsyslogd: action 'action-2-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2406.0 try https://www.rsyslog.com/e/2007 ]
Jul 26 15:34:16 DX-C9 rsyslogd: action 'action-2-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2406.0 try https://www.rsyslog.com/e/2359 ]
Jul 26 15:35:16 DX-C9 rsyslogd: omfwd: remote server at 10.11.10.24:5514 seems to have closed connection. This often happens when the remote peer (or an interim system like a load balancer or firewall) shuts down or aborts a connection. Rsyslog will re-open the connection if configured to do so (we saw a generic IO Error, which usually goes along with that behaviour). [v8.2406.0 try https://www.rsyslog.com/e/2027 ]
Jul 26 15:35:16 DX-C9 rsyslogd: action 'action-2-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2406.0 try https://www.rsyslog.com/e/2007 ]
Jul 26 15:35:16 DX-C9 rsyslogd: action 'action-2-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2406.0 try https://www.rsyslog.com/e/2359 ]
Jul 26 15:36:16 DX-C9 rsyslogd: omfwd: remote server at 10.11.10.24:5514 seems to have closed connection. This often happens when the remote peer (or an interim system like a load balancer or firewall) shuts down or aborts a connection. Rsyslog will re-open the connection if configured to do so (we saw a generic IO Error, which usually goes along with that behaviour). [v8.2406.0 try https://www.rsyslog.com/e/2027 ]
Jul 26 15:36:16 DX-C9 rsyslogd: action 'action-2-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2406.0 try https://www.rsyslog.com/e/2007 ]
Jul 26 15:36:16 DX-C9 rsyslogd: action 'action-2-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2406.0 try https://www.rsyslog.com/e/2359 ]
Jul 26 15:36:16 DX-C9 rsyslogd: omfwd: remote server at 10.11.10.24:5514 seems to have closed connection. This often happens when the remote peer (or an interim system like a load balancer or firewall) shuts down or aborts a connection. Rsyslog will re-open the connection if configured to do so (we saw a generic IO Error, which usually goes along with that behaviour). [v8.2406.0 try https://www.rsyslog.com/e/2027 ]
Jul 26 15:36:16 DX-C9 rsyslogd: action 'action-2-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2406.0 try https://www.rsyslog.com/e/2007 ]
Jul 26 15:36:16 DX-C9 rsyslogd: action 'action-2-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2406.0 try https://www.rsyslog.com/e/2359 ]
Jul 26 15:37:16 DX-C9 rsyslogd: omfwd: remote server at 10.11.10.24:5514 seems to have closed connection. This often happens when the remote peer (or an interim system like a load balancer or firewall) shuts down or aborts a connection. Rsyslog will re-open the connection if configured to do so (we saw a generic IO Error, which usually goes along with that behaviour). [v8.2406.0 try https://www.rsyslog.com/e/2027 ]
Jul 26 15:37:16 DX-C9 rsyslogd: action 'action-2-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2406.0 try https://www.rsyslog.com/e/2007 ]
Jul 26 15:37:16 DX-C9 rsyslogd: action 'action-2-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2406.0 try https://www.rsyslog.com/e/2359 ]
Jul 26 15:37:16 DX-C9 rsyslogd: omfwd: remote server at 10.11.10.24:5514 seems to have closed connection. This often happens when the remote peer (or an interim system like a load balancer or firewall) shuts down or aborts a connection. Rsyslog will re-open the connection if configured to do so (we saw a generic IO Error, which usually goes along with that behaviour). [v8.2406.0 try https://www.rsyslog.com/e/2027 ]
Jul 26 15:37:16 DX-C9 rsyslogd: action 'action-2-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2406.0 try https://www.rsyslog.com/e/2007 ]
Jul 26 15:37:16 DX-C9 rsyslogd: action 'action-2-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2406.0 try https://www.rsyslog.com/e/2359 ]
When i select rsyslog as a syslog server on the same server 10.11.10.24:514, everything is ok(no annoying logs).
tcpdump show me, the connection was RST, so i think the reason may was the server side. no firewall or load balancer.
Please provide a link to a minimal reproduction of the bug
No response
Please provide the exception or error you saw
No response
Please provide the version you discovered this bug in (check about page for version information)
openobserve v0.10.8-rc5
os: Ubuntu 22.04.4 LTS
Anything else?
No response
I'm pretty sure this has to do with TCP keepalive, are you sending syslog during those 60seconds?
I'm pretty sure this has to do with TCP keepalive, are you sending syslog during those 60seconds?
Sorry to reply too late, sometime it has log during this 60, The result is still the sameγ
As can be seen from the above test, the same log is printed each time.
I am getting similar issuse when logging to OpenObserver as a syslog server running under Unraid 6.12.11, except I'm getting more frequent errors:
Aug 22 07:52:47 unraid01 rsyslogd: omfwd: remote server at 192.168.3.172:1514 seems to have closed connection. This often happens when the remote peer (or an interim system like a load balancer or firewall) shuts down or aborts a connection. Rsyslog will re-open the connection if configured to do so (we saw a generic IO Error, which usually goes along with that behaviour). [v8.2102.0 try https://www.rsyslog.com/e/2027 ] Aug 22 07:52:47 unraid01 rsyslogd: action 'action-2-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2102.0 try https://www.rsyslog.com/e/2007 ] Aug 22 07:52:47 unraid01 rsyslogd: action 'action-2-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2102.0 try https://www.rsyslog.com/e/2359 ] Aug 22 07:52:49 unraid01 kernel: eth0: renamed from veth42b5b63 Aug 22 07:52:49 unraid01 rsyslogd: omfwd: remote server at 192.168.3.172:1514 seems to have closed connection. This often happens when the remote peer (or an interim system like a load balancer or firewall) shuts down or aborts a connection. Rsyslog will re-open the connection if configured to do so (we saw a generic IO Error, which usually goes along with that behaviour). [v8.2102.0 try https://www.rsyslog.com/e/2027 ] Aug 22 07:52:49 unraid01 rsyslogd: action 'action-2-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2102.0 try https://www.rsyslog.com/e/2007 ] Aug 22 07:52:49 unraid01 rsyslogd: action 'action-2-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2102.0 try https://www.rsyslog.com/e/2359 ] Aug 22 07:52:49 unraid01 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): vethb277e8f: link becomes ready Aug 22 07:52:49 unraid01 kernel: docker0: port 2(vethb277e8f) entered blocking state Aug 22 07:52:49 unraid01 kernel: docker0: port 2(vethb277e8f) entered forwarding state Aug 22 07:53:12 unraid01 kernel: veth42b5b63: renamed from eth0 Aug 22 07:53:12 unraid01 rsyslogd: omfwd: remote server at 192.168.3.172:1514 seems to have closed connection. This often happens when the remote peer (or an interim system like a load balancer or firewall) shuts down or aborts a connection. Rsyslog will re-open the connection if configured to do so (we saw a generic IO Error, which usually goes along with that behaviour). [v8.2102.0 try https://www.rsyslog.com/e/2027 ] Aug 22 07:53:12 unraid01 rsyslogd: action 'action-2-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2102.0 try https://www.rsyslog.com/e/2007 ] Aug 22 07:53:12 unraid01 rsyslogd: action 'action-2-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2102.0 try https://www.rsyslog.com/e/2359 ] Aug 22 07:53:12 unraid01 kernel: docker0: port 2(vethb277e8f) entered disabled state Aug 22 07:53:12 unraid01 kernel: docker0: port 2(vethb277e8f) entered disabled state Aug 22 07:53:12 unraid01 kernel: device vethb277e8f left promiscuous mode Aug 22 07:53:12 unraid01 kernel: docker0: port 2(vethb277e8f) entered disabled state Aug 22 07:53:12 unraid01 rsyslogd: omfwd: remote server at 192.168.3.172:1514 seems to have closed connection. This often happens when the remote peer (or an interim system like a load balancer or firewall) shuts down or aborts a connection. Rsyslog will re-open the connection if configured to do so (we saw a generic IO Error, which usually goes along with that behaviour). [v8.2102.0 try https://www.rsyslog.com/e/2027 ] Aug 22 07:53:12 unraid01 rsyslogd: action 'action-2-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2102.0 try https://www.rsyslog.com/e/2007 ] Aug 22 07:53:12 unraid01 rsyslogd: action 'action-2-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2102.0 try https://www.rsyslog.com/e/2359 ]
@kmephistoh @doychi do you mind we give you a dev version for help testing the fix?
Hi,
I'd love to, but I'll have to work out how to get it installed under Unraid.
Thanks, Justin
Get Outlook for Androidhttps://aka.ms/AAb9ysg
From: Hengfei Yang @.> Sent: Tuesday, August 27, 2024 3:22:45 PM To: openobserve/openobserve @.> Cc: Doychi @.>; Mention @.> Subject: Re: [openobserve/openobserve] When openobserve as syslog server, tcp connection was closed repeatly almost every 60s (Issue #4067)
@kmephistohhttps://github.com/kmephistoh @doychihttps://github.com/doychi do you mind we give you a dev version for help testing the fix?
β Reply to this email directly, view it on GitHubhttps://github.com/openobserve/openobserve/issues/4067#issuecomment-2311596141, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACT6YFV3FDZBG5EPRSFNG6TZTQEKLAVCNFSM6AAAAABLQAI3BCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJRGU4TMMJUGE. You are receiving this because you were mentioned.Message ID: @.***>
@doychi we only have docker image for dev version. maybe let's wait for some days, we will release a binary version then i will back to you.
@kmephistoh @doychi do you mind we give you a dev version for help testing the fix? ok, i'm waiting here.
@kmephistoh can you try this version:
public.ecr.aws/zinclabs/openobserve-dev:v0.11.0-rc3-e7e01c5
public.ecr.aws/zinclabs/openobserve-dev:v0.11.0-rc3-e7e01c5
Can't download
@kmephistoh this is docker image, you can access it by browser, you can use
docker pull public.ecr.aws/zinclabs/openobserve-dev:v0.11.0-rc3-684dd39
or directly run it:
docker run -tdi --name o2-test -e ZO_ROOT_USER_EMAIL="[email protected]" -e ZO_ROOT_USER_PASSWORD="Complexpass#123" -e ZO_DATA_DIR="/data" -p 5080:5080 -p 5514:5514 public.ecr.aws/zinclabs/openobserve-dev:v0.11.0-rc3-684dd39
Please try this version, we added some debug logs.
I have installed the image and will keep an eye on it today and let you know how it goes.
The new image seems to have resolved the issue for me.
Can you let me know when this is likely to make it the latest stable release, so I can switch back to the stable line?
sure, will make release next week
@kmephistoh this is docker image, you can access it by browser, you can use
docker pull public.ecr.aws/zinclabs/openobserve-dev:v0.11.0-rc3-684dd39or directly run it:
docker run -tdi --name o2-test -e ZO_ROOT_USER_EMAIL="[email protected]" -e ZO_ROOT_USER_PASSWORD="Complexpass#123" -e ZO_DATA_DIR="/data" -p 5080:5080 -p 5514:5514 public.ecr.aws/zinclabs/openobserve-dev:v0.11.0-rc3-684dd39Please try this version, we added some debug logs.
I try this version, the issue's problem was resolved. But other normal log are loss sometime. Thank you for your help
We released v0.11.0, both of you can test it now.
We released
v0.11.0, both of you can test it now.
The problem was resolved, thanks for your help.