pd icon indicating copy to clipboard operation
pd copied to clipboard

tests: remove leak option

Open HuSharp opened this issue 1 year ago • 4 comments

What problem does this PR solve?

Issue Number: Close #7782

What is changed and how does it work?

for example, we will meet goroutine leak which top stack is runtime_pollWait which resulted from dashboard

Goroutine 1362 in state IO wait, with internal/poll.runtime_pollWait on top of the stack:
goroutine 1362 [IO wait]:
internal/poll.runtime_pollWait(0x14dc55908, 0x72)
	/opt/homebrew/opt/go/libexec/src/runtime/netpoll.go:343 +0xa0
internal/poll.(*pollDesc).wait(0x14008d8e580?, 0x0?, 0x0)
	/opt/homebrew/opt/go/libexec/src/internal/poll/fd_poll_runtime.go:84 +0x28
internal/poll.(*pollDesc).waitRead(...)
	/opt/homebrew/opt/go/libexec/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0x14008d8e580)
	/opt/homebrew/opt/go/libexec/src/internal/poll/fd_unix.go:611 +0x250
net.(*netFD).accept(0x14008d8e580)
	/opt/homebrew/opt/go/libexec/src/net/fd_unix.go:172 +0x28
net.(*TCPListener).accept(0x140088d1840)
	/opt/homebrew/opt/go/libexec/src/net/tcpsock_posix.go:152 +0x28
net.(*TCPListener).Accept(0x140088d1840)
	/opt/homebrew/opt/go/libexec/src/net/tcpsock.go:315 +0x2c
github.com/pingcap/tidb-dashboard/pkg/tidb.(*proxy).run(0x14008d9e1e0, {0x104f79778?, 0x14007fd5c20})
	/Users/pingcap/go/pkg/mod/github.com/pingcap/[email protected]/pkg/tidb/proxy.go:227 +0x37c
created by github.com/pingcap/tidb-dashboard/pkg/tidb.(*Forwarder).Start in goroutine 1296
	/Users/pingcap/go/pkg/mod/github.com/pingcap/[email protected]/pkg/tidb/forwarder.go:57 +0x1d4

besides we will meet goroutine leak error which top stack is runtime_pollWait as well, but the root case is go.etcd.io/etcd/pkg/transport.timeoutConn.Read which is different with dashboard

internal/poll.runtime_pollWait(0x10f8cc2a0, 0x72)
	/opt/homebrew/opt/go/libexec/src/runtime/netpoll.go:343 +0xa0
internal/poll.(*pollDesc).wait(0x140016ac400?, 0x1400159a000?, 0x0)
	/opt/homebrew/opt/go/libexec/src/internal/poll/fd_poll_runtime.go:84 +0x28
internal/poll.(*pollDesc).waitRead(...)
	/opt/homebrew/opt/go/libexec/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0x140016ac400, {0x1400159a000, 0x1000, 0x1000})
	/opt/homebrew/opt/go/libexec/src/internal/poll/fd_unix.go:164 +0x200
net.(*netFD).Read(0x140016ac400, {0x1400159a000?, 0x14000436160?, 0x140013dcc60?})
	/opt/homebrew/opt/go/libexec/src/net/fd_posix.go:55 +0x28
net.(*conn).Read(0x140016b81c0, {0x1400159a000?, 0x14001141bd8?, 0x1?})
	/opt/homebrew/opt/go/libexec/src/net/net.go:179 +0x34
go.etcd.io/etcd/pkg/transport.timeoutConn.Read({{0x106cf7120?, 0x140016b81c0?}, 0x14001141c78?, 0x1041334cc?}, {0x1400159a000?, 0x104133074?, 0x1400039b068?})
	/Users/pingcap/go/pkg/mod/go.etcd.io/[email protected]/pkg/transport/timeout_conn.go:43 +0xa8
net/http.(*persistConn).Read(0x140013dcc60, {0x1400159a000?, 0x104133570?, 0x140013caf00?})
	/opt/homebrew/opt/go/libexec/src/net/http/transport.go:1954 +0x50
bufio.(*Reader).fill(0x14001073d40)
	/opt/homebrew/opt/go/libexec/src/bufio/bufio.go:113 +0xf8
bufio.(*Reader).Peek(0x14001073d40, 0x1)
	/opt/homebrew/opt/go/libexec/src/bufio/bufio.go:151 +0x60
net/http.(*persistConn).readLoop(0x140013dcc60)
	/opt/homebrew/opt/go/libexec/src/net/http/transport.go:2118 +0x14c
created by net/http.(*Transport).dialConn in goroutine 316
	/opt/homebrew/opt/go/libexec/src/net/http/transport.go:1776 +0x1144

When we use top stack to ignore, it results in dashboard errors not being exposed. A better way to treat these two issues is to:

  • wait for the etcd timeout (because the goroutine is still exiting)
  • Troubleshooting the dashboard So we need a more fine-grained to check

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Code changes

Release note

None.

HuSharp avatar Jan 30 '24 06:01 HuSharp

[REVIEW NOTIFICATION]

This pull request has not been approved.

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment. After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review. Reviewer can cancel approval by submitting a request changes review.

ti-chi-bot[bot] avatar Jan 30 '24 06:01 ti-chi-bot[bot]

Skipping CI for Draft Pull Request. If you want CI signal for your change, please convert it to an actual PR. You can still manually trigger a test run with /test all

ti-chi-bot[bot] avatar Jan 30 '24 06:01 ti-chi-bot[bot]

Codecov Report

Merging #7777 (654845c) into master (f0699ba) will increase coverage by 0.14%. Report is 71 commits behind head on master. The diff coverage is 55.69%.

:exclamation: Current head 654845c differs from pull request most recent head b97929a. Consider uploading reports for the commit b97929a to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7777      +/-   ##
==========================================
+ Coverage   73.45%   73.59%   +0.14%     
==========================================
  Files         432      433       +1     
  Lines       47843    47871      +28     
==========================================
+ Hits        35142    35230      +88     
+ Misses       9663     9620      -43     
+ Partials     3038     3021      -17     
Flag Coverage Δ
unittests 73.59% <55.69%> (+0.14%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

codecov[bot] avatar Jan 31 '24 13:01 codecov[bot]

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

ti-chi-bot[bot] avatar Mar 29 '24 18:03 ti-chi-bot[bot]