go-control-plane icon indicating copy to clipboard operation
go-control-plane copied to clipboard

server: use separate goroutines for sotw bidi streams (#530)

Open rueian opened this issue 3 years ago • 21 comments

To address the #530, I follow the #451 and separate goroutines for both bidi streams.

I also reuse the same chan cache.Response for all watches and remove the reflection.

rueian avatar Dec 21 '21 09:12 rueian

@rueian can you rebase your branch of what was merged? The PR this was based off of is now in main

alecholmez avatar Jan 06 '22 20:01 alecholmez

@rueian can you rebase your branch of what was merged? The PR this was based off of is now in main

Sure, I have rebased it. Thanks.

rueian avatar Jan 07 '22 03:01 rueian

Hi @alecholmez, could you review this if you have free time?

rueian avatar Jan 10 '22 10:01 rueian

Hi @alecholmez, could you review this if you have free time?

Yes sorry will review this now

alecholmez avatar Jan 11 '22 20:01 alecholmez

Is there any other way we can approach a fix for the deadlock in linear cache without adding goroutines?

currently every resource update causes a response to be sent via sending on the responder channels, we could batch and coalesce updates with a timer to ensure multiple updates wont be blocked etc. before a response is actually sent

sunjayBhatia avatar Jan 12 '22 21:01 sunjayBhatia

also would be nice to have a test that demonstrates this issue, with the new refactor as well, having a harder time piecing together the sequence of events, honestly havent used/looked at the linearcache much myself

sunjayBhatia avatar Jan 12 '22 22:01 sunjayBhatia

Hi @sunjayBhatia,

Demonstrating the issue is not easy and it is kind of flaky. However, I have still tried to make a test case to do that.

The TestSOTWLinearCacheIntegrationDeadLock test case just tries to simulate what could happen to the LinearCache and SOTW server concurrently with many iterations. And I never pass the test with previous version of SOTW server.

Please let me know what you think about the test.

rueian avatar Jan 13 '22 16:01 rueian

taking a look 👍🏽

sunjayBhatia avatar Jan 13 '22 20:01 sunjayBhatia

Hi @alecholmez, @sunjayBhatia

Is there any thing I should do to let this PR be merged?

rueian avatar Jan 20 '22 16:01 rueian

@snowp can you actually give this a look too? I'd like more eyes on this

alecholmez avatar Feb 05 '22 05:02 alecholmez

I was wondering why the delta server won't be deadlocked when I wrote the TestSOTWLinearCacheIntegrationDeadLock test case. And it turns out that the delta server also uses separated goroutine: https://github.com/envoyproxy/go-control-plane/blob/main/pkg/server/delta/v3/server.go#L176-L181

rueian avatar Feb 14 '22 07:02 rueian

ping @snowp

alecholmez avatar Mar 04 '22 15:03 alecholmez

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

github-actions[bot] avatar Apr 03 '22 16:04 github-actions[bot]

not stale

rueian avatar Apr 03 '22 16:04 rueian

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

github-actions[bot] avatar May 03 '22 20:05 github-actions[bot]

not stale

rueian avatar May 04 '22 00:05 rueian

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

github-actions[bot] avatar Jun 03 '22 08:06 github-actions[bot]

Not staled

rueian avatar Jun 03 '22 08:06 rueian

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

github-actions[bot] avatar Jul 03 '22 16:07 github-actions[bot]

Not staled

rueian avatar Jul 03 '22 23:07 rueian

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

github-actions[bot] avatar Aug 03 '22 16:08 github-actions[bot]