matrixone
                                
                                
                                
                                    matrixone copied to clipboard
                            
                            
                            
                        resubmit pipeline client max connections is too large
What type of PR is this?
- [ ] API-change
 - [ ] BUG
 - [x] Improvement
 - [ ] Documentation
 - [ ] Feature
 - [ ] Test and CI
 - [ ] Code Refactoring
 
Which issue(s) this PR fixes:
issue #12652
What this PR does / why we need it:
让stream rpc复用tcp连接,减少多cn之间的tcp连接数 可以将配置的最大连接数从10万降低到10.
上次被revert是因为checkin regression没有跑过,现在已经实现了shuffle join的阻塞控制,checkin regression已经可以跑过了。 准备多跑几轮checkin regression,然后再多观察观察多cn的tpch,以及长稳测试情况再合并。
@badboynt1 Thanks for your contributions!
Pull Request Review:
Title:
The title of the pull request is clear and concise, indicating that the maximum number of connections for the pipeline client is being reduced.
Body:
The body of the pull request provides information on the type of PR, the related issue, and the reason for the changes. It mentions that the change is aimed at reusing TCP connections for stream RPC to reduce the number of TCP connections between multiple CNs. Additionally, it states that the maximum connection number is being reduced from 100,000 to 10.
Changes:
- 
pkg/cnservice/cnclient/client.go:
- The change in this file modifies the 
NewStreamfunction to passfalseinstead oftruewhen creating a new stream, which likely affects the behavior of reusing TCP connections. - The constant 
dfMaxSenderNumberis reduced from 100,000 to 10, which limits the maximum number of senders. 
 - The change in this file modifies the 
 - 
pkg/sql/compile/scope.go:
- In the 
notifyAndReceiveFromRemotefunction, theClosemethod ofstreamSenderis now called withfalseinstead oftrue. 
 - In the 
 - 
pkg/sql/compile/scopeRemoteRunTypes.go:
- The 
closemethod inmessageSenderOnClientnow callsCloseonstreamSenderwithfalseinstead oftrue. 
 - The 
 
Feedback and Suggestions:
- 
Security Concern:
- The change in passing 
falseinstead oftruein theClosemethods across multiple files raises a security concern. It's crucial to ensure that closing connections properly and securely is maintained. Verify that changing this parameter does not introduce any vulnerabilities or unexpected behavior, especially related to resource management and connection handling. 
 - The change in passing 
 - 
Code Consistency:
- It's essential to maintain consistency in parameter usage across the codebase. If the parameter 
trueorfalsesignifies specific behavior, ensure it is consistently applied throughout the codebase to avoid confusion and potential bugs. 
 - It's essential to maintain consistency in parameter usage across the codebase. If the parameter 
 - 
Documentation:
- Consider updating the comments or documentation to reflect the changes made in the code. Clear documentation helps developers understand the purpose and behavior of the functions and constants being modified.
 
 - 
Testing:
- After making these changes, thorough testing is crucial to ensure that the functionality of reusing TCP connections and limiting the maximum number of senders works as expected without any regressions.
 
 - 
Optimization:
- While reducing the maximum sender number from 100,000 to 10 can be a valid optimization, it's essential to consider the impact on performance and scalability. Monitor the application's behavior under load to ensure the new limit does not adversely affect performance.
 
 - 
Error Handling:
- Verify that error handling mechanisms are in place for cases where connections fail to close properly. Ensure that any potential errors are appropriately handled to prevent resource leaks or unexpected behavior.
 
 - 
Code Review:
- Conduct a thorough code review to ensure that all changes align with the project's coding standards and best practices. Address any potential code smells or areas for improvement during the review process.
 
 
By addressing the security concern, maintaining code consistency, updating documentation, conducting thorough testing, optimizing performance, ensuring proper error handling, and conducting a comprehensive code review, the pull request can be enhanced to improve the quality and reliability of the codebase.
https://github.com/matrixorigin/ci-test/actions/runs/9121513159 连续跑了三次checkin regression都成功了
https://github.com/matrixorigin/ci-test/actions/runs/9185551821 最新的一次也成功了