fix: enhance token server request handling and add max frame length validation to prevent memory issues
中文版本
请描述这个PR的作用以及为什么需要它
本PR增强了token服务器的请求处理能力,并增加了最大帧长度验证,以防止潜在的安全风险和性能问题。
当公司安全团队使用nmap进行端口扫描时,可能会向token服务器发送畸形数据包。
这些数据包可能包含异常大的长度字段,导致服务器创建极大的字节数组,从而引发过多的内存消耗和Full GC问题。
现象
在maxFrameLength最大为1024时,解码ping报文时,会创建16M的临时数组,带来内存压力。
复现
- 本地启动
com.alibaba.csp.sentinel.demo.cluster.ClusterServerDemo - 本地安装nmap命令行工具
brew install nmap - 本地执行napp扫描脚本
nmap -oX - 127.0.0.1 -p 11111 -T4 -sT -sV -Pn -n --host-timeout 300000ms --max-retries 1 --min-parallelism 16 --max-scan-delay 5s
- 在
com.alibaba.csp.sentinel.cluster.server.codec.data.PingRequestDataDecoder.decode断点,可以看到解码出超大的length
原理
根据namp端口特征 规则库可以看到,DNSVersionBindReqTCP 类型的探测报文会被token server误解码为ping包。 https://raw.githubusercontent.com/nmap/nmap/refs/heads/master/nmap-service-probes
这个PR是否修复了某个问题?
修复了畸形数据包中异常大的长度字段可能导致token服务器过度内存分配和Full GC的问题。
请描述您是如何解决的
- 在
ServerConstants.java中添加了一个值为1024的常量NETTY_MAX_FRAME_LENGTH,用于定义允许的最大帧长度。 - 修改了
NettyTransportServer.java,在LengthFieldBasedFrameDecoder中使用NETTY_MAX_FRAME_LENGTH常量替代硬编码值。 - 增强了
ParamFlowRequestDataDecoder.java,对字符串参数长度进行验证,如果超过最大帧长度则抛出异常。 - 改进了
PingRequestDataDecoder.java,检测并记录可能是端口扫描尝试的异常数据包。
请描述如何验证这个PR
- 启动Sentinel集群token服务器
- 发送正常请求以验证服务器功能正常
- 发送包含异常大长度字段的畸形数据包以模拟端口扫描尝试
- 验证服务器记录警告消息并优雅地处理数据包,而不会创建大的字节数组
- 检查超过最大帧长度的字符串参数请求是否被适当的异常拒绝
特别说明(给评审人员)
此修复解决了畸形数据包可能导致过度内存分配的潜在安全和性能问题。该解决方案引入了适当的数据包大小验证和限制,以防止拒绝服务场景的发生。
English Version
Describe what this PR does / why we need it
This PR enhances the token server's request handling and adds max frame length validation to prevent potential security risks and performance issues. When the company's security team performs port scanning using nmap, malformed packets may be sent to the token server. These packets may contain abnormally large length fields, which could cause the server to create extremely large byte arrays, leading to excessive memory consumption and Full GC issues.
Phenomenon
When maxFrameLength is set to a maximum of 1024, decoding ping packets creates a 16M temporary array, causing memory pressure.
Reproduction Steps
- Start locally:
com.alibaba.csp.sentinel.demo.cluster.ClusterServerDemo - Install nmap command line tool locally:
brew install nmap - Execute nmap scanning script locally:
nmap -oX - 127.0.0.1 -p 11111 -T4 -sT -sV -Pn -n --host-timeout 300000ms --max-retries 1 --min-parallelism 16 --max-scan-delay 5s
- Set a breakpoint in
com.alibaba.csp.sentinel.cluster.server.codec.data.PingRequestDataDecoder.decodeto see the decoded oversizedlength
Principle
According to the nmap port characteristic rule base, DNSVersionBindReqTCP type probe packets are misdecoded by the token server as ping packets. https://raw.githubusercontent.com/nmap/nmap/refs/heads/master/nmap-service-probes
Does this pull request fix one issue?
Fixes the issue where malformed packets with abnormally large length fields could cause excessive memory allocation and Full GC in the token server.
Describe how you did it
- Added a constant
NETTY_MAX_FRAME_LENGTHwith a value of 1024 inServerConstants.javato define the maximum frame length allowed. - Modified
NettyTransportServer.javato use theNETTY_MAX_FRAME_LENGTHconstant in theLengthFieldBasedFrameDecoderinstead of a hardcoded value. - Enhanced
ParamFlowRequestDataDecoder.javato validate the string parameter length against the maximum frame length and throw an exception if it exceeds the limit. - Improved
PingRequestDataDecoder.javato detect and log abnormal packets that may be port scanning attempts.
Describe how to verify it
- Start the Sentinel cluster token server
- Send a normal request to verify that the server functions properly
- Send a malformed packet with an abnormally large length field to simulate a port scanning attempt
- Verify that the server logs a warning message and handles the packet gracefully without creating large byte arrays
- Check that requests with string parameters exceeding the maximum frame length are rejected with an appropriate exception
Special notes for reviews
This fix addresses a potential security and performance issue where malformed packets could cause excessive memory allocation. The solution introduces proper validation and limits on packet sizes to prevent denial-of-service scenarios.
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
yungyu16 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.
这个问题,有可能被恶意利用,影响微服务进程可用性。