[FR] Enhance self-learning collection of http2/gRPC header key values
Search before asking
- [X] I had searched in the issues and found no similar feature requirement.
Description
需求: deepflow v6.4版本实现了eBPF kprobe 高性能解码 HTTP2 压缩头,自动学习通信双方的压缩字典,但是在实际过程中采集自定义header存在丢失乱序覆盖的问题,希望使用只采集value去解决自定义头匹配的问题 文章来源:https://www.deepflow.io/blog/zh/053-high-performance-decoding-of-http2-compressed-headers-using-ebpf-kprobe/ 缺陷:
- 对于 deepflow-agent 启动之前就已经存在的 HTTP2 长连接,已存在的动态字典表项无法解码
- 使用 cBPF 时,由于网络中可能存在丢包、重传、乱序等因素,因此对压缩头不的还原可能存在误差(但 eBPF kprobe 无此限制)
- 实际测试v6.5版本可能存在压缩字典乱序的问题,导致采集内容key和value对应不上
问题描述: 对于可能存在压缩字典乱序的问题,导致采集内容key和value对应不上,实测效果
static_config:
l7-protocol-advanced-features:
extra-log-fields:
http2:
- field-name: "x-custom-code"
- field-name: "x-custom-msg"
- field-name: "x-custom-data"
发送一个http2/gRPC的请求
:authority: www.xxxx.com
:method: POST
:path: /list?aid=6383&sdk_version=5.1.18_zip&device_platform=web&zip=1
:scheme: https
accept: */*
accept-encoding: gzip, deflate, br, zstd
accept-language: zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6
content-encoding: gzip
content-length: 5368
content-type: application/json; charset=utf-8
origin: https://www.xxxx.com
priority: u=1, i
referer: https://www.xxxx.com/
user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36 Edg/129.0.0.0
x-custom-code: 200
x-custom-msg: success
x-custom-data: {"test": "data"}
技术原理:https://kiosk007.top/post/http-2-0-header-compression/
http2索引表包括:静态表rfc7541和动态表
Server代码落库位置:
deepflow\server\ingester\flow_log\log_data\l7_flow_log.go
// AttributeNames = [] 数组 和 AttributeValues = [] 数组
// 映射关系是一对一 key=>value关系:AttributeNames[i]=>AttributeValues[i]
h.AttributeNames = append(h.AttributeNames, l.ExtInfo.AttributeNames...)
h.AttributeValues = append(h.AttributeValues, l.ExtInfo.AttributeValues...)
h.MetricsNames = append(h.MetricsNames, l.ExtInfo.MetricsNames...)
h.MetricsValues = append(h.MetricsValues, l.ExtInfo.MetricsValues...)
落库结果举例:
# 情况1:正常,少数
AttributeNames = ["rpc_services","x-custom-code","x-custom-msg","x-custom-data"]
AttributeValues = ["xxx","200","success","{\"test\": \"data\"}"]
# 情况2:异常,大量
# x-custome-msg 被 x-custome-code 覆盖,索引表解析乱序
AttributeNames = ["rpc_services","x-custom-code","x-custom-code","x-custom-data"]
AttributeValues = ["xxx","200","success","{\"test\": \"data\"}"]
# x-custome-code 被 x-custome-data 覆盖,索引表解析乱序
AttributeNames = ["rpc_services","x-custom-data","x-custom-msg","x-custom-data"]
AttributeValues = ["xxx","200","success","{\"test\": \"data\"}"]
技术方案: 技术思路:既然自学习HTTP2头解析索引表还是存在一些不足,不如从有特点的value入手通过配置进行补全
首先来一个通用简单的场景,分隔符处理,定义一个header
# 定义的header key :x-custom-content,没有实际意义,如果wireshark和deepflow学习不到这个值的时候是unknown
# 特定字符串分隔符:!#!
x-custom-content: "200!#!success!#!{\"test\": \"data\"}"
# 实际协议解析可能为:unknown:"200!#!success!#!{\"test\": \"data\"}"
增加一个配置:这里有几个不同的方案,经过实测后
static_config:
l7-protocol-advanced-features:
extra-log-fields:
http2:
- field-name: "x-custom-code"
match-value-rule: "!#!"
field-value-index: 0
- field-name: "x-custom-msg"
match-value-rule: "!#!"
field-value-index: 1
- field-name: "x-custom-data"
match-value-rule: "!#!"
field-value-index: 2
由于特殊分隔符的情况较少,解析header时候可以被特殊分隔符分割且分割后的长度大于等于2的value,按照匹配规则和预定义的key进行补全。
补全后的结果和正常自学习header结果一致,效果稳定
AttributeNames = ["rpc_services","x-custom-code","x-custom-msg","x-custom-data"]
AttributeValues = ["xxx","200","success","{\"test\": \"data\"}"]
场景补充:正则匹配处理(字段冗余思路)
# 定义的header key :x-custom-content,http2协议标准,动态表的一个字段,解析没有实际意义
# 特定字符串分隔符:!#!
x-custom-code: "x-custom-code:200"
x-custom-msg: "x-custom-msg:success"
x-custom-data: "x-custom-data:{\"test\": \"data\"}"
# 实际协议解析可能为:
# unknown: "x-custom-code:200"
# unknown: "x-custom-msg:success"
# unknown: "x-custom-data:{\"test\": \"data\"}"
增加一个配置
static_config:
l7-protocol-advanced-features:
extra-log-fields:
http2:
- field-name: "x-custom-code"
match-value-rule: "^x-custom-code:(.*)"
field-value-index: 0
- field-name: "x-custom-msg"
match-value-rule: "^x-custom-msg:(.*)"
field-value-index: 0
- field-name: "x-custom-data"
match-value-rule: "^x-custom-data:(.*)"
field-value-index: 0
举例伪代码处理:
import re
input_string = "x-custom-msg:success"
pattern = r"^x-custom-msg:(.*)"
match = re.match(pattern, input_string)
if match:
result = match.group(1)
print("匹配成功!")
print("提取的内容:", result) # success
else:
print("匹配失败")
匹配解析后的结果
# x-custom-code: "200"
# x-custom-msg: "success"
# x-custom-data: "{\"test\": \"data\"}"
AttributeNames = ["rpc_services","x-custom-code","x-custom-msg","x-custom-data"]
AttributeValues = ["xxx","200","success","{\"test\": \"data\"}"]
备注: 采用HTTP2静态表中的字段user-agent和server,deepflow采集的效果稳定很多,但是对应的server代码要做修改处理,静态表字段并不符合协议标准和存在不安全性,看能否兼容动态表处理,兼容自定义http2 header的场景
@sharang
Use case
No response
Related issues
No response
Are you willing to submit a PR?
- [x] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
@Fancyki1 你提到的方法挺好的,相当于定义一个 http/grpc header injection 的规范,通过 value 的特殊性,在一个 value 中放进去所有需要 injection 的内容。
我们想想如何能在规范层面推进这种做法。
请问一下低于6.4的版本会有这个问题吗?
@gbling 文章来源都有:https://www.deepflow.io/blog/zh/053-high-performance-decoding-of-http2-compressed-headers-using-ebpf-kprobe/
6.4之前都不支持这个功能
@Fancyki1 想再确认一下,HTTP1.1 协议的也会有同样的情况么?
@gbling http1.1 可以用wasm插件解析去实现,不需要用到这个特性
@Fancyki1 是这样的,我们在测试链路追踪的时候通过自定义的 http_log_x_request_id 做链路的关联,内部链路调用都是用 http1.1 ,会存在链路不全的情况;是想再明确一下这个特性是只对 HTTP2/gRPC 生效,还是 http1.1 也会生效的?
@gbling 你多看看文档,文档里面都写了
## Configuration to extract the customized header fields of HTTP, HTTP2, GRPC protocol etc
#extra-log-fields:
## for example:
## http:
## - field-name: "user-agent"
## - field-name: "cookie"
# http: []
# http2: []
你用>v6.4版本,配置了http就启用了http1.1,而且http1.1不存在http2索引表的采集乱序不全的问题,直接用就好了,而且你要弄明白你要实现什么效果,如果是链路追踪那和这个没什么关系,如果想用这个看链路追踪是否每个请求都有http_log_x_request_id那倒是可以辅助排障使用