[BUG] agent mem pprof看不到符号表
Search before asking
- [X] I had searched in the issues and found no similar feature requirement.
DeepFlow Component
Agent
What you expected to happen
希望能通过heap pprof文件知道具体内存占用情况
How to reproduce
- 基于branch v6.4 cherry-pick https://github.com/deepflowio/deepflow/pull/5280/commits/2036076c81f61e6087bc79b7eb1ffcffc2dd0180 最终修改见 https://github.com/qyzhaoxun/deepflow/tree/v6.4
- 使用容器编译
docker run --privileged --rm -it -v $(pwd):/deepflow hub.deepflow.yunshan.net/public/rust-build bash -c "cd /deepflow/agent && cargo build"
- 构建容器镜像
FROM registry.cn-hongkong.aliyuncs.com/deepflow-ce/deepflow-agent:v6.4
RUN rm /usr/bin/deepflow-agent
ADD ./deepflow-agent.tgz /usr/bin/
- 获取heap pprof文件并生成svg
DeepFlow version
No response
DeepFlow agent list
v6.4 agent使用standalone模式启动
Kubernetes CNI
不涉及
Operation-System/Kernel version
"Ubuntu 22.04 LTS" Linux VM-11-12-ubuntu 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Anything else
No response
Are you willing to submit a PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
另外这里提个需求,希望可以把heap pprof做成配置项合并到主干和v6.4,默认不开启,但是有需求的话可以通过配置开启
@qyzhaoxun 有什么报错么 ? 生成svg的命令也发一下吧
jeprof --svg ./deepflow-agent ./agent.profile >profile.svg
没有报错
cargo.toml里开启的debug=true, 编译的agent都是带有符号表的,可以检查下,上面的heap都是用你的分支生成的是可以的;另外容器内运行可能有问题可以试试直接在主机上运行
我这里在容器环境运行,有什么需要额外配置的吗? @yuanchaoa
另外这里对编译的命令有要求吗?cargo build --release,这里需要添加--release吗?
是的 用cargo build --release
想问下,如果使用容器这里应该怎么pprof,我这边是采集的容器的heap文件,然后在节点上执行的jeprof @yuanchaoa 我这里用--release方式build,还是找不到对应符号表
agent内存pprof是通过这个库实现的:https://crates.io/crates/jemalloc_pprof 其中有段说明应该对你有帮助:
@qyzhaoxun 如下 agent 配置可用于降低内存 https://github.com/deepflowio/deepflow/blob/main/server/agent_config/example.yaml :
cBPF 采集哪些网卡
## Regular Expression for TAP (Traffic Access Point)
## Length: [0, 65535]
## Default:
## Localhost: lo
## Common NIC: eth.*|en[osipx].*
## QEMU VM NIC: tap.*
## Flannel: veth.*
## Calico: cali.*
## Cilium: lxc.*
## Kube-OVN: [0-9a-f]+_h$
## Note: Regular expression of NIC name for collecting traffic
#tap_interface_regex: ^(tap.*|cali.*|veth.*|eth.*|en[osipx].*|lxc.*|lo|[0-9a-f]+_h)$
默认也会采集 lo 网卡,如果不需要的话,去掉可降低内存消耗。
cBPF 忽略哪些流量
## Traffic Capture Filter
## Length: [1, 512]
## Note: If not configured, all traffic will be collected. Please
## refer to BPF syntax: https://biot.com/capstats/bpf.html
#capture_bpf:
如果明确知道有些流量不需要关心,可以配置 bpf 表达式过滤
cBPF 流量采集截断和应用协议解析截断 ⭐️
## Maximum Packet Capture Length
## Unit: bytes. Default: 65535. Range: [128, 65535]
## Note: DPDK environment does not support this configuration.
#capture_packet_size: 65535
## Protocol Identification Maximun Packet Length
## Default: 1024. Bpf Range: [256, 65535], Ebpf Range: [256, 8192]
## Note: The maximum data length used for application protocol identification,
## note that the effective value is less than or equal to the value of
## capture_packet_size.
#l7_log_packet_size: 1024
目前我们的应用协议解析最大支持解析 8192 字节,因此这两个配置可以统一为 1024 ~ 8192 之间某个值。降低 capture_packet_size 有助于降低内存。
关闭隧道解析的尝试
## Decapsulation Tunnel Protocols
## Default: [1, 2], means VXLAN and IPIP. Options: 1 (VXLAN), 2 (IPIP), 3 (GRE), 4 (Geneve)
#decap_type:
#- 1
#- 2
有助于降低 CPU 消耗
关闭 X-Forwarded-For、X-Request-ID、TraceID、SpanID 的解析
## HTTP Real Client Key
## Default: X-Forwarded-For.
## Note: It is used to extract the real client IP field in the HTTP header,
## such as X-Forwarded-For, etc. Leave it empty to disable this feature.
#http_log_proxy_client: X-Forwarded-For
## HTTP X-Request-ID Key
## Default: X-Request-ID
## Note: It is used to extract the fields in the HTTP header that are used
## to uniquely identify the same request before and after the gateway,
## such as X-Request-ID, etc. This feature can be turned off by setting
## it to empty.
#http_log_x_request_id: X-Request-ID
## TraceID Keys
## Default: traceparent, sw8.
## Note: Used to extract the TraceID field in HTTP and RPC headers, supports filling
## in multiple values separated by commas. This feature can be turned off by
## setting it to empty.
#http_log_trace_id: traceparent, sw8
## SpanID Keys
## Default: traceparent, sw8.
## Note: Used to extract the SpanID field in HTTP and RPC headers, supports filling
## in multiple values separated by commas. This feature can be turned off by
## setting it to empty.
#http_log_span_id: traceparent, sw8
若不关心 l7_flow_log 中的这些字段,可以关闭
降低 cBPF 缓冲区大小 ⭐️
###############
## AF_PACKET ##
###############
## AF_PACKET Blocks Switch
## Note: When tap_mode != 2, you need to explicitly turn on this switch to
## configure 'afpacket-blocks'.
#afpacket-blocks-enabled: false
## AF_PACKET Blocks
## Default: 128, Range: [8, +oo)
## Note: deepflow-agent will automatically calculate the number of blocks
## used by AF_PACKET according to max_memory, which can also be specified
## using this configuration item. The size of each block is fixed at 1MB.
#afpacket-blocks: 128
默认会根据 max-memory 计算一个合适的 afpacket-blocks( agent 日志里能看到),如果还希望降低内存,可以明确配置。一个 block = 1MB。
降低 eBPF 缓冲区大小 ⭐️
## eBPF dispatch ring size
## Default: 65536. Range: [8192, 131072]
## Note: The size of the ring cache queue, The value is 2^n ( n range [13, 17] ).
## If the value is between 2^n and 2^(n+1), it will be automatically adjusted by the ebpf configurator to the minimum value (2^n).
#ring-size: 65536
可以认为这里的 1 个单位(是一个指针)对应的存储空间最大可能是 l7_log_packet_size 的大小(默认是 1KB)。即默认情况下这里最大会有 64K * 1KB = 64MB 的内存消耗。
其他可以降低数据量的配置
https://deepflow.io/docs/zh/best-practice/reduce-storage-overhead/