venus
venus copied to clipboard
[venus]全组件metrics指标监控
Discussed in https://github.com/filecoin-project/venus/discussions/4950
Originally posted by hunjixin June 21, 2022
Checklist
- [X] This is not a new feature or an enhancement to the Filecoin protocol. If it is, please open an FIP issue.
- [X] This is not a new feature request. If it is, please file a feature request instead.
- [X] This is not brainstorming ideas. If you have an idea you'd like to discuss, please open a new discussion on the venus forum and select the category as
Ideas. - [X] I have a specific, actionable, and well motivated improvement to propose.
Venus component
- [X] venus daemon - [chain service] chain sync
- [X] venus auth - [chain service] authentication
- [X] venus messager - [chain service] message management (mpool)
- [X] venus gateway - [chain service] gateway
- [X] venus miner - [chain service] mining and block production
- [X] venus sealer/worker - sealing
- [X] venus sealer - proving (WindowPoSt)
- [X] venus market - storage deal
- [X] venus market - retrieval deal
- [X] venus market - data transfer
- [ ] venus light-weight client
- [ ] venus JSON-RPC API
- [ ] Other
Improvement Suggestion
venus组件重要指标监控支持,方便运维和用户直接方便进行指标监控,发生异常是能够及时干预修复。
venus
- 新tipset处理时间
- mpool 消息数量
- 高度, 区块数量,消息数量,重量
- 接受到新区块和预期的消息之间的时间差
venus-message
- 最近一段时间每个地址的资产,nonce,消息状态,多少待打包,多少失败,多少失败待处理,
- 每轮次 选择消息数量,推送数量,待打包数量,失败消息数量
- 多少消息堵塞超过3分钟, 超过5分钟
- 从venus接收到的区块事件 稳定之后的时间,及时发现同步问题
venus-gateway
- 链接到gateway的钱包数量,地址数量,ip位置
- 链接到gateway的miner的数量,地址数量,ip位置
- 通过gateway的签名数量
venus-market
- 一段时间内接受到的的存储/检索订单数量,成功率
venus-miner
- 一段时间内的出块权数量
- 计算证明耗时
- 签名耗时
- 拿base耗时
- 前置运算耗时
venus-auth/venus-wallet
处于安全考虑待定
这几个怎么样?
Venus
- Block validation time
- Memory / CPU usage
- Number of goroutines
- IPLD block read latency
- Bandwidth usage
Cluster
- windowPost计算时间
- winningPost计算时间
venus-market
- deal传输速度,数量,时间,状态
- 检索传输速度,数量,时间,状态
https://github.com/filecoin-project/venus/issues/4960 https://github.com/filecoin-project/venus/issues/5054
原来lotus有的Validated X messages (X per second)在几秒内验证多少条消息。
- [ ] https://github.com/filecoin-project/lotus/pull/9052
单独组件的做完了,后续需要推动和farcast的工作, 一方面是他们继承我们的指标,另一方面是我们需要增加一些具备venus特色的指标。
venus-messager metrics 指标
地址
# 地址余额
WalletBalance = stats.Int64("wallet_balance", "Wallet balance", stats.UnitDimensionless)
# 地址在数据库中的nonce值
WalletDBNonce = stats.Int64("wallet_db_nonce", "Wallet nonce in db", stats.UnitDimensionless)
# 地址链上nonce值
WalletChainNonce = stats.Int64("wallet_chain_nonce", "Wallet nonce on the chain", stats.UnitDimensionless)
消息数量
# unfill消息数量,可以根据地址分组
NumOfUnFillMsg = stats.Int64("num_of_unfill_msg", "The number of unFill msg", stats.UnitDimensionless)
# fill消息数量,可以根据地址分组
NumOfFillMsg = stats.Int64("num_of_fill_msg", "The number of fill Msg", stats.UnitDimensionless)
# failed消息数量
NumOfFailedMsg = stats.Int64("num_of_failed_msg", "The number of failed msg", stats.UnitDimensionless)
# fill消息三分未上链的数量
NumOfMsgBlockedThreeMinutes = stats.Int64("blocked_three_minutes_msgs", "Number of messages blocked for more than 3 minutes", stats.UnitDimensionless)
# fill消息五分组未上链的数量
NumOfMsgBlockedFiveMinutes = stats.Int64("blocked_five_minutes_msgs", "Number of messages blocked for more than 5 minutes", stats.UnitDimensionless)
单次选择消息情况
# 选择的消息数量
SelectedMsgNumOfLastRound = stats.Int64("selected_msg_num", "Number of selected messages in the last round", stats.UnitDimensionless)
# 还未上链的fill消息
ToPushMsgNumOfLastRound = stats.Int64("topush_msg_num", "Number of to-push messages in the last round", stats.UnitDimensionless)
# 过期的消息数量
ExpiredMsgNumOfLastRound = stats.Int64("expired_msg_num", "Number of expired messages in the last round", stats.UnitDimensionless)
# 错误的消息数量
ErrMsgNumOfLastRound = stats.Int64("err_msg_num", "Number of err messages in the last round", stats.UnitDimensionless)
head
# 链head稳定的花费时间
ChainHeadStableDelay = stats.Int64("chain_head_stable_s", "Delay of chain head stabilization", stats.UnitSeconds)
venus-gateway
钱包
# 钱包注册
WalletRegister = stats.Int64("wallet_register", "Wallet register", stats.UnitDimensionless)
# 钱包注销
WalletUnregister = stats.Int64("wallet_unregister", "Wallet unregister", stats.UnitDimensionless)
# 钱包数量
WalletNum = stats.Int64("wallet_num", "Wallet count", stats.UnitDimensionless)
# 钱包包含的地址数量
WalletAddressNum = stats.Int64("wallet_address_num", "Address owned by wallet", stats.UnitDimensionless)
# 钱包来源
WalletSource = stats.Int64("wallet_source", "Wallet IP", stats.UnitDimensionless)
# 钱包新增地址
WalletAddAddr = stats.Int64("wallet_add_addr", "Wallet add a new address", stats.UnitDimensionless)
# 钱包移除地址
WalletRemoveAddr = stats.Int64("wallet_remove_addr", "Wallet remove a new address", stats.UnitDimensionless)
# 钱包的连接数量
WalletConnNum = stats.Int64("wallet_conn_num", "Wallet connection count", stats.UnitDimensionless)
矿工
# 矿工注册
MinerRegister = stats.Int64("miner_register", "Miner register", stats.UnitDimensionless)
# 矿工注销
MinerUnregister = stats.Int64("miner_unregister", "Miner unregister", stats.UnitDimensionless)
# 矿工数量
MinerNum = stats.Int64("miner_num", "Wallet count", stats.UnitDimensionless)
# 矿工来源
MinerSource = stats.Int64("wallet_source", "Miner IP", stats.UnitDimensionless)
# 矿工的连接数量
MinerConnNum = stats.Int64("miner_conn_num", "Miner connection count", stats.UnitDimensionless)
接口调用
# 签名耗时(毫秒)
WalletSign = stats.Float64("wallet_sign", "Call WalletSign spent time", stats.UnitMilliseconds)
# 列出钱包地址耗时(毫秒)
WalletList = stats.Float64("wallet_list", "Call WalletList spent time", stats.UnitMilliseconds)
# 计算 winnerpost 耗时(毫秒)
ComputeProof = stats.Float64("compute_proof", "Call ComputeProof spent time", stats.UnitMilliseconds)
# 调用 IsUnsealed 耗时(毫秒)
IsUnsealed = stats.Float64("is_unsealed", "Call IsUnsealed spent time", stats.UnitMilliseconds)
# 调用 SectorsUnsealPiece(毫秒)
SectorsUnsealPiece = stats.Float64("sectors_unseal_piece", "Call SectorsUnsealPiece spent time", stats.UnitMilliseconds)