devgrants
devgrants copied to clipboard
Open Grant Proposal: `Data Onboarding Metrics`
Open Grant Proposal: Data Onboarding Metrics
Name of Project: Data Onboarding Metrics - Venus


Proposal Category: Choose one of core-dev, devtools-libraries


Proposer: ipfs-force-community


(Optional) Technical Sponsor:


Do you agree to open source all work you do on behalf of this RFP and dual-license under MIT, APACHE2, or GPL licenses?: Yes
Project Description
One of the issues that new SPs or even many veteran SPs facing everyday when they on-board loads of sectors is getting a clear picture of the heartbeat for their storage system to diagnose whatever has gone wrong in their pipeline. A thousand things could go wrong when moving sectors through SP’s storage systems such as chain head out of sync, messages stuck in mpool, missing block producing round, high API latency and etc. SPs have to navigate through these anomalies all the time and be quick to response to these conditions.


This is where Data Onboarding Metrics for Venus Filecoin comes into play. We propose to build a series of critical metrics for each component of Venus Filecoin to reflect the live health of a storage system so that operators could have better knowledge of what’s going with their systems and then could better react to different situations instead of relying on guessing, digging through tons of logs or overly extensive dev-ops experience.

Value

There are many benefits we see that Data Onboarding Metrics could bring to SPs to take control of their storage systems back instead of spending a lot time troubleshooting a black box. We believe metrics provides the toolbox for SP to minimize the impact of their operation errors, to get to see if winidowPost messages get properly sent out in time, to monitor time/latency for PoST computation and much more so that SPs do not get punished by the protocol unintentionally.


- Live heart beat map of all critical information of your storage system
- Lower protocol penalties from mechanics such as PCD, PoST slashes, missing block etc
- Easy integration with third party monitoring solution
Deliverables
| Milestone | Area | Deliverable | Funding |
|---|---|---|---|
| A | Spec | Begin collecting feedbacks from community on what kind of metrics they would like to see developed to ease their data-onboarding efforts | - |
| A | Spec | Initial design for metrics system for each components, daemon, miner, messager, gateway, market, cluster | 5,000 |
| B | Impl | Implementation of metrics to be collected from miner component as the first MVP | 5,000 |
| B | Doc | Initial documentation on configurations and usages of the metrics system | 3,000 |
| C | Impl | Implementation of metrics to be collected from messager component | 3,000 |
| C | Impl | Implementation of metrics to be collected from gateway component | 3,000 |
| C | Impl | Implementation of metrics to be collected from daemon component and finish up on all other chain service component | 3,000 |
| D | Impl | Implementation of metrics to be collected from market component | 5,000 |
| E | Impl | Implementation of metrics to be collected from cluster component | 5,000 |
| F | Spec | Release full documentation on the metrics system along with practical tutorials | 5,000 |
| G | Spec | Collect community feedbacks on more metrics they would like to have after their initial tryouts | - |
| H | Impl | Revisit each component and add new metrics that community deemed necessary | 6,000 |
| - | - | Project Management: A dedicated project management budget will help coordinate work between different collaborators, as well as outreach to key stakeholders and key users | 5,000 |
| - | - | Total | $48,000 |
Development Roadmap

The development could be loosely broken down into three parts: 1) Design 2) Implementation and lastly 3) maintenance of the metrics system.
Design
This phase includes milestone A and B in the above deliverable table. The team will be collecting ideas from community, concieve the 1st design of metrics system, and lastly build a POC/MVP for miner component. A embedded exporter that allows custom configuration will be included for easier integration with third party tools. A metrics module will be added to the miner project which may contain below parameters for SPs to monitor their storage pipeline.
// latency for GetBaseInfo API
GetBaseInfoDuration (Milliseconds)
// latency for ComputeTicket API
ComputeTicketDuration (Milliseconds)
// latency for IsRoundWinner API
IsRoundWinnerDuration (Milliseconds)
// latency for ComputeProof API
ComputeProofDuration (Seconds)
// number of block produced
NumberOfBlock (Dimensionless)
// number of rounds that miner_id is winner
NumberOfIsRoundWinner (Dimensionless)
Implementation
This phase includes milestone C to E in the above deliverable table. The team will be continuing to collect ideas from community while implementing the metrics system for the rest of the Venus components. A list of parameters that metrics module will be adopting are listed below…
messager
// Below metrics are updated on a per wallet address granularity
WalletBalance (UnitDimensionless)
WalletDBNonce (UnitDimensionless)
WalletChainNonce (Dimensionless)
// Current number of messages that are waiting for venus-messager to fill out parameters like signature, gas usage, nonce etc.
// This metric is updated on a per wallet address granularity
NumOfUnFillMsg (UnitDimensionless)
// Current number of messages that venus-messager has filled out parameters like signature, gas usage, nonce etc.
// This metric is updated on a per wallet address granularity
NumOfFillMsg (Dimensionless)
// Current number of messages that venus-messager has failed to fill out parameters like signature, gas usage, nonce etc.
// This metric is updated on a per wallet address granularity
NumOfFailedMsg (UnitDimensionless)
// Current number of messages that haven't being on-chain for more than 3 minutes
NumOfMsgBlockedThreeMinutes (Dimensionless)
// Current number of messages that haven't being on-chain for more than 5 minutes
NumOfMsgBlockedFiveMinutes (UnitDimensionless)
// Number of message being selected by venus-messager during last round of message pushing
SelectedMsgNumOfLastRound (UnitDimensionless)
// Number of message being pushed by venus-messager during last round of message pushing
ToPushMsgNumOfLastRound (UnitDimensionless)
// Number of message being expired by venus-messager during last round of message pushing
ExpiredMsgNumOfLastRound (UnitDimensionless)
// Number of message encountered errors during last round of message pushing
ErrMsgNumOfLastRound (UnitDimensionless)
// Current time difference between chain head time and time on venus-messager machine system time
ChainHeadStableDelay (UnitSeconds)
// Histogram of time difference between chain head time and time on venus-messager machine system time
ChainHeadStableDuration (UnitSeconds)
)
gateway
// Number of wallet connecting to the gateway
WalletCount
// Number of wallet addresses connecting to the gateway
WalletAddressCount
// IP of remote wallet connecting to the gateway
WalletIPAddress
// Number of SP connecting to the gateway
SPCount
// Number of SP addresses connecting to the gateway
SPAddressCount
// IP of remote SP connecting to the gateway
SPIPAddress
// Number of signature gateway initiated
SignCount
market
// Count of storage deals accepted
StorageDealAccepted
// Number of active data transfer
NumberOfActiveTransfer
// Speed of data transfer, per transfer, unit = Mbps
DataTransferSpeed
// The rate of successful data transfer
SucessTransferRate
daemon
TBD
cluster
// Count of new sectors, per miner_id
SectorManagerNewSector
// Count of preCommit, per miner_id
SectorManagerPreCommitSector
// count of commit, per miner_id
SectorManagerCommitSector
// time of computing winningPost, per miner_id, unit = Seconds
ProverWinningPostDuration
// time of computing WindowPost, per miner_id, unit = Minutes
ProverWindowPostDuration
// Completion rate for partition that have passed windowPost, per miner_id
// Eg: ProverWindowPostCompleteRate=0.9 when 9 out 10 partition complete windowPost submission
ProverWindowPostCompleteRate
// Latency of sector manage API calls, unit = ms
APIRequestDuration
Note that all metrics are not final and subject to have more parameters when community see fit.
Maintenance
This phase includes milestone F to H in the above deliverable table. The team will be continuing to collect ideas and feedbacks from community while iterating on the metrics system for all Venus components. Documentations and easy-to-follow tutorials will be produced to help push metrics system to be adopted by broader community members. We hope after we are done with this phase SPs will have the tools they need to remove any obstacles when on-boarding large amount of sectors.
Total Budget Requested
The total budget requests is $48,000. The breakdown of the budget is associated with the deliverables of each milestone, defined above.

Maintenance and Upgrade Plans
The goal of the team is to support metrics system long term, which including continuously adding more critical parameters that community deemed worthy of monitoring. Therefore, easing the process of on-boarding large amount of data to the network.

Team
Team Members
Force community engineering team
Team Member LinkedIn Profiles
Team Website
https://forcecommunity.io/ 

Relevant Experience
Force community has been an active contributor to Web3 ecosystem and Filecoin ecosystem in general. The engineering team from Force community has a track record of contributing code to Lotus as far back as Testnet and Space Race. 

Team code repositories
https://github.com/ipfs-force-community 

Additional information
Force community is committed to become a major contributor to Web3 infrastructure and we see Filecoin at the core of the big Web3 migration. We hope that we could fast track the realization of Web3 adoption by contributing our software development capacity to the course and join hand in hand with all other ecosystem developers around the globe through this historical journey!