stacks-blockchain-api icon indicating copy to clipboard operation
stacks-blockchain-api copied to clipboard

Adding Endpoint to further improve fee estimation (taking what's in the mempool into account)

Open Hero-Gamer opened this issue 3 years ago • 14 comments

Is your feature request related to a problem? Please describe.

  • During high mempool traffic, fee estimation can be further improved if taken into account transactions in the mempool.

Describe the solution you'd like

  • Adding an Endpoint to take into account of yet-to-be processed mempool transactions (i.e. taking into account the future as well as the past) @whoabuddy has produced an approach we could reference, approach as follows: Here's one approach I've used for calculating fees, gets average and median of mempool, then averages those two and can apply a multiplier: https://github.com/citycoins/scripts/blob/feat/implement-v2/src/get-network-status.ts

  • It's a newer version - so it requires: npx ts-node src/script_name

Sample output:

✔ Check all TX? (default: first 200) … yes

currentBlock: 67334 mempoolTxCount: 205 Processed 200 of 202 Processed 202 of 202 maxFee: 10.000000 STX avgFee: 0.183126 STX median: 0.003000 STX multiplier: 1 optimalFee: 0.093063 STX

Describe alternatives you've considered

  • A clear and concise description of any alternative solutions or features you've considered.

Additional context

  • I'll just add a minor point that, many tx in the mempool are not really valid txs, either outdated or they could just be hanging there due to bad nonce, or past obsolete nonces, etc.

Side note:

  • maybe this endpoint might help the Explorer to display block fullness visually (similar to mempool.space), as per conversation here: https://discord.com/channels/621759717756370964/911531946738339900/991858648538173511

cc: @whoabuddy @wileyj @saralab

Hero-Gamer avatar Jul 11 '22 15:07 Hero-Gamer

+1 to a new endpoint at the very least that averages the tx fee more accurately.

wileyj avatar Jul 11 '22 15:07 wileyj

This approach has been working well for me including during times of higher congestion.

Using only the average values can be a bit misleading when you do get a few accounts that spend unexpectedly high fees, and this script does nothing to verify that transactions in the mempool will be processed.

I'll just add a minor point that, many tx in the mempool are not really valid txs, either outdated or they could just be hanging there due to bad nonce, or past obsolete nonces, etc.

Invalid transactions should be removed after 256 blocks iirc, so while this fact is true, they can still be a valid data point for attempted transactions. They could be balanced out by the current fee estimation technique of looking at past transactions.

Should the title be updated here? I feel like the conversation is about a different/better fee estimate endpoint versus block fullness?

whoabuddy avatar Jul 11 '22 16:07 whoabuddy

Our current fee estimation accounts for a variety of scenarios and aspects. +1 Basing Fee Estimates on average is not the best approach!

We have a rather elaborate fee estimator where:

  • The fee rates are weighted by transaction size during calculation
  • the fee estimator computes estimates from data from the past 5 blocks by default, although the exact size of the window is configurable to miners.
  • the fee estimator returns a “fuzzed” version of the low, middle, and high estimate - some random noise is added to the baseline estimate.

This blog post has some great detail on how the Fee Estimator works: https://www.hiro.so/blog/improved-fee-estimations-in-the-hiro-wallet-and-stacks-api

And if you prefer a Video: https://youtu.be/0NqE4PMhWqc

cc: @zone117x

saralab avatar Jul 12 '22 15:07 saralab

Yep, I think the API could help out here, but not exactly sure how. A simple average of fees may work in some cases, but the couple references above which work with Bitcoin tx fees are not very applicable in this scenario. To be precise, Bitcoin txs & fees are almost entirely dependent on a very simple metric of "byte size". Stacks is more akin to Ethereum gas prices. And instead of one parameter (byte size), several tx parameters are used to calculate fee: write count, write length, read count, read length, runtime, and size in number of bytes.

The Stacks Blockchain node RPC endpoint /v2/fees/transfer is easily able to calculate fee estimates using historical, mined txs, because the relatively expensive work to calculate those parameters has already been done. In this case, this repo (the Stacks Blockchain API) cannot realistically calculate those same parameters for the mempool transactions.

My feeling is that, ideally, the Stacks core node needs a new endpoint which has this ability, similar to its "mock miner" mode. It could emit those tx parameters in the mempool events.

But I'm curious to hear from @kantai @gregorycoppola @pavitthrap about possible (probably more heuristic based?) options that could be implemented in the API.

zone117x avatar Jul 12 '22 15:07 zone117x

@saralab I think the current fee estimator works great, but since it only points to historical data as it mentions in the article, finding a competitive fee can be difficult in times of high congestion especially during the initial spike.

While I think there are definitely grounds to look at something more complex like what @zone117x is suggesting, I also think there's some value in the basic data that's available from the current transactions in the mempool. While it doesn't take into account the different cost dimensions I do think the information is useful.

In the script linked above, it:

  • calculates the average fee of all tx in the mempool
  • calculates the median fee of all tx in the mempool
  • calculates the average of the median and average above
  • (optionally) applies a multiplier, although in practice this hasn't been needed

Instead of trying to compute and suggest a fee based on these values, maybe it's just better to expose them in a simpler way?

e.g. when querying fees in respect to the mempool:

{
  "total_mempool_txs": 202,
  "maxFee": 10000000,
  "avgFee": 183126,
  "medianFee": 3000,
}

This data could also be broken into 5th, 50th and 95th percentiles similar to the current fee estimator, or whatever type of grouping that may make it easier to estimate fees during higher congestion.

whoabuddy avatar Jul 12 '22 18:07 whoabuddy

I'd invite everyone to the next degen mint with me (us, well, Stacks users, by users I generally don't mean Hiro users/builders, I mean average community member who is not a builder, who are just NFT traders), and try to mint 10, 20 NFTs (yes that's what we do.. from multiple wallets..) over a few blocks from starting block height the initial spikes.

Once you experience it first hand, you will get a better feel what the Stacks users need practically on the battle field. :)

Agree with this comment very much "I think the current fee estimator works great, but since it only points to historical data as it mentions in the article, finding a competitive fee can be difficult in times of high congestion especially during the initial spike."

Hero-Gamer avatar Jul 12 '22 18:07 Hero-Gamer

Instead of trying to compute and suggest a fee based on these values, maybe it's just better to expose them in a simpler way?

e.g. when querying fees in respect to the mempool:

{
  "total_mempool_txs": 202,
  "maxFee": 10000000,
  "avgFee": 183126,
  "medianFee": 3000,
}

This would be easy to implement in the API, but I'm not sure how it will be used. Do we expect wallets to use these values to calculate tx fees? If so, how would a wallet decide when to use this vs the existing fee estimation endpoint?

More importantly, if any product (like a wallet) decided to use these values to determine fees, does this work in an adversarial environment? Doesn't this make it trivial to perform front-running? E.g. someone could flood the mempool with extremely low tx fees, so that when you try to mint an NFT (with a low tx fee due to this simple averaging), the adversary can easily front-run you with a higher tx fee?

There's probably additional adversarial behaviors that this could open the door for.

I'd like to get feedback from the @hirosystems/blockchain-team on this before we commit to implementing this.

zone117x avatar Jul 14 '22 15:07 zone117x

I think the ideal way to incorporate the current mempool into fee estimation is to provide an estimate based on the estimated fee rate required for block inclusion -- i.e., this fee rate will include the transaction in the next block, this fee rate will include the transaction in the second block, etc.

To calculate that, you need to be able to estimate both the fee rate and the cost of transactions in the mempool (because that tells you how much of the block is being taken up). I think this could be calculated from the API side, by essentially re-implementing the fee estimation from the blockchain node (the fee estimation code only reads event API data, so it's possible), and then extending that code to peak at the mempool and estimate block fill. This wouldn't be a small undertaking, but it is possible.

kantai avatar Jul 14 '22 15:07 kantai

If you add mempool transactions would that open up "attack vectors"? Ofc users always have the option to set manually but influencing the default could be lucrative.

Attack vector could be: Miners could influence the default fees of the wallet: by sending thousands of bogus transaction just to fill the mempool with high fee transactions that can never be processed (unprocessable transactions: for example with out of order nonces)

If you only include transactions that have actually been mined it is much harder to influence… as it would cost you the actual fees.

Perhaps another possible solution could be to monitor the total amount of transactions in the mempool and then when it goes over a tresshold (i.e. >2000) the average fee for the wallet is based on a shorter timeframe (1 or 2 blocks instead of the usual amount (50?)). I think in that case you’ll get fee estimates more fitting to the situation.

314159265359879 avatar Jul 14 '22 17:07 314159265359879

+1

Make it so!

unclemantis avatar Jul 15 '22 07:07 unclemantis

Thanks @314159265359879 So I guess the question becomes:

IS the negative impact of A. having wallet gas fee suggestion automatically adjusted/react frequently (based on unconfirmed txs in the mempool) GREATER than the negative impact of B. fee not adjusted as frequently? (based on looking at confirmed block and causing txs to be pending during mempool spikes) (and maybe just utilize third party tools (like the one Jason made) to get more live gas suggestions and input manually on users' side)

I can see how if A is built into wallet can have greater negative impact than B. Any thought?

Hero-Gamer avatar Jul 16 '22 18:07 Hero-Gamer

may I suggest bumping this to a P1 or P2?

unclemantis avatar Jul 26 '22 01:07 unclemantis

My understanding so far is that the Stacks node would need to have a new feature where it performs a (temporary) mine/evaluation of each mempool tx, and emit the associated cost data to the API so that it could then try to factor that into some cost estimation algorithm.

I think @kantai suggested an alternative approach where the API implements/ports some cost estimation capabilities that the Stacks node currently uses. I'm still not sure exactly what that would entail or if it's reasonable for us to implement anytime soon.

Additionally, regardless of the above approaches/issues, there still seems to be unresolved security/reliability problems around mempool tx fee estimation which I've discussed in my last comment, and @314159265359879 also discussed above.

I'm not sure what actionable items the API could do right now to help with this issue.

zone117x avatar Aug 01 '22 15:08 zone117x

To add from my side - I didn't think of the security implications of this and agree that while this method can give a good estimate it would not be reliable for production with simply the avg + median.

For the ideas on the node side, there is work going on here https://github.com/stacks-network/stacks-blockchain/issues/3229 for optimizing how the miner walks the mempool, maybe there is some overlap there to help generate/expose data that would help with reliable mempool fee estimation?

whoabuddy avatar Aug 01 '22 15:08 whoabuddy

Closing this due to inactivity. Please reopen if this is still desired.

smcclellan avatar Sep 08 '23 21:09 smcclellan