Question. Graph-node deployment takes to much space

Open statzky opened this issue 1 year ago • 0 comments

Hello everyone, I'm sorry if this is not the right place to ask..

I have a graph-node deployed on a virtual machine (in docker), and I'm encountering an issue with disk space consumption. My setup details are as follows:

Network: Polygon
Start block: 2,000,000
Current block: 59,606,297
Database size after initial indexing: 35 GB (on 58,908,274 block)

The problem is that the database size increases by approximately 7 GB every day. Given the number of new blocks generated each day, this seems unusually high.

Here are some additional details:

I have 5 subgraphs deployed to my node.
Subgraphs are tracking events from certain smart contracts.
Using PostgreSQL for the database (nothing special, just like all other subgraphs).
Lately I have no new transaction for my contracts (around few weeks).

I suspect that the rapid increase in disk space usage might not be solely due to new blocks and their associated data. Some potential culprits or considerations might be:

Node configuration (everything is set by default, maybe I should change something?).
Database indexing or maintenance issues.
Possible redundant data or "bloat" in the database.

I know about GRAPH_ETHEREUM_CLEANUP_BLOCKS, but I want to use my graph-node for production purposes.

Should I use Postgres VACUUM or ANALYZE functionality?

Could anyone provide insights or suggestions on what might be causing this rapid increase in database size and how to optimize it?

Jul 20 '24 23:07 statzky