vega
vega copied to clipboard
Optimize data-node PostgreSQL usage
Feature Overview
We have the data node that uses PostgreSQL as a backend. It uses 2.2TB after 230 days of running the network.
It is hard to run the archival data node so far. But if the data increase is linear it will be 3TB every year. There is also not much traffic on the network at the moment.
However, most of the data comes from a single table - orders. The biggest tables are:
- Orders 2.1TB
- Market data: 320GB
- Ledger: 90GB
- Balances: 36GB
The rest tables are smaller.
Maybe We can think about optimizing the orders table? We can use some data pointers instead of raw data? We should think about it.