graph-node
graph-node copied to clipboard
[Feature] Allow for both pruned and unpruned versions to be live simultaneously
Description
As the title suggests, allow for both pruned and unpruned versions to be live simultaneously.
This means I have Subgraph X which will be living in two different forms - pruned and unpruned. The most important thing needs to be that both should be accessible based on query needs, simultaneously.
If the block/data range queried falls inside the pruned version range, send queries to it. If not, send the historical queries to the unpruned version.
This has to be done with zero input from the infrastructure operator.
Are you aware of any blockers that must be resolved before implementing this feature? If so, which? Link to any relevant GitHub issues.
No response
Some information to help us out
- [ ] Tick this box if you plan on implementing this feature yourself.
- [X] I have searched the issue tracker to make sure this issue is not a duplicate.
This would be very similar to split entity history but while split entity history tries to achieve this within one subgraph, keeping separate subgraphs would treat the pruned and unpruned version as fairly disconnected. Separate subgraphs will also cause more data duplication than split entity history, e.g., because immutable entities will have to be stored for both as duplicates.
Separate subgraphs will also cause more data duplication than split entity history, e.g., because immutable entities will have to be stored for both as duplicates.
Finally giving ZFS dedup its time in the spotlights lol
@trader-payne can you clarify your goal here? seems that you're not worried about database size, but want to serve fast queries with the pruned version?
@azf20 yes, correct. I want to have two (or maybe even more) versions of the same subgraph active at once. As a broad example, all for subgraph "A":
- subgraph A pruned, with degen database settings
- subgraph A unpruned with conservative database settings for OLTP
- subgraph A unpruned with database settings for long SQL queries for analytics etc
OK cool. Would a read replica be an alternative approach for the third case?
Could be, but I never tried. I don't know if you can have different postgres settings in this case. Maybe @lutter might know. But I know replicas require a lot of things that I normally get rid of, for better indexing/data ingestion speed.
No, a read replica won't work here since the data will be exactly the same as in the main database.
Looks like this issue has been open for 6 months with no activity. Is it still relevant? If not, please remember to close it.