graph-node icon indicating copy to clipboard operation
graph-node copied to clipboard

[Feature] Allow for both pruned and unpruned versions to be live simultaneously

Open trader-payne opened this issue 1 year ago • 8 comments

Description

As the title suggests, allow for both pruned and unpruned versions to be live simultaneously.

This means I have Subgraph X which will be living in two different forms - pruned and unpruned. The most important thing needs to be that both should be accessible based on query needs, simultaneously.

If the block/data range queried falls inside the pruned version range, send queries to it. If not, send the historical queries to the unpruned version.

This has to be done with zero input from the infrastructure operator.

Are you aware of any blockers that must be resolved before implementing this feature? If so, which? Link to any relevant GitHub issues.

No response

Some information to help us out

  • [ ] Tick this box if you plan on implementing this feature yourself.
  • [X] I have searched the issue tracker to make sure this issue is not a duplicate.

trader-payne avatar Jan 15 '24 18:01 trader-payne

This would be very similar to split entity history but while split entity history tries to achieve this within one subgraph, keeping separate subgraphs would treat the pruned and unpruned version as fairly disconnected. Separate subgraphs will also cause more data duplication than split entity history, e.g., because immutable entities will have to be stored for both as duplicates.

lutter avatar Jan 15 '24 18:01 lutter

Separate subgraphs will also cause more data duplication than split entity history, e.g., because immutable entities will have to be stored for both as duplicates.

Finally giving ZFS dedup its time in the spotlights lol

trader-payne avatar Jan 15 '24 18:01 trader-payne

@trader-payne can you clarify your goal here? seems that you're not worried about database size, but want to serve fast queries with the pruned version?

azf20 avatar Feb 02 '24 13:02 azf20

@azf20 yes, correct. I want to have two (or maybe even more) versions of the same subgraph active at once. As a broad example, all for subgraph "A":

  • subgraph A pruned, with degen database settings
  • subgraph A unpruned with conservative database settings for OLTP
  • subgraph A unpruned with database settings for long SQL queries for analytics etc

trader-payne avatar Feb 02 '24 13:02 trader-payne

OK cool. Would a read replica be an alternative approach for the third case?

azf20 avatar Feb 02 '24 14:02 azf20

Could be, but I never tried. I don't know if you can have different postgres settings in this case. Maybe @lutter might know. But I know replicas require a lot of things that I normally get rid of, for better indexing/data ingestion speed.

trader-payne avatar Feb 02 '24 14:02 trader-payne

No, a read replica won't work here since the data will be exactly the same as in the main database.

lutter avatar Feb 07 '24 19:02 lutter

Looks like this issue has been open for 6 months with no activity. Is it still relevant? If not, please remember to close it.

github-actions[bot] avatar Aug 18 '24 00:08 github-actions[bot]