graph-explorer icon indicating copy to clipboard operation
graph-explorer copied to clipboard

Optimize schema sync DB queries

Open kmcginnes opened this issue 9 months ago • 2 comments

Some users report having time outs when syncing the schema on larger databases.

We should investigate if there is anything we can do to improve the chances of a successful sync.

Related Issues

  • #225

Steps today

  1. Fetch summary schema
  2. Fetch attributes for one node of each label
  3. Fetch attributes for one edge of each label

Gremlin

This is the query ran after the summary query in my test environment.

g.V()
  .project(
    "Comment","Organization","vertex","software","Post","Airport2","Region2",
    "Forum","Country2","person","Tag","TagClass","Person","Place"
  )
  .by(V().hasLabel("Comment").limit(1))
  .by(V().hasLabel("Organization").limit(1))
  .by(V().hasLabel("vertex").limit(1))
  .by(V().hasLabel("software").limit(1))
  .by(V().hasLabel("Post").limit(1))
  .by(V().hasLabel("Airport2").limit(1))
  .by(V().hasLabel("Region2").limit(1))
  .by(V().hasLabel("Forum").limit(1))
  .by(V().hasLabel("Country2").limit(1))
  .by(V().hasLabel("person").limit(1))
  .by(V().hasLabel("Tag").limit(1))
  .by(V().hasLabel("TagClass").limit(1))
  .by(V().hasLabel("Person").limit(1))
  .by(V().hasLabel("Place").limit(1))
  .limit(1)

The query that gets the edge attributes:

g.E()
  .project(
    "islocatedIn","hasCreator","studyAt","hasTag","workAt","hasMember",
    "WITHIN","isPartOf","KNOWS","hasModerator","hasInterest","isLocatedIn",
    "isSubClass","containerOf","replyOf","hasType","knows","likes"
  )
  .by(V().bothE("islocatedIn").limit(1))
  .by(V().bothE("hasCreator").limit(1))
  .by(V().bothE("studyAt").limit(1))
  .by(V().bothE("hasTag").limit(1))
  .by(V().bothE("workAt").limit(1))
  .by(V().bothE("hasMember").limit(1))
  .by(V().bothE("WITHIN").limit(1))
  .by(V().bothE("isPartOf").limit(1))
  .by(V().bothE("KNOWS").limit(1))
  .by(V().bothE("hasModerator").limit(1))
  .by(V().bothE("hasInterest").limit(1))
  .by(V().bothE("isLocatedIn").limit(1))
  .by(V().bothE("isSubClass").limit(1))
  .by(V().bothE("containerOf").limit(1))
  .by(V().bothE("replyOf").limit(1))
  .by(V().bothE("hasType").limit(1))
  .by(V().bothE("knows").limit(1))
  .by(V().bothE("likes").limit(1))
  .limit(1)

Improvements

  • Batch vertices in groups of 10
  • Use union as mentioned in #225

kmcginnes avatar May 06 '24 23:05 kmcginnes

Please see #226 and #225

dsaban-lightricks avatar May 07 '24 06:05 dsaban-lightricks

@dsaban-lightricks Thank you!

We will consider that approach when we start work on this issue.

kmcginnes avatar May 07 '24 23:05 kmcginnes