janusgraph
janusgraph copied to clipboard
Add batch query support for drop step [tp-tests]
- Reuse multi-query optimization for TinkerPop
DropStep
- Change restriction on eligible multi-query traversals and allow multi-query optimizations to be used for queries with steps which contain
drop()
step - Add release template for JanusGraph 1.1.0
Benchmarks:
CQLMultiQueryDropBenchmark.dropVertices N/A true N/A 5000 avgt 5 98.069 ± 24.496 ms/op
CQLMultiQueryDropBenchmark.dropVertices N/A false N/A 5000 avgt 5 1451.917 ± 90.617 ms/op
CQLMultiQueryDropBenchmark.dropVerticesGremlinQuery N/A true N/A 5000 avgt 5 106.444 ± 38.290 ms/op
CQLMultiQueryDropBenchmark.dropVerticesGremlinQuery N/A false N/A 5000 avgt 5 1452.070 ± 130.414 ms/op
Conclusion: same as with direct usage of multi-query the usage of Gremlin drop()
step benefits from re-using multi-query capabilities. Performance of drop
query processing improved more than 13 times.
TODO:
- Update from
TinkerPop
3.7.3-SNAPSHOT to 3.7.3 when it's released.
Related https://github.com/apache/tinkerpop/pull/2609 Related https://github.com/apache/tinkerpop/pull/2612
Thank you for contributing to JanusGraph!
In order to streamline the review of the contribution we ask you to ensure the following steps have been taken:
For all changes:
- [ ] Is there an issue associated with this PR? Is it referenced in the commit message?
- [ ] Does your PR body contain #xyz where xyz is the issue number you are trying to resolve?
- [x] Has your PR been rebased against the latest commit within the target branch (typically
master
)? - [x] Is your initial contribution a single, squashed commit?
For code changes:
- [x] Have you written and/or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
- [ ] If applicable, have you updated the LICENSE.txt file, including the main LICENSE.txt file in the root of this repository?
- [ ] If applicable, have you updated the NOTICE.txt file, including the main NOTICE.txt file found in the root of this repository?
For documentation related changes:
- [x] Have you ensured that format looks appropriate for the output in which it is rendered?
Currently I'm stuck on the following two TinkerPop tests:
Error: Failures:
Error: org.apache.tinkerpop.gremlin.process.traversal.step.filter.DropTest$Traversals.g_E_propertiesXweightX_drop
Error: Run 1: DropTest$Traversals>DropTest.g_E_propertiesXweightX_drop:98->DropTest.lambda$g_E_propertiesXweightX_drop$1:98 expected:<0> but was:<1>
Error: Run 2: DropTest$Traversals>DropTest.g_E_propertiesXweightX_drop:98->DropTest.lambda$g_E_propertiesXweightX_drop$1:98 expected:<0> but was:<1>
[INFO]
Error: org.apache.tinkerpop.gremlin.process.traversal.step.filter.DropTest$Traversals.g_V_properties_propertiesXstartTimeX_drop
Error: Run 1: DropTest$Traversals>DropTest.g_V_properties_propertiesXstartTimeX_drop:109->DropTest.lambda$g_V_properties_propertiesXstartTimeX_drop$3:109 expected:<0> but was:<1>
Error: Run 2: DropTest$Traversals>DropTest.g_V_properties_propertiesXstartTimeX_drop:109->DropTest.lambda$g_V_properties_propertiesXstartTimeX_drop$3:109 expected:<0> but was:<2>
The first test checks that drop()
works for Edge properties. The second one checks that drop()
works for meta properties.
For some reason those two TinkerPop tests are failing with this PR. However, I verified that the removal operation works in this PR by adding testMetaPropertiesDrop
and testEdgePropertiesDrop
tests (similarly as in those TinkerPop tests).
It can be seen that these added tests are passing. For now the reason for these two TinkerPop tests to fail is unknown to me.
Additional pair of eyes would me much appreciated.
TinkerPop tests for inmemory database can be triggered with:
mvn clean install --projects janusgraph-inmemory -DskipTests=true --batch-mode --also-make
mvn verify --projects janusgraph-inmemory -Dtest.skip.tp=false -Dtest=org.janusgraph.blueprints.inmemory.process.InMemoryMultiQueryJanusGraphProcessTest -pl janusgraph-inmemory
One of the observations is that if I change the default barrier step size from the default 2500
to 1
(query.batch.limited-size=1
) then all tests passes.
I.e. it means that the fact that we store those properties in the barrier before we actually remove them changes the result of those two tests.
Currently I'm stuck on the following two TinkerPop tests:
Error: Failures: Error: org.apache.tinkerpop.gremlin.process.traversal.step.filter.DropTest$Traversals.g_E_propertiesXweightX_drop Error: Run 1: DropTest$Traversals>DropTest.g_E_propertiesXweightX_drop:98->DropTest.lambda$g_E_propertiesXweightX_drop$1:98 expected:<0> but was:<1> Error: Run 2: DropTest$Traversals>DropTest.g_E_propertiesXweightX_drop:98->DropTest.lambda$g_E_propertiesXweightX_drop$1:98 expected:<0> but was:<1> [INFO] Error: org.apache.tinkerpop.gremlin.process.traversal.step.filter.DropTest$Traversals.g_V_properties_propertiesXstartTimeX_drop Error: Run 1: DropTest$Traversals>DropTest.g_V_properties_propertiesXstartTimeX_drop:109->DropTest.lambda$g_V_properties_propertiesXstartTimeX_drop$3:109 expected:<0> but was:<1> Error: Run 2: DropTest$Traversals>DropTest.g_V_properties_propertiesXstartTimeX_drop:109->DropTest.lambda$g_V_properties_propertiesXstartTimeX_drop$3:109 expected:<0> but was:<2>
The first test checks that
drop()
works for Edge properties. The second one checks thatdrop()
works for meta properties. For some reason those two TinkerPop tests are failing with this PR. However, I verified that the removal operation works in this PR by addingtestMetaPropertiesDrop
andtestEdgePropertiesDrop
tests (similarly as in those TinkerPop tests). It can be seen that these added tests are passing. For now the reason for these two TinkerPop tests to fail is unknown to me. Additional pair of eyes would me much appreciated.TinkerPop tests for inmemory database can be triggered with:
mvn clean install --projects janusgraph-inmemory -DskipTests=true --batch-mode --also-make mvn verify --projects janusgraph-inmemory -Dtest.skip.tp=false -Dtest=org.janusgraph.blueprints.inmemory.process.InMemoryMultiQueryJanusGraphProcessTest -pl janusgraph-inmemory
I noticed, I am not sure if it is related, but you might have missed adding JanusGraphDropStep
here:
https://github.com/JanusGraph/janusgraph/blob/a2e3521d28de37ca01d08166777d77c0943a9db5/janusgraph-backend-testutils/src/main/java/org/janusgraph/graphdb/database/LazyLoadGraphTest.java#L40
I noticed, I am not sure if it is related, but you might have missed adding JanusGraphDropStep here:
JanusGraphDropStep
is replaced as part of JanusGraphLocalQueryOptimizerStrategy
. So I believe it should be fine.
I feels that those two TinkerPop tests don't like that I'm storing properties for removal in the barrier step for some reason.
However, I can't figure out "why" yet.
After a bit of debugging I found the root cause of the problem.
When removing Edge properties I see that each edge property is wrapped into B_O_Traverser
which is passed to NoOpBarrier
step. That traverser is added to the TraverserSet
here.
After I loaded the same graph as in TinkerPop tests locally I noticed that 4 out of 6 edge properties were removed, but 2 didn't because they were considered the "same" properties as were already added into the TraverserSet
(thus their Traverser's merged).
The root cause is that relation
is not checked on equality for SimpleJanusGraphProperty
. This bug was introduced 9 years ago here: https://github.com/JanusGraph/janusgraph/commit/795d7b0131c1e62fdf354e0a8e4718d6627f1418
The fix should be a simple one and I will add a separate test case with barrier()
+ drop()
of edge properties with the same key
and value
.
After a bit of debugging I found the root cause of the problem. When removing Edge properties I see that each edge property is wrapped into
B_O_Traverser
which is passed toNoOpBarrier
step. That traverser is added to theTraverserSet
here. After I loaded the same graph as in TinkerPop tests locally I noticed that 4 out of 6 edge properties were removed, but 2 didn't because they were considered the "same" properties as were already added into theTraverserSet
(thus their Traverser's merged). The root cause is thatrelation
is not checked on equality forSimpleJanusGraphProperty
. This bug was introduced 9 years ago here: 795d7b0The fix should be a simple one and I will add a separate test case with
barrier()
+drop()
of edge properties with the samekey
andvalue
.
Huh. Seems I was a bit wrong. If I fix that drop
tests then dedup
tests start to fail. Hmm. Now I'm confused where the fix should happen.
Seems I'm missing how barrier
step suppose to work. It could be that we actually don't have any bug which I described above and it's actually intended to be like that (at least I verified that the behavior is consistent between TinkerGraph and JanusGraph, so I guess we are fine there).
However, I think it makes sense to double check with the TinkerPop community regrading barrier
and dedup
behavior. Thus, I opened a thread here: https://discord.com/channels/838910279550238720/1241219665376706603
If that is confirmed to be the intended behavior then I will try to think about a bit different solution.
One of the potential solutions is to make out own NoOpBarrierStep
which skips anything except Vertex
and insert that barrier step as part of multi-query optimization instead of the standard NoOpBarrierStep
.
However, there are many places in code logic where we need to check and sometimes cast a step to NoOpBarrierStep
. It would make sense to extend NoOpBarrierStep
and overwrite part of the logic, but NoOpBarrierStep
is currently final
and the necessary properties are private
which makes things a bit challenging.
Potentially, when the following PR is merged and the next TinkerPop version is released it should be easier to extend logic: https://github.com/apache/tinkerpop/pull/2612
Both TinkerPop PRs are now merged (https://github.com/apache/tinkerpop/pull/2612 and https://github.com/apache/tinkerpop/pull/2609). This will allow us to extend NoOpBarrierStep
to skip everything except vertices. Thus should be also an improvement by itself because there is no reason for multi-query strategy to inject barrier
steps which trigger batching for anything except edges because we multi-query step registers Vertices
only as for now, and thus anything else is just a potential waist of operations.
When we replace it with our own barrier step it will allow us:
- Safely open multi-query strategy for
drop()
step. - Slightly improve processing of edges / properties / meta properties because they won't be affected by the barrier anymore (i.e. a bit less of unnecessary operations).
Now we need to wait for TinkerPop 3.7.3 to proceed without doing major hacks.
The PR is now ready for the review. All tests passed as can be seen from my branch (here).
Notes:
- This PR is using TinkerPop 3.7.3-SNAPSHOT. Thus, we can't merge it until TinkerPop 3.7.3 is released.
- The attached link to my branch shows that all tests (including tp-tests) passed. However, if berkeleydb tp-tests fail for this PR - it's not related to this PR. We currently have flaky berkeleydb configuration in
master
branch which makes tp-tests for berkeleydb fail often. This problem appeared after merging the following PR #4425 .
@ntisseyre let me know if you want to review it or I will merge this PR by following lazy consensus
@ntisseyre let me know if you want to review it or I will merge this PR by following lazy consensus
Let's merge, thanks @porunov !