Update TinkerPop 3.6.1
Signed-off-by: Jan Jansen [email protected]
Thank you for contributing to JanusGraph!
In order to streamline the review of the contribution we ask you to ensure the following steps have been taken:
For all changes:
- [ ] Is there an issue associated with this PR? Is it referenced in the commit message?
- [ ] Does your PR body contain #xyz where xyz is the issue number you are trying to resolve?
- [ ] Has your PR been rebased against the latest commit within the target branch (typically
master)? - [ ] Is your initial contribution a single, squashed commit?
For code changes:
- [ ] Have you written and/or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
- [ ] If applicable, have you updated the LICENSE.txt file, including the main LICENSE.txt file in the root of this repository?
- [ ] If applicable, have you updated the NOTICE.txt file, including the main NOTICE.txt file found in the root of this repository?
For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in which it is rendered?
I will split this update into multiple PRs.
@porunov Hbase upgrade is a pain.
@porunov Hbase upgrade is a pain.
I imagine it is ...
@porunov @FlorianHockmann @li-boxuan @rngcntr Does any have a bit time to look into hbase hadoop tests?
It looks like this one of the bugs which stoping me to get this running fine. https://github.com/apache/hbase/pull/4819/files
Other tests in apache projects were deactivated in the combination to test hbase 2 with hadoop 3. https://github.com/apache/ranger
@farodin91
TP tests are skipped
It would be nice to run it by adding [tp-tests] into commit message
@mad I would like to find a way to fix the hbase test before hand.
@farodin91
Some info about hbase issue
Scan for janusgraph hbase return some data, but metadata say no data exists
Scan response
hbase:003:0> scan 'janusgraph', {LIMIT => 10}
ROW COLUMN+CELL
\x00\x00\x00\x00\x00\x00\x00\x03 column=i:\xFF\xFF\xFF\xFF\xFF\xFE\xC7\x7F\x00\x00\x01\x83\xD0\xEB\xE0\x807f000101691244-bic-pc1, timestamp=2022-10-13T10:37:42.913, value=
\x00\x00\x00\x00\x00\x00\x00\x04 column=i:\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x9B\x00\x00\x01\x83\xD0\xEB\xDF(7f000101691244-bic-pc1, timestamp=2022-10-13T10:37:42.569, value=
\x00\x00\x00\x00\x00\x00\x00\x04 column=i:\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xCD\x00\x00\x01\x83\xD0\xEB\xDD\xCA7f000101691244-bic-pc1, timestamp=2022-10-13T10:37:42.219, value=
\x00\x00\x00\x00\x00\x00\x02\x0D column=e:\x02, timestamp=2022-10-13T10:37:44.127, value=\x00\x01\x08\x80
\x00\x00\x00\x00\x00\x00\x02\x0D column=e:\x10\xC0, timestamp=2022-10-13T10:37:44.127, value=\xA0vl\x1EvertexKe\xF9\x04\x80
\x00\x00\x00\x00\x00\x00\x02\x0D column=e:\x10\xC2\x80\x14\x00, timestamp=2022-10-13T10:37:44.127, value=\x8F\x00\x01\x8E\x00\x8F\x80
\x00\x00\x00\x00\x00\x00\x02\x0D column=e:\x10\xC2\x80\x18\x00, timestamp=2022-10-13T10:37:44.127, value=\x8F\x00\x01\x8E\x00\x90\x80
\x00\x00\x00\x00\x00\x00\x02\x0D column=e:\x10\xC4, timestamp=2022-10-13T10:37:44.127, value=\x00\x82\x0C\x80
\x00\x00\x00\x00\x00\x00\x02\x0D column=e:\x10\xC8, timestamp=2022-10-13T10:37:44.127, value=\x00\x80\x00\x01\x83\xD0\xEB\xE1\xFF\x10\x80
\x18\xD4{\x96\x10\xA5\xA0vl\x1EvertexKe\xF9 column=g:\x00, timestamp=2022-10-13T10:37:44.127, value=\x04\x8D
configuration column=s:graph.janusgraph-version, timestamp=2022-10-13T10:37:42.043, value=\x92\xA01.0.0-SNAPSHO\xD4
configuration column=s:graph.storage-version, timestamp=2022-10-13T10:37:42.047, value=\x92\xA0\xB2
configuration column=s:graph.timestamps, timestamp=2022-10-13T10:37:42.033, value=\xB6\x82
configuration column=s:hidden.frozen, timestamp=2022-10-13T10:37:42.049, value=\x8F\x01
configuration column=s:ids.num-partitions, timestamp=2022-10-13T10:37:42.035, value=\x8C\x82
configuration column=s:storage.drop-on-clear, timestamp=2022-10-13T10:37:42.039, value=\x8F\x00
configuration column=s:system-registration.7f000101691244-bic-pc1.startup-time, timestamp=2022-10-13T10:37:42.153, value=\xC1\x80\x00\x00\x00cG\xEAv\x01\x11ta\x80
\x88\x00\x00\x00\x00\x00\x00\x00 column=i:\xFF\xFF\xFF\xFF\xFF\xFF\xD8\xEF\x00\x00\x01\x83\xD0\xEB\xE2\x077f000101691244-bic-pc1, timestamp=2022-10-13T10:37:43.303, value=
\x88\x00\x00\x00\x00\x00\x00\x03 column=i:\xFF\xFF\xFF\xFF\xFF\xFE\xC7\x7F\x00\x00\x01\x83\xD0\xEB\xE3c7f000101691244-bic-pc1, timestamp=2022-10-13T10:37:43.651, value=
\x88\x00\x00\x00\x00\x00\x00\x80 column=e:\x02, timestamp=2022-10-13T10:37:44.001, value=\x00\x01\x04\x91
\x88\x00\x00\x00\x00\x00\x00\x80 column=e:$, timestamp=2022-10-13T10:37:44.001, value=\x04\x8D\x08\x91\xFF
\xFA<[T\x11\xA5\x82 column=g:\x00\x04\x8D\x0C\x80, timestamp=2022-10-13T10:37:44.127, value=\x04\x8D
9 row(s)
Took 0.0498 seconds
Metadata response
hbase:004:0> scan 'hbase:meta', {FILTER=>"PrefixFilter('janusgraph')", COLUMNS=>['info:regioninfo']}
ROW COLUMN+CELL
janusgraph,,1665657444679.32f32af13dd119840a3d0b3c column=info:regioninfo, timestamp=2022-10-13T10:37:41.600, value={ENCODED => 32f32af13dd119840a3d0b3c75e01e99, NAME => 'janusgraph,,1665657444679.32f32
75e01e99. af13dd119840a3d0b3c75e01e99.', STARTKEY => '', ENDKEY => ''}
1 row(s)
STARTKEY and ENDKEY are empties.
So, that lead to empty inputSplits here org.janusgraph.hadoop.formats.hbase.HBaseBinaryInputFormat#getSplits
@mad any idea why metadata is empty?
@farodin91
Actually root cause is spark https://spark.apache.org/docs/3.2.0/core-migration-guide.html#upgrading-from-core-31-to-32
Since Spark 3.2, spark.hadoopRDD.ignoreEmptySplits is set to true by default which means Spark will not create empty partitions for empty input splits. To restore the behavior before Spark 3.2, you can set spark.hadoopRDD.ignoreEmptySplits to false.
So, just put spark.hadoopRDD.ignoreEmptySplits=false to hbase-read.properties and hbase-read-snapshot.properties
@mad Thank you.
@mad Updated
@farodin91
3.6 has some breaking changes https://github.com/apache/tinkerpop/blob/master/CHANGELOG.asciidoc - (breaking) mark
Do we need to take this into account?
@mad All breaking changes, i've handled multiple breaking changes. https://issues.apache.org/jira/browse/TINKERPOP-2507 and gryo removal.
@porunov Would you like to review it again?