shark
shark copied to clipboard
[WIP] Add CLI Support for Catalyst
- Support reload the cachedRDD upon the start
- Support the CLI switch for Hive/Catalyst
$ bin/shark
catalyst> show tables;
Execution Mode: catalyst
OK
shark_test1
shark_test1_cached
Time taken: 0.011 seconds
catalyst> explain select * from shark_test1;
Execution Mode: catalyst
== Logical Plan ==
Project [key#0,val#1]
MetastoreRelation default, shark_test1, None
== Optimized Logical Plan ==
MetastoreRelation default, shark_test1, None
== Physical Plan ==
HiveTableScan [key#0,val#1], (MetastoreRelation default, shark_test1, None), None
Time taken: 0.172 seconds
catalyst> set shark.exec.mode=hive;
hive> explain select * from shark_test1;
Execution Mode: hive
OK
ABSTRACT SYNTAX TREE:
(TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME shark_test1))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF))))
STAGE DEPENDENCIES:
Stage-0 is a root stage
STAGE PLANS:
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
TableScan
alias: shark_test1
Select Operator
expressions:
expr: key
type: int
expr: val
type: string
outputColumnNames: _col0, _col1
ListSink
Time taken: 0.107 seconds
@marmbrus Can you review that for me? Sorry, lots of code, but most of them are copied from the Shark.
Merged build triggered.
Merged build started.
Merged build finished.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/Shark-Pull-Request-Builder/12203/
Still found some jar conflict issues, I will keep updating.
Merged build triggered.
Merged build started.
SharkServer2Suite failed in my local test, seems the namespace conflict for the rewritten class CliService.java / HiveServer2.java, I will figure out how to fix that soon.
Besides, I removed the cached RDD reload code for next PR.
Merged build finished.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/Shark-Pull-Request-Builder/12204/
Merged build triggered.
Merged build started.
Merged build finished.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/Shark-Pull-Request-Builder/12205/
Merged build triggered.
Merged build started.
Merged build finished.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/Shark-Pull-Request-Builder/12206/
@chenghao-intel thanks for working on this. I think it is ok to not have the other features for now. We just need a CLI that we can use to query.
The CLI is ready now, and it passed the unit test in my local (SharkServer2 doens't work in my local still), But Jenkins failed in retrieving the httpclient jar, @rxin , can you check that also in your local if possible? I am not sure if any env setting that only work for myself.
Jenkins, retest this please.
Merged build triggered.
Merged build started.
Merged build finished.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/Shark-Pull-Request-Builder/12207/
Still failed in retrieving the jar httpclient.
Could it be missing a repository?
Actually I 've added 3 more repository.
I confirm that I can build this locally.
@pwendell can we clear the .m2 / .ivy2 cache on the Jenkins machine?