initialization-actions icon indicating copy to clipboard operation
initialization-actions copied to clipboard

Apache Drill initialisation action bug

Open e-compagno opened this issue 3 years ago • 1 comments

It seems there is a bug in the Apache Drill initialisation actions:

Following instructions in https://github.com/GoogleCloudDataproc/initialization-actions/tree/master/drill , I have created a cluster via

gcloud beta dataproc clusters create cluster-drill \
    --region us-central1 \
    --no-address \
    --zone us-central1-c \
    --single-node \
    --master-machine-type n1-standard-4 \
    --master-boot-disk-size 500 \
    --image-version 2.0-debian10 \
    --project myproject \
    --initialization-actions 'gs://goog-dataproc-initialization-actions-us-central1/drill/drill.sh' \
    --optional-components=zookeeper

However, once the cluster is created drill is not running properly: sudo ./drillbit.sh status returns

/usr/lib/drill/drillbit.pid file is present but drillbit is not running.

If I try starting manually with sudo ./drillbit.sh start, in the log file /usr/lib/drill/log/drillbit.out I get the error message:

ERROR o.a.c.f.imps.CuratorFrameworkImpl - Ensure path 
threw exception
org.apache.zookeeper.KeeperException$UnimplementedException: KeeperErrorCo
de = Unimplemented for /drill

However, I am able to run a standalone version of drill with

sudo /usr/lib/drill/bin/drill-embedded

e-compagno avatar Jun 11 '21 10:06 e-compagno

Apparently it's a incompatibility version issue with Zookeeper and with the GCS connector.

To install Drill properly the previous configuration has to modified to

gcloud beta dataproc clusters create cluster-drill \
    --region us-central1 \
    --no-address \
    --zone us-central1-c \
    --single-node \
    --master-machine-type n1-standard-4 \
    --master-boot-disk-size 500 \
    --image-version 2.0-debian10 \
    --project myproject \
    --initialization-actions 'gs://goog-dataproc-initialization-actions-us-central1/drill/drill.sh' \
    --optional-components=ZOOKEEPER
    --metadata GCS_CONNECTOR_VERSION=2.0.1

In any case, I would suggest making an optional components in Dataproc with Drill to simplify the initialisation actions.

e-compagno avatar Jun 14 '21 13:06 e-compagno