cdap-operator icon indicating copy to clipboard operation
cdap-operator copied to clipboard

CDAP pipeline : native compute profile fail with error

Open kwop opened this issue 4 years ago • 3 comments

Similar issue as : https://github.com/cdapio/cdap-operator/issues/47

CDAP operator version : v2.0.2-rc0 CDAP version : 6.3.0 Kubernetes version :

Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:50:19Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.7", GitCommit:"b4455102ef392bf7d594ef96b97a4caa79d729d9", GitTreeState:"clean", BuildDate:"2020-06-17T11:32:20Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}

When i use the "native compute profile" on any pipeline it fails with io.kubernetes.client.ApiException: Unprocessable Entity in appfabric container logs.

Is this related to hadoop user not set ?

I am using a hadoop container ( sequenceiq/hadoop-docker:2.7.1 ) like written in the -> blog post Am i missing something ?

Stackstrace :

2020-11-19 11:30:34,642 - INFO  [sys-app-management-service:i.c.c.i.s.SystemAppManagementService@109] - System app config file /opt/cdap/master/system-app-config/..2020_11_19_11_26_26.773026893 does not exist.
2020-11-19 11:30:38,394 - WARN  [program.status:i.c.c.i.a.r.d.DistributedProgramRuntimeService@172] - Twill RunId does not exist for the program program:system.pipeline.-SNAPSHOT.service.studio, runId a3966da1-2a5a-11eb-a9ee-4e9d09a2764f
2020-11-19 11:30:39,338 - INFO  [program-start-0:i.c.c.i.a.r.d.DistributedProgramRunner@494] - Starting Service Program 'studio' with Arguments [logical.start.time=1605785434485]
2020-11-19 11:30:39,340 - INFO  [program-start-0:i.c.c.i.a.r.d.DistributedProgramRunner$1@299] - Starting program:system.pipeline.-SNAPSHOT.service.studio with debugging enabled: false, logback: file:/etc/cdap/conf/logback-container.xml
2020-11-19 11:30:39,670 - INFO  [program-start-0:i.c.c.k.r.KubeTwillPreparer@520] - Created deployment cdap-cdap-service-system-pipeline-studio-a193e088-e3b0-4fdf-a6c1-75278007d1af in Kubernetes
2020-11-19 11:30:39,742 - INFO  [program-start-0:i.c.c.i.a.r.d.AbstractTwillProgramController$1@69] - Twill program running: program_run:system.pipeline.-SNAPSHOT.service.studio.a3966da1-2a5a-11eb-a9ee-4e9d09a2764f, twill runId: a193e088-e3b0-4fdf-a6c1-75278007d1af
2020-11-19 11:31:32,700 - INFO  [appfabric-executor-18:i.c.c.i.a.s.ProgramLifecycleService@515] - Attempt to run Workflow program DataPipelineWorkflow as user root
2020-11-19 11:31:36,138 - WARN  [program.status:i.c.c.i.a.r.d.DistributedProgramRuntimeService@172] - Twill RunId does not exist for the program program:default.read_and_write_hdfs_v1.-SNAPSHOT.workflow.DataPipelineWorkflow, runId c64388b2-2a5a-11eb-a074-4e9d09a2764f
2020-11-19 11:31:53,755 - INFO  [program-start-0:i.c.c.i.a.r.d.DistributedProgramRunner@494] - Starting Workflow Program 'DataPipelineWorkflow' with Arguments [logical.start.time=1605785492667]
2020-11-19 11:31:53,759 - INFO  [program-start-0:i.c.c.i.a.r.d.DistributedProgramRunner$1@299] - Starting program:default.read_and_write_hdfs_v1.-SNAPSHOT.workflow.DataPipelineWorkflow with debugging enabled: false, logback: file:/etc/cdap/conf/logback-container.xml
2020-11-19 11:31:54,837 - ERROR [program-start-0:i.c.c.a.r.AbstractProgramRuntimeService@184] - Exception while trying to run program
java.lang.RuntimeException: Unable to create Kubernetes resource while attempting to start program.
        at io.cdap.cdap.k8s.runtime.KubeTwillPreparer.start(KubeTwillPreparer.java:328) ~[na:na]
        at io.cdap.cdap.internal.app.runtime.distributed.DistributedProgramRunner$1.call(DistributedProgramRunner.java:354) ~[na:na]
        at io.cdap.cdap.internal.app.runtime.distributed.DistributedProgramRunner$1.call(DistributedProgramRunner.java:252) ~[na:na]
        at io.cdap.cdap.security.impersonation.ImpersonationUtils$1.run(ImpersonationUtils.java:47) ~[na:na]
        at java.security.AccessController.doPrivileged(Native Method) ~[na:1.8.0_252]
        at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_252]
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893) ~[hadoop-common-2.9.2.jar:na]
        at io.cdap.cdap.security.impersonation.ImpersonationUtils.doAs(ImpersonationUtils.java:44) ~[na:na]
        at io.cdap.cdap.security.impersonation.DefaultImpersonator.doAs(DefaultImpersonator.java:74) ~[na:na]
        at io.cdap.cdap.security.impersonation.DefaultImpersonator.doAs(DefaultImpersonator.java:63) ~[na:na]
        at io.cdap.cdap.internal.app.runtime.distributed.DistributedProgramRunner.run(DistributedProgramRunner.java:366) ~[na:na]
        at io.cdap.cdap.app.runtime.AbstractProgramRuntimeService.lambda$run$2(AbstractProgramRuntimeService.java:180) ~[na:na]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_252]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_252]
        at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_252]
Caused by: io.kubernetes.client.ApiException: Unprocessable Entity
        at io.kubernetes.client.ApiClient.handleResponse(ApiClient.java:882) ~[na:na]
        at io.kubernetes.client.ApiClient.execute(ApiClient.java:798) ~[na:na]
        at io.kubernetes.client.apis.CoreV1Api.createNamespacedConfigMapWithHttpInfo(CoreV1Api.java:7548) ~[na:na]
        at io.kubernetes.client.apis.CoreV1Api.createNamespacedConfigMap(CoreV1Api.java:7530) ~[na:na]
        at io.cdap.cdap.k8s.runtime.KubeTwillPreparer.createConfigMap(KubeTwillPreparer.java:588) ~[na:na]
        at io.cdap.cdap.k8s.runtime.KubeTwillPreparer.start(KubeTwillPreparer.java:324) ~[na:na]
        ... 14 common frames omitted

My master cdap config :

apiVersion: cdap.cdap.io/v1alpha1
kind: CDAPMaster
metadata:
  name: {{ .Release.Name }}
  namespace: {{.Release.Namespace}}
spec:
  runtime: {}
  locationURI: hdfs://{{ .Release.Name }}-hadoop:9000
  serviceAccountName: cdap
  router:
    replicas: 2
  userInterface:
    replicas: 2
  securitySecret: {{ .Release.Name }}
  image: {{.Values.cdap.baseContainer}}
  userInterfaceImage: {{.Values.cdap.baseContainer}}
  config:
    master.environment.k8s.namespace: {{ .Release.Namespace }}
    enable.preview: "true"
    data.storage.implementation: postgresql
    data.storage.sql.jdbc.connection.url: jdbc:postgresql://{{.Release.Name}}-postgresql:5432/cdap
    data.storage.sql.jdbc.driver.name: org.postgresql.Driver
    metadata.storage.implementation: elastic
    metadata.elasticsearch.cluster.hosts: elasticsearch-master
    hdfs.user: root

kwop avatar Nov 19 '20 11:11 kwop

Native profile is currently not support in k8s environment

chtyim avatar Nov 21 '20 01:11 chtyim

Is this feature being added any time soon?

devlinmr avatar May 06 '21 16:05 devlinmr

We are starting work on it in the CDAP project, aiming for the 6.6.0 release

albertshau avatar May 06 '21 16:05 albertshau