incubator-streampark icon indicating copy to clipboard operation
incubator-streampark copied to clipboard

StreamPark FAQ

Open xinzhuxiansheng opened this issue 2 years ago • 18 comments

StreamPark ── A magical framework make flink&spark easier!

FAQ

Here is a compilation of frequently mentioned popular issues based on user feedback. If you have a new question, please submit issue . Please do not ask your question here. This is not a question area.


这里记录总结了用户反馈较多的热门问题, 如果你有新的问题,请提issue ,不要在这里提问. 不要在这里提问. 不要在这里提问. 这里不是提问区.

xinzhuxiansheng avatar Dec 09 '21 16:12 xinzhuxiansheng

1. maven install error,Failed to run task: 'npm install' failed?

69691639122373_ pic_hd

because the front end uses nodejs, make sure that nodejs is installed on the compiling machine when compiling, and make sure that the nodejs version is not too old. You can enter streamx-console-webapp and manually execute the cmd to try to compile: npm install, if still If it fails, you can check the information related to nodejs compilation by yourself and try to solve this problem by yourself

xinzhuxiansheng avatar Jan 12 '22 15:01 xinzhuxiansheng

  1. #530

exception like this:

Caused by: org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn Application Cluster
at org.apache.flink.yarn.YarnClusterDescriptor.deployApplicationCluster(YarnClusterDescriptor.java:465)
at com.streamxhub.streamx.flink.submit.impl.YarnApplicationSubmit$$anon$1.call(YarnApplicationSubmit.scala:80)
at com.streamxhub.streamx.flink.submit.impl.YarnApplicationSubmit$$anon$1.call(YarnApplicationSubmit.scala:64)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1926)
... 157 more
Caused by: java.lang.NumberFormatException: For input string: "30s"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:589)
at java.lang.Long.parseLong(Long.java:631)
at org.apache.hadoop.conf.Configuration.getLong(Configuration.java:1435)
at org.apache.hadoop.hdfs.client.impl.DfsClientConf.(DfsClientConf.java:255)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:319)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:303)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:159)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3247)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:121)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3296)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3264)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:475)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:228)
at org.apache.flink.yarn.YarnClusterDescriptor.startAppMaster(YarnClusterDescriptor.java:769)
at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:592)
at org.apache.flink.yarn.YarnClusterDescriptor.deployApplicationCluster(YarnClusterDescriptor.java:458)
... 162 more

fixed , see #2443

xinzhuxiansheng avatar Jan 12 '22 15:01 xinzhuxiansheng

  1. after streamx-console started, app.home is not set, and throw NullPointerException

image

streamx-console initialization check failed. If started local for development and debugging, please ensure the -Dapp.home parameter is clearly specified in vm options, more detail: http://www.streamxhub.com/docs/user-guide/development/#vm-options

xinzhuxiansheng avatar Jan 12 '22 15:01 xinzhuxiansheng

  1. Cause: java.sql.SQLSyntaxErrorException: Table 'streamx.t_setting' doesn't exist

### Cause: java.sql.SQLSyntaxErrorException: Table 'streamx.t_setting' doesn't exist
; bad SQL grammar []; nested exception is java.sql.SQLSyntaxErrorException: Table 'streamx.t_setting' doesn't exist
	at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBeforeInitialization(InitDestroyAnnotationBeanPostProcessor.java:160)
	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.applyBeanPostProcessorsBeforeInitialization(AbstractAutowireCapableBeanFactory.java:415)
	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1786)
	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:594)
	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:516)
	at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:324)
	at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:234)
	at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:322)
	at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:202)
	at org.springframework.beans.factory.config.DependencyDescriptor.resolveCandidate(DependencyDescriptor.java:276)
	at org.springframework.beans.factory.support.DefaultListableBeanFactory.doResolveDependency(DefaultListableBeanFactory.java:1307)
	at org.springframework.beans.factory.support.DefaultListableBeanFactory.resolveDependency(DefaultListableBeanFactory.java:1227)
	at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor$AutowiredFieldElement.inject(AutowiredAnnotationBeanPostProcessor.java:640)
	... 57 common frames omitted

StreamX v1.2.0以上版本以及主分支(不包含v1.2.0),需要手动执行sql脚本,初始化表结构,请参考此链接:https://github.com/streamxhub/streamx/tree/main/streamx-console/streamx-console-service/src/assembly/script

  • final :用于全新部署
  • upgrade:用于低版本升级,表结构的增量

xinzhuxiansheng avatar Jan 12 '22 15:01 xinzhuxiansheng

  1. java.io.InvalidClassException: scala.collection.immutable.Set$EmptySet$

2021-12-02 18:01:27 | INFO  | XNIO-1 task-4 | com.streamxhub.streamx.console.core.entity.Application ] local appHome:~/streamx_workspace/workspace/1466345568741457922
2021-12-02 18:01:28 | INFO  | XNIO-1 task-4 | com.streamxhub.streamx.flink.proxy.FlinkShimsProxy ] [StreamX] 
----------------------------------------- flink version -----------------------------------
     flinkHome    : /data/flink-1.14.0
     distJarName  : flink-dist_2.12-1.14.0.jar
     flinkVersion : 1.14.0
     majorVersion : 1.14
     scalaVersion : 2.12
     shimsVersion : streamx-flink-shims_flink-1.14
-------------------------------------------------------------------------------------------

java.io.InvalidClassException: scala.collection.immutable.Set$EmptySet$; local class incompatible: stream classdesc serialVersionUID = -1118802231467657162, local class serialVersionUID = -2443710944435909512
        at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699)
        at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2001)
        at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1848)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2158)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1665)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2403)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2327)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2185)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1665)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2403)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2327)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2185)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1665)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2403)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2327)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2185)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1665)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:501)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:459)

Currently StreamX only supports scala 2.11, so you need to change the scala version of the custom program and the Flink pointed to by Flink_HOME to the 2.11 installation package. I have verified that the program and Flink installation package are replaced with 2.11 scala

目前StreamX 仅支持scala 2.11 ,所以需要要将自定义程序的scala版本及 Flink_HOME指向的Flink 也改成2.11安装包, 本人已验证, 将程序及Flink安装包换成2.11的scala,可以了,OK :)


xinzhuxiansheng avatar Jan 12 '22 15:01 xinzhuxiansheng

  1. Caused: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

图片

配置文件缺失metastore.uri,添加上即可, 请参考#219

xinzhuxiansheng avatar Jan 12 '22 15:01 xinzhuxiansheng

  1. java.lang.RuntimeException: java.io.IOException: com.sun.jna.LastErrorException: [2] No such file or directory

java.lang.RuntimeException: java.io.IOException: com.sun.jna.LastErrorException: [2] No such file or directory
	at com.github.dockerjava.httpclient5.ApacheDockerHttpClientImpl.execute(ApacheDockerHttpClientImpl.java:187)
	at com.github.dockerjava.httpclient5.ApacheDockerHttpClient.execute(ApacheDockerHttpClient.java:9)
	at com.github.dockerjava.core.DefaultInvocationBuilder.execute(DefaultInvocationBuilder.java:228)
	at com.github.dockerjava.core.DefaultInvocationBuilder.lambda$executeAndStream$1(DefaultInvocationBuilder.java:269)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: com.sun.jna.LastErrorException: [2] No such file or directory
	at com.github.dockerjava.transport.DomainSocket.<init>(DomainSocket.java:63)
	at com.github.dockerjava.transport.BsdDomainSocket.<init>(BsdDomainSocket.java:43)
	at com.github.dockerjava.transport.DomainSocket.get(DomainSocket.java:138)
	at com.github.dockerjava.transport.UnixSocket.get(UnixSocket.java:27)
	at com.github.dockerjava.httpclient5.ApacheDockerHttpClientImpl$2.createSocket(ApacheDockerHttpClientImpl.java:145)
	at org.apache.hc.client5.http.impl.io.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:125)
	at org.apache.hc.client5.http.impl.io.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:407)
	at org.apache.hc.client5.http.impl.classic.InternalExecRuntime.connectEndpoint(InternalExecRuntime.java:168)
	at org.apache.hc.client5.http.impl.classic.InternalExecRuntime.connectEndpoint(InternalExecRuntime.java:178)
	at org.apache.hc.client5.http.impl.classic.ConnectExec.execute(ConnectExec.java:136)

Check if Docker is started


xinzhuxiansheng avatar Jan 12 '22 15:01 xinzhuxiansheng

  1. #571

image

methods of resolution

Dependency hierarchy, see where the log conflicts are

fix udf log4j conflict

Requirement or improvement(诉求 & 改进建议)

add java parameter Start the job of Flink job add: -Dlog4j.ignoreTC=true

wolfboys avatar Jan 12 '22 23:01 wolfboys

  1. Could not find a suitable table factory for 'org.apache.flink.table.factories.TableSourceFactory' in the classpath

2021-12-24T04:43:28.901627819Z Caused by: org.apache.flink.table.api.NoMatchingTableFactoryException: Could not find a suitable table factory for 'org.apache.flink.table.factories.TableSourceFactory' in
2021-12-24T04:43:28.901630469Z the classpath.
2021-12-24T04:43:28.901632969Z
2021-12-24T04:43:28.901635342Z Reason: Required context properties mismatch.
2021-12-24T04:43:28.901637804Z
2021-12-24T04:43:28.901640154Z The matching candidates:
2021-12-24T04:43:28.901642578Z org.apache.flink.table.sources.CsvAppendTableSourceFactory
2021-12-24T04:43:28.901645059Z Mismatched properties:
2021-12-24T04:43:28.901647812Z 'connector.type' expects 'filesystem', but is 'kafka'
2021-12-24T04:43:28.901650202Z 'format.type' expects 'csv', but is 'json'

Flink version: 1.14.0

解决方案:flink-kafka-connector 的使用参数不对,请参考flink官网:

CREATE TABLE user_log (
user_id VARCHAR,
item_id VARCHAR,
category_id VARCHAR,
behavior VARCHAR,
ts TIMESTAMP(3)
) WITH (
'connector' = 'kafka',
'topic' = 'user_behavior',
'properties.bootstrap.servers' = 'kafka-1:9092,kafka-2:9092,kafka-3:9092',
'properties.group.id' = 'testGroup',
'scan.startup.mode' = 'earliest-offset',
'format' = 'json'
);

CREATE TABLE pvuv_sink (
dt VARCHAR primary key,
pv BIGINT,
uv BIGINT
) WITH (
'connector' = 'jdbc', -- 使用 jdbc connector
'url' = 'jdbc:mysql://test-mysql:3306/test', -- jdbc url
'table-name' = 'pvuv_sink', -- 表名
'username' = 'root', -- 用户名
'password' = '123456' -- 密码
);

INSERT INTO pvuv_sink
SELECT
DATE_FORMAT(ts, 'yyyy-MM-dd HH:00') dt,
COUNT(*) AS pv,
COUNT(DISTINCT user_id) AS uv
FROM user_log
GROUP BY DATE_FORMAT(ts, 'yyyy-MM-dd HH:00');

另外kafka的消息格式

{"user_id": "543462", "item_id":"1715", "category_id": "1464116", "behavior": "pv", "ts":"2021-02-01T01:00:00Z"}
{"user_id": "662867", "item_id":"2244074","category_id":"1575622","behavior": "pv", "ts":"2021-02-01T01:00:00Z"}
{"user_id": "662867", "item_id":"2244074","category_id":"1575622","behavior": "pv", "ts":"2021-02-01T01:00:00Z"}
{"user_id": "662867", "item_id":"2244074","category_id":"1575622","behavior": "learning flink", "ts":"2021-02-01T01:00:00Z"}

要修改为

{"user_id": "543462", "item_id":"1715", "category_id": "1464116", "behavior": "pv", "ts":"2021-02-01 01:00:00"}
{"user_id": "662867", "item_id":"2244074","category_id":"1575622","behavior": "pv", "ts":"2021-02-01 01:00:00"}
{"user_id": "662867", "item_id":"2244074","category_id":"1575622","behavior": "pv", "ts":"2021-02-01 01:00:00"}
{"user_id": "662867", "item_id":"2244074","category_id":"1575622","behavior": "learning flink", "ts":"2021-02-01 01:00:00"}

否则日志解析失败。


wolfboys avatar Jan 12 '22 23:01 wolfboys

  1. window idea environment Could not submit Flink job to remote yarn cluster

动态修改提交job的入口参数, 入口点 YarnClientImpl类 , 方法 submitApplication, 提交点 this.rmClient.submitApplication(request); 对入参 request的 CLASSPATH and _FLINK_CLASSPATH 参数值的分隔符 windows is ";" 替换为 linux is ":"

这个问题应该和官网上报的是一个问题 https://issues.apache.org/jira/browse/FLINK-17858

wolfboys avatar Jan 12 '22 23:01 wolfboys

idea 启动时build失败 , image

Su1024 avatar Apr 25 '22 10:04 Su1024

使用jdbc链接mysql,定义了主键,id有重复,写入就报错了,看资料说,jdbc定义主键后应该是upsert模式,为啥会出现如下报错啊: Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry '1' for key 'PRIMARY' CREATE TABLE test_sink ( id varchar PRIMARY KEY NOT ENFORCED, name varchar, class varchar ) WITH ( 'connector.type' = 'jdbc', 'connector.url' = 'jdbc:mysql:///database', 'connector.table' = 'kafka_test', 'connector.username' = 'root', 'connector.password' = '***********', 'connector.write.flush.max-rows' = '1' );

Yougetadad avatar May 23 '22 07:05 Yougetadad

idea 启动时build失败 , image

Have you solved the problem ? @Su1024

wanghuan2054 avatar Jun 27 '22 09:06 wanghuan2054

请问 2.0.0版本 docker 部署 .env是引用已有mysql,那么数据库 streampark数据库是要手工导入sql吗?还是在docker首次启动的时候会自己创建? image image

bulolo avatar Dec 14 '22 07:12 bulolo

请问 2.0.0版本 docker 部署 .env是引用已有mysql,那么数据库 streampark数据库是要手工导入sql吗?还是在docker首次启动的时候会自己创建? image image

得自己进入mysql容器,执行sql,默认不会初始化

0akarma avatar Dec 28 '22 10:12 0akarma

  1. 请问streampark 集成 flinkcdc不,根据日志实时同步功能支持不?

支持, 不论是datastream写的flinkcdc同步的作业还是flinksql 作业都支持, 只要是一个标准的flink作业都支持, 如果是flink sql作业的话, connector 必须是按照flink的规范实现的标准的 flink sql connector, 引入对应的依赖jar或者pom即可.

wolfboys avatar Jan 12 '23 02:01 wolfboys

compile using this command : mvn clean install -DskipTests -Dcheckstyle.skip -Dmaven.javadoc.skip=true

2000liux avatar Jan 17 '23 10:01 2000liux

提交flink sql的任务运行失败,找不到失败原因 image

Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.streampark.flink.client.FlinkClient$.$anonfun$proxy$1(FlinkClient.scala:80) at org.apache.streampark.flink.proxy.FlinkShimsProxy$.$anonfun$proxy$1(FlinkShimsProxy.scala:60) at org.apache.streampark.common.util.ClassLoaderUtils$.runAsClassLoader(ClassLoaderUtils.scala:38) at org.apache.streampark.flink.proxy.FlinkShimsProxy$.proxy(FlinkShimsProxy.scala:60) at org.apache.streampark.flink.client.FlinkClient$.proxy(FlinkClient.scala:75) at org.apache.streampark.flink.client.FlinkClient$.submit(FlinkClient.scala:49) at org.apache.streampark.flink.client.FlinkClient.submit(FlinkClient.scala) at org.apache.streampark.console.core.service.impl.ApplicationServiceImpl.lambda$start$10(ApplicationServiceImpl.java:1544) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) ... 3 more Caused by: java.lang.NoSuchFieldError: CANCEL_ENABLE at org.apache.streampark.flink.client.trait.FlinkClientTrait.submit(FlinkClientTrait.scala:102) at org.apache.streampark.flink.client.trait.FlinkClientTrait.submit$(FlinkClientTrait.scala:63) at org.apache.streampark.flink.client.impl.YarnApplicationClient$.submit(YarnApplicationClient.scala:44) at org.apache.streampark.flink.client.FlinkClientHandler$.submit(FlinkClientHandler.scala:40) at org.apache.streampark.flink.client.FlinkClientHandler.submit(FlinkClientHandler.scala)

Upgrading the version to 2.1.1

changeme2012 avatar Jun 05 '23 05:06 changeme2012

image

  1. 可以通过修改依赖 为 provided解决。
  2. 是否还有其他手段解决,例如fat jar?
  3. 为什么修改child-first 与 parent-first选项未生效?

3yekn1 avatar Sep 04 '23 03:09 3yekn1

An error is reported when the source code is compiled

image

At present, you can comment these two files to compile, then uncomment and compile again.You can try it.

caicancai avatar Sep 04 '23 08:09 caicancai

  1. after streamx-console started, app.home is not set, and throw NullPointerException

image

streamx-console initialization check failed. If started local for development and debugging, please ensure the -Dapp.home parameter is clearly specified in vm options, more detail: http://www.streamxhub.com/docs/user-guide/development/#vm-options

http://www.streamxhub.com/docs/user-guide/development/#vm-options Link has expired,Now you can refer to this address:https://streampark.apache.org/zh-CN/docs/user-guide/deployment/

liyichencc avatar Sep 06 '23 02:09 liyichencc