开发者社区 > 大数据与机器学习 > 实时计算 Flink > 正文

flink用IDEA本地运行可以读取HDFS数据,然后把项目打包提交到flink集群,无法读取HDFS数据,出现以下错误,这是为何?

 The program finished with the following exception:

org.apache.flink.client.program.ProgramInvocationException: Job failed. (JobID: 74a2d820909fee963c4dea371b5c236c)
    at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268)
    at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:483)
    at org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
    at org.apache.flink.streaming.api.scala.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.scala:654)
    at org.myflink.quickstart.WordCount$.main(WordCount.scala:20)
    at org.myflink.quickstart.WordCount.main(WordCount.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
    at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
    at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:423)
    at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813)
    at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287)
    at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
    at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050)
    at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)
    at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
    at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
    at org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
    at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265)
    ... 19 more
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not find a file system implementation for scheme 'hdfs'. The scheme is not directly supported by Flink and no Hadoop file system to support this scheme could be loaded.
    at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:403)
    at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:318)
    at org.apache.flink.streaming.api.functions.source.ContinuousFileMonitoringFunction.run(ContinuousFileMonitoringFunction.java:196)
    at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:93)
    at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:57)
    at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:97)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Hadoop is not in the classpath/dependencies.
    at org.apache.flink.core.fs.UnsupportedSchemeFactory.create(UnsupportedSchemeFactory.java:64)
    at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:399)
    ... 8 more

本地bashrc已经配置了

HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop

flink-conf.yaml也已经做了一下配置

env.hadoop.conf.dir=/usr/local/hadoop/etc/hadoop

请问这是什么原因呀?

展开
收起
从大数据到人工智能 2019-06-06 13:19:23 15261 0
2 条回答
写回答
取消 提交回答
  • 精于基础,广于工具,熟于业务。

    JAR包里面有code-default。xml文件。修正下。这个文件导致的无法识别hdfs地址无法获取文件

    2019-09-03 11:26:11
    赞同 展开评论 打赏
  • 我用的flink-1.7.2版本和hadoop-2.7.2,hadoop_conf_dir和env也配置了,但读取hdfs上数据也报这个错误。后来在flink/lib下添加官网下载的flink和hadoop匹配的flink-shaded-hadoop2-uber-1.7.2.jar包,就不报错了。不知道你的问题是不是也能这样解决

    2019-07-17 23:36:50
    赞同 展开评论 打赏

实时计算Flink版是阿里云提供的全托管Serverless Flink云服务,基于 Apache Flink 构建的企业级、高性能实时大数据处理系统。提供全托管版 Flink 集群和引擎,提高作业开发运维效率。

相关产品

  • 实时计算 Flink版
  • 相关电子书

    更多
    Flink CDC Meetup PPT - 龚中强 立即下载
    Flink CDC Meetup PPT - 王赫 立即下载
    海量数据分布式存储——Apache HDFS之最新进展 立即下载