开发者社区 > 大数据与机器学习 > 实时计算 Flink > 正文

配置使用s3做状态存储,运行任务报403错误

我应该怎么排查呢 完整异常栈如下:

···

org.apache.flink.runtime.rest.handler.RestHandlerException: Could not execute application. at org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$handleRequest$1(JarRunHandler.java:110) at java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:836) at java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:811) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1609) at java.lang.Thread.run(Thread.java:750) Caused by: java.util.concurrent.CompletionException: org.apache.flink.util.FlinkRuntimeException: Could not execute application. at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273) at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1606) ... 1 more Caused by: org.apache.flink.util.FlinkRuntimeException: Could not execute application. at org.apache.flink.client.deployment.application.DetachedApplicationRunner.tryExecuteJobs(DetachedApplicationRunner.java:88) at org.apache.flink.client.deployment.application.DetachedApplicationRunner.run(DetachedApplicationRunner.java:70) at org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$handleRequest$0(JarRunHandler.java:104) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) ... 1 more Caused by: org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: Failed to execute sql at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:372) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222) at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114) at org.apache.flink.client.deployment.application.DetachedApplicationRunner.tryExecuteJobs(DetachedApplicationRunner.java:84) ... 4 more Caused by: org.apache.flink.table.api.TableException: Failed to execute sql at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:777) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:742) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:856) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:730) at com.hzsun.zbp.OneFormEtl.main(OneFormEtl.java:305) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355) ... 7 more Caused by: org.apache.flink.util.FlinkException: Failed to execute job 'OneFormEtl'. at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1969) at org.apache.flink.client.program.StreamContextEnvironment.executeAsync(StreamContextEnvironment.java:137) at org.apache.flink.table.planner.delegation.ExecutorBase.executeAsync(ExecutorBase.java:55) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:759) ... 16 more Caused by: java.lang.RuntimeException: org.apache.flink.runtime.client.JobInitializationException: Could not start the JobMaster. at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:316) at org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:75) at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) at java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:457) at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175) Caused by: org.apache.flink.runtime.client.JobInitializationException: Could not start the JobMaster. at org.apache.flink.runtime.jobmaster.DefaultJobMasterServiceProcess.lambda$new$0(DefaultJobMasterServiceProcess.java:97) at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1609) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: java.util.concurrent.CompletionException: org.apache.flink.util.FlinkRuntimeException: Failed to create checkpoint storage at checkpoint coordinator side. at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273) at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1606) ... 7 more Caused by: org.apache.flink.util.FlinkRuntimeException: Failed to create checkpoint storage at checkpoint coordinator side. at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.(CheckpointCoordinator.java:325) at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.(CheckpointCoordinator.java:241) at org.apache.flink.runtime.executiongraph.DefaultExecutionGraph.enableCheckpointing(DefaultExecutionGraph.java:448) at org.apache.flink.runtime.executiongraph.DefaultExecutionGraphBuilder.buildGraph(DefaultExecutionGraphBuilder.java:311) at org.apache.flink.runtime.scheduler.DefaultExecutionGraphFactory.createAndRestoreExecutionGraph(DefaultExecutionGraphFactory.java:107) at org.apache.flink.runtime.scheduler.SchedulerBase.createAndRestoreExecutionGraph(SchedulerBase.java:342) at org.apache.flink.runtime.scheduler.SchedulerBase.(SchedulerBase.java:190) at org.apache.flink.runtime.scheduler.DefaultScheduler.(DefaultScheduler.java:122) at org.apache.flink.runtime.scheduler.DefaultSchedulerFactory.createInstance(DefaultSchedulerFactory.java:132) at org.apache.flink.runtime.jobmaster.DefaultSlotPoolServiceSchedulerFactory.createScheduler(DefaultSlotPoolServiceSchedulerFactory.java:110) at org.apache.flink.runtime.jobmaster.JobMaster.createScheduler(JobMaster.java:340) at org.apache.flink.runtime.jobmaster.JobMaster.(JobMaster.java:317) at org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.internalCreateJobMasterService(DefaultJobMasterServiceFactory.java:107) at org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.lambda$createJobMasterService$0(DefaultJobMasterServiceFactory.java:95) at org.apache.flink.util.function.FunctionUtils.lambda$uncheckedSupplier$4(FunctionUtils.java:112) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) ... 7 more Caused by: java.nio.file.AccessDeniedException: s3://test/flink-checkpoints/91801a52f8ea3d9b182c603abb60bf7a/shared: getFileStatus on s3://test/flink-checkpoints/91801a52f8ea3d9b182c603abb60bf7a/shared: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: null; S3 Extended Request ID: null; Proxy: null), S3 Extended Request ID: null:403 Forbidden at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:218) at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:145) at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2184) at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2149) at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088) at org.apache.hadoop.fs.s3a.S3AFileSystem.innerMkdirs(S3AFileSystem.java:2037) at org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:2007) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2326) at org.apache.flink.fs.s3hadoop.common.HadoopFileSystem.mkdirs(HadoopFileSystem.java:183) at org.apache.flink.core.fs.PluginFileSystemFactory$ClassLoaderFixingFileSystem.mkdirs(PluginFileSystemFactory.java:162) at org.apache.flink.runtime.state.filesystem.FsCheckpointStorageAccess.initializeBaseLocations(FsCheckpointStorageAccess.java:113) at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.(CheckpointCoordinator.java:323) ... 22 more Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: null; S3 Extended Request ID: null; Proxy: null), S3 Extended Request ID: null at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1811) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1395) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1371) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704) at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5062) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5008) at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1338) at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getObjectMetadata$4(S3AFileSystem.java:1235) at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:317) at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:280) at org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:1232) at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2169) ... 31 more ** ···

展开
收起
sympathetic 2023-04-07 14:53:59 1173 0
3 条回答
写回答
取消 提交回答
  • 公众号:网络技术联盟站,InfoQ签约作者,阿里云社区签约作者,华为云 云享专家,BOSS直聘 创作王者,腾讯课堂创作领航员,博客+论坛:https://www.wljslmz.cn,工程师导航:https://www.wljslmz.com

    在实时计算 Flink 版中,使用 S3 作为状态存储时,出现 403 错误通常是因为访问 S3 的权限不足导致的。以下是一些可能的解决方法:

    1、检查访问密钥和访问密钥ID:在实时计算 Flink 版中,需要配置访问密钥和访问密钥ID,以便访问 S3 存储桶。请确保这些访问密钥和访问密钥ID 是正确的,并且具有足够的权限来执行所需的操作,例如读取和写入状态存储。

    2、检查权限策略:在 S3 存储桶的权限配置中,需要确保对于实时计算 Flink 版使用的访问密钥和访问密钥ID,具有足够的权限来执行所需的操作,例如读取和写入状态存储。可以通过检查 S3 存储桶的访问权限策略,确认是否允许实时计算 Flink 版的访问请求。

    3、检查网络连接:确保实时计算 Flink 版的运行环境可以正常连接到 S3 存储桶。例如,如果实时计算 Flink 版运行在虚拟私有云(VPC)中,需要配置正确的 VPC 网络连接,以便实时计算 Flink 版可以访问 S3 存储桶。

    4、检查 S3 存储桶的区域:实时计算 Flink 版需要连接到正确的 S3 存储桶区域。请确保实时计算 Flink 版的配置中指定的 S3 存储桶区域与实际的存储桶区域一致。

    5、检查 S3 存储桶的存在和命名:确保实时计算 Flink 版的配置中指定的 S3 存储桶名称是正确的,并且存储桶已经存在。

    如果上述方法无法解决问题,可以进一步检查实时计算 Flink 版和 S3 存储桶之间的配置和权限设置,确保实时计算 Flink 版具有足够的权限来访问 S3 存储桶。如有需要,可以与阿里云技术支持联系,以获取更进一步的帮助。

    2023-04-08 23:44:35
    赞同 展开评论 打赏
  • 天下风云出我辈,一入江湖岁月催,皇图霸业谈笑中,不胜人生一场醉。

    内网域名访问对象存储s3显示403decodes3是访问COS时返回403错误码。这是一种故障现象,当COSAPI、SDK上传和下载资源时,返回403错误码。当使用临时密钥或子账号访问COS资源时,返回403错误码。当修改COSbucket配置时,返回403错误码。在IIS下可能没有配置默认访问文件或没有开启ASP访问权限。前者是在工作目录下建立default.asp文件,后者在IIS网站管理界面选中ASP。

    2023-04-07 17:15:43
    赞同 展开评论 打赏
  • 坚持这件事孤独又漫长。

    出现403错误,原因是S3返回了拒绝访问的响应。可以尝试以下排查方法:

    1. 检查S3密钥是否正确:Flink需要正确的AWS_ACCESS_KEY_ID和AWS_SECRET_ACCESS_KEY才能访问S3状态后端。请确保这些密钥是正确的,并且有足够的权限来访问状态存储。

    2. 检查S3存储桶是否存在:请确保指定的S3存储桶存在,并且您有足够的权限访问该存储桶。您可以使用AWS S3 控制台或AWS CLI验证存储桶是否存在。

    3. 检查S3访问策略:请确保为S3存储桶配置了正确的访问策略,以允许Flink可以访问该存储桶。您可以使用AWS S3 控制台配置这些策略。

    4. 检查S3状态后端配置:请确保您已正确配置Flink使用S3作为状态后端。请检查配置文件中的设置并确保它们正确匹配。

    5. 检查网络连接:请确保您正在运行Flink的计算机可以连接到S3存储桶,并且没有被防火墙或代理服务器阻止访问。您可以尝试使用AWS CLI测试连接。

    6. 检查S3区域:请确保您正在使用的S3存储桶和Flink任务在同一区域内。如果它们不在同一区域,可能会出现访问问题。您可以在AWS S3 控制台中查看存储桶的区域。

    7. 检查S3限制:请确保您正在使用的S3存储桶没有遇到限制。例如,存储桶内的对象数量超过了Amazon S3 中的限制,则可能导致访问问题。

    2023-04-07 15:01:06
    赞同 展开评论 打赏

实时计算Flink版是阿里云提供的全托管Serverless Flink云服务,基于 Apache Flink 构建的企业级、高性能实时大数据处理系统。提供全托管版 Flink 集群和引擎,提高作业开发运维效率。

相关电子书

更多
低代码开发师(初级)实战教程 立即下载
冬季实战营第三期:MySQL数据库进阶实战 立即下载
阿里巴巴DevOps 最佳实践手册 立即下载