开发者社区> 问答> 正文

flink 触发保存点失败是为什么?

    我触发Flink 保存点总是失败,报错如下,一直说是超时,但是没有进一步的信息可以查看,我查资料说可以设置checkpoint超时时间,我设置了2min,但是触发

保存点时在2min之前就会报错,另外我的 状态 并不大

   


 The program finished with the following exception:

org.apache.flink.util.FlinkException: Triggering a savepoint for the job 00000000000000000000000000000000 failed.

at org.apache.flink.client.cli.CliFrontend.triggerSavepoint(CliFrontend.java:777)

at org.apache.flink.client.cli.CliFrontend.lambda$savepoint$9(CliFrontend.java:754)

at org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:1002)

at org.apache.flink.client.cli.CliFrontend.savepoint(CliFrontend.java:751)

at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1072)

at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)

at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)

at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132)

Caused by: java.util.concurrent.TimeoutException

at org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:1255)

at org.apache.flink.runtime.concurrent.DirectExecutorService.execute(DirectExecutorService.java:217)

at org.apache.flink.runtime.concurrent.FutureUtils.lambda$orTimeout$15(FutureUtils.java:582)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)*来自志愿者整理的flink邮件归档

展开
收起
EXCEED 2021-12-02 14:52:40 953 0
1 条回答
写回答
取消 提交回答
  • 这个报错是 client 提起触发 checkpoint 的请求后,job manager 没有及时反馈 checkpoint 的结果。没有及时反馈的原因可能有很多,比如 checkpoint 超时,比如网络通信问题等等。可以打开 flink web ui 看一下是否有更多信息,或者打开 job manager 和 task manager 的 log 看一下。*来自志愿者整理的FLINK邮件归档

    2021-12-02 15:02:51
    赞同 展开评论 打赏
问答排行榜
最热
最新

相关电子书

更多
Flink CDC Meetup PPT - 龚中强 立即下载
Flink CDC Meetup PPT - 王赫 立即下载
Flink CDC Meetup PPT - 覃立辉 立即下载