开发者社区> 问答> 正文

flinksql流计算任务非正常结束怎么处理?

我启动了yarn-session:bin/yarn-session.sh -jm 1g -tm 4g -s 4 -qu root.flink -nm fsql-cli  2>&1 &

然后通过sql-client,提交了一个sql:

主要逻辑是将一个kafka表和一个hive维表做join,然后将聚合结果写到mysql中。 

运行过程中,经常出现短则几个小时,长则几十个小时后,任务状态变为succeeded的情况,如图:https://s1.ax1x.com/2020/06/29/Nf2dIA.png

日志中能看到INFO级别的异常,15:34任务结束时的日志如下: 2020-06-29 14:53:20,260 INFO org.apache.flink.api.common.io.LocatableInputSplitAssigner - Assigning remote split to host uhadoop-op3raf-core12 2020-06-29 14:53:22,845 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: HiveTableSource(vid, q70) TablePath: dw.video_pic_title_q70, PartitionPruned: false, PartitionNums: null (1/1) (68c24aa5 9c898cefbb20fbc929ddbafd) switched from RUNNING to FINISHED. 2020-06-29 15:34:52,982 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Shutting YarnSessionClusterEntrypoint down with application status SUCCEEDED. Diagnostics null. 2020-06-29 15:34:52,984 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Shutting down rest endpoint. 2020-06-29 15:34:53,072 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Removing cache directory /tmp/flink-web-cdb67193-05ee-4a83-b957-9b7a9d85c23f/flink-web-ui 2020-06-29 15:34:53,073 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - http://uhadoop-op3raf-core1:44664 lost leadership 2020-06-29 15:34:53,074 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Shut down complete. 2020-06-29 15:34:53,074 INFO org.apache.flink.yarn.YarnResourceManager - Shut down cluster because application is in SUCCEEDED, diagnostics null. 2020-06-29 15:34:53,076 INFO org.apache.flink.yarn.YarnResourceManager - Unregister application from the YARN Resource Manager with final status SUCCEEDED. 2020-06-29 15:34:53,088 INFO org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl - Waiting for application to be successfully unregistered. 2020-06-29 15:34:53,306 INFO org.apache.flink.runtime.entrypoint.component.DispatcherResourceManagerComponent - Closing components. 2020-06-29 15:34:53,308 INFO org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess - Stopping SessionDispatcherLeaderProcess. 2020-06-29 15:34:53,309 INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Stopping dispatcher akka.tcp://flink@uhadoop-op3raf-core1:38817/user/dispatcher. 2020-06-29 15:34:53,310 INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Stopping all currently running jobs of dispatcher akka.tcp://flink@uhadoop-op3raf-core1:38817/user/dispatcher. 2020-06-29 15:34:53,311 INFO org.apache.flink.runtime.jobmaster.JobMaster - Stopping the JobMaster for job default: insert into rt_app.app_video_cover_abtest_test ... 2020-06-29 15:34:53,322 INFO org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl - Interrupted while waiting for queue java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2048) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:287) 2020-06-29 15:34:53,324 INFO org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy - Opening proxy : uhadoop-op3raf-core12:23333

ps: 

  1. kafka中一直有数据在写入的
  2. flink版本1.10.0 请问,任务状态为什么会变为SUCCEEDED呢?

谢谢大家!

*来自志愿者整理的flink邮件归档

展开
收起
毛毛虫雨 2021-12-06 16:19:24 1077 0
1 条回答
写回答
取消 提交回答
  • 看你的日志你的数据源是hive table?可以看下是否是批作业模式而不是流作业模式。

    *来自志愿者整理的flink邮件归档

    2021-12-06 16:53:32
    赞同 展开评论 打赏
问答排行榜
最热
最新

相关电子书

更多
Flink CDC Meetup PPT - 龚中强 立即下载
Flink CDC Meetup PPT - 王赫 立即下载
Flink CDC Meetup PPT - 覃立辉 立即下载