flinksql流计算任务非正常结束怎么处理？

我启动了yarn-session：bin/yarn-session.sh -jm 1g -tm 4g -s 4 -qu root.flink -nm fsql-cli 2>&1 &

然后通过sql-client，提交了一个sql：

主要逻辑是将一个kafka表和一个hive维表做join，然后将聚合结果写到mysql中。

运行过程中，经常出现短则几个小时，长则几十个小时后，任务状态变为succeeded的情况，如图：https://s1.ax1x.com/2020/06/29/Nf2dIA.png

日志中能看到INFO级别的异常，15:34任务结束时的日志如下： 2020-06-29 14:53:20,260 INFO org.apache.flink.api.common.io.LocatableInputSplitAssigner - Assigning remote split to host uhadoop-op3raf-core12 2020-06-29 14:53:22,845 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: HiveTableSource(vid, q70) TablePath: dw.video_pic_title_q70, PartitionPruned: false, PartitionNums: null (1/1) (68c24aa5 9c898cefbb20fbc929ddbafd) switched from RUNNING to FINISHED. 2020-06-29 15:34:52,982 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Shutting YarnSessionClusterEntrypoint down with application status SUCCEEDED. Diagnostics null. 2020-06-29 15:34:52,984 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Shutting down rest endpoint. 2020-06-29 15:34:53,072 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Removing cache directory /tmp/flink-web-cdb67193-05ee-4a83-b957-9b7a9d85c23f/flink-web-ui 2020-06-29 15:34:53,073 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - http://uhadoop-op3raf-core1:44664 lost leadership 2020-06-29 15:34:53,074 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Shut down complete. 2020-06-29 15:34:53,074 INFO org.apache.flink.yarn.YarnResourceManager - Shut down cluster because application is in SUCCEEDED, diagnostics null. 2020-06-29 15:34:53,076 INFO org.apache.flink.yarn.YarnResourceManager - Unregister application from the YARN Resource Manager with final status SUCCEEDED. 2020-06-29 15:34:53,088 INFO org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl - Waiting for application to be successfully unregistered. 2020-06-29 15:34:53,306 INFO org.apache.flink.runtime.entrypoint.component.DispatcherResourceManagerComponent - Closing components. 2020-06-29 15:34:53,308 INFO org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess - Stopping SessionDispatcherLeaderProcess. 2020-06-29 15:34:53,309 INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Stopping dispatcher akka.tcp://flink@uhadoop-op3raf-core1:38817/user/dispatcher. 2020-06-29 15:34:53,310 INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Stopping all currently running jobs of dispatcher akka.tcp://flink@uhadoop-op3raf-core1:38817/user/dispatcher. 2020-06-29 15:34:53,311 INFO org.apache.flink.runtime.jobmaster.JobMaster - Stopping the JobMaster for job default: insert into rt_app.app_video_cover_abtest_test ... 2020-06-29 15:34:53,322 INFO org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl - Interrupted while waiting for queue java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2048) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:287) 2020-06-29 15:34:53,324 INFO org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy - Opening proxy : uhadoop-op3raf-core12:23333

ps:

kafka中一直有数据在写入的
flink版本1.10.0 请问，任务状态为什么会变为SUCCEEDED呢？

谢谢大家！

*来自志愿者整理的flink邮件归档

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

flinksql流计算任务非正常结束怎么处理？

相关文章