开发者社区 > 云原生 > 正文

连接Zookeeper , ConnLoss 导致OOM

依赖版本

zookeeper: 3.4.9

    <dependency>
        <groupId>org.apache.dubbo</groupId>
        <artifactId>dubbo-spring-boot-starter</artifactId>
        <version>2.7.8</version>
    </dependency>

    <dependency>
        <groupId>org.apache.curator</groupId>
        <artifactId>curator-recipes</artifactId>
        <version>2.10.0</version>
    </dependency>

dubbo配置

dubbo.protocol.lazy=true dubbo.protocol.heartbeat=3000 dubbo.registry.address= dubbo.registry.register=false dubbo.consumer.check=false dubbo.application.qos.enable=false dubbo.registry.file=./dubboregistry/dubbo-registry.properties dubbo.provider.timeout=10000 dubbo.provider.threadpool=fixed dubbo.provider.threads=100 dubbo.provider.protocol=dubbo dubbo.provider.cluster=failfast dubbo.provider.loadbalance=roundrobin dubbo.provider.server=netty dubbo.consumer.lazy=true dubbo.consumer.timeout=10000 dubbo.application.logger=slf4j

遇到的错误日志如下:

2021-11-11 09:49:40 | siac | * | Curator-Framework-0 | INFO | o.a.curator.framework.state.ConnectionStateManager | | ConnectionStateManager.java:228 | State change: LOST 2021-11-11 09:49:41 | siac | * | Curator-Framework-0 | INFO | o.apache.curator.framework.recipes.cache.TreeCache | | TreeCache.java:469 | Unknown event CuratorEventImpl{type=WATCHED, resultCode=-4, path='null', name='null', children=null, context=null, stat=null, data=null, watchedEvent=WatchedEvent state:Disconnected type:None path:null, aclList=null} 2021-11-11 09:49:41 | siac | * | Curator-Framework-0 | INFO | o.apache.curator.framework.recipes.cache.TreeCache | | TreeCache.java:469 | Unknown event CuratorEventImpl{type=WATCHED, resultCode=-4, path='null', name='null', children=null, context=null, stat=null, data=null, watchedEvent=WatchedEvent state:Disconnected type:None path:null, aclList=null}

2021-11-11 09:49:48 | siac | * | DubboRegistryRetryTimer-thread-1 | INFO | o.apache.dubbo.registry.retry.FailedSubscribedTask | | AbstractRetryTask.java:121 | [DUBBO] retry subscribe : consumer://10.135.101.104/com.authority.service.IAuthorityService?application=*&category=providers,configurators,routers&check=false&dubbo=2.0.2&init=false&interface=com.authority.service.IAuthorityService&lazy=true&logger=slf4j&metadata-type=remote&methods=getRoleInfoByRoleEnName,updateRole,addResources,getAuthorithByUsername,deleteResource,getOperateLogList,addLog,disableRole,getUsefulRoles,getResourcesDetail,batchInsertAuthUserRole,getRoles,powerList,getResourcesTree,getUserByUsername,getResourcesList,getRole,getAuthTreeByRoleId,selectBdList,getUserInfoByRoleId,getRoleByUsername,modifyResource,getUserInfoByRoleEnName,powerRedis,userRole,addRole,getRolesByResKey&pid=948&qos.enable=false&release=2.7.8&side=consumer&sticky=false&timeout=10000&timestamp=1636446226827, dubbo version: 2.7.8, current host: 10.135.101.104 2021-11-11 09:49:48 | siac | * | Curator-Framework-0 | INFO | o.apache.curator.framework.recipes.cache.TreeCache | | TreeCache.java:469 | Unknown event CuratorEventImpl{type=WATCHED, resultCode=-4, path='null', name='null', children=null, context=null, stat=null, data=null, watchedEvent=WatchedEvent state:Disconnected type:None path:null, aclList=null}

2021-11-11 09:49:47 | siac | * | main-SendThread(10.135.101.101:2181) | WARN | org.apache.zookeeper.ClientCnxn | | ClientCnxn.java:1102 | Session 0x17a78d8154236ec for server 10.135.101.101/10.135.101.101:2181, unexpected error, closing socket connection and attempting reconnect java.io.IOException: Broken pipe at sun.nio.ch.FileDispatcherImpl.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) at sun.nio.ch.IOUtil.write(IOUtil.java:65) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:117) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 2021-11-11 09:49:47 | siac | * | Curator-ConnectionStateManager-0 | WARN | o.a.d.r.zookeeper.curator.CuratorZookeeperClient | | CuratorZookeeperClient.java:376 | [DUBBO] Curator zookeeper connection recovered from connection lose, reuse the old session 17a78d8154236ec, dubbo version: 2.7.8, current host: 10.135.101.104 2021-11-11 09:49:47 | siac | * | Curator-ConnectionStateManager-0 | WARN | o.a.dubbo.registry.zookeeper.ZookeeperRegistry | | ZookeeperRegistry.java:89 | [DUBBO] Trying to fetch the latest urls, in case there're provider changes during connection loss. Since ephemeral ZNode will not get deleted for a connection lose, there's no need to re-register url of this instance., dubbo version: 2.7.8, current host: 10.135.101.104

2021-11-11 10:23:46 | siac | * | main-EventThread | ERROR | o.a.curator.framework.imps.CuratorFrameworkImpl | | CuratorFrameworkImpl.java:557 | Background operation retry gave up org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:728) at org.apache.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(CuratorFrameworkImpl.java:516) at org.apache.curator.framework.imps.GetChildrenBuilderImpl$2.processResult(GetChildrenBuilderImpl.java:166) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:593) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)

内存溢出的直接原因在:

提问72.png

绝大多数的 ClientCnxn$Packet 的 AsyncCallback cb 为 GetChildrenBuilderImpl。

存活对象:

提问73.png

遇到过两次这种问题。

原提问者GitHub用户shenhuaxin

展开
收起
大圣东游 2023-05-11 19:19:48 363 0
1 条回答
写回答
取消 提交回答
  • 这个应该是curator低版本的缺陷吧,目前master分支使用的curator-recipes版本已经升到4.2.0了

    建议查一下应用代码,你的情况和上面这个帖子几乎相同。

    https://blog.csdn.net/a17816876003/article/details/107899354

    看图片你的Xmx貌似还不到400M,而且存活的最大对象也不到40M,内存泄露的可能性不大。 如果你没有配置zk连接断开重连最大次数,默认就是无限次重连,如果网络不稳定,不间断地重连会消耗很多内存。

    另外2.7.8的版本存在注册url包含时间戳导致重复添加等导致OOM的问题,最好是升级一下dubbo版本。

    原回答者GitHub用户zrlw

    2023-05-12 11:06:47
    赞同 展开评论 打赏

阿里云拥有国内全面的云原生产品技术以及大规模的云原生应用实践,通过全面容器化、核心技术互联网化、应用 Serverless 化三大范式,助力制造业企业高效上云,实现系统稳定、应用敏捷智能。拥抱云原生,让创新无处不在。

相关产品

  • 微服务引擎
  • 相关电子书

    更多
    《MSE 微服务网关》 立即下载
    微服务引擎 MSE 治理中心重磅发布 立即下载
    阿里云微服务引擎 MSE 2.0 线上发布 立即下载