云服务器Redis集群部署及客户端通过公网IP连接问题

本文涉及的产品
云数据库 Tair(兼容Redis),内存型 2GB
Redis 开源版,标准版 2GB
推荐场景:
搭建游戏排行榜
日志服务 SLS,月写入数据量 50GB 1个月
简介: 目录1、配置文件2、启动服务并创建集群(1)启动6个Redis服务(2)通过客户端命令创建集群3、客户端连接(1)客户端配置(2)测试用例(3)错误日志分析4、问题解决(1)查redis.conf配置文件(2)修改配置文件(3)重新启动Redis服务并创建集群5、故障转移期间Lettuce客户端连接问题(1)测试用例(2)停掉其中一个master节点,模拟宕机(3)解决办法1)更换Redis客户端2)Lettuce客户端配置Redis集群拓扑刷新

目录

1、配置文件

2、启动服务并创建集群

(1)启动6个Redis服务

(2)通过客户端命令创建集群

3、客户端连接

(1)客户端配置

(2)测试用例

(3)错误日志分析

4、问题解决

(1)查redis.conf配置文件

(2)修改配置文件

(3)重新启动Redis服务并创建集群

5、故障转移期间Lettuce客户端连接问题

(1)测试用例

(2)停掉其中一个master节点,模拟宕机

(3)解决办法

1)更换Redis客户端

2)Lettuce客户端配置Redis集群拓扑刷新

1、配置文件



准备了6个配置文件:redis-6381.conf,redis-6382.conf,redis-6383.conf,redis-6384.conf,redis-6385.conf,

redis-6386.conf。配置文件内容如下:

# 配置文件进行了精简,完整配置可自行和官方提供的完整conf文件进行对照。端口号自行对应修改
#后台启动的意思
daemonize yes 
#端口号
port 6381
# IP绑定,redis不建议对公网开放,这里绑定了服务器私网IP及环回地址
bind 172.17.0.13 127.0.0.1
# redis数据文件存放的目录
dir /redis/workingDir
# 日志文件
logfile "/redis/logs/cluster-node-6381.log"
# 开启AOF
appendonly yes
 # 开启集群
cluster-enabled yes
# 集群持久化配置文件,内容包含其它节点的状态,持久化变量等,会自动生成在上面配置的dir目录下
cluster-config-file cluster-node-6381.conf
# 集群节点不可用的最大时间(毫秒),如果主节点在指定时间内不可达,那么会进行故障转移
cluster-node-timeout 5000


备注:Redis版本为6.0.4

2、启动服务并创建集群

(1)启动6个Redis服务

redis-server redis-6381.conf
redis-server redis-6382.conf
redis-server redis-6383.conf
redis-server redis-6384.conf
redis-server redis-6385.conf
redis-server redis-6386.conf


(2)通过客户端命令创建集群

创建集群,每个master节点分配一个从节点:

redis-cli --cluster create \ 
172.17.0.13:6381 172.17.0.13:6382 172.17.0.13:6383 \ 
172.17.0.13:6384 172.17.0.13:6385 172.17.0.13:6386 \
--cluster-replicas 1

3、客户端连接

(1)客户端配置

@Configuration
public class RedisClusterConfig {
  @Bean
  public RedisConnectionFactory redisConnectionFactory() {
    // 客户端读写分离配置
    LettuceClientConfiguration clientConfig = LettuceClientConfiguration.builder()
            .readFrom(ReadFrom.REPLICA_PREFERRED)
            .build();
    RedisClusterConfiguration redisClusterConfiguration = new RedisClusterConfiguration(Arrays.asList(
            "122.51.151.130:6381",
            "122.51.151.130:6382",
            "122.51.151.130:6383",
            "122.51.151.130:6384",
            "122.51.151.130:6385",
            "122.51.151.130:6386"));
    return new LettuceConnectionFactory(redisClusterConfiguration, clientConfig);
  }
}

(2)测试用例

@RunWith(SpringRunner.class)
@SpringBootTest(classes = Application.class)
public class RedisClusterTest {
  @Autowired
  private StringRedisTemplate stringRedisTemplate;
  @Test
  public void readFromReplicaWriteToMasterTest() {
    System.out.println("开始设置值...");
    stringRedisTemplate.opsForValue().set("username", "Nick");
    System.out.println("获取值:" + stringRedisTemplate.opsForValue().get("username"));
  }
}


(3)错误日志分析

2020-08-14 14:57:49.180  WARN 22012 --- [ioEventLoop-6-4] i.l.c.c.topology.ClusterTopologyRefresh  : Unable to connect to [172.17.0.13:6384]: connection timed out: /172.17.0.13:6384
2020-08-14 14:57:49.180  WARN 22012 --- [ioEventLoop-6-3] i.l.c.c.topology.ClusterTopologyRefresh  : Unable to connect to [172.17.0.13:6383]: connection timed out: /172.17.0.13:6383
2020-08-14 14:57:49.182  WARN 22012 --- [ioEventLoop-6-2] i.l.c.c.topology.ClusterTopologyRefresh  : Unable to connect to [172.17.0.13:6382]: connection timed out: /172.17.0.13:6382
2020-08-14 14:57:49.182  WARN 22012 --- [ioEventLoop-6-1] i.l.c.c.topology.ClusterTopologyRefresh  : Unable to connect to [172.17.0.13:6381]: connection timed out: /172.17.0.13:6381
2020-08-14 14:57:49.190  WARN 22012 --- [ioEventLoop-6-1] i.l.c.c.topology.ClusterTopologyRefresh  : Unable to connect to [172.17.0.13:6385]: connection timed out: /172.17.0.13:6385
2020-08-14 14:57:49.191  WARN 22012 --- [ioEventLoop-6-2] i.l.c.c.topology.ClusterTopologyRefresh  : Unable to connect to [172.17.0.13:6386]: connection timed out: /172.17.0.13:6386
2020-08-14 14:57:59.389  WARN 22012 --- [ioEventLoop-6-3] i.l.core.cluster.RedisClusterClient      : connection timed out: /172.17.0.13:6382
2020-08-14 14:58:09.391  WARN 22012 --- [ioEventLoop-6-4] i.l.core.cluster.RedisClusterClient      : connection timed out: /172.17.0.13:6381
2020-08-14 14:58:19.393  WARN 22012 --- [ioEventLoop-6-1] i.l.core.cluster.RedisClusterClient      : connection timed out: /172.17.0.13:6383
2020-08-14 14:58:29.396  WARN 22012 --- [ioEventLoop-6-2] i.l.core.cluster.RedisClusterClient      : connection timed out: /172.17.0.13:6384
2020-08-14 14:58:39.399  WARN 22012 --- [ioEventLoop-6-3] i.l.core.cluster.RedisClusterClient      : connection timed out: /172.17.0.13:6386
2020-08-14 14:58:49.402  WARN 22012 --- [ioEventLoop-6-4] i.l.core.cluster.RedisClusterClient      : connection timed out: /172.17.0.13:6385


4、问题解决

(1)查redis.conf配置文件

让Redis暴露公网IP其实在redis.conf配置文件里是能找到的,下面这段配置主要针对docker这种特殊的部署,这里我们也可以手动指定Redis的公网IP、端口以及总线端口(默认服务端口加10000)。

########################## CLUSTER DOCKER/NAT support  ########################
# In certain deployments, Redis Cluster nodes address discovery fails, because
# addresses are NAT-ted or because ports are forwarded (the typical case is
# Docker and other containers).
#
# In order to make Redis Cluster working in such environments, a static
# configuration where each node knows its public address is needed. The
# following two options are used for this scope, and are:
#
# * cluster-announce-ip
# * cluster-announce-port
# * cluster-announce-bus-port
#
# Each instruct the node about its address, client port, and cluster message
# bus port. The information is then published in the header of the bus packets
# so that other nodes will be able to correctly map the address of the node
# publishing the information.
#
# If the above options are not used, the normal Redis Cluster auto-detection
# will be used instead.
#
# Note that when remapped, the bus port may not be at the fixed offset of
# clients port + 10000, so you can specify any port and bus-port depending
# on how they get remapped. If the bus-port is not set, a fixed offset of
# 10000 will be used as usually.
#
# Example:
#
# cluster-announce-ip 10.1.1.5
# cluster-announce-port 6379
# cluster-announce-bus-port 6380


(2)修改配置文件

手动指定了公网ip后,Redis集群中的节点会通过公网IP进行通信,也就是外网访问。因此相关的总线端口,如下面的16381等总线端口必须在云服务器中的安全组中放开,不然集群会处于fail状态。

# 配置文件进行了精简,完整配置可自行和官方提供的完整conf文件进行对照。端口号自行对应修改
#后台启动的意思
daemonize yes 
#端口号
port 6381
# IP绑定,redis不建议对公网开放,这里绑定了服务器私网IP及环回地址
bind 172.17.0.13 127.0.0.1
# redis数据文件存放的目录
dir /redis/workingDir
# 日志文件
logfile "/redis/logs/cluster-node-6381.log"
# 开启AOF
appendonly yes
 # 开启集群
cluster-enabled yes
# 集群持久化配置文件,内容包含其它节点的状态,持久化变量等,会自动生成在上面配置的dir目录下
cluster-config-file cluster-node-6381.conf
# 集群节点不可用的最大时间(毫秒),如果主节点在指定时间内不可达,那么会进行故障转移
cluster-node-timeout 5000
# 云服务器上部署需指定公网ip
cluster-announce-ip 122.51.151.130
# Redis总线端口,用于与其它节点通信
cluster-announce-bus-port 16381

(3)重新启动Redis服务并创建集群

这个时候我们可以查看一下节点配置文件cluster-node-6381.conf的内容前后有啥变化。

未指定公网IP前:

[universe@VM_0_13_centos workingDir]$ cat cluster-node-6381.conf 
34287d78c1e9c4ff49880bb976707a0c17676f82 172.17.0.13:6384@16384 slave 1a206270f835a79e43e281df5f6f8215ab49d713 0 1597390563209 4 connected
e306ae5e3ead5f2a837d3bdc0b95c0bd8e3cff99 172.17.0.13:6383@16383 master - 0 1597390565212 3 connected 10923-16383
0932cc203a19f37a3f5ebca8278962f5b325c67e 172.17.0.13:6385@16385 slave 2cc1aed536ff5b48c2fdd94f16cd96cefc4fd4ef 0 1597390564711 5 connected
2cc1aed536ff5b48c2fdd94f16cd96cefc4fd4ef 172.17.0.13:6382@16382 master - 0 1597390565000 2 connected 5461-10922
1a206270f835a79e43e281df5f6f8215ab49d713 172.17.0.13:6381@16381 myself,master - 0 1597390564000 1 connected 0-5460
0f63accb455594d0625cffa8d09aacc580d7e428 172.17.0.13:6386@16386 slave e306ae5e3ead5f2a837d3bdc0b95c0bd8e3cff99 0 1597390564210 6 connected


指定公网IP后:

[universe@VM_0_13_centos workingDir]$ cat cluster-node-6381.conf 
e2691ffd4bf7d867bc91b3b91c7b233a5f1e5dd2 122.51.151.130:6384@16384 master - 0 1597389992286 7 connected 10923-16383
511668874d39a7b1f701cc3df6f21d00510bfeae 122.51.151.130:6383@16383 slave e2691ffd4bf7d867bc91b3b91c7b233a5f1e5dd2 0 1597389991283 7 connected
e77e540ef4115abe920fb191f354b81f42e7b4ed 122.51.151.130:6381@16381 myself,master - 0 1597389991000 1 connected 0-5460
2a3ea359311b34cd59e10da7d2f1bba48403f0ee 122.51.151.130:6385@16385 slave e77e540ef4115abe920fb191f354b81f42e7b4ed 0 1597389990583 5 connected
2bf4f01a4dba802eb1a50d9510947a4af0ac92ef 122.51.151.130:6382@16382 master - 0 1597389992789 2 connected 5461-10922
2b7671e002143b329c9c6c969bfb825a86fb41b2 122.51.151.130:6386@16386 slave 2bf4f01a4dba802eb1a50d9510947a4af0ac92ef 0 1597389991784 6 connected
vars currentEpoch 7 lastVoteEpoch 7


这里我们可以发现,各节点暴露的IP全是公网IP了,再次运行测试用例,一切正常。

5、故障转移期间Lettuce客户端连接问题

(1)测试用例

@RunWith(SpringRunner.class)
@SpringBootTest(classes = Application.class)
public class RedisClusterTest {
  @Autowired
  private StringRedisTemplate stringRedisTemplate;
  @Test
  public void automaticFailoverTest() throws InterruptedException {
    int count = 0;
    while (true) {
      try {
        stringRedisTemplate.opsForValue().set("count", String.valueOf(++count));
        System.out.println("修改count的值:" + count);
        System.out.println("获取count的值:" + stringRedisTemplate.opsForValue().get("count"));
        Thread.sleep(2000);
      } catch (Exception e) {
        System.out.println("可能发生切主,重新操作...");
        Thread.sleep(3000);
      }
    }
  }
}

(2)停掉其中一个master节点,模拟宕机

日志如下:

2020-08-20 19:33:25.118  INFO 13696 --- [xecutorLoop-1-1] i.l.core.protocol.ConnectionWatchdog     : Reconnecting, last destination was /122.51.151.130:6384
2020-08-20 19:33:26.213  WARN 13696 --- [ioEventLoop-6-1] i.l.core.protocol.ConnectionWatchdog     : Cannot reconnect to [122.51.151.130:6384]: Connection refused: no further information: /122.51.151.130:6384
2020-08-20 19:33:31.015  INFO 13696 --- [xecutorLoop-1-2] i.l.core.protocol.ConnectionWatchdog     : Reconnecting, last destination was 122.51.151.130:6384
2020-08-20 19:33:32.107  WARN 13696 --- [ioEventLoop-6-2] i.l.core.protocol.ConnectionWatchdog     : Cannot reconnect to [122.51.151.130:6384]: Connection refused: no further information: /122.51.151.130:6384
2020-08-20 19:33:36.616  INFO 13696 --- [xecutorLoop-1-2] i.l.core.protocol.ConnectionWatchdog     : Reconnecting, last destination was 122.51.151.130:6384
2020-08-20 19:33:37.709  WARN 13696 --- [ioEventLoop-6-2] i.l.core.protocol.ConnectionWatchdog     : Cannot reconnect to [122.51.151.130:6384]: Connection refused: no further information: /122.51.151.130:6384
2020-08-20 19:33:42.016  INFO 13696 --- [xecutorLoop-1-4] i.l.core.protocol.ConnectionWatchdog     : Reconnecting, last destination was 122.51.151.130:6384
2020-08-20 19:33:43.110  WARN 13696 --- [ioEventLoop-6-4] i.l.core.protocol.ConnectionWatchdog     : Cannot reconnect to [122.51.151.130:6384]: Connection refused: no further information: /122.51.151.130:6384
2020-08-20 19:33:47.216  INFO 13696 --- [xecutorLoop-1-1] i.l.core.protocol.ConnectionWatchdog     : Reconnecting, last destination was 122.51.151.130:6384
2020-08-20 19:33:48.317  WARN 13696 --- [ioEventLoop-6-1] i.l.core.protocol.ConnectionWatchdog     : Cannot reconnect to [122.51.151.130:6384]: Connection refused: no further information: /122.51.151.130:6384
2020-08-20 19:33:56.515  INFO 13696 --- [xecutorLoop-1-2] i.l.core.protocol.ConnectionWatchdog     : Reconnecting, last destination was 122.51.151.130:6384
2020-08-20 19:33:57.605  WARN 13696 --- [ioEventLoop-6-2] i.l.core.protocol.ConnectionWatchdog     : Cannot reconnect to [122.51.151.130:6384]: Connection refused: no further information: /122.51.151.130:6384
2020-08-20 19:34:14.016  INFO 13696 --- [xecutorLoop-1-3] i.l.core.protocol.ConnectionWatchdog     : Reconnecting, last destination was 122.51.151.130:6384
2020-08-20 19:34:15.113  WARN 13696 --- [ioEventLoop-6-3] i.l.core.protocol.ConnectionWatchdog     : Cannot reconnect to [122.51.151.130:6384]: Connection refused: no further information: /122.51.151.130:6384
可能发生切主,重新操作...
2020-08-20 19:34:45.116  INFO 13696 --- [xecutorLoop-1-4] i.l.core.protocol.ConnectionWatchdog     : Reconnecting, last destination was 122.51.151.130:6384
2020-08-20 19:34:46.212  WARN 13696 --- [ioEventLoop-6-4] i.l.core.protocol.ConnectionWatchdog     : Cannot reconnect to [122.51.151.130:6384]: Connection refused: no further information: /122.51.151.130:6384
2020-08-20 19:35:16.216  INFO 13696 --- [xecutorLoop-1-1] i.l.core.protocol.ConnectionWatchdog     : Reconnecting, last destination was 122.51.151.130:6384
2020-08-20 19:35:17.310  WARN 13696 --- [ioEventLoop-6-1] i.l.core.protocol.ConnectionWatchdog     : Cannot reconnect to [122.51.151.130:6384]: Connection refused: no further information: /122.51.151.130:6384
可能发生切主,重新操作...


等了很长一段时间发现,发现客户端一致处于重连状态,这Lettuce客户端可能有毒。

(3)解决办法

1)更换Redis客户端

将客户端换为Jedis后,再次模拟主节点宕机,发现过段时间后客户端连接恢复正常了。

@Configuration
public class RedisClusterConfig {
  @Bean
  public RedisConnectionFactory redisConnectionFactory() {
    RedisClusterConfiguration redisClusterConfiguration = new RedisClusterConfiguration(Arrays.asList(
            "122.51.151.130:6381",
            "122.51.151.130:6382",
            "122.51.151.130:6383",
            "122.51.151.130:6384",
            "122.51.151.130:6385",
            "122.51.151.130:6386"));
    return new JedisConnectionFactory(redisClusterConfiguration);
  }
}

难道Lettuce客户端不支持主从切换后客户端重连么,那是不可能的。我们在github上找到了关于lettuce关于Redis集群的一些信息,相关地址如下:

https://github.com/lettuce-io/lettuce-core/wiki/Redis-Cluster

https://github.com/lettuce-io/lettuce-core/wiki/Client-options#cluster-specific-options


接下来按照文档上的提示修改客户端配置:

@Configuration
public class RedisClusterConfig {
  @Bean
  public RedisConnectionFactory redisConnectionFactory() {
    // 开启自适应集群拓扑刷新和周期拓扑刷新,不开启相应槽位主节点挂掉会出现服务不可用,直到挂掉节点重新恢复
    ClusterTopologyRefreshOptions clusterTopologyRefreshOptions =  ClusterTopologyRefreshOptions.builder()
            .enableAllAdaptiveRefreshTriggers() // 开启自适应刷新,自适应刷新不开启,Redis集群变更时将会导致连接异常
            .adaptiveRefreshTriggersTimeout(Duration.ofSeconds(30)) //自适应刷新超时时间(默认30秒),默认关闭开启后时间为30秒
            .enablePeriodicRefresh(Duration.ofSeconds(20))  // 默认关闭开启后时间为60秒 ClusterTopologyRefreshOptions.DEFAULT_REFRESH_PERIOD 60  .enablePeriodicRefresh(Duration.ofSeconds(2)) = .enablePeriodicRefresh().refreshPeriod(Duration.ofSeconds(2))
            .build();
    ClientOptions clientOptions = ClusterClientOptions.builder()
            .topologyRefreshOptions(clusterTopologyRefreshOptions)
            .build();
    // 客户端读写分离配置
    LettuceClientConfiguration clientConfig = LettuceClientConfiguration.builder()
            .clientOptions(clientOptions)
            .build();
    RedisClusterConfiguration redisClusterConfiguration = new RedisClusterConfiguration(Arrays.asList(
            "122.51.151.130:6381",
            "122.51.151.130:6382",
            "122.51.151.130:6383",
            "122.51.151.130:6384",
            "122.51.151.130:6385",
            "122.51.151.130:6386"));
    return new LettuceConnectionFactory(redisClusterConfiguration, clientConfig);
  }
}
相关实践学习
借助OSS搭建在线教育视频课程分享网站
本教程介绍如何基于云服务器ECS和对象存储OSS,搭建一个在线教育视频课程分享网站。
7天玩转云服务器
云服务器ECS(Elastic Compute Service)是一种弹性可伸缩的计算服务,可降低 IT 成本,提升运维效率。本课程手把手带你了解ECS、掌握基本操作、动手实操快照管理、镜像管理等。了解产品详情: https://www.aliyun.com/product/ecs
相关文章
|
9天前
|
NoSQL Redis 数据安全/隐私保护
Redis 最流行的图形化界面下载及使用超详细教程(带安装包)! redis windows客户端下载
文章提供了Redis最流行的图形化界面工具Another Redis Desktop Manager的下载及使用教程,包括如何下载、解压、连接Redis服务器以及使用控制台和查看数据类型详细信息。
39 6
Redis 最流行的图形化界面下载及使用超详细教程(带安装包)! redis windows客户端下载
|
9天前
|
NoSQL Redis 数据库
Redis 图形化界面下载及使用超详细教程(带安装包)! redis windows下客户端下载
文章提供了Redis图形化界面工具的下载及使用教程,包括如何连接本地Redis服务器、操作键值对、查看日志和使用命令行等功能。
47 0
Redis 图形化界面下载及使用超详细教程(带安装包)! redis windows下客户端下载
|
4天前
|
存储 消息中间件 NoSQL
Redis 入门 - C#.NET Core客户端库六种选择
Redis 入门 - C#.NET Core客户端库六种选择
30 8
|
4天前
|
分布式计算 监控 Hadoop
Hadoop-29 ZooKeeper集群 Watcher机制 工作原理 与 ZK基本命令 测试集群效果 3台公网云服务器
Hadoop-29 ZooKeeper集群 Watcher机制 工作原理 与 ZK基本命令 测试集群效果 3台公网云服务器
18 1
|
6天前
|
存储 网络协议 Java
【网络】UDP回显服务器和客户端的构造,以及连接流程
【网络】UDP回显服务器和客户端的构造,以及连接流程
26 2
|
9天前
|
前端开发 Java
学习SpringMVC,建立连接,请求,响应 SpringBoot初学,如何前后端交互(后端版)?最简单的能通过网址访问的后端服务器代码举例
文章介绍了如何使用SpringBoot创建简单的后端服务器来处理HTTP请求,包括建立连接、编写Controller处理请求,并返回响应给前端或网址。
22 0
学习SpringMVC,建立连接,请求,响应 SpringBoot初学,如何前后端交互(后端版)?最简单的能通过网址访问的后端服务器代码举例
|
19天前
|
人工智能 网络协议 Shell
内网穿透实现公网访问自己搭建的Ollma架构的AI服务器
内网穿透实现公网访问自己搭建的Ollma架构的AI服务器
36 1
|
19天前
|
人工智能 网络协议 Shell
内网穿透实现公网访问自己搭建的Ollma架构的AI服务器
内网穿透实现公网访问自己搭建的Ollma架构的AI服务器
37 0
内网穿透实现公网访问自己搭建的Ollma架构的AI服务器
|
20天前
|
JSON NoSQL Java
redis的java客户端的使用(Jedis、SpringDataRedis、SpringBoot整合redis、redisTemplate序列化及stringRedisTemplate序列化)
这篇文章介绍了在Java中使用Redis客户端的几种方法,包括Jedis、SpringDataRedis和SpringBoot整合Redis的操作。文章详细解释了Jedis的基本使用步骤,Jedis连接池的创建和使用,以及在SpringBoot项目中如何配置和使用RedisTemplate和StringRedisTemplate。此外,还探讨了RedisTemplate序列化的两种实践方案,包括默认的JDK序列化和自定义的JSON序列化,以及StringRedisTemplate的使用,它要求键和值都必须是String类型。
redis的java客户端的使用(Jedis、SpringDataRedis、SpringBoot整合redis、redisTemplate序列化及stringRedisTemplate序列化)
|
7天前
|
SQL 数据库
SQL-serve数据库不能连接本地服务器的解决方案
SQL-serve数据库不能连接本地服务器的解决方案
56 0