Redis4.x新特性 -- 萌萌的MEMORY DOCTOR

本文涉及的产品
云数据库 Tair(兼容Redis),内存型 2GB
Redis 开源版,标准版 2GB
推荐场景:
搭建游戏排行榜
简介: MEMORY DOCTOR命令是Redis4.x版本新增MEMORY 命令下的一个子命令,它可以通过诊断给出关于redis内存使用方面的建议,在不同的状态下会有不同的分析结果。

Redis4.x版本去年发布以后,新增了许多新的功能特性。大致翻看下来,一个叫MEMORY DOCTOR的命令吸引了我的注意。MEMORY DOCTOR命令是Redis4.x版本新增MEMORY 命令下的一个子命令,它可以通过诊断给出关于redis内存使用方面的建议,在不同的状态下会有不同的分析结果。此时我的脑海里第一个闪过念头:最强AI?redis是不是通过什么复杂的人工智能算法,对其使用的内存状况做了全方位的分析进而给出了最合理的优化建议?它是不是和AlphaGo一样通过学习会变得越来越强?伴随着疑问我尝试的从github上找了找相关的源码,最后知道真相的我眼泪掉来,不得不说MEMORY DOCTOR的实现真的挺萌的,萌的可爱。什么最强AI,我上我也行呀。

使用展示

doctor

源码分析

MEMORY DOCTOR功能源码地址 https://github.com/antirez/redis/blob/unstable/src/object.c
实现逻辑全在1080行开始的getMemoryDoctorReport方法中

/* This implements MEMORY DOCTOR. An human readable analysis of the Redis
 * memory condition. */
sds getMemoryDoctorReport(void) {
    int empty = 0;          /* Instance is empty or almost empty. */
    int big_peak = 0;       /* Memory peak is much larger than used mem. */
    int high_frag = 0;      /* High fragmentation. */
    int high_alloc_frag = 0;/* High allocator fragmentation. */
    int high_proc_rss = 0;  /* High process rss overhead. */
    int high_alloc_rss = 0; /* High rss overhead. */
    int big_slave_buf = 0;  /* Slave buffers are too big. */
    int big_client_buf = 0; /* Client buffers are too big. */
    int many_scripts = 0;   /* Script cache has too many scripts. */
    int num_reports = 0;
    struct redisMemOverhead *mh = getMemoryOverheadData();

    if (mh->total_allocated < (1024*1024*5)) {
        empty = 1;
        num_reports++;
    } else {
        /* Peak is > 150% of current used memory? */
        if (((float)mh->peak_allocated / mh->total_allocated) > 1.5) {
            big_peak = 1;
            num_reports++;
        }

        /* Fragmentation is higher than 1.4 and 10MB ?*/
        if (mh->total_frag > 1.4 && mh->total_frag_bytes > 10<<20) {
            high_frag = 1;
            num_reports++;
        }

        /* External fragmentation is higher than 1.1 and 10MB? */
        if (mh->allocator_frag > 1.1 && mh->allocator_frag_bytes > 10<<20) {
            high_alloc_frag = 1;
            num_reports++;
        }

        /* Allocator fss is higher than 1.1 and 10MB ? */
        if (mh->allocator_rss > 1.1 && mh->allocator_rss_bytes > 10<<20) {
            high_alloc_rss = 1;
            num_reports++;
        }

        /* Non-Allocator fss is higher than 1.1 and 10MB ? */
        if (mh->rss_extra > 1.1 && mh->rss_extra_bytes > 10<<20) {
            high_proc_rss = 1;
            num_reports++;
        }

        /* Clients using more than 200k each average? */
        long numslaves = listLength(server.slaves);
        long numclients = listLength(server.clients)-numslaves;
        if (mh->clients_normal / numclients > (1024*200)) {
            big_client_buf = 1;
            num_reports++;
        }

        /* Slaves using more than 10 MB each? */
        if (numslaves > 0 && mh->clients_slaves / numslaves > (1024*1024*10)) {
            big_slave_buf = 1;
            num_reports++;
        }

        /* Too many scripts are cached? */
        if (dictSize(server.lua_scripts) > 1000) {
            many_scripts = 1;
            num_reports++;
        }
    }

    sds s;
    if (num_reports == 0) {
        s = sdsnew(
        "Hi Sam, I can't find any memory issue in your instance. "
        "I can only account for what occurs on this base.\n");
    } else if (empty == 1) {
        s = sdsnew(
        "Hi Sam, this instance is empty or is using very little memory, "
        "my issues detector can't be used in these conditions. "
        "Please, leave for your mission on Earth and fill it with some data. "
        "The new Sam and I will be back to our programming as soon as I "
        "finished rebooting.\n");
    } else {
        s = sdsnew("Sam, I detected a few issues in this Redis instance memory implants:\n\n");
        if (big_peak) {
            s = sdscat(s," * Peak memory: In the past this instance used more than 150% the memory that is currently using. The allocator is normally not able to release memory after a peak, so you can expect to see a big fragmentation ratio, however this is actually harmless and is only due to the memory peak, and if the Redis instance Resident Set Size (RSS) is currently bigger than expected, the memory will be used as soon as you fill the Redis instance with more data. If the memory peak was only occasional and you want to try to reclaim memory, please try the MEMORY PURGE command, otherwise the only other option is to shutdown and restart the instance.\n\n");
        }
        if (high_frag) {
            s = sdscatprintf(s," * High total RSS: This instance has a memory fragmentation and RSS overhead greater than 1.4 (this means that the Resident Set Size of the Redis process is much larger than the sum of the logical allocations Redis performed). This problem is usually due either to a large peak memory (check if there is a peak memory entry above in the report) or may result from a workload that causes the allocator to fragment memory a lot. If the problem is a large peak memory, then there is no issue. Otherwise, make sure you are using the Jemalloc allocator and not the default libc malloc. Note: The currently used allocator is \"%s\".\n\n", ZMALLOC_LIB);
        }
        if (high_alloc_frag) {
            s = sdscatprintf(s," * High allocator fragmentation: This instance has an allocator external fragmentation greater than 1.1. This problem is usually due either to a large peak memory (check if there is a peak memory entry above in the report) or may result from a workload that causes the allocator to fragment memory a lot. You can try enabling 'activedefrag' config option.\n\n");
        }
        if (high_alloc_rss) {
            s = sdscatprintf(s," * High allocator RSS overhead: This instance has an RSS memory overhead is greater than 1.1 (this means that the Resident Set Size of the allocator is much larger than the sum what the allocator actually holds). This problem is usually due to a large peak memory (check if there is a peak memory entry above in the report), you can try the MEMORY PURGE command to reclaim it.\n\n");
        }
        if (high_proc_rss) {
            s = sdscatprintf(s," * High process RSS overhead: This instance has non-allocator RSS memory overhead is greater than 1.1 (this means that the Resident Set Size of the Redis process is much larger than the RSS the allocator holds). This problem may be due to Lua scripts or Modules.\n\n");
        }
        if (big_slave_buf) {
            s = sdscat(s," * Big replica buffers: The replica output buffers in this instance are greater than 10MB for each replica (on average). This likely means that there is some replica instance that is struggling receiving data, either because it is too slow or because of networking issues. As a result, data piles on the master output buffers. Please try to identify what replica is not receiving data correctly and why. You can use the INFO output in order to check the replicas delays and the CLIENT LIST command to check the output buffers of each replica.\n\n");
        }
        if (big_client_buf) {
            s = sdscat(s," * Big client buffers: The clients output buffers in this instance are greater than 200K per client (on average). This may result from different causes, like Pub/Sub clients subscribed to channels bot not receiving data fast enough, so that data piles on the Redis instance output buffer, or clients sending commands with large replies or very large sequences of commands in the same pipeline. Please use the CLIENT LIST command in order to investigate the issue if it causes problems in your instance, or to understand better why certain clients are using a big amount of memory.\n\n");
        }
        if (many_scripts) {
            s = sdscat(s," * Many scripts: There seem to be many cached scripts in this instance (more than 1000). This may be because scripts are generated and `EVAL`ed, instead of being parameterized (with KEYS and ARGV), `SCRIPT LOAD`ed and `EVALSHA`ed. Unless `SCRIPT FLUSH` is called periodically, the scripts' caches may end up consuming most of your memory.\n\n");
        }
        s = sdscat(s,"I'm here to keep you safe, Sam. I want to help you.\n");
    }
    freeMemoryOverheadData(mh);
    return s;
}

通过阅读源码,我们不难发现redis会从Instance is empty or not,Memory used peak,High fragmentation,High allocator fragmentation,High process rss overhead,High rss overhead,Slave buffers are too big,Client buffers are too big,Script cache has too many 这八个维度来分析一个redis节点的内存使用状况,每个维度都有相应的触发条件,然后再分别给每个维度提出对应解决建议,具体见下表:

维度 触发条件 建议
Instance is empty or almost empty 节点内存空间使用小于5M
Memory peak is much larger 节点在过去使用内存的峰值超过150%的当前内存 导致内存碎片率较大,但这它是无害的,可以手动执行MEMORY PURGE命令或者重启节点
High fragmentation memory_fragmentation 和 RSS memory之和大于1.4 通常是由于工作负载导致分配器大量分段内存造成的,请确保使用的内存分配器是Jemalloc
High allocator fragmentation 节点的allocator external fragmentation大于1.1 建议尝试启用"activedefrag"配置选项开启自动内存碎片整理
High rss overhead 节点的non-allocator RSS memory开销大于1.1 意味着Redis进程的驻留集大小远大于分配器保存的RSS,通常可能是由于Lua脚本或模块化功能引起的
High process rss overhead 节点的RSS memory大于1.1 此通常是由于峰值内存较大,可以尝试使用MEMORY PURGE命令来回收它
Slave buffers are too big 节点的output buffers对于每个replica(平均)大于10MB 一些replica正在努力接收数据,要么是因为它太慢,要么是因为网络问题.可以使用INFO输出来检查副本延迟.并使用CLIENT LIST命令检查每个副本的输出缓冲区
Client buffers are too big 节点的 clients output buffers大于每个客户端200K(平均) 可能是由于不同的原因造成的,使用CLIENT LIST命令调查问题
Script cache has too many 节点中缓存脚超过1000个 定期调用SCRIPT FLUSH命令

到此为止,我们应该彻底弄清楚了MEMORY DOCTOR的实现机制了,它到底在什么情况下会给相应的诊断意见。虽然他的实现有那么些许的萌萌哒,但又显得那么的合情合理。后续我们的RedisManager会基于MEMORY DOCTOR命令增加4.0以上集群的健康诊断功能。

些许想法

1.MEMORY DOCTOR这几个对于内存状况维度分析的指标和执行info命令得到的结果如此的吻合,对于3.0甚至2.8版本那些不支持MEMORY DOCTOR命令的redis版本,我们是不是自己也可以实现一套内存诊断机制,给出对应的节点解决方案?
2.诊断意见中的那个Sam到底是谁。。

相关实践学习
基于Redis实现在线游戏积分排行榜
本场景将介绍如何基于Redis数据库实现在线游戏中的游戏玩家积分排行榜功能。
云数据库 Redis 版使用教程
云数据库Redis版是兼容Redis协议标准的、提供持久化的内存数据库服务,基于高可靠双机热备架构及可无缝扩展的集群架构,满足高读写性能场景及容量需弹性变配的业务需求。 产品详情:https://www.aliyun.com/product/kvstore &nbsp; &nbsp; ------------------------------------------------------------------------- 阿里云数据库体验:数据库上云实战 开发者云会免费提供一台带自建MySQL的源数据库&nbsp;ECS 实例和一台目标数据库&nbsp;RDS实例。跟着指引,您可以一步步实现将ECS自建数据库迁移到目标数据库RDS。 点击下方链接,领取免费ECS&amp;RDS资源,30分钟完成数据库上云实战!https://developer.aliyun.com/adc/scenario/51eefbd1894e42f6bb9acacadd3f9121?spm=a2c6h.13788135.J_3257954370.9.4ba85f24utseFl
目录
相关文章
|
8月前
|
存储 缓存 NoSQL
聊聊 Redis 的高级特性之一: 发布订阅
Redis 发布订阅 (pub/sub) 是一种消息通信模式:发送者 (pub) 发送消息,订阅者 (sub) 接收消息。 图中,消费者1和消费者2 订阅了 Redis 服务的频道 channel ,当生产者通过 PUBLISH 命令发送给频道 channel 时, 这个消息就会被发送给订阅它的两个客户端。
聊聊 Redis 的高级特性之一: 发布订阅
|
2月前
|
存储 缓存 监控
利用 Redis 缓存特性避免缓存穿透的策略与方法
【10月更文挑战第23天】通过以上对利用 Redis 缓存特性避免缓存穿透的详细阐述,我们对这一策略有了更深入的理解。在实际应用中,我们需要根据具体情况灵活运用这些方法,并结合其他技术手段,共同保障系统的稳定和高效运行。同时,要不断关注 Redis 缓存特性的发展和变化,及时调整策略,以应对不断出现的新挑战。
81 10
|
3月前
|
存储 消息中间件 NoSQL
【redis】redis的特性和主要应用场景
【redis】redis的特性和主要应用场景
235 1
|
3月前
|
NoSQL 关系型数据库 MySQL
Redis 事务特性、原理、具体命令操作全方位诠释 —— 零基础可学习
本文全面阐述了Redis事务的特性、原理、具体命令操作,指出Redis事务具有原子性但不保证一致性、持久性和隔离性,并解释了Redis事务的适用场景和WATCH命令的乐观锁机制。
514 0
Redis 事务特性、原理、具体命令操作全方位诠释 —— 零基础可学习
|
6月前
|
消息中间件 缓存 NoSQL
Redis快速度特性及为什么支持多线程及应用场景
Redis快速度特性及为什么支持多线程及应用场景
137 11
|
5月前
|
NoSQL Java 调度
Lettuce的特性和内部实现问题之Redis的管道模式提升性能的问题如何解决
Lettuce的特性和内部实现问题之Redis的管道模式提升性能的问题如何解决
|
5月前
|
NoSQL 网络协议 安全
Lettuce的特性和内部实现问题之Lettuce天然地使用管道模式与Redis交互的问题如何解决
Lettuce的特性和内部实现问题之Lettuce天然地使用管道模式与Redis交互的问题如何解决
|
7月前
|
监控 NoSQL 网络安全
Redis 6和7:探索新版本中的新特性
Redis 6和7:探索新版本中的新特性
|
8月前
|
存储 NoSQL 关系型数据库
redis-学习笔记(概念, 相关名词, 特性, 优势: 快)
redis-学习笔记(概念, 相关名词, 特性, 优势: 快)
47 0
|
8月前
|
缓存 NoSQL 中间件
redis内存溢出报错--OOM command not allowed when used memory > 'maxmemory'
该内容是关于Redis缓存服务器的使用指南。通过Xshell连接IP地址为25.218.153.193或206的主机,进入/data/iuap/middleware/redis-30001/bin目录,使用`redis-cli`连接到IP为206的30003端口。登录时需`auth yonyou*123`,可运行`info`和`info memory`查看状态,`flushall`清理缓存。在清理前,要备份/data/iuap/middleware/redis-30003/data/下的.aof和.rdb文件,利用tar命令打包并移至/tmp目录。