PostgreSQL的 initdb 源代码分析之十七

简介:

继续分析:

    setup_collation()

 展开:

复制代码
/*
 * populate pg_collation
 */
static void
setup_collation(void)
{
#if defined(HAVE_LOCALE_T) && !defined(WIN32)
    int            i;
    FILE       *locale_a_handle;
    char        localebuf[NAMEDATALEN];
    int            count = 0;

    PG_CMD_DECL;
#endif

    fputs(_("creating collations ... "), stdout);
    fflush(stdout);

#if defined(HAVE_LOCALE_T) && !defined(WIN32)
    snprintf(cmd, sizeof(cmd),
             "\"%s\" %s template1 >%s",
             backend_exec, backend_options,
             DEVNULL);

    locale_a_handle = popen_check("locale -a", "r");
    if (!locale_a_handle)
        return;                    /* complaint already printed */

    PG_CMD_OPEN;

    PG_CMD_PUTS("CREATE TEMP TABLE tmp_pg_collation ( "
                "    collname name, "
                "    locale name, "
                "    encoding int) WITHOUT OIDS;\n");

    while (fgets(localebuf, sizeof(localebuf), locale_a_handle))
    {
        size_t        len;
        int            enc;
        bool        skip;
        char       *quoted_locale;
        char        alias[NAMEDATALEN];

        len = strlen(localebuf);

        if (len == 0 || localebuf[len - 1] != '\n')
        {
            if (debug)
                fprintf(stderr, _("%s: locale name too long, skipped: %s\n"),
                        progname, localebuf);
            continue;
        }
        localebuf[len - 1] = '\0';

        /*
         * Some systems have locale names that don't consist entirely of ASCII
         * letters (such as "bokmål" or "français").  This is
         * pretty silly, since we need the locale itself to interpret the
         * non-ASCII characters. We can't do much with those, so we filter
         * them out.
         */
        skip = false;
        for (i = 0; i < len; i++)
        {
            if (IS_HIGHBIT_SET(localebuf[i]))
            {
                skip = true;
                break;
            }
        }
        if (skip)
        {
            if (debug)
                fprintf(stderr, _("%s: locale name has non-ASCII characters, skipped: %s\n"),
                        progname, localebuf);
            continue;
        }

        enc = pg_get_encoding_from_locale(localebuf, debug);
        if (enc < 0)
        {
            /* error message printed by pg_get_encoding_from_locale() */
            continue;
        }
        if (!PG_VALID_BE_ENCODING(enc))
            continue;            /* ignore locales for client-only encodings */
        if (enc == PG_SQL_ASCII)
            continue;            /* C/POSIX are already in the catalog */

        count++;

        quoted_locale = escape_quotes(localebuf);

        PG_CMD_PRINTF3("INSERT INTO tmp_pg_collation VALUES (E'%s', E'%s', %d);\n",
                       quoted_locale, quoted_locale, enc);

        /*
         * Generate aliases such as "en_US" in addition to "en_US.utf8" for
         * ease of use.  Note that collation names are unique per encoding
         * only, so this doesn't clash with "en_US" for LATIN1, say.
         */
        if (normalize_locale_name(alias, localebuf))
            PG_CMD_PRINTF3("INSERT INTO tmp_pg_collation VALUES (E'%s', E'%s', %d);\n",
                           escape_quotes(alias), quoted_locale, enc);
    }

    /* Add an SQL-standard name */
    PG_CMD_PRINTF1("INSERT INTO tmp_pg_collation VALUES ('ucs_basic', 'C', %d);\n", PG_UTF8);

    /*
     * When copying collations to the final location, eliminate aliases that
     * conflict with an existing locale name for the same encoding.  For
     * example, "br_FR.iso88591" is normalized to "br_FR", both for encoding
     * LATIN1.    But the unnormalized locale "br_FR" already exists for LATIN1.
     * Prefer the alias that matches the OS locale name, else the first locale
     * name by sort order (arbitrary choice to be deterministic).
     *
     * Also, eliminate any aliases that conflict with pg_collation's
     * hard-wired entries for "C" etc.
     */
    PG_CMD_PUTS("INSERT INTO pg_collation (collname, collnamespace, collowner, collencoding, collcollate, collctype) "
                " SELECT DISTINCT ON (collname, encoding)"
                "   collname, "
                "   (SELECT oid FROM pg_namespace WHERE nspname = 'pg_catalog') AS collnamespace, "
                "   (SELECT relowner FROM pg_class WHERE relname = 'pg_collation') AS collowner, "
                "   encoding, locale, locale "
                "  FROM tmp_pg_collation"
                "  WHERE NOT EXISTS (SELECT 1 FROM pg_collation WHERE collname = tmp_pg_collation.collname)"
       "  ORDER BY collname, encoding, (collname = locale) DESC, locale;\n");

    pclose(locale_a_handle);
    PG_CMD_CLOSE;

    check_ok();
    if (count == 0 && !debug)
    {
        printf(_("No usable system locales were found.\n"));
        printf(_("Use the option \"--debug\" to see details.\n"));
    }
#else                            /* not HAVE_LOCALE_T && not WIN32 */
    printf(_("not supported on this platform\n"));
    fflush(stdout);
#endif   /* not HAVE_LOCALE_T  && not WIN32 */
}
复制代码

其实质就是,向 pg_collation 表中插入数据

补充一点,pg_collation 的数据大概是这样的:

复制代码
pgsql=# \x
Expanded display is on.
pgsql=# select * from pg_collation limit 10;
-[ RECORD 1 ]-+-----------------
collname      | default
collnamespace | 11
collowner     | 10
collencoding  | -1
collcollate   | 
collctype     | 
-[ RECORD 2 ]-+-----------------
collname      | C
collnamespace | 11
collowner     | 10
collencoding  | -1
collcollate   | C
collctype     | C
-[ RECORD 3 ]-+-----------------
collname      | POSIX
collnamespace | 11
collowner     | 10
collencoding  | -1
collcollate   | POSIX
collctype     | POSIX
-[ RECORD 4 ]-+-----------------
collname      | aa_DJ
collnamespace | 11
collowner     | 10
collencoding  | 6
collcollate   | aa_DJ.utf8
collctype     | aa_DJ.utf8
-[ RECORD 5 ]-+-----------------
collname      | aa_DJ
collnamespace | 11
collowner     | 10
collencoding  | 8
collcollate   | aa_DJ
collctype     | aa_DJ
-[ RECORD 6 ]-+-----------------
collname      | aa_DJ.iso88591
collnamespace | 11
collowner     | 10
collencoding  | 8
collcollate   | aa_DJ.iso88591
collctype     | aa_DJ.iso88591
-[ RECORD 7 ]-+-----------------
collname      | aa_DJ.utf8
collnamespace | 11
collowner     | 10
collencoding  | 6
collcollate   | aa_DJ.utf8
collctype     | aa_DJ.utf8
-[ RECORD 8 ]-+-----------------
collname      | aa_ER
collnamespace | 11
collowner     | 10
collencoding  | 6
collcollate   | aa_ER
collctype     | aa_ER
-[ RECORD 9 ]-+-----------------
collname      | aa_ER.utf8
collnamespace | 11
collowner     | 10
collencoding  | 6
collcollate   | aa_ER.utf8
collctype     | aa_ER.utf8
-[ RECORD 10 ]+-----------------
collname      | aa_ER.utf8@saaho
collnamespace | 11
collowner     | 10
collencoding  | 6
collcollate   | aa_ER.utf8@saaho
collctype     | aa_ER.utf8@saaho

pgsql=#  
复制代码
相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍如何基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
目录
相关文章
|
关系型数据库 物联网 PostgreSQL
沉浸式学习PostgreSQL|PolarDB 11: 物联网(IoT)、监控系统、应用日志、用户行为记录等场景 - 时序数据高吞吐存取分析
物联网场景, 通常有大量的传感器(例如水质监控、气象监测、新能源汽车上的大量传感器)不断探测最新数据并上报到数据库. 监控系统, 通常也会有采集程序不断的读取被监控指标(例如CPU、网络数据包转发、磁盘的IOPS和BW占用情况、内存的使用率等等), 同时将监控数据上报到数据库. 应用日志、用户行为日志, 也就有同样的特征, 不断产生并上报到数据库. 以上数据具有时序特征, 对数据库的关键能力要求如下: 数据高速写入 高速按时间区间读取和分析, 目的是发现异常, 分析规律. 尽量节省存储空间
1094 1
|
8月前
|
SQL 存储 关系型数据库
PostgreSQL窗口函数避坑指南:如何让复杂分析查询提速300%?
本文基于真实企业级案例,深入剖析PostgreSQL窗口函数的执行原理与性能陷阱,提供8大优化策略。通过定制索引、分区裁剪、内存调优及并行处理等手段,将分钟级查询压缩至秒级响应。结合CTE分阶段计算与物化视图技术,解决海量数据分析中的瓶颈问题。某金融客户实践表明,风险分析查询从47秒降至0.8秒,效率提升5800%。文章附带代码均在PostgreSQL 15中验证,助您高效优化SQL性能。
414 0
|
关系型数据库 定位技术 分布式数据库
沉浸式学习PostgreSQL|PolarDB 18: 通过GIS轨迹相似伴随|时态分析|轨迹驻点识别等技术对拐卖、诱骗场景进行侦查
本文主要教大家怎么用好数据库, 而不是怎么运维管理数据库、怎么开发数据库内核.
1619 1
|
Oracle NoSQL 关系型数据库
主流数据库对比:MySQL、PostgreSQL、Oracle和Redis的优缺点分析
主流数据库对比:MySQL、PostgreSQL、Oracle和Redis的优缺点分析
2906 3
|
存储 关系型数据库 MySQL
TiDB与MySQL、PostgreSQL等数据库的比较分析
【2月更文挑战第25天】本文将对TiDB、MySQL和PostgreSQL等数据库进行详细的比较分析,探讨它们各自的优势和劣势。TiDB作为一款分布式关系型数据库,在扩展性、并发性能等方面表现突出;MySQL以其易用性和成熟性受到广泛应用;PostgreSQL则在数据完整性、扩展性等方面具有优势。通过对比这些数据库的特点和适用场景,帮助企业更好地选择适合自己业务需求的数据库系统。
2710 4
|
SQL 关系型数据库 MySQL
PostgreSQL【异常 01】java.io.IOException:Tried to send an out-of-range integer as a 2-byte value 分析+解决
PostgreSQL【异常 01】java.io.IOException:Tried to send an out-of-range integer as a 2-byte value 分析+解决
1182 1
|
存储 关系型数据库 PostgreSQL
Postgresql内核源码分析-heapam分析
Postgresql内核源码分析-heapam分析
396 1
|
SQL 关系型数据库 MySQL
《PostgreSQL与MySQL:详细对比与分析》
《PostgreSQL与MySQL:详细对比与分析》
1021 0
|
关系型数据库 分布式数据库 PolarDB
沉浸式学习PostgreSQL|PolarDB 15: 企业ERP软件、网站、分析型业务场景、营销场景人群圈选, 任意字段组合条件数据筛选
本篇文章目标学习如何快速在任意字段组合条件输入搜索到满足条件的数据.
843 0
|
关系型数据库 分布式数据库 PolarDB
PolarDB 开源版通过 postgresql_hll 实现高效率 UV滑动分析、实时推荐已读列表过滤
背景PolarDB 的云原生存算分离架构, 具备低廉的数据存储、高效扩展弹性、高速多机并行计算能力、高速数据搜索和处理; PolarDB与计算算法结合, 将实现双剑合璧, 推动业务数据的价值产出, 将数据变成生产力.本文将介绍PolarDB 开源版通过 postgresql_hll 实现高效率 UV...
461 0

推荐镜像

更多