开发者社区> 问答> 正文

greenplum bug

the error message :

 Failed to kill processes for segment /home/gpuser/gpdb/data/gpseg-1: ([Errno 3] No such process)

the error message will show when i init greenplum or gpstop -a ,but it works right!
also i could to user it to psql postgres and create table,insert and so on.
please help me !

                                          using english to describe the simple question to  practice ,than you!

1
2
3

展开
收起
jason张 2016-01-25 10:39:46 8238 0
4 条回答
写回答
取消 提交回答
  • 这是正常,不是bug

    2019-07-17 18:26:22
    赞同 展开评论 打赏
  • 有些事确实挺操蛋的,

    jczhang@jczhang:~/gpdb$ gpstart -a
    20160126:11:23:07:002880 gpstart:jczhang:jczhang-[INFO]:-Starting gpstart with args: -a
    20160126:11:23:07:002880 gpstart:jczhang:jczhang-[INFO]:-Gathering information and validating the environment...
    20160126:11:23:07:002880 gpstart:jczhang:jczhang-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 4.3.99.00 build dev'
    20160126:11:23:07:002880 gpstart:jczhang:jczhang-[INFO]:-Greenplum Catalog Version: '301601051'
    20160126:11:23:07:002880 gpstart:jczhang:jczhang-[INFO]:-Starting Master instance in admin mode
    20160126:11:23:09:002880 gpstart:jczhang:jczhang-[INFO]:-Obtaining Greenplum Master catalog information
    20160126:11:23:09:002880 gpstart:jczhang:jczhang-[INFO]:-Obtaining Segment details from master...
    20160126:11:23:10:002880 gpstart:jczhang:jczhang-[INFO]:-Setting new master era
    20160126:11:23:10:002880 gpstart:jczhang:jczhang-[INFO]:-Master Started...
    20160126:11:23:10:002880 gpstart:jczhang:jczhang-[INFO]:-Shutting down master
    20160126:11:23:12:002880 gpstart:jczhang:jczhang-[INFO]:-Commencing parallel segment instance startup, please wait...
    ........
    20160126:11:23:20:002880 gpstart:jczhang:jczhang-[INFO]:-Process results...
    20160126:11:23:20:002880 gpstart:jczhang:jczhang-[INFO]:-----------------------------------------------------
    20160126:11:23:20:002880 gpstart:jczhang:jczhang-[INFO]:- Successful segment starts = 2
    20160126:11:23:20:002880 gpstart:jczhang:jczhang-[INFO]:- Failed segment starts = 0
    20160126:11:23:20:002880 gpstart:jczhang:jczhang-[INFO]:- Skipped segment starts (segments are marked down in configuration) = 0
    20160126:11:23:20:002880 gpstart:jczhang:jczhang-[INFO]:-----------------------------------------------------
    20160126:11:23:20:002880 gpstart:jczhang:jczhang-[INFO]:-
    20160126:11:23:20:002880 gpstart:jczhang:jczhang-[INFO]:-Successfully started 2 of 2 segment instances
    20160126:11:23:20:002880 gpstart:jczhang:jczhang-[INFO]:-----------------------------------------------------
    20160126:11:23:20:002880 gpstart:jczhang:jczhang-[INFO]:-Starting Master instance jczhang directory /home/jczhang/gpdb/data/gpseg-1
    20160126:11:23:21:002880 gpstart:jczhang:jczhang-[INFO]:-Command pg_ctl reports Master jczhang instance active
    20160126:11:23:21:002880 gpstart:jczhang:jczhang-[INFO]:-No standby master configured. skipping...
    20160126:11:23:21:002880 gpstart:jczhang:jczhang-[INFO]:-Database successfully started
    上面是我启动过程
    下面是ps x显示的进程
    2986 ? Ss 0:00 /home/jczhang/gpdb/build/gpdb.master/bin/postgres -D /home/jczhang/gpdb/data/gpseg0 -p 40000 -b 2 -z 2 --silent-mode=true -i -M mirrorless -C 0
    2987 ? Ss 0:00 /home/jczhang/gpdb/build/gpdb.master/bin/postgres -D /home/jczhang/gpdb/data/gpseg1 -p 40001 -b 3 -z 2 --silent-mode=true -i -M mirrorless -C 1
    2988 ? Ss 0:00 postgres: port 40001, logger process
    2989 ? Ss 0:00 postgres: port 40000, logger process
    2996 ? Ss 0:00 postgres: port 40001, stats collector process
    2997 ? Ss 0:00 postgres: port 40001, writer process
    2998 ? Ss 0:00 postgres: port 40001, checkpoint process
    2999 ? S 0:00 postgres: port 40001, sweeper process
    3000 ? Ss 0:00 postgres: port 40000, stats collector process
    3001 ? Ss 0:00 postgres: port 40000, writer process
    3002 ? Ss 0:00 postgres: port 40000, checkpoint process
    3003 ? S 0:00 postgres: port 40000, sweeper process
    3015 ? Ss 0:00 /home/jczhang/gpdb/build/gpdb.master/bin/postgres -D /home/jczhang/gpdb/data/gpseg-1 -p 5432 -b 1 -z 2 --silent-mode=true -i -M master -C -1 -x 0 -E
    3016 ? Ss 0:00 postgres: port 5432, master logger process
    3019 ? Ss 0:00 postgres: port 5432, stats collector process
    3020 ? Ss 0:00 postgres: port 5432, writer process
    3021 ? Ss 0:00 postgres: port 5432, checkpoint process
    3022 ? S 0:00 postgres: port 5432, seqserver process
    3023 ? S 0:00 postgres: port 5432, ftsprobe process
    3024 ? S 0:00 postgres: port 5432, sweeper process
    可以看到master进程:3015是启动的
    然后是关闭集群
    20160126:11:24:17:003062 gpstop:jczhang:jczhang-[INFO]:-Starting gpstop with args: -a
    20160126:11:24:17:003062 gpstop:jczhang:jczhang-[INFO]:-Gathering information and validating the environment...
    20160126:11:24:17:003062 gpstop:jczhang:jczhang-[INFO]:-Obtaining Greenplum Master catalog information
    20160126:11:24:17:003062 gpstop:jczhang:jczhang-[INFO]:-Obtaining Segment details from master...
    20160126:11:24:17:003062 gpstop:jczhang:jczhang-[INFO]:-Greenplum Version: 'postgres (Greenplum Database) 4.3.99.00 build dev'
    20160126:11:24:17:003062 gpstop:jczhang:jczhang-[INFO]:-There are 0 connections to the database
    20160126:11:24:17:003062 gpstop:jczhang:jczhang-[INFO]:-Commencing Master instance shutdown with mode='smart'
    20160126:11:24:17:003062 gpstop:jczhang:jczhang-[INFO]:-Master host=jczhang
    20160126:11:24:17:003062 gpstop:jczhang:jczhang-[INFO]:-Commencing Master instance shutdown with mode=smart
    20160126:11:24:17:003062 gpstop:jczhang:jczhang-[INFO]:-Master segment instance directory=/home/jczhang/gpdb/data/gpseg-1
    20160126:11:24:18:003062 gpstop:jczhang:jczhang-[INFO]:-Attempting forceful termination of any leftover master process
    20160126:11:24:18:003062 gpstop:jczhang:jczhang-[INFO]:-Terminating processes for segment /home/jczhang/gpdb/data/gpseg-1
    20160126:11:24:18:003062 gpstop:jczhang:jczhang-[ERROR]:-Failed to kill processes for segment /home/jczhang/gpdb/data/gpseg-1: ([Errno 3] No such process)
    20160126:11:24:18:003062 gpstop:jczhang:jczhang-[INFO]:-No standby master host configured
    20160126:11:24:18:003062 gpstop:jczhang:jczhang-[INFO]:-Commencing parallel segment instance shutdown, please wait...
    20160126:11:24:18:003062 gpstop:jczhang:jczhang-[INFO]:-0.00% of jobs completed
    20160126:11:24:28:003062 gpstop:jczhang:jczhang-[INFO]:-100.00% of jobs completed
    20160126:11:24:28:003062 gpstop:jczhang:jczhang-[INFO]:-----------------------------------------------------
    20160126:11:24:28:003062 gpstop:jczhang:jczhang-[INFO]:- Segments stopped successfully = 2
    20160126:11:24:28:003062 gpstop:jczhang:jczhang-[INFO]:- Segments with errors during stop = 0
    20160126:11:24:28:003062 gpstop:jczhang:jczhang-[INFO]:-----------------------------------------------------
    20160126:11:24:28:003062 gpstop:jczhang:jczhang-[INFO]:-Successfully shutdown 2 of 2 segment instances
    20160126:11:24:28:003062 gpstop:jczhang:jczhang-[INFO]:-Database successfully shutdown with no errors reported
    20160126:11:24:28:003062 gpstop:jczhang:jczhang-[INFO]:-Cleaning up leftover gpmmon process
    20160126:11:24:28:003062 gpstop:jczhang:jczhang-[INFO]:-No leftover gpmmon process found
    20160126:11:24:28:003062 gpstop:jczhang:jczhang-[INFO]:-Cleaning up leftover gpsmon processes
    20160126:11:24:28:003062 gpstop:jczhang:jczhang-[INFO]:-No leftover gpsmon processes on some hosts. not attempting forceful termination on these hosts
    20160126:11:24:28:003062 gpstop:jczhang:jczhang-[INFO]:-Cleaning up leftover shared memory
    注意到有error提示了!


    123

    2019-07-17 18:26:22
    赞同 展开评论 打赏
  • 公益是一辈子的事, I am digoal, just do it. 阿里云数据库团队, 擅长PolarDB, PostgreSQL, DuckDB, ADB等, 长期致力于推动开源数据库技术、生态在中国的发展与开源产业人才培养. 曾荣获阿里巴巴麒麟布道师称号、2018届OSCAR开源尖峰人物.

    是不是在gpstop前,进程以及不在了呢?


    看了一下gpstop的脚本,这个问题是由于KILL一个不存在的进程引起的,说明正常停库了。
    gpstop会尝试先停库,然后不管有没有停掉,都会去KILL。

        ######
        def _stop_master(self,masterOnly=False):
            ''' shutsdown the master '''
            
            self.conn = dbconn.connect(self.dburl, utility=True)        
            self._stop_master_checks()
                
            self.conn.close()
        
            e = GpEraFile(self.master_datadir, logger=get_logger_if_verbose())
            e.end_era()
    
            logger.info("Commencing Master instance shutdown with mode=%s" % self.mode)
            logger.info("Master segment instance directory=%s" % self.master_datadir)
            
            cmd=gp.MasterStop("stopping master", self.master_datadir, mode=self.mode, timeout=self.timeout)
            try:
                cmd.run(validateAfter=True)
            except:
                # Didn't stop in timeout or pg_ctl failed.  So try kill
                (succeeded,mypid,file_datadir)=pg.ReadPostmasterTempFile.local("Read master tmp file", self.dburl.pgport).getResults()
                if succeeded and file_datadir == self.master_datadir:
                    if unix.check_pid(mypid):
                        logger.info("Failed to shutdown master with pg_ctl.")
                        logger.info("Sending SIGQUIT signal...")
                        os.kill(mypid,signal.SIGQUIT)
                        time.sleep(5)
                        
                        # Still not gone... try SIGABRT
                        if unix.check_pid(mypid):
                            logger.info("Sending SIGABRT signal...")
                            os.kill(mypid,signal.SIGABRT)                      
                            time.sleep(5)
                        
                        if not unix.check_pid(mypid):
                            # Clean up files
                            lockfile="/tmp/.s.PGSQL.%s" % self.dburl.pgport    
                            if os.path.exists(lockfile):
                                logger.info("Clearing segment instance lock files")        
                                os.remove(lockfile)
            logger.info('Attempting forceful termination of any leftover master process')
            (succeeded,mypid,file_datadir)=pg.ReadPostmasterTempFile.local("Read master tmp file", self.dburl.pgport).getResults()
            unix.kill_9_segment_processes(self.master_datadir, self.dburl.pgport, mypid)
                
            logger.debug("Successfully shutdown the Master instance in admin mode")

    GPSTOP的逻辑引起的错误,首先GPSTOP会正常关闭master,然后会KILL master。
    如果正常关闭了,gpstop还是会去kill。所以只要正常关闭了master,就一定会报这个ERROR。
    代码在gpstop中可以看到。

    2019-07-17 18:26:22
    赞同 展开评论 打赏
  • https://github.com/ideal

    Seems that the process with corresponding pid you want to kill does not exist. Maybe not a bug.

    2019-07-17 18:26:22
    赞同 展开评论 打赏
问答地址:
问答排行榜
最热
最新

相关电子书

更多
Greenplum内核优化 —ApsaraDB for Greenplum的内核定制 立即下载
Greenplum内核优化 立即下载
\"中国人的数据库分支——ApsaraDB AliSQL开源思路 \" 立即下载