PostgreSQL-XC 使用详解及维护

参考：

http://postgres-xc.sourceforge.net/docs/1_2_1/

https://www.postgres-xl.org/documentation/index.html

建表详解

根据数据的分布方式，pxc可以创建以下两种类型的表

Replicated Tables:

各个底层节点数据库上表中的数据完成相同。插入数据时，会分别在各个底层节点数据库上插入相同的数据。值需要读取任意一个节点的数据。

Distributed Tables:

根据一个拆分规则把表的数据拆分到各个底层的数据库数据节点上，也就是分布式数据库中所谓的 sharding技术。每个底层数据节点上的只保存了表的一部分数据。

PXC可以将不同表的数据分布到不同的部分底层数据库节点上，而不必全部完整的分布到所有的底层数据节点上。

在创建表的时候，需要指定表的数据分布到哪些节点上

CREATE TABLE (.............)

[ Distribute BY { Replication | Roundrobin | { [ HASH | MODULO ] ( COLUMN_NAME) } } ]

[ TO { GROUP groupname | NODE ( nodename [ ,... ] } ]

REPLICATION：表示表在不同的数据节点上有相同的数据，相当于数据复制；

ROUNDROBIN: 表示根据插入的顺序把数据依次插入到不同的后端数据库节点；“HASH” 表示按hash算法计算的结果把数据分布到后端的节点中；“ MODULO” 是按照取模的结果值来分布数据。

注意

ROUNDROBIN 会按照插入数据的顺序依次把数据放到不同的数据节点上，这种方式中，数据是任意分布到各个底层节点上的，不能有唯一键，PXC是无法保证键值唯一的。

如果按"HASH" 或 “MODULO” 方式创建表，则表上的唯一约束（包括主键约束）必须是分布键，如果不是分布键，PXC无法在多个节点上保证数据的唯一性。

测试结果：

主键必须是单字段，不支持符合主键，解决方法是：创建表时先指定单字段主键，然后再创建唯一约束索引（"多字段主键"）

create table t_011 (id int,qty int,name varchar(32),primary key(id)) distribute by hash(id) to node (dnode1,dnode2);

create unique index idx_id_qty on t_011(id,qty);

TO 子句指定把表的内容分布到哪些数据节点。其中“NODE nodename” 指定分布的数据节点；“GROUP groupname”指定要放到哪个节点组中。使用 CREATE NODE GROUP nodename命令创建。

在所有 Coordinators上执行下列SQL

testdb=# create node group cgname1 with (dnode1,dnode2);
testdb=# create node group cgall with (dnode1,dnode2,dnode3);

如果不指定 distribute by 子句时，分布键会自动使用主键。当没有主键是，只有唯一键时，PXC会自动使用唯一键，如果唯一约束不存在时，会使用找到的第一个可以作为分布键的列。

例如

create table test01( id int primary key,city varchar(32)) distribute by hash(id) to group cgname1;

create table test02(id int primary key,addr varchar(32)) distribute by hash(id) to group cgall;

create table test03( id int primary key, remarks varchar(32))distribute by hash(id) to node (dnode1,dnode3);

create table test04(id int,qty int,name varchar(32)) distribute by ROUNDROBIN(ID) to node(dnode1,dnode2,dnode3);

postgres=# create user zhaowz superuser;

ERROR: Failed to get pooled connections

CONTEXT: SQL statement "EXECUTE DIRECT ON (cdtor1) 'SELECT pg_catalog.pg_try_advisory_xact_lock_shared(65535, 0)'"

原因及解决方法

原因： Coordinators 的 pg_hba 没有配置数据节点白名单（数据节点IP）。登录所有的Coordinators实例配置白名单

[pgxc@vlnx113052001 coordinator]$ vim pg_hba.conf

host all all 172.31.107.1/24 trust

host all all 0.0.0.0/0 trust

[pgxc@vlnx113052001 coordinator]$ psql -p 5435 postgres

postgres=# select pg_reload_conf();

create table test05( id integer,name varchar(32)) distribute by  HASH(id) to group cgname1;
create table test06( id integer,name varchar(32)) distribute by  HASH(id) to node(dnode1,dnode3);


insert into test06 select generate_series(1,30),'test-pgxc';
insert into test05 select generate_series(1,60),'test-pgxc01';

可以通过 Coordinateors 就可以查询底层数据节点数据，
testdb=# execute direct on (dnode1) 'select * from test05';

EXECUTE DIRECT

Postgres-XC 创建表及增加节点维护

重新分布数据
PXC 可以修改表的重分布属性，即将表的数据在不同的节点中进行重新分布。
第一种：
        默认的重分布方法，首先，通过 copy to 命令把所有的数据保存到 coordinator上，然后使用truncate把各个数据节点上的数据清空。
        最后使用 copy from 机制把数据重新分布到底层的各个数据节点上。最后根据情况可能会执行  reindex。

第二种：
    replicated表到replicated表： 这通常是在 replicated表中增加或删除底层数据节点时使用。
    删除一个副本，直接在这个数据节点上 执行  truncate 即可。
    增加一个副本，通过 copy to 命令把任意节点上的数据保存到 coordinator上，然后使用  copy from 机制把数据复制到新增的节点上即可。最后根据情况可能会执行  reindex。

第三种：
    把replicated表转为distribute表：
    如果转换后 distribute表的节点分布列表与分布前不同或有新节点增加，则使用默认方法，即第一种；
    如果转换后distribute表分布的节点和转换前相同或减少了，则不需要跨节点重新分布数据，只需要删除底层节点表中那些不需要保留的数据即可。最后根据情况可能会执行  reindex。

第四种：
    把distribute表转换成 replicated表： 使用默认的重分布方法，即第一种。

Postgres-XC 通过扩展 “ALTER  TABLE ” 命令提供了让表数据重新分布的功能，主要增加了以下子句

DISTRIBUTE BY { REPLICATION | ROUNDROBIN | { HASH | MODULO  ( column_name) } }
    TO { GROUP  groupname | NODE  （nodename [ , ....]）}
    ADD NODE （nodename [, ... ]）
    DELETE NODE (nodename [, ... ] ）


增加Coordinators 节点

    1、初始化一个新的 coordinator节点
    2、配置对应的 postgresql.conf 文件
    3、连接到任意已有的 coordinator节点，锁住集群，为备份做准备，执行锁住命令 但不要退出,否则备份出来的数据会不一致
    select pgxc_lock_for_backup();
postgres=# select pgxc_lock_for_backup();
INFO:  please do not close this session until you are done adding the new node

     做完这个操作后，整个集群不要执行DDL语句，这样数据库的元数据就不会发生变化，就能保证备份的一致性了。
    4、再连接已有的任意一个 coordinator节点，执行备份
    /usr/local/pgxc1.2/bin/pg_dumpall -p 5432 -s --include-nodes --dump-nodes --file=meta.sql
    5、把新的 coordinator节点启动到 restoremode 模式下
         /usr/local/pgxc1.2/bin/pg_ctl start -Z restoremode -D /opt/pgxc/coordinator/
        将备份文件拷贝到新节点
        psql -d postgres -p 5432 -f /tmp/meta.sql
        然后将新的 coordinator停止
/usr/local/pgxc1.2/bin/pg_ctl stop -D /opt/pgxc/coordinator
    6、启动新coordinator
        /usr/local/pgxc1.2/bin/pg_ctl start -Z coordinator -D /opt/pgxc/coordinator
    7、在每台 coordinator上执行 create node 命令增加新节点,最后调用函数 pgxc_pool_reload() 刷新缓存在连接池中的节点信息
        create node cdtor5 with (host= 'vlnx107001.firstshare.cn',type = 'coordinator',port=5432);

        select pgxc_pool_reload();
    8、退出第3 步的session，释放集群锁。

移除Coordinators 节点

1、将要移除的节点停掉
    pg_ctl stop -Z coordinator -D /opt/pgxc/coordinators -m faster
2、连接到任意 coordinator， 执行 drop node命令
    drop node cdtor3;
3、刷新缓存
    select  pgxc_pool_reload();


增加 Datanode节点

    1、初始化新的 datanode 并修改对应的postgresql.conf文件
    2、连接到任意已有的 coordinator节点，锁住集群，为备份做准备，执行锁住命令 但不要退出,否则备份出来的数据会不一致
    select pgxc_lock_for_backup();
     做完这个操作后，整个集群不要执行DDL语句，这样数据库的元数据就不会发生变化，就能保证备份的一致性了。
3、连接到任意 datanode节点，执行元数据的备份
    pg_dumpall -p 5432 -s --file=datanode_meta.sql
4、把新的datanode启动到 restoremode
    pg_ctl start -Z restoremode -D /var/lib/pgsql/9.6/dnode -p 5439
    psql -d postgres -f datanode_meta.sql - p 5439
    pg_ctl stop -D /var/lib/pgsql/9.6/dnode -p 5439 -m fast
5、启动新的datanode节点到正常状态下
    pg_ctl start -Z datanode -D /var/lib/pgsql/9.6/dnode
6、在每台 coordinator 执行 create node 增加新节点，然后调用 pgxc_pool_reload()
    create node dnode4 with(host='hostname',type ='datanode',port= 5439);
    select pgxc_pool_reload();
7、退出第2步
8、可以执行  alter table 命令，把旧表的数据重新分布到新节点上。如果不执行这步，则旧表的数据只存在于原先的节点中。



移除 Datanode节点

1、要移除一个datanode，首先要把这个datanode上的数据重新分布到其他节点上
    alter table tbl_name1 delete node (dnode4);
    alter table tbl_name2 delete node (dnode4);
    alter table tbl_name3 delete node (dnode4);
    .....
检查是否有表把数据放在这个节点上
select * from pgxc_class c,pgxc_node n where n.node_name='dnode4' and n.oid=any(c.nodeoids);

2、停掉该节点
    pg_ctl stop -Z datanode -D /var/lib/pgsql/9.6/dnode -m fast;
3、连接到所有 coordinator节点 删除给节点
    drop node  dnode4;
4、刷新缓存
    select pgxc_pool_reload();




postgres=# select pgxc_lock_for_backup();
ERROR:  cannot lock cluster for backup in presence of 1 uncommitted prepared transactions

postgres=# SELECT * FROM pg_prepared_xacts;
postgres=# SELECT * FROM pg_prepared_xact();
postgres=# SELECT * FROM pgxc_prepared_xact();

postgres=# EXECUTE DIRECT ON (dnode1) ' rollbackup prepared '事务ID''；
或
postgres=# EXECUTE DIRECT ON (dnode1) ' commit prepared '事务ID''；

Postgres-XC 使用限制

1、分布键不能更新；

2、不支持 SERIALIZABLE 和 repeatable read 这两种事务级别

3、约束只能应用于 datanodes，不支持跨节点的约束

4、在 prepare 语句中不支持一些复杂的SQL

5、视图上的权限可能不能正常工作

6、行触发器在COPY命令中并不工作

7、replication的表不支持 copy to

8、plpgsql函数中不能使用dml语句

9、不支持 create table as execute

10、不支持 where current of

11、不支持 foreign data wrapper

12、不支持 savepoint

13、listen、unliten、notify 只能工作在本地 coordinator上

14、统计信息并不是全局的，是被各个节点自己维护的。

PostgreSQL-XC 使用详解及维护

悦读