postgresql hash索引流复制备库报错

标签： postgresql hash 索引 | 发表时间：2014-06-04 13:19 | 作者：xmarker

出处：http://xmarker.blog.163.com

今天测试了一把postgresql的hash索引，在流复制过程中会有些问题，一下是测试过程：

1.首先搭建pg9.3.4的流复制环境，略，我的环境如下：db3为主库，db4为从库

2.创建测试表及索引

create table t_test(id int,name varchar(512),age int,time timestamp);
postgres=# create index idx_t_test on t_test using hash (name);
CREATE INDEX

插入数据：

postgres=# insert into t_test values (1,'mcl',28,now());

INSERT 0 1

postgres=# insert into t_test values (2,'afas',22,now());

INSERT 0 1

postgres=# insert into t_test values(3,'aaa',32,now());

INSERT 0 1

再批量插入些数据：

postgres=# insert into t_test select a,md5(a::text),a,clock_timestamp() from generate_series(10,1000) a;

INSERT 0 991

看下主库数据条数和备库条数：

db3（主库）:

postgres=# select count(*) from t_test ;

count

-------

994

(1 row)

表结构：

postgres=# \d t_test

Table "public.t_test"

Column | Type | Modifiers

--------+-----------------------------+-----------

id | integer |

name | character varying(512) |

age | integer |

time | timestamp without time zone |

Indexes:

"idx_t_test" hash (name)

db4(备库):

postgres=# select count(*) from t_test ;

count

-------

994

(1 row)

表结构：

postgres=# \d t_test

Table "public.t_test"

Column | Type | Modifiers

--------+-----------------------------+-----------

id | integer |

name | character varying(512) |

age | integer |

time | timestamp without time zone |

Indexes:

"idx_t_test" hash (name)

主备库的数据一致，表结构也都完全一样，下面测试；

3.测试hash索引检索：

db3（主库）：

postgres=# explain( analyze,verbose,buffers,timing) select * from t_test where name='mcl';

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------

Index Scan using idx_t_test on public.t_test (cost=0.00..8.02 rows=1 width=48) (actual time=0.032..0.038 rows=1 loops=1)

Output: id, name, age, "time"

Index Cond: ((t_test.name)::text = 'mcl'::text)

Buffers: shared hit=3

Total runtime: 0.106 ms

(5 rows)

Time: 0.832 ms

可以看出已经使用了hash索引来扫描，下面看备库

db4（备库）：

postgres=# explain( analyze,verbose,buffers,timing) select * from t_test where name='mcl';

ERROR: could not read block 0 in file "base/12896/16388": read only 0 of 8192 bytes

这样执行竟然报错，我猜测是因为hash索引因为没有被流复制过来，所以通过索引扫描报错，现在让他走全表扫描试试：

postgres=# set enable_bitmapscan =off;

SET

postgres=# set enable_indexscan =off;

SET

postgres=# explain( analyze,verbose,buffers,timing) select * from t_test where name='mcl';

QUERY PLAN

---------------------------------------------------------------------------------------------------------

Seq Scan on public.t_test (cost=0.00..23.43 rows=1 width=48) (actual time=0.050..0.252 rows=1 loops=1)

Output: id, name, age, "time"

Filter: ((t_test.name)::text = 'mcl'::text)

Rows Removed by Filter: 993

Buffers: shared hit=11

Total runtime: 0.416 ms

(6 rows)

果真走全表扫描就可以了

另外索引换成btree索引也是可以的：

postgres=# drop index idx_t_test ;

DROP INDEX

postgres=# create index ind_t_test on t_test (name);

CREATE INDEX

db4（备库）再次查询：

postgres=# select * from t_test where name='mcl';

id | name | age | time

----+------+-----+----------------------------

1 | mcl | 28 | 2014-06-04 10:17:00.405492

(1 row)

postgres=# explain( analyze,verbose,buffers,timing) select * from t_test where name='mcl';

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------

Index Scan using ind_t_test on public.t_test (cost=0.28..8.29 rows=1 width=48) (actual time=0.028..0.032 rows=1 loops=1)

Output: id, name, age, "time"

Index Cond: ((t_test.name)::text = 'mcl'::text)

Buffers: shared hit=3

Total runtime: 0.126 ms

(5 rows)

最后看下官网对hash索引的说明：

Hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX after a database crash if there were unwritten changes. Also, changes to hash indexes are not replicated over streaming or file-based replication after the initial base backup, so they give wrong answers to queries that subsequently use them. For these reasons, hash index use is presently discouraged.

hash索引目前没有被wal日志记录，因此数据库宕机后可能要重新reindex，而且也不会通过流复制传递到备库，所以在备库查询基于hash索引的时候会报错，所以hash索引目前是不被鼓励使用的。