今天测试了一把postgresql的hash索引,在流复制过程中会有些问题,一下是测试过程:
1.首先搭建pg9.3.4的流复制环境,略,我的环境如下:db3为主库,db4为从库
2.创建测试表及索引
create table t_test(id int,name varchar(512),age int,time timestamp);
postgres=# create index idx_t_test on t_test using hash (name);
CREATE INDEX
插入数据:
postgres=# insert into t_test values (1,'mcl',28,now());
INSERT 0 1
postgres=# insert into t_test values (2,'afas',22,now());
INSERT 0 1
postgres=# insert into t_test values(3,'aaa',32,now());
INSERT 0 1
再批量插入些数据:
postgres=# insert into t_test select a,md5(a::text),a,clock_timestamp() from generate_series(10,1000) a;
INSERT 0 991
看下主库数据条数和备库条数:
db3(主库):
postgres=# select count(*) from t_test ;
count
-------
994
(1 row)
表结构:
postgres=# \d t_test
Table "public.t_test"
Column | Type | Modifiers
--------+-----------------------------+-----------
id | integer |
name | character varying(512) |
age | integer |
time | timestamp without time zone |
Indexes:
"idx_t_test" hash (name)
db4(备库):
postgres=# select count(*) from t_test ;
count
-------
994
(1 row)
表结构:
postgres=# \d t_test
Table "public.t_test"
Column | Type | Modifiers
--------+-----------------------------+-----------
id | integer |
name | character varying(512) |
age | integer |
time | timestamp without time zone |
Indexes:
"idx_t_test" hash (name)
主备库的数据一致,表结构也都完全一样,下面测试;
3.测试hash索引检索:
db3(主库):
postgres=# explain( analyze,verbose,buffers,timing) select * from t_test where name='mcl';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_t_test on public.t_test (cost=0.00..8.02 rows=1 width=48) (actual time=0.032..0.038 rows=1 loops=1)
Output: id, name, age, "time"
Index Cond: ((t_test.name)::text = 'mcl'::text)
Buffers: shared hit=3
Total runtime: 0.106 ms
(5 rows)
Time: 0.832 ms
可以看出已经使用了hash索引来扫描,下面看备库
db4(备库):
postgres=# explain( analyze,verbose,buffers,timing) select * from t_test where name='mcl';
ERROR: could not read block 0 in file "base/12896/16388": read only 0 of 8192 bytes
这样执行竟然报错,我猜测是因为hash索引因为没有被流复制过来,所以通过索引扫描报错,现在让他走全表扫描试试:
postgres=# set enable_bitmapscan =off;
SET
postgres=# set enable_indexscan =off;
SET
postgres=# explain( analyze,verbose,buffers,timing) select * from t_test where name='mcl';
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Seq Scan on public.t_test (cost=0.00..23.43 rows=1 width=48) (actual time=0.050..0.252 rows=1 loops=1)
Output: id, name, age, "time"
Filter: ((t_test.name)::text = 'mcl'::text)
Rows Removed by Filter: 993
Buffers: shared hit=11
Total runtime: 0.416 ms
(6 rows)
果真走全表扫描就可以了
另外索引换成btree索引也是可以的:
postgres=# drop index idx_t_test ;
DROP INDEX
postgres=# create index ind_t_test on t_test (name);
CREATE INDEX
db4(备库)再次查询:
postgres=# select * from t_test where name='mcl';
id | name | age | time
----+------+-----+----------------------------
1 | mcl | 28 | 2014-06-04 10:17:00.405492
(1 row)
postgres=# explain( analyze,verbose,buffers,timing) select * from t_test where name='mcl';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Index Scan using ind_t_test on public.t_test (cost=0.28..8.29 rows=1 width=48) (actual time=0.028..0.032 rows=1 loops=1)
Output: id, name, age, "time"
Index Cond: ((t_test.name)::text = 'mcl'::text)
Buffers: shared hit=3
Total runtime: 0.126 ms
(5 rows)
最后看下官网对hash索引的说明:
Hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX after a database crash if there were unwritten changes. Also, changes to hash indexes are not replicated over streaming or file-based replication after the initial base backup, so they give wrong answers to queries that subsequently use them. For these reasons, hash index use is presently discouraged.
hash索引目前没有被wal日志记录,因此数据库宕机后可能要重新reindex,而且也不会通过流复制传递到备库,所以在备库查询基于hash索引的时候会报错,所以hash索引目前是不被鼓励使用的。