<<上篇 | 首页 | 下篇>>

hbase ( key 设计 ) 条件查询排序分页

Paging is a very common use-case for web sites and many other applications.  In relational databases, this is easily implemented with LIMIT and OFFSET, or by selecting the row number in the query and adding conditionals based on it’s value.  HBase 0.19.x, on the other hand, does not provide any queries or filters that support paging directly. After a quick example using SQL, I will show how to implement the same functionality in HBase.

Let’s assume that we have a large number of users.  Each user has performed a number of actions.  Each action has a unique identifier, a timestamp, and a name.

This is how you might get the third page of an individual users’ actions using SQL:

SELECT id, name, stamp FROM actions WHERE userid = 1
ORDER BY stamp DESC LIMIT 10 OFFSET 20;

This utilizes secondary indexes on both userid and stamp, meaning to accomplish this query you need at least three indexes on this table as id is the primary key.  Though a simple query to write, you will run into problems as the actions table grows to millions of rows and beyond.  Insertions would look like:

INSERT INTO actions (id, userid, name, stamp) VALUES (newid(), 1, 'Joe User', epoch());

HBase has no real indexes.  Rows are stored in sorted order, and columns in a family are sorted.  For more information, read the HBase Architecture page on the HBase Wiki.

Very conscious of the primary queries we will run on user-actions, we will design an HBase table to support paging queries on per-user, time-ordered lists of actions.

We will use the Java Client API for HBase, specifically the HTable class.  What we are looking for are two methods:

public static List<Action> getUserActions(int userid, int offset, int limit)
public static void putUserAction(Action action)

Please note, I am using a custom object, Action, for simplicity.  It is a client-side holder for the four action fields (id, userid, name, stamp).

There are a number of ways to store your data in HBase that will allow the getUserActions query, but in this case we will go with a very tall table design (lots of rows with few columns in them) rather thanwide (lots of columns in each row).  Specifically, the difference here would be whether you have a row-per-action or a row-per-user.  We will do a row-per-action, but will be designing our row key (the primary key) to be a composite key to allow for grouping and sorting of actions, rather than just the action id.  This means we will not have random-access to an action by it’s id, so rather than defining this as the actions table (which might also exist if you needed actionid random access) we will define it as the useractions table, and we will only store a single column in a single family,content:name.

The row key that we will use in our HBase useractions table is:

<userid><reverse_order_stamp><actionid>

It’s very important that each of these fields is fixed-length and binary so that the lexicographical/ascending byte-ordering of HBase will properly sort our rows.

The userid field will be a 4 byte, big endian integer.  reverse_order_stamp is an 8 byte, big endian long with a value of (Long.MAX_VALUE - epoch).  This is so the most recent stamp is at the top rather than the bottom.  actionid is another 4 byte, big endian integer.  Thankfully, HBase provides helpful utilties in the org.apache.hadoop.hbase.util.Bytes class to deal with this (unfortunately it lacked some key features in 0.19, so the code below makes use of the Bytes class available in 0.20/TRUNK).  Before we get into HBase code, let’s define the helper methods makeActionRow and readActionRow to deal with the composite key:

public static byte [] makeActionRow(int userid, long stamp, int actionid)
throws Exception {
  byte [] useridBytes = Bytes.toBytes(userid);
  byte [] stampBytes = Bytes.toBytes(stamp);
  byte [] actionidBytes = Bytes.toBytes(actionid);
  return Bytes.add(useridBytes, stampBytes, actionidBytes);
}
 
public static Action readActionRow(byte [] row)
throws Exception {
  // Bytes.toInt(byte [] buf, int offset, int length)
  int userid = Bytes.toInt(row,0,4);
  long stamp = Long.MAX_VALUE - Bytes.toLong(row,4,8);
  int actionid = Bytes.toInt(row,12,4);
  return new Action(userid,stamp,actionid);
}

Now that we can deal with the composite keys, insertion is very straightforward:

public static void putUserAction(Action action) throws Exception {
  // Get the fields from the Action object
  int userid = action.getUserID();
  long stamp = Long.MAX_VALUE - action.getStamp();
  int actionid = action.getID();
  String name = action.getName();
 
  // Build the composite row, column, and value
  byte [] row = makeActionRow(userid,stamp,actionid);
  byte [] column = Bytes.toBytes("content:name");
  byte [] value = Bytes.toBytes(name);
 
  // Insert to HBase
  HTable ht = new HTable("useractions");
  BatchUpdate bu = new BatchUpdate(row);
  bu.put(column,value)
  ht.commit(bu);
}

We just serialize the fields into the composite row, and write the single column to HBase in a BatchUpdate.  Reading will deal with unserializing the fields and Scanners.  In addition to matching for the content:name column, we will also specify a startRow and stopRow so that the Scanner only returns results from the user we are looking at.  This way we do not have to worry about jumping to the next user in our code, the Scanner will just stop.

public static List<Action> getUserActions(int userid, int offset, int limit)
throws Exception {
  // Initialize counter and List to return
  int count = 0;
  List<Action> actions = new ArrayList<Action>(limit);
 
  // Initialize startRow, stopRow, and columns to match
  byte [] startRow = makeActionRow(userid,0,0);
  byte [] stopRow = makeActionRow(userid,Long.MAX_VALUE,Integer.MAX_VALUE);
  byte [][] columns = {Bytes.toBytes("content:name")};
 
  // Open Scanner
  HTable ht = new HTable("useractions");
  Scanner s = ht.getScanner(columns,startRow,stopRow);
  RowResult res = null;
 
  // Iterate over Scanner
  while((res = s.next()) != null) {
    // Check if past offset
    if(++count <= offset) continue;
 
    // Get data from RowResult
    byte [] row = res.getRow();
    byte [] value = res.get(columns[0]).getValue();
 
    // Build Action
    Action action = readActionRow(row);
    String name = Bytes.toString(value);
    action.setName(name);
    actions.add(action);
 
    // Check limit
    if(count == offset + limit) break;
  }
  // Cleanup and return
  s.close();
  return actions;
}

The storage of your data must be tied to how you need to query it.  Without a sophisticated query engine or indexing capabilities, you must design to take advantage of sorted rows and columns, potentially designing a table per query type.  Denormalization is okay!

阅读全文……

标签 : ,

Ken Wu's Blog » HBase二级索引与Join

1,按索引建表
每一个索引建立一个表,然后依靠表的row key来实现范围检索。row key在HBase中是以B+ tree结构化有序存储的,所以scan起来会比较效率。
单表以row key存储索引,column value存储id值或其他数据 ,这就是Hbase索引表的结构。

如何Join?
多索引(多表)的join场景中,主要有两种参考方案:

1,按索引的种类扫描各自独立的单索引表,最后将扫描结果merge。
这个方案的特点是简单,但是如果多个索引扫描结果数据量比较大的话,merge就会遇到瓶颈。

比如,现在有一张1亿的用户信息表,建有出生地和年龄两个索引,我想得到一个条件是在杭州出生,年龄为20岁的按用户id正序排列前10个的用户列表。
有一种方案是,系统先扫描出生地为杭州的索引,得到一个用户id结果集,这个集合的规模假设是10万。
然后扫描年龄,规模是5万,最后merge这些用户id,去重,排序得到结果。

这明显有问题,如何改良?
保证出生地和年龄的结果是排过序的,可以减少merge的数据量?但Hbase是按row key排序,value是不能排序的。
变通一下 – 将用户id冗余到row key里?OK,这是一种解决方案了,这个方案的图示如下:

merge时提取交集就是所需要的列表,顺序是靠索引增加了_id,以字典序保证的。

2, 按索引查询种类建立组合索引。
在方案1的场景中,想象一下,如果单索引数量多达10个会怎么样?10个索引,就要merge 10次,性能可想而知。


解决这个问题需要参考RDBMS的组合索引实现。
比如出生地和年龄需要同时查询,此时如果建立一个出生地和年龄的组合索引,查询时效率会高出merge很多。
当然,这个索引也需要冗余用户id,目的是让结果自然有序。结构图示如下:

这个方案的优点是查询速度非常快,根据查询条件,只需要到一张表中检索即可得到结果list。缺点是如果有多个索引,就要建立多个与查询条件一一对应的组合索引,存储压力会增大。

在制定Schema设计方案时,设计人员需要充分考虑场景的特点,结合方案一和二来使用。下面是一个简单的对比:

  单索引 组合索引
检索性能 优异 优异
存储 数据不冗余,节省存储。 数据冗余,存储比较浪费。
事务性 多个索引保证事务性比较困难。 多个索引保证事务性比较困难。
join 性能较差 性能优异
count,sum,avg,etc 符合条件的结果集全表扫描 符合条件的结果集全表扫描

从上表中可以得知,方案1,2都存在更新时事务性保证比较困难的问题。如果业务系统可以接受最终一致性的话,事务性会稍微好做一些。否则只能借助于复杂的分布式事务,比如JTA,Chubby等技术。
count, sum, avg, max, min等聚合功能,Hbase只能通过硬扫的方式,并且很悲剧,你可能需要做一些hack操作(比如加一个CF,value为null),否则你在扫描时可能需要往客户端传回所有数据。
当然你可以在这个场景上做一些优化,比如增加状态表等,但复杂性带来的风险会更高。
还有一种终极解决方案就是在业务上只提供上一页和下一页,这或许是最简单有效的方案了。

阅读全文……

标签 : , ,

HBase存储时间相关多列数据的两种方案 - 大圆那些事 - 博客园

多行单列

表结构设计

Row Key:用户标识ID + (Long.MAX_VALUE - timestamp)

Column Family:’cf’

Column Qualifier:’’

Value:宝贝、URL等

其中,使用(Long.MAX_VALUE – timestamp)作为Row Key的后半部分是为了便于获取最近插入的数据,一个用户标识ID下的数据存储在多个Row Key下,每个Row Key下仅有一个Column Qualifier,表示该用户的一次时间相关的访问数据(访问宝贝、URL等)。

查询方式

1)查询某个特定timestamp下的记录,则使用用户标识ID + (Long.MAX_VALUE - timestamp)进行Get查找;

2)查询某个用户标识ID下所有的记录,则通过Scan.setStartRow(uid)和Scan.setStopRow(uid+1)进行Scan查找;

3)查询某个用户标识ID下最近时间内的N条记录,则通过Scan.setStartRow(uid)和Scan.setStopRow(uid+1)进行Scan查找,但是为了只获取N条记录,可以设置Scan.setCaching(N)优化查询,同时做一次ResultScanner.next(N)得到结果。

单行多列

表结构设计

Row Key:用户标识ID

Column Family:’cf’

Column Qualifier:(Long.MAX_VALUE - timestamp)

Value:宝贝、URL等

其中,使用(Long.MAX_VALUE – timestamp)作为Column Qualifier是为了便于获取最近插入的数据,一个用户标识ID下的数据存储在一个Row Key下,每个Row Key下会有多个Column Qualifier,表示该用户的所有时间相关的访问数据(访问宝贝、URL等)。

查询方式

1)查询某个特定timestamp下的记录,则使用用户标识ID进行Get查找,同时通过Get.addColumn(‘cf’, (Long.MAX_VALUE – timestamp))方法限定要查询的Column Qualifier;

2)查询某个用户标识ID下所有的记录,则直接使用用户标识ID进行Get查找,通过Get.addFamily(‘cf’)方法添加整个Column Family;

3)查询某个用户标识ID下最近时间内的N条记录,则直接使用用户标识ID进行Get查找,通过Get.addFamily(‘cf’)方法添加整个Column Family,通过ColumnCountGetFilter(int N)限制最多要查询返回N条记录。

阅读全文……

标签 : , ,