来源: http://blog.nosqlfan.com/html/1845.html
本文有标题党之嫌。在NoSQL如日中天的今天,各种NoSQL产品可谓百花齐放,但每一个产品都有自己的特点,有长处也有不适合的场景。本文对 Cassandra, Mongodb, CouchDB, Redis, Riak 以及 HBase 进行了多方面的特点分析,希望看完此文的您能够对这些NoSQL产品的特性有所了解。
|
-
Written in: Erlang
-
Main point: DB consistency, ease of use
-
License: Apache
-
Protocol: HTTP/REST
-
Bi-directional (!) replication,
-
continuous or ad-hoc,
-
with conflict detection,
-
thus, master-master replication. (!)
-
MVCC – write operations do not block reads
-
Previous versions of documents are available
-
Crash-only (reliable) design
-
Needs compacting from time to time
-
Views: embedded map/reduce
-
Formatting views: lists & shows
-
Server-side document validation possible
-
Authentication possible
-
Real-time updates via _changes (!)
-
Attachment handling
-
thus, CouchApps (standalone
js apps)
-
jQuery library included
Best used: For accumulating, occasionally changing data, on which pre-defined queries are to
be run. Places where versioning is important.
For example: CRM, CMS systems. Master-master replication is an especially interesting feature,
allowing easy multi-site deployments.
|
|
-
Written in: C/C++
-
Main point: Blazing fast
-
License: BSD
-
Protocol: Telnet-like
-
Disk-backed in-memory database,
-
but since 2.0, it can swap to disk.
-
Master-slave replication
-
Simple keys and values,
-
but complex
operations like ZREVRANGEBYSCORE
-
INCR & co (good for rate limiting or statistics)
-
Has sets (also union/diff/inter)
-
Has lists (also a queue; blocking pop)
-
Has hashes (objects of multiple fields)
-
Of all these databases, only Redis does transactions (!)
-
Values can be set to expire (as in a cache)
-
Sorted sets (high score table, good for range queries)
-
Pub/Sub and WATCH on data changes (!)
Best used: For rapidly changing data with a foreseeable database size (should fit mostly in memory).
For example: Stock prices. Analytics. Real-time data collection. Real-time communication.
|
|
-
Written in: C++
-
Main point: Retains some friendly properties of SQL. (Query, index)
-
License: AGPL (Drivers: Apache)
-
Protocol: Custom, binary (BSON)
-
Master/slave replication
-
Queries are javascript expressions
-
Run arbitrary javascript functions server-side
-
Better update-in-place than CouchDB
-
Sharding built-in
-
Uses memory mapped files for data storage
-
Performance over features
-
After crash, it needs to repair tables
-
Better durablity coming in V1.8
Best used: If you need dynamic queries. If you prefer to define indexes, not map/reduce functions.
If you need good performance on a big DB. If you wanted CouchDB, but your data changes too much, filling up disks.
For example: For all things that you would do with MySQL or PostgreSQL, but having predefined
columns really holds you back.
|
|
-
Written in: Java
-
Main point: Best of BigTable and Dynamo
-
License: Apache
-
Protocol: Custom, binary (Thrift)
-
Tunable trade-offs for distribution and replication (N, R, W)
-
Querying by column, range of keys
-
BigTable-like features: columns, column families
-
Writes are much faster than reads (!)
-
Map/reduce possible with Apache Hadoop
-
I admit being a bit biased against it, because of the bloat and complexity it has partly because of Java (configuration, seeing exceptions, etc)
Best used: When you write more than you read (logging). If every component of the system must
be in Java. (“No one gets fired for choosing Apache’s stuff.”)
For example: Banking, financial industry (though not necessarily for financial transactions,
but these industries are much bigger than that.) Writes are faster than reads, so one natural niche is real time data analysis.
|
|
-
Written in: Erlang & C, some Javascript
-
Main point: Fault tolerance
-
License: Apache
-
Protocol: HTTP/REST
-
Tunable trade-offs for distribution and replication (N, R, W)
-
Pre- and post-commit hooks,
-
for validation and security.
-
Built-in full-text search
-
Map/reduce in javascript or Erlang
-
Comes in “open source” and “enterprise” editions
Best used: If you want something Cassandra-like (Dynamo-like), but no way you’re gonna deal with
the bloat and complexity. If you need very good single-site scalability, availability and fault-tolerance, but you’re ready to pay for multi-site replication.
For example: Point-of-sales data collection. Factory control systems. Places where even seconds
of downtime hurt.
|
HBase
|
-
Written in: Java
-
Main point: Billions of rows X millions of columns
-
License: Apache
-
Protocol: HTTP/REST (also Thrift)
-
Modeled after BigTable
-
Map/reduce with Hadoop
-
Query predicate push down via server side scan and get filters
-
Optimizations for real time queries
-
A high performance Thrift gateway
-
HTTP supports XML, Protobuf, and binary
-
Cascading, hive, and pig source and sink modules
-
Jruby-based (JIRB) shell
-
No single point of failure
-
Rolling restart for configuration changes and minor upgrades
-
Random access performance is like MySQL
Best used: If you’re in love with BigTable. And
when you need random, realtime read/write access to your Big Data.
For example: Facebook Messaging Database (more general example coming soon)
|
原文链接: Cassandra
vs MongoDB vs CouchDB vs Redis vs Riak vs HBase comparison
作者:heiyeshuwu 发表于2011-12-20 14:37:28
原文链接