[转]consul VS zookeeper、etcd、doozerd

标签: | 发表时间:2015-01-28 18:47 | 作者:Xiao_Qiang_
出处:http://blog.csdn.net/xiao_qiang_

  zookeeper、doozerd、etcd都有着相似的架构,这三者的服务节点都需要一个仲裁节点来操作,它们是强一致的,并提供各种操作原语。应用程序可以通过客户端lib库来构建分布式的系统。在一个单datacenter中,consul的server节点工作在一种简单的方式下,consul server需要一个仲裁操作,并提供强一致性。consul原生的支持多datacenter,就像多gossip系统链接sever节点和clients一样。

         如果任何这些系统都用于K/V值存储,它们都提供了相同的语义,读取是强一致性的,在面对网络分区的时候,牺牲一致性确保可用性,然而,当系统使用高级特性的时候这些差异更加明显。这些系统提供的语义对构建服务发现有很大作用。zookeeper只提供原始的K/V值存储,并要求开发人员自己构建自己的系统来提供服务发现功能。consul提供了一个坚固的服务发现框架,这样就提升了开发的工作效率。客户端简单的注册服务,然后使用DNS或者HTTP接口来发现服务。其他其他则需要你自己定制自己的解决方案。一个令人信服的服务发现框架必须包含健康检测和考虑失败的可能性。原生的系统使用心跳检测、周期性的更新和TTL来确保发现服务异常。这个系统需要知道工作节点的数量和固定数量服务器上的需求。此外,故障检测窗口需要TTL机制。zookeeper提供了短暂节点的K/V条目,当客户端断开链接则删除该条目。这是比心跳检测更复杂的系统,但是也有增加客户端难的问题。所有客户端必须维护到zookeeper服务的连接活跃,并发送活跃消息,而且,这种客户端比较厚重,很难编写,易出BUG。consul使用一个完全不同的体系进行健康检查。不只是在server节点,consul client运行在集群中的每一个节点上,这些clients是gossip pool的一部分,提供包括分布式健康监测的功能。gossip协议提供了一个高效的故障检测机制,可以扩展到任何集群规模,而没有任何工作集中在某台服务器上客户端也支持在本地进行更丰富的健康监测。而zookeeper的短暂节点是一个非常原始的活跃度检查。客户端可以检查web服务器的状态返回码,内存利用率、磁盘使用情况等等。consul clients暴露出了一个HTTP接口,避免像zookeeper一样暴露给客户端一些复杂的系统。consul提供一流的服务发现、健康检查、K/V存储、多数据中心服务。支持任何简单的K/V存储,所有这些其他系统都需要额外的工具和lib库。通过client节点,consul提供了一个简单的API接口。

What do Etcd, Consul, and Zookeeper do?
  - Service Registration:
    - Host, port number, and sometimes authentication credentials, protocols, versions
      numbers, and/or environment details.
  - Service Discovery:
    - Ability for client application to query the central registry to learn of service location.
  - Consistent and durable general-purpose K/V store across distributed system.
    - Some solutions support this better than others.
    - Based on Paxos or some derivative (i.e. Raft) algorithm to quickly converge to a consistent state.
    - Centralized locking can be based on this K/V store.
  - Leader Election:
    - Not to be confused with leader election within the quorum of Etcd/Consul nodes. This is an
      implementation detail that is transparent to the user. What we are talking about here is leader
      election among the services that are registered against Etcd/Consul.
    - Etcd tabled their leader election module until the API stabilizes.
  - Other non-standard use cases:
    - Distributed locking
    - Atomic broadcast
    - Sequence numbers
    - Pointers to data in eventually consistent stores.
 
- How do they behave in a distributed system?
  - All of the solutions under consideration are primarily CP systems in the CAP context.
    That is, they favor consistency over availability. This means that all nodes have a
    consistent view of written data but at the expense of availability in the event that
    a network partitions occurs (i.e. loss of node).
    - Some of these solutions will support "stale reads" in the event of node loss.
  - Each solution can work with only one node. It is generally advised that we have one etcd/
    consul per VM/physical host. We do not want to have an etcd/consul per container!
 
- Immediate problems that we are trying to solve:
  - Get and set dynamic configuration across a distributed system (e.g. things in moc.config.json):
    - This is perhaps the most pressing problem that we need to solve.
    - An SCM tool like Puppet/Anisble are great for managing static configurations but
      they are too heavy for dynamic changes.
  - Service registration:
    - We need to be able to spin up a track and have services make themselves visible
      via DNS.
    - This would be useful primarily outside of production where we would want to regularly
      spin up and destroy tracks.
    - That said, we don't have a highly-distributed and elastic architecture so we could get
      by without this for a while.
  - Service discovery:
    - Services must be able to determine which host to talk to for a particular service.
    - This may not be as important for production if we have a loadbalancer. In fact, a
      loadbalancer would be more transparent to our existing apps as they work at the IP level.
    - That said, we don't have a highly-distributed and elastic architecture so we could get
      by without this for a while.
 
- Features that we don't need for now:
  - Leader election. Many of our apps are currently not designed to scale horizontally.
    However, it should be noted that Consul has the ability to select a leader based on
    health checks.
 
- Problems that these tools are not designed to solve:
  - Load-balancing.
 
- Things that I've explored:
  - Etcd:
    - Basic info:
      - Service registration relies on using a key TTL along with heartbeating from the service
        to ensure the key remains available. If a services fails to update the key’s TTL, Etcd
        will expire it. If a service becomes unavailable, clients will need to handle the
        connection failure and try another service instance.
      - There would be a compelling reason to favor Etcd if we ever planned to use CoreOS
        but I don't see this happening anytime soon.
    - Pros:
      - Service discovery involves listing the keys under a directory and then waiting for
        changes on the directory. Since the API is HTTP based, the client application keeps a 
        long-polling connection open with the Etcd cluster.
      - Has been around for longer than Consul. 150% more github watches/stars.
      - 3 times as many contributors (i.e. more eyes) and forks on github.
    - Cons:
      - There are claims that the Raft implementation used by Etcd (go-raft) is not quite right (unverified).
      - Immature, but by the time its use is under consideration in production, it should
        have reached 1.0.
      - Serving DNS records from Etcd may require a separate service/process (verify):
        - http://probablyfine.co.uk/2014/03/02/serving-dns-records-from-etcd/
        - SkyDNS is essentially DNS on top of Etcd
 
  - Consul:
    - Pros:
      - Has more high-level features like service monitoring.
      - There is another project out of Hashicorp that will read/set environment variable
        for processes from Consul.
        - https://github.com/hashicorp/envconsul
      - Better documentation.
        - I had an easier time installing and configuring this over Etcd, not that Etcd was
          particularly hard. Docs make all the difference.
        - Stuff like this makes me want to shed a tear. I commend the KIDS at Hashicorp.
          - http://www.consul.io/docs/internals/index.html
      - You can make DSN queries directly against Consul agent! Nice! No need for SkyDNS or Helix
      - We can add arbitrary checks! Nice, if we are into that sort of thing.
      - Understands the notion of a datacenter. Each cluster is confined to datacenter but the
        cluster is able to communicate with other datacenters/clusters.
        - At Skybox, we might use this feature to separate docker tracks, even if they live on same host.
      - It has a rudimentary web UI:
        - http://demo.consul.io/ui/
    - Cons:
      - There are claims that Consul's implementation of Raft is better (unverified).
      - Immature. Even younger than Etcd (though there are no reason to believe that there are problems with it).
 
- Etcd and Consul similarities:
  - HTTP+JSON based API. Curl-able.
  - Docker containers can talk directly with Etcd/Consul over the docker0 interface (i.e. default gateway).
  - Atomic look-before-you-set:
    - Etcd: Compare-and-set by both value and version index.
    - Consul: Check-and-set by sequence number (ModifyIndex)
  - DNS TTLs can be set to something VERY low.
    - Etcd: supports TTL (time-to-live) on both keys and directories, which will be honoured:
      if a value has existed beyond its TTL
    - Consul: By default, serves all DNS results with a 0 TTL value
  - Has been tested with Jepsen (tool to simulate network partitions in distributed databases).
    - Results were not 100% for either but still generally promising.
    - https://news.ycombinator.com/item?id=7884640
  - Both work with Confd by Kelsey Hightower.
    - A tool that watches Etcd/Consul and modifies config files on disk.
    - https://github.com/kelseyhightower/confd
  - Long polling for changes:
    - Etcd: Easily listen for changes to a prefix via HTTP long-polling.
    - Consul: A blocking query against some endpoints will wait for a change to potentially
      take place using long polling.
 
- Things that I have not explored:
  - SkyDNS: Anyone have good input on this one?
  - Zookeeper: It seems mature but it would take a lot more work to make it work for us.
    - We would be have to configure and use it without high-level features.
    - Provides only a primitive K/V store.
    - Requires that application developers build their own system to provide service discovery.
    - Java dependency (and Dan Streit hates Java)
    - All clients must maintain active connections to the ZooKeeper servers, and perform keep-alives.
    - Zookeeper not recommended for virtual environments? Why? I just read this somewhere.
  - Corosync/Pacemaker (not sure if this is a viable solution, actually)
  - Redis is not viable! It is an in-memory K/V that does not persist data. Nope.
  - Smartstack + Synapse + Nerve from AirBnB (not viable as it only does TCP through HAproxy).
    - Ruby dependencies and many moving parts.
 
- References:
  http://www.hashicorp.com/blog/twelve-factor-consul.html   (heroku's excellent 12-factor thing).
  http://12factor.net/
  http://www.consul.io/intro/vs/index.html
  http://www.consul.io/docs/internals/index.html
  https://news.ycombinator.com/item?id=7604787
  https://news.ycombinator.com/item?id=7623317
  https://news.ycombinator.com/item?id=7884640
  http://www.activestate.com/blog/2014/03/brandon-philips-explains-etcd
  http://jpmens.net/2013/10/24/a-key-value-store-for-shared-configuration-etcd-confd/
  http://igor.moomers.org/smartstack-vs-consul/
  http://jasonwilder.com/blog/2014/02/04/service-discovery-in-the-cloud/
  http://nerds.airbnb.com/smartstack-service-discovery-cloud/

作者:Xiao_Qiang_ 发表于2015/1/28 10:47:10 原文链接
阅读:90 评论:0 查看评论

相关 [consul vs zookeeper] 推荐:

[转]consul VS zookeeper、etcd、doozerd

- - Xiao_Qiang_的专栏
  zookeeper、doozerd、etcd都有着相似的架构,这三者的服务节点都需要一个仲裁节点来操作,它们是强一致的,并提供各种操作原语. 应用程序可以通过客户端lib库来构建分布式的系统. 在一个单datacenter中,consul的server节点工作在一种简单的方式下,consul server需要一个仲裁操作,并提供强一致性.

服务发现:Zookeeper vs etcd vs Consul

- - 企业架构 - ITeye博客
服务发现:Zookeeper vs etcd vs Consul. 【编者的话】本文对比了Zookeeper、etcd和Consul三种服务发现工具,探讨了最佳的服务发现解决方案,仅供参考. 如果使用预定义的端口,服务越多,发生冲突的可能性越大,毕竟,不可能有两个服务监听同一个端口. 管理一个拥挤的比方说被几百个服务所使用的所有端口的列表,本身就是一个挑战,添加到该列表后,这些服务需要的数据库和数量会日益增多.

分布式配置服务etcd VS 分布式协调服务zookeeper

- - 操作系统 - ITeye博客
etcd是一个高可用的键值存储系统,主要用于共享配置和服务发现. etcd是由CoreOS开发并维护的,灵感来自于 ZooKeeper 和 Doozer,它使用Go语言编写,并通过Raft一致性算法处理日志复制以保证强一致性. Raft是一个来自Stanford的新的一致性算法,适用于分布式系统的日志复制,Raft通过选举的方式来实现一致性,在Raft中,任何一个节点都可能成为Leader.

zookeeper( 转)

- - 企业架构 - ITeye博客
转自:http://qindongliang.iteye.com/category/299318. 分布式助手Zookeeper(一). Zookeeper最早是Hadoop的一个子项目,主要为Hadoop生态系统中一些列组件提供统一的分布式协作服务,在2010年10月升级成Apache Software .

ZooKeeper监控

- - 淘宝网通用产品团队博客
        在公司内部,有不少应用已经强依赖zookeeper,比如meta和精卫系统,zookeeper的工作状态直接影响它们的正常工作. 目前开源世界中暂没有一个比较成熟的zk-monitor,公司内部的各个zookeeper运行也都是无监控,无报表状态. 目前zookeeper-monitor能做哪些事情,讲到这个,首先来看看哪些因素对zookeeper正常工作比较大的影响:.

zookeeper原理

- - CSDN博客云计算推荐文章
1.为了解决分布式事务性一致的问题. 2.文件系统也是一个树形的文件系统,但比linux系统简单,不区分文件和文件夹,所有的文件统一称为znode. 3.znode的作用:存放数据,但上限是1M ;存放ACL(access control list)访问控制列表,每个znode被创建的时候,都会带有一个ACL,身份验证方式有三种:digest(用户名密码验证),host(主机名验证),ip(ip验证) ,ACL到底有哪些权限呢.

Zookeeper Client简介

- - zzm
直接使用zk的api实现业务功能比较繁琐. 因为要处理session loss,session expire等异常,在发生这些异常后进行重连. 又因为ZK的watcher是一次性的,如果要基于wather实现发布/订阅模式,还要自己包装一下,将一次性订阅包装成持久订阅. 另外如果要使用抽象级别更高的功能,比如分布式锁,leader选举等,还要自己额外做很多事情.

zookeeper 理论

- - zzm
引用官方的说法:“Zookeeper是一个高性能,分布式的,开源分布式应用协调服务. 它提供了简单原始的功能,分布式应用可以基于它实现更高级 的服务,比如同步,配置管理,集群管理,名空间. 它被设计为易于编程,使用文件系统目录树作为数据模型. 服务端跑在java上,提供java和C的客户端 API”.

ZooKeeper 入门

- - 企业架构 - ITeye博客
ZooKeeper是一个高可用的分布式数据管理与系统协调框架. 基于对Paxos算法的实现,使该框架保证了分布式环境中数据的强一致性,也正是基于这样的特性,使得ZooKeeper解决很多分布式问题. 网上对ZK的应用场景也有不少介绍,本文将结合作者身边的项目例子,系统地对ZK的应用场景进行一个分门归类的介绍.

zookeeper场景

- - 企业架构 - ITeye博客
发布与订阅模型,即所谓的配置中心,顾名思义就是发布者将数据发布到ZK节点上,供订阅者动态获取数据,实现配置信息的集中式管理和动态更新. 例如全局的配置信息,服务式服务框架的服务地址列表等就非常适合使用. 应用中用到的一些配置信息放到ZK上进行集中管理. 这类场景通常是这样:应用在启动的时候会主动来获取一次配置,同时,在节点上注册一个Watcher,这样一来,以后每次配置有更新的时候,都会实时通知到订阅的客户端,从来达到获取最新配置信息的目的.