Facebook’s architecture(转)

标签: 底层架构 architecture facebook | 发表时间:2011-05-08 13:46 | 作者:千石 JingSQ

From various readings and conversations I had, my understanding of Facebook’s current architecture is:

* Web front-end written in PHP. Facebook’s HipHop [1] then converts it to C++ and compiles it using g++, thus providing a high performance templating and Web logic execution layer
* Business logic is exposed as services using Thrift [2]. Some of these services are implemented in PHP, C++ or Java depending on service requirements (some other languages are probably used…)
* Services implemented in Java don’t use any usual enterprise application server but rather use Facebook’s custom application server. At first this can look as wheel reinvented but as these services are exposed and consumed only (or mostly) using Thrift, the overhead of Tomcat, or even Jetty was probably too high with no significant added value for their need.
* Persistence is done using MySQL, Memcached [3], Facebook’s Cassandra [4], Hadoop’s HBase [5]. Memcached is used as a cache for MySQL as well as a general purpose cache. Facebook engineers admit that their use of Cassandra is currently decreasing as they now prefer HBase for its simpler consistency model and its MapReduce ability.
* Offline processing is done using Hadoop and Hive
* Data such as logging, clicks and feeds transit using Scribe [6] and are aggregating and stored in HDFS using Scribe-HDFS [7], thus allowing extended analysis using MapReduce
* BigPipe [8] is their custom technology to accelerate page rendering using a pipelining logic
* Varnish Cache [9] is used for HTTP proxying. They’ve prefered it for its high performance and efficiency [10].
* The storage of the billions of photos posted by the users is handled by Haystack, an ad-hoc storage solution developed by Facebook which brings low level optimizations and append-only writes [11].
* Facebook Messages is using its own architecture which is notably based on infrastructure sharding and dynamic cluster management. Business logic and persistence is encapsulated in so-called ‘Cell’. Each Cell handles a part of users ; new Cells can be added as popularity grows [12]. Persistence is achieved using HBase [13].
* Facebook Messages’ search engine is built with an inverted index stored in HBase [14]
* Facebook Search Engine’s implementation details are unknown as far as I know
* The typeahead search uses a custom storage and retrieval logic [15]
* Chat is based on an Epoll server developed in Erlang and accessed using Thrift [16]

About the resources provisioned for each of these components, some information and numbers are known:

* Facebook is estimated to own more than 60,000 servers [17]. Their recent datacenter in Prineville, Oregon is based on entirely self-designed hardware [18] that was recently unveiled as Open Compute Project [19].
* 300 TB of data is stored in Memcached processes [20]
* Their Hadoop and Hive cluster is made of 3000 servers with 8 cores, 32 GB RAM, 12 TB disks that is a total of 24k cores, 96 TB RAM and 36 PB disks [20]
* 100 billion hits per day, 50 billion photos, 3 trillion objects cached, 130 TB of logs per day as of july 2010 [21]

[1] HipHop for PHP: http://developers.facebook.com/blog/post/358
[2] Thrift: http://thrift.apache.org/
[3] Memcached: http://memcached.org/
[4] Cassandra: http://cassandra.apache.org/
[5] HBase: http://hbase.apache.org/
[6] Scribe: https://github.com/facebook/scribe
[7] Scribe-HDFS: http://hadoopblog.blogspot.com/2009/06/hdfs-scribe-integration.html
[8] BigPipe: http://www.facebook.com/notes/facebook-engineering/bigpipe-pipelining-web-pages-for-high-performance/389414033919
[9] Varnish Cache: http://www.varnish-cache.org/
[10] Facebook goes for Varnish: http://www.varnish-software.com/customers/facebook
[11] Needle in a haystack: efficient storage of billions of photos: http://www.facebook.com/note.php?note_id=76191543919
[12] Scaling the Messages Application Back End: http://www.facebook.com/note.php?note_id=10150148835363920
[13] The Underlying Technology of Messages: https://www.facebook.com/note.php?note_id=454991608919
[14] The Underlying Technology of Messages Tech Talk: http://www.facebook.com/video/video.php?v=690851516105
[15] Facebook’s typeahead search architecture: http://www.facebook.com/video/video.php?v=432864835468
[16] Facebook Chat: http://www.facebook.com/note.php?note_id=14218138919
[17] Who has the most Web Servers?: http://www.datacenterknowledge.com/archives/2009/05/14/whos-got-the-most-web-servers/
[18] Building Efficient Data Centers with the Open Compute Project: http://www.facebook.com/note.php?note_id=10150144039563920
[19] Open Compute Project: http://opencompute.org/
[20] Facebook’s architecture presentation at Devoxx 2010: http://www.devoxx.com
[21] Scaling Facebook to 500 millions users and beyond: http://www.facebook.com/note.php?note_id=409881258919

相关 [facebook architecture] 推荐:

Facebook’s architecture(转)

- Adam - 淘宝JAVA中间件团队博客
Facebook’s HipHop [1] then converts it to C++ and compiles it using g++, thus providing a high performance templating and Web logic execution layer. Some of these services are implemented in PHP, C++ or Java depending on service requirements (some other languages are probably used…).

读《game engine architecture》有感

- 启鑫 - 博客园-首页原创精华区
最近在看一本叫做《game engine architecture》的书,这本书从很细,很具体的讲解现在游戏引擎的体系结构. 本书的亮点:1.讲解现代游戏引擎架构,拥有非常新的实例. 包括作者自己公司的引擎和商业引擎例如Unreal的实例. 代码少而思想多,往往一段话就可以让你了解某个部分的实现--(来自豆瓣上的点评).

[Architecture] MVP, MVC, MVVM, 傻傻分不清楚~

- Amo - 點部落-小朱® 的技術隨手寫


- Discoverer - 60designwebpick
位于 LUGANO 湖湖畔山坡上的两层别墅住宅,由意大利的 JM ARCHITECTURE 事务所设计. 一个圆角多边形玻璃房在地面之上的层,包含生活区、厨房、餐厅及仓储空间. 卧室、浴室和车库在半地下较低的层,每一个层级都有与其密切关联的独立室外空间. 住宅的两个层级都被庭院所包围,在较高层级的玻璃房之中可以欣赏背靠山坡的景色,以及透过庭院俯瞰 LUGANO 湖.


- Lorna - It Talks--上海魏武挥的博客
腾讯近日很低调地推出了一个名为“朋友”的网络服务(也是一个使用独立域名的网站),这是一个与时下社交网站,比如人人、开心等非常类似的产品. 与它们一样,目前这个“朋友”上也加载了一些应用,当然,一贯的,以腾讯自家出品为主. 而且,我个人以为,未来会有更多的腾讯在QQ这个客户端上的应用,逐步向这个网站迁移.


- 亦农 - 王建硕
今天的湾区阳光灿烂,280州际公路两边的绿色山坡和蔚蓝的白云,让人觉得自己是Windows XP桌面上的一个图标. 下午,2点,终于来到Facebook这个神奇的公司. 他们的新家在南加利福尼亚街的最里面,一幢两层的楼里. 他们刚刚从车位紧张的Palo Alto城里搬到这里,据说一层楼又要搬了. 我好像是他们再次搬地方前的最后一批访客.


- We_Get - GeekPark 捕风捉影
除了我们熟知的QQ,互联网世界里以自身强大资源来复制小公司的产品和服务的事,绝不少见. 去年8月份,Facebook推出了Facebook Places,这是一个类似Foursquare的基于地理位置的手机在线签到功能. Foursquare当时是社交网站新贵,几个月以后它成了代表着全球互联网方向中的SoLoMo中的Lo—Location.


- - 阮一峰的网络日志
Facebook即将 上市,一时之间成为全球新闻热点. 为了不错过商机,有人将Facebook内部的标语做成 海报,每张20美元,放到网上卖. 据说,从扎克伯格创业初期,这些标语就贴在Facebook的办公室,作为行动准则,激励员工. 虽然这些标语很简单,但是我发现,它们真的具有警醒的效果,能指导你做出正确的事.


- - 搜索引擎技术博客
     我的一位室友说:“嘿,我来帮你”. 我回答他:“兄弟,你不会编程”,然后他周末回家买了一个perl编程入门的书,告诉我‘我现在准备好了’. 我就说;“兄弟,网站不是用perl语言编写的”.      我们并不想让用户在网站停留尽可能长的时间,我们所做的是让人们可以在网站用户好的体验,使他们在上面所花的时间有价值.