Source: http://royal.pingdom.com/2010/06/18/the-software-behind-facebook/
简单翻译
Memcached, http://memcached.org/
Memcached可以被称作当今互联网上最著名的一个软件了. 作为一款分布式的缓存系统, Facebook以及其他成千上万的站点将其放到Web服务器同MySQL服务器的中间层. 近几年, Facebook也对Memcached做了诸多的改进, 也发布了一些Memcached的周边软件.
Facebook运行着上千台的Memcached服务器, 同时里面缓存着10几TB的数据. 看上去这应该是全世界最庞大的Memcached集群了~
HipHop for PHP, http://wiki.github.com/facebook/hiphop-php/
作为一种脚本型语言, PHP相对于本地应用来说实在不够快. 而HipHop可以将PHP代码转换至C++的代码, 这样在编译后就可以取得更好的性能. 这为Facebook – 这种对PHP有着很强依赖性的网站 – 来说提供了比一般Web服务器更好的性能输出.
一小撮(准确说三个)Facebook工程师花了18个月开发HipHop, 现在它已经可以被使用到生产环境中了…………
Haystack, http://www.facebook.com/note.php?note_id=76191543919
Haystack是Facebook的高性能图片存取系统(严格的说, Haystack是一款对象存储系统, 所以它并不是只能存取图片). 它可以做许多工作: 在Facebook上至少有200亿张用户上传的图片, 而每一张都被存储为4中不同的分辨率, 算下来, 这有超过800亿的图片.
这不仅仅是能够存取百亿级别的图片这么简单, 性能才是关键. 正如我们前面提到的, Facebook每秒钟需要响应大约120万次的图片请求, 这还没包括那些放在Facebook CDN服务器上的图片. 真是个令人吃惊的数字.
BigPipe, http://www.facebook.com/notes/facebook-engineering/bigpipe-pipelining-web-pages-for-high-performance/389414033919
BigPipe是一个由Facebook开发的动态Web页面服务系统. 在它的帮助下, Facebook将分区域的响应每一次的Web页面请求(被称作”pagelets, 页面小应用”), 以达到更加优良的性能.
例如, 聊天框是被拆分开进行处理的, 新闻Feed和其他也是一样. 这种页面小应用可以被平行的处理, 这样又可以获得一些性能, 而且即便是某些部分不活动或者干脆罢工了, 在这种工作方式下, 用户还是可以访问到网站的其他功能.
Cassandra, http://cassandra.apache.org/
Cassandra是一个单点服务器不会坏掉的分布式存储系统, 它是NoSQL运动带来的产物, 并且它已经是开源界的一份子了. Facebook用他来做收件箱的搜索功能.
除了Facebook, 还有很多的网站使用Cassandra, 比如Digg.
尚未完工…
Scribe,
http://hadoop.apache.org/
作为一款灵活的日志系统, Scribe被Facebook用来处理许多内部的应用. 它被打造成完全有能力处理Facebook这样规模站点的日志,
Scribe is a flexible logging system that Facebook uses for a multitude of purposes internally. It’s been built to be able to handle logging at the scale of Facebook, and automatically handles new logging categories as they show up (Facebook has hundreds).
Hadoop and Hive
Hadoop is an open source map-reduce implementation that makes it possible to perform calculations on massive amounts of data. Facebook uses this for data analysis (and as we all know, Facebook has massive amounts of data). Hive originated from within Facebook, and makes it possible to use SQL queries against Hadoop, making it easier for non-programmers to use.
Both Hadoop and Hive are open source (Apache projects) and are used by a number of big services, for example Yahoo and Twitter.
Thrift
Facebook uses several different languages for its different services. PHP is used for the front-end, Erlang is used for Chat, Java and C++ are also used in several places (and perhaps other languages as well). Thrift is an internally developed cross-language framework that ties all of these different languages together, making it possible for them to talk to each other. This has made it much easier for Facebook to keep up its cross-language development.
Facebook has made Thrift open source and support for even more languages has been added.
Varnish
Varnish is an HTTP accelerator which can act as a load balancer and also cache content which can then be served lightning-fast.
Facebook uses Varnish to serve photos and profile pictures, handling billions of requests every day. Like almost everything Facebook uses, Varnish is open source.