Post by Nancy Moore Jerrious with Bay City Cloud Providers
and Coders
Social networking is the art of connecting with those who share common interests.
Your “˜network’ is a community that helps keep you united with others and offers many benefits.
Networking via social media sites has revolutionized how we use the Internet and is at the forefront of what we now call
Web 2.0.
Facebook
is social networking. People have been “facebooking” each other for about 6 years now, making Facebook the most used social network with over 350 million
users worldwide. But how does Facebook work?
In this article, I will discuss Facebook’s inner workings, covering its architecture
and frontend/backend infrastructure””the nuts and bolts that hold Facebook together.
How Does Facebook Work?””The
Front End
Facebook uses a variety of services, tools, and programming languages to make up its core infrastructure. At the
front end, their servers run a LAMP (Linux, Apache, MySQL, and PHP) stack with Memcache.
Not a computer science
expert? Let’s take a look at exactly what that means.
Linux & Apache

This
part is pretty self-explanatory. Linux is a Unix-like
computer operating system kernel. It’s open source, very customizable, and good for security. Facebook runs the Linux
operating system on Apache HTTP Servers. Apache is also
free and is the most popular open source web server in use.
MySQL
For the database,
Facebook utilizes MySQL because of its speed and reliability. MySQL is used primarily as a key-value store as data is randomly distributed amongst a large set of logical instances.
These logical instances are spread out across physical nodes and load balancing is done at the physical node level.
As far as customizations
are concerned, Facebook has developed a custom partitioning scheme in which a global ID is assigned to all data. They also
have a custom archiving scheme that is based on how frequent and recent data is on a per-user basis. Most data is distributed
randomly.
PHP

Facebook
uses PHP because it is a good web programming language with extensive support and an active developer community and it is
good for rapid iteration. PHP is a dynamically typed/interpreted
scripting language.
Memcache

Memcache is a memory caching system that is
used to speed up dynamic database-driven websites (like Facebook) by caching data and objects in RAM to reduce reading time.
Memcache is Facebook’s primary form of caching and helps alleviate the database load.
Having a caching system allows Facebook
to be as fast as it is at recalling your data. If it doesn’t have to go to the database it will just fetch your data
from the cache based on your user ID.
Downsides to Using LAMP
Facebook has realized that there are downsides to using the LAMP
stack. Notably, PHP is not necessarily optimized for large websites and therefore hard to scale. Also, it is not the fastest
executing language and the extension framework is difficult to use.
Mike Schroepfer, Facebook’s Vice President
of Engineering, recently did an interview at EmTech@MIT concerning this. “Scaling
any website is a challenge,” Schroepfer said, “but scaling a social network has unique challenges.”
He went on to say
that unlike other websites, you can’t just add more servers to solve the problem because of Facebook’s “huge
interconnected dataset.” New connections are created all the time due to user activity.
Facebook has grown so quickly that they
are often faced with issues regarding database queries, caching, and storage of data. Their database is huge and largely complex. To account for this, Facebook
has started a lot of open source projects and backend services.
How Does Facebook Work?””The Back End
Facebook’s
backend services are written in a variety of different programming languages including C++, Java, Python, and Erlang. Their
philosophy for the creation of services is as follows:
1. Create a service if needed
2. Create a framework/toolset
for easier creation of services
3. Use the right programming language for the task
A list of all of Facebook’s open source developments can be found here. I will discuss a few of the essential tools that Facebook
has developed.
Thrift (protocol)
Thrift is a lightweight remote procedure
call framework for scalable cross-language services development. Thrift supports C++, PHP, Python, Perl, Java, Ruby, Erlang,
and others. It’s quick, saves development time, and provides a division of labor of work on high-performance servers
and applications.
Scribe (log server)
Scribe is a server for aggregating log
data streamed in real-time from many other servers. It is a scalable framework useful for logging a wide array of data. It
is built on top of Thrift.
Cassandra (database)

Cassandra is a database management system
designed to handle large amounts of data spread out across many servers. It powers Facebook’s Inbox Search feature and
provides a structured key-value store with eventual consistency.
HipHop for PHP
HipHop for PHP is a source code transformer
for PHP script code and was created to save server resources. HipHop transforms PHP source code into optimized C++. After
doing this, it uses g++ to compile it to machine code.
Conclusion
In a nutshell, that’s Facebook. This article could
easily be 37 pages longer if I were to go into more detail, but to answer the question “How does Facebook work?”
I think this will suffice. If you look past all of the features and innovations the main idea behind Facebook is really very
basic””keeping people connected.
Facebook realizes the power of social networking and is constantly
innovating to keep their service the best in the business.......