Jun 05 2008

LinkedIn Architecture

Category: ScalabilityOren Hurvitz @ 12:20 am

At JavaOne 2008, LinkedIn employees presented two sessions about the LinkedIn architecture. The slides are available online:

These slides are hosted at SlideShare. If you register then you can download them as PDF’s.

This post summarizes the key parts of the LinkedIn architecture. It’s based on the presentations above, and on additional comments made during the presentation at JavaOne.

Site Statistics

  • 22 million members
  • 4+ million unique visitors/month
  • 40 million page views/day
  • 2 million searches/day
  • 250K invitations sent/day
  • 1 million answers posted
  • 2 million email messages/day


  • Solaris (running on Sun x86 platform and Sparc)
  • Tomcat and Jetty as application servers
  • Oracle and MySQL as DBs
  • No ORM (such as Hibernate); they use straight JDBC
  • ActiveMQ for JMS. (It’s partitioned by type of messages. Backed by MySQL.)
  • Lucene as a foundation for search
  • Spring as glue

Server Architecture


  • One monolithic web application
  • One database: the Core Database
  • The network graph is cached in memory in The Cloud
  • Members Search implemented using Lucene. It runs on the same server as The Cloud, because member searches must be filtered according to the searching user’s network, so it’s convenient to have Lucene on the same machine as The Cloud.
  • WebApp updates the Core Database directly. The Core Database updates The Cloud.


  • Added Replica DB’s, to reduce the load on the Core Database. They contain read-only data. A RepDB server manages updates of the Replica DB’s.
  • Moved Search out of The Cloud and into its own server.
  • Changed the way updates are handled, by adding the Databus. This is a central component that distributes updates to any component that needs them. This is the new updates flow:
    • Changes originate in the WebApp
    • The WebApp updates the Core Database
    • The Core Database sends updates to the Databus
    • The Databus sends the updates to: the Replica DB’s, The Cloud, and Search


  • The WebApp doesn’t do everything itself anymore: they split parts of its business logic into Services.
    The WebApp still presents the GUI to the user, but now it calls Services to manipulate the Profile, Groups, etc.
  • Each Service has its own domain-specific database (i.e., vertical partitioning).
  • This architecture allows other applications (besides the main WebApp) to access LinkedIn. They’ve added applications for Recruiters, Ads, etc.

The Cloud

  • The Cloud is a server that caches the entire LinkedIn network graph in memory.
  • Network size: 22M nodes, 120M edges.
  • Requires 12 GB RAM.
  • There are 40 instances in production
  • Rebuilding an instance of The Cloud from disk takes 8 hours.
  • The Cloud is updated in real-time using the Databus.
  • Persisted to disk on shutdown.
  • The cache is implemented in C++, accessed via JNI. They chose C++ instead of Java for two reasons:
    • To use as little RAM as possible.
    • Garbage Collection pauses were killing them. [LinkedIn said they were using advanced GC’s, but GC’s have improved since 2003; is this still a problem today?]
  • Having to keep everything in RAM is a limitation, but as LinkedIn have pointed out, partitioning graphs is hard.
  • [Sun offers servers with up to 2 TB of RAM (Sun SPARC Enterprise M9000 Server), so LinkedIn could support up to 1.1 billion users before they run out of memory. (This calculation is based only on the number of nodes, not edges). Price is another matter: Sun say only “contact us for price”, which is ominous considering that the prices they do list go up to $30,000.]

The Cloud caches the entire LinkedIn Network, but each user needs to see the network from his own point of view. It’s computationally expensive to calculate that, so they do it just once when a user session begins, and keep it cached. That takes up to 2 MB of RAM per user. This cached network is not updated during the session. (It is updated if the user himself adds/removes a link, but not if any of the user’s contacts make changes. LinkedIn says users won’t notice this.)

As an aside, they use Ehcache to cache members’ profiles. They cache up to 2 million profiles (out of 22 million members). They tried caching using LFU algorithm (Least Frequently Used), but found that Ehcache would sometimes block for 30 seconds while recalculating LFU, so they switched to LRU (Least Recently Used).

Communication Architecture

Communication Service

The Communication Service is responsible for permanent messages, e.g. InBox messages and emails.

  • The entire system is asynchronous and uses JMS heavily
  • Clients post messages via JMS
  • Messages are then routed via a routing service to the appropriate mailbox or directly for email processing
  • Message delivery: either Pull (clients request their messages), or Push (e.g., sending emails)
  • They use Spring, with proprietary LinkedIn Spring extensions. Use HTTP-RPC.

Scaling Techniques

  • Functional partitioning: sent, received, archived, etc. [a.k.a. vertical partitioning]
  • Class partitioning: Member mailboxes, guest mailboxes, corporate mailboxes
  • Range partitioning: Member ID range; Email lexicographical range. [a.k.a. horizontal partitioning]
  • Everything is asynchronous

Network Updates Service

The Network Updates Service is responsible for short-lived notifications, e.g. status updates from your contacts.

Initial Architecture (up to 2007)

  • There are many services that can contain updates.
  • Clients make separate requests to each service that can have updates: Questions, Profile Updates, etc.
  • It took a long time to gather all the data.

In 2008 they created the Network Updates Service. The implementation went through several iterations:

Iteration 1

  • Client makes just one request, to the NetworkUpdateService.
  • NetworkUpdateService makes multiple requests to gather the data from all the services. These requests are made in parallel.
  • The results are aggregated and returned to the client together.
  • Pull-based architecture.
  • They rolled out this new system to everyone at LinkedIn, which caused problems while the system was stabilizing. In hindsight, should have tried it out on a small subset of users first.

Iteration 2

  • Push-based architecture: whenever events occur in the system, add them to the user’s "mailbox". When a client asks for updates, return the data that’s already waiting in the mailbox.
  • Pros: reads are much quicker since the data is already available.
  • Cons: might waste effort on moving around update data that will never be read. Requires more storage space.
  • There is still post-processing of updates before returning them to the user. E.g.: collapse 10 updates from a user to 1.
  • The updates are stored in CLOB’s: 1 CLOB per update-type per user (for a total of 15 CLOB’s per user).
  • Incoming updates must be added to the CLOB. Use optimistic locking to avoid lock contention.
  • They had set the CLOB size to 8 kb, which was too large and led to a lot of wasted space.
  • Design note: instead of CLOB’s, LinkedIn could have created additional tables, one for each type of update. They said that they didn’t do this because of what they would have to do when updates expire: Had they created additional tables then they would have had to delete rows, and that’s very expensive.
  • They used JMX to monitor and change the configuration in real-time. This was very helpful.

Iteration 3

  • Goal: improve speed by reducing the number of CLOB updates, because CLOB updates are expensive.
  • Added an overflow buffer: a VARCHAR(4000) column where data is added initially. When this column is full, dump it to the CLOB. This eliminated 90% of CLOB updates.
  • Reduced the size of the updates.

[LinkedIn have had success in moving from a Pull architecture to a Push architecture. However, don’t discount Pull architectures. Amazon, for example, use a Pull architecture. In A Conversation with Werner Vogels, Amazon’s CTO, he said that when you visit the front page of Amazon they typically call more than 100 services in order to construct the page.]

The presentation ends with some tips about scaling. These are oldies but goodies:

  • Can’t use just one database. Use many databases, partitioned horizontally and vertically.
  • Because of partitioning, forget about referential integrity or cross-domain JOINs.
  • Forget about 100% data integrity.
  • At large scale, cost is a problem: hardware, databases, licenses, storage, power.
  • Once you’re large, spammers and data-scrapers come a-knocking.
  • Cache!
  • Use asynchronous flows.
  • Reporting and analytics are challenging; consider them up-front when designing the system.
  • Expect the system to fail.
  • Don’t underestimate your growth trajectory.

47 Responses to “LinkedIn Architecture”

  1. TheBull says:

    It is nice to see another big java implementation, not to mention one that uses a lot of good open source tools.

  2. Xceptance Blog » Blog Archiv » LinkedIn-Architektur says:

    […] gern mal wissen möchte, wie moderne grosse Webseiten laufen, der kann sich hier die LinkedIn-Architecture anschauen und durchlesen. Und wieder ist Lucene die Search-Engine der Wahl. Interessant ist der […]

  3. ero says:

    Java rules, we’re using same set of tools but lack of good design ….
    Well done, LinkedIn team!!!

  4. Arquitectura de LinkedIn says:

    […] Arquitectura de LinkedInhurvitz.org/blog/2008/06/linkedin-architecture por jccpUEM hace pocos segundos […]

  5. links for 2008-06-05 « Brent Sordyl’s Blog says:

    […] LinkedIn Architecture At JavaOne 2008, LinkedIn employees presented two sessions about the LinkedIn architecture. The slides are available online: […]

  6. Do you use ActiveMQ? at Thinking Outloud says:

    […] I read today where LinkedIN uses ActiveMQ as part of their architecture. I haven’t used ActiveMQ personally, but we use it for a project where I work and the guy […]

  7. Daniel Molina says:

    I agree, is nice to look a large scale java application, I will take a look at the papers to see if they can be used as examples to mount similar applications. Thanks…

  8. Another next big thing says:

    […] used memcached and just accepted that data served would be stale, old and wrong. So it was nice to read about LinkedIn’s architecture – where the headlines are: * Mostly Java. Java might be hysterically slow, but in this space it’s […]

  9. Tudor Galos's blog : Arhitectura LinkedIn says:

    […] găsit AICI un post foarte interesant despre arhitectura LinkedIn. Interesante sunt și […]

  10. cherouvim says:

    Superb info! Thanks.

  11. links for 2008-06-05 « 個人的な雑記 says:

    […] Cookies are for Closers » LinkedIn Architecture (tags: linkedin scalability architecture performance java programming) […]

  12. links for 2008-06-06 « Simply… A User says:

    […] Cookies are for Closers » LinkedIn Architecture (tags: architecture scalability java linkedin performance programming hardware reference code **) […]

  13. The Real-Time Cloud Architecture of LinkedIn’s Massive Netwo | WhiteSandsDigital.com says:

    […] entire LinkedIn network graph in memory. # Network size: 22M nodes 120M edges # Requires 12 GB RAMread more | digg […]

  14. asdf says:

    Lots of technology. Wish it were used to support a service that actually provided something of value.

  15. links for 2008-06-06 | Bieber Labs says:

    […] Cookies are for Closers » LinkedIn Architecture Interesting post describing the architecture of the LinkedIn application. (tags: architecture apps scalability) […]

  16. links for 2008-06-06 « memor.ia blog says:

    […] Cookies are for Closers » LinkedIn Architecture (tags: architecture memori.us Startup scalability) […]

  17. Weekly linkdump #129 - max - блог разработчиков says:

    […] Интересные факты и подробности об архитектуре сервиса LinkedIn, Cookies are for Closers » LinkedIn Architecture […]

  18. Visa Kopu » LinkedInin teknisestä toteutuksesta says:

    […] are for Closers: LinkedIn Architecture LinkedIn käyttää Javaa Tomcatilla ja Jettyllä, ActiveMQ:ta JMS-liikenteelle, suoria […]

  19. Markus Kohler (Java Performance blog) says:

    12 Gbyte should not a problem anymore today, The CMS (concurrent mark and sweep) collector available with the SUN JVM should avoid the pauses almost always.

  20. Francesco Biacca blog » J2EE e LinkedIn says:

    […] hurvitz.org) Tags: j2ee, jetty, linkedin, spring, […]

  21. Igor says:

    Unfortunately there is no information about a number of servers used for each service (only one number, that ’40 instances of the graph’). It would be rather interesting to get at least some rough estimates of the cluster size and geographical distribution of the servers (if there is any).

  22. Global Nerdy | LinkedIn’s Architecture says:

    […] are for Closers (great blog name!) has a post about LinkedIn’s architecture, featuring links to slide decks for two JavaOne 2008 presentations by people from LinkedIn and an […]

  23. Dare Obasanjo aka Carnage4Life - Velocity: A Distributed In-Memory Cache from Microsoft says:

    […] any modern stories of the architectures of popular Web sites today such as the recently published overview of the LinkedIn social networking site’s Web architecture, you will notice a heavy usage of in-memory caching to improve performance. Popular web sites built […]

  24. Evolutionary Goo » Blog Archive » Its OK to code that Web 2.0 app in Java says:

    […] For more information on LinkedIn’s implementation, see this excellent article. […]

  25. Evgeny Rippi says:

    However I met some problems on linkedin. For instance, once my user image didn’t show up.

  26. American Jeff says:

    I can attest that Java 5/6 GC can be a real problem once you’ve got more than a few GB of data structures in memory. It took a lot of experimentation to arrive at a mix of configs that worked tolerably. The key was finding the right balance of MaxNewSize and SurvivorRatio to minimize the tenuring of garbage.

  27. Neuronus says:

    Как?!? Неужели кто-то не пользует хибернэйт?

  28. Java the right way | Denis’ Blog says:

    […] for Football it is over for us, let us dream that we are designing the right way ). Have a look at their presentations. No Hibernate, Agile methodology and scalability in mind. Great […]

  29. Slava Imeshev says:

    LinkedIn could look at our Cacheonix. Cacheonix is a Data Grid solution. It allows you to split cached data into multiple JVMs with smaller heap sizes, so you can have a very large cache w/o disadvantage of long GC pauses.

  30. Maverixk says:

    Looked at iterations 1 thru 3. Another alternative would be as follows:
    – Have an exclusive table for all updates
    for each day
    – Have a column in that table to indicate
    update type(insert/delete/update etc.)
    – Use the column to affect insert/update/
    delete etc.
    – Truncate the table next day

  31. Paul Murphy mobile edition says:

    […] to one of those by pointing to an interesting discussion of the linkedin.com architecture on Oren Hurvitz’s “Cookies are for closers” blog […]

  32. AmBAr Amarelo says:

    Good (and hard) job :)

  33. L’architecture LinkedIn | episode 2 says:

    […] trouverez (en anglais) un article sur blog d’Oren Hurvitz, qui reprend les grandes lignes de la présentation.  Pour les slides de la […]

  34. Greg says:

    There is a lot to digest in this article from technical perspective if we could only adopt 20% in our design that would be awesome.
    Now if we only have other large website giving us their architecture we all could benefit.

  35. nvrijn says:

    Garbage Collection pauses are not a problem if you go with the Azul System Java compute appliance. We’ve seen > 100 GB heaps without a significant GC pause. (Yeah … surprised the heck outta me too). Takes an integrated solution though.

  36. Архитектура LinkedIn | Insight IT says:

    […] о публикации двух презентаций c JavaOne 2008 о LinkedIn и их обобщении от Overn Hurvitz пронеслось по русскоязычным новостным […]

  37. ooopinionsss says:

    How you think when the economic crisis will end? I wish to make statistics of independent opinions!

  38. Are Cloud Based Memory Architectures the Next Big Thing? | Unix Stuff says:

    […] isn’t the only one getting a performance bang from moving data into memory. Both LinkedIn and Digg keep the graph of their network social network in memory. Facebook has northwards of 800 […]

  39. Daily Find #120 | TechToolBlog says:

    […] LinkedIn Architecture […]

  40. Dare Obasanjo aka Carnage4Life - Some thoughts on memory based architectures (aka why memcached isn't good enough) says:

    […] isn’t the only one getting a performance bang from moving data into memory. Both LinkedIn and Digg keep the graph of their network social network in memory. Facebook has northwards of 800 […]

  41. GPS Humano » Blog Archive » Este fim de semana… says:

    […] – e a Google sabe-o melhor que ninguém. Do tal rant fui parar a uma exposição sobre a arquitectura do LinkedIn e, tristeza das tristezas, acabei por perceber que perdi uma belíssima oportunidade de obter os […]

  42. AmrD says:

    I think it’s time to introduce a “Community/Social Application Framework” by LinkedIn team.

  43. How do sites like LinkedIn efficiently display 1st/2nd/3rd-level relationship next to each persons name?(Resolved) - Tech Forum Network says:

    […] that the question was about an optimal solution, regardless of how LinkedIn actually does it today, which I looked up after I wrote my own answer […]