Drupal: MASTER-MASTER replication - architecture

Submitted by Jochus on Thu, 13/01/2011 - 23:59 | Posted in: Drupal
This blogpost is outdated. Please check this page for a new and better architecture.



One of the nice projects I have done at PIPA, is the setup of a MASTER-MASTER replication architecture for Drupal.

What the ... is a MASTER-MASTER replication architecture?

Imagine you have 1 webhost, with 1 database. If that webhost, or if that database, goes down, your total site is down. Our company cannot afford it to have a website going down for longer then 5 minutes. So that's why we have setup 2 webhosts, with 2 databases. If one of these webhosts, or one of theses databases goes down, the other host/database will take over. In order to notice which node is down, we needed a loadbalancer to check if the nodes were up or down. If one of them was down, the loadbalancer would stop sending requests to a node until it was up again.

Seems pretty easy, but it isn't. Why not? Because you have 2 databases that should be constantly in sync. So if you add a node on host1, it has to be immediately on host2 (and vice versa). If you create a file on host1, it has to be immediately on host2 (and vice versa).

How do you do that?

MySQL MASTER-MASTER replication

So the first thing we did, was the database replication. We configured the MASTER-MASTER replication. We configured host1 to only create new lines with even id's, host2 would only create lines with odd id's. In this way, when 2 users add a new line to the same table, on the same moment, there will never be a conflict because of the difference in id's.
Next, we disabled replication for some tables.

semaphore

Why? Because:

  • semaphore is used a locking mechanism, so multiple threads will not be able to execute same operation
  • when both of the system are trying to get a lock on the same time, a duplicate entry will rise
  • there are 2 modules which use semaphore
  • : locale and menu. Each of the actions in those modules are performed by an administrator (installing modules, themes, ...). As we only have 1 admin, operation will only performed on 1 node, so it's safe to not replicate the semaphore table

all cache tables

Why? Because:

  • ... each host would have his own caching data
  • ... the primary keys of caching tables are strings, not integers. So the trick with even/odd numbers doesn't work here :-( ... You cannot determine which id's to create on host1, and which id's to create on host2.

But each system having it's own database, introduces a new issue. If WWW1 is flushing his cache, you want the WWW2 also flushing his cache. Therefore we ...

  • ... made sure that every "flush operation" is executed through the Cache API
  • ... register all flush operations in a custom MySQL table (hacking the Cache API ... :-( ...)
  • ... use the MySQL replication to copy the flush operations to the second system
  • ... touch the second system, telling him the look into the custom MySQL table and flush his own cache

So both systems will have flushed their cache.

So, that's it for database. This setup works perfectly and bloody fast. We have done a lot of tests on this, and we were absolutely sure this would work :-) ...

Files MASTER-MASTER replication

Another issue is the synchronization of files on both the systems. First of all, I want to mention you have 2 kinds of files. You have the PHP files, which are related to Drupal core, contrib and custom modules/themes. Second of all, you have the files/ directory which holds all node images, aggregated javascripts, css files, etc, etc, ...

Source code replication

Source code replication was rather easy. We have a custom Drupal module (Release Management Module) which uploads files from our STAG server to our PROD server. This module checks the filetree on te STAG server, and compares it with host1 and host2. So if a file is missing on the production servers, it will try to upload it. If it changed, it will override the files on production. And if it was deleted, it will delete it on PROD too. This Release Management Module can be be compared with the diff program on a UNIX/LINUX system. The Release Management module extends diff by adding an option to add/change/delete.

Files replication

In order to sync the files, we started using rsync to sync files from host 1 to host 2 (and vice versa). This operation runs every even minute on the first machine, and every odd minute on the second machine.

However, this solution isn't perfect. Imagine following scenario:

  • a user deletes a file (foo.bar) on host 1
  • rsync gets started from host 2 to host 1
  • rsync notices foo.bar is missing on host 1, so it copies the file from host 2 to host 1

As the deletion of files is very rare at our website, we are currently not tracking this problem. However, I'm willing to write a "consistency" page in the backend of Drupal which checks if a file on the file system, is also in the files table of Drupal. If not, delete the file on the file system. Although, we still have to check this

Duplicate session ID's

So as we started with the above setup, we soon had some failures with the MySQL replication sync. We were suffering duplicate sid's (session id's). So a sid was generated on WWW1, and exactly the same sid was generated too on WWW2. After some reseach, MD5 wasn't so unique as we were expecting. MD5 uses the current timestamp and the remote address to calculate the hash value. But the remote address was always the IP address of the loadbalancer.
Because we didn't want to hack core (to change the remote address), we tried something else. On WWW1, we use MD5 to generate a hash value with 4 bits / character. On WWW2, we use SHA-1 to generate a hash value with 5 bits / character. In this way, we cannot have duplicate sid' as each system has it own session mechanism.

At least, that were we thinking: after a while, we received duplicate session id's (again!) on Google bots, which were browsing the site for "Google Web Preview". Those crawlers work in the cloud and they share cookies. As the session id is stored in a cookie, different IP's were using the same cookie and so the same session id. The same session id on different nodes ... that doesn't sound good.
So we ended up in hacking the core (and Dries killed a kitten :-( ...)

Index: includes/session.inc
===================================================================
--- includes/session.inc	(revision 906)
+++ includes/session.inc	(revision 911)
@@ -78,7 +78,7 @@
   else {
     // If this query fails, another parallel request probably got here first.
     // In that case, any session data generated in this request is discarded.
-    @db_query("INSERT INTO {sessions} (sid, uid, cache, hostname, session, timestamp) VALUES ('%s', %d, %d, '%s', '%s', %d)", $key, $user->uid, isset($user->cache) ? $user->cache : '', ip_address(), $value, time());
+    @db_query("INSERT IGNORE INTO {sessions} (sid, uid, cache, hostname, session, timestamp) VALUES ('%s', %d, %d, '%s', '%s', %d)", $key, $user->uid, isset($user->cache) ? $user->cache : '', ip_address(), $value, time());
   }
 
   return TRUE;

Comments

Submitted by Ilmari Oranen (not verified) on Tue, 28/02/2012 - 22:16
 

Very interesting write up. Seems very hard to find good master-master setup instructions for Drupal. A few things made me wonder though.

Why keep cache_* in the database at all? It would be easy to push them to memcache ( http://drupal.org/project/memcache ). You can do the same for sessions (either memcache or mongodb).
The Drupal mongodb module ( http://drupal.org/project/mongodb ) also supports a locking mechanism (D6 patch though), so that would be taken case aswell. And about syncing /files/ .. what about mounting the same harddrive?

Was this writeup made for Drupal 6 or Drupal 7?

Submitted by Jochus on Tue, 28/02/2012 - 22:29
 

Hi,

Thanks for your comment. I must say we are currently in the process to moving to a master - slave architecture. We are having too much trouble with the master - master setup.

About the cache_* tables: yes, at the moment I've written this blogpost, we didn't have a notice of memcache. Since a month, we are running memcache for all cache tables and sessions.

About mongodb: thanks, interesting

About syncing files: yes, you could do that, but the problem is we really wanted to seperate the whole system. We don't want to have a single-point-of-failure. If the "same harddrive" fails, your site will not be available.

The writeup is written for Drupal 6. But after reading your comment, I consider writing a second blogpost with contra's about master - master and ... most important ... our idea of the new setup. We are also considering the use of a common NAS drive where all files are stored. So we do have a single-point-of-failure on that one, but ... the chances the NAS falls down are so little ... It's better to keep the files directory clean

Jochen

"I must say we are currently in the process to moving to a master - slave architecture. We are having too much trouble with the master - master setup."

This is valuable information. We still might consider MMM (mysql master-master)-setup, but with a loadbalancer in between Drupal and MySQL-servers, which means that to Drupal it will seem like a single db. On top of that, there might be a slave for some selects, and one dedicated mysql server for a few very busy custom tables.

"The writeup is written for Drupal 6. But after reading your comment, I consider writing a second blogpost with contra's about master - master and ... most important ... our idea of the new setup."

I'd very much like to see that article also :)

"We are also considering the use of a common NAS drive where all files are stored. So we do have a single-point-of-failure on that one, but ... the chances the NAS falls down are so little ... It's better to keep the files directory clean"

At the moment I'd consider a NAS RAID-setup reliable enough for /files. Most of those files will be cached in Varnish anyway.

Hi,

Thank you for your great article, I am in the midst of a deployment which is for an organization and this article would be of immense help. I would like to know how you update from staging to production the database values, as you have mentioned in the detailed diagram that all new files can easily be updated using rsync I understood that it is easy, however I am yet to understand how we can go about with a complete upgrade of new modules from staging to production.

Submitted by Jochus on Wed, 25/04/2012 - 22:05
 

Hi Frank,

Well, that's stated in the article, no?

---
Source code replication was rather easy. We have a custom Drupal module (Release Management Module) which uploads files from our STAG server to our PROD server. This module checks the filetree on te STAG server, and compares it with host1 and host2. So if a file is missing on the production servers, it will try to upload it. If it changed, it will override the files on production. And if it was deleted, it will delete it on PROD too. This Release Management Module can be be compared with the diff program on a UNIX/LINUX system. The Release Management module extends diff by adding an option to add/change/delete.
---

This module is a custom module and not contributed to the community. If you want the source code, I'm afraid you will have to ask it to helpdesk [ A T ] pipa.be

Jochen

Add new comment

The content of this field is kept private and will not be shown publicly.

Full HTML

  • Lines and paragraphs break automatically.
  • You can caption images (data-caption="Text"), but also videos, blockquotes, and so on.
  • Web page addresses and email addresses turn into links automatically.
  • You can enable syntax highlighting of source code with the following tags: <code>, <blockcode>, <bash>, <cpp>, <css>, <html5>, <java>, <javascript>, <php>, <sql>, <xml>. The supported tag styles are: <foo>, [foo].
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.