Adding Google repository on Ubuntu Lucid Lynx (10.04)

Submitted by Jochus on Sun, 16/01/2011 - 12:59 | Posted in:

If you want to add the Google repository to your Ubuntu system (e.g.: you want to install google-chrome), you should add:

deb http://dl.google.com/linux/deb/ stable main

... to the list of repositories. Next, you need to get a public key to fetch software from the repository. Execute the following command:

$ sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys A040830F7FAC5991

... where A040830F7FAC5991 is the key of the Google repository.



To install Google Chrome:
$ sudo aptitude install google-chrome-stable

Drupal: MASTER-MASTER replication - architecture

Submitted by Jochus on Thu, 13/01/2011 - 23:59 | Posted in: Drupal
Posted in
This blogpost is outdated. Please check this page for a new and better architecture.



One of the nice projects I have done at PIPA, is the setup of a MASTER-MASTER replication architecture for Drupal.

What the ... is a MASTER-MASTER replication architecture?

Imagine you have 1 webhost, with 1 database. If that webhost, or if that database, goes down, your total site is down. Our company cannot afford it to have a website going down for longer then 5 minutes. So that's why we have setup 2 webhosts, with 2 databases. If one of these webhosts, or one of theses databases goes down, the other host/database will take over. In order to notice which node is down, we needed a loadbalancer to check if the nodes were up or down. If one of them was down, the loadbalancer would stop sending requests to a node until it was up again.

Seems pretty easy, but it isn't. Why not? Because you have 2 databases that should be constantly in sync. So if you add a node on host1, it has to be immediately on host2 (and vice versa). If you create a file on host1, it has to be immediately on host2 (and vice versa).

How do you do that?

MySQL MASTER-MASTER replication

So the first thing we did, was the database replication. We configured the MASTER-MASTER replication. We configured host1 to only create new lines with even id's, host2 would only create lines with odd id's. In this way, when 2 users add a new line to the same table, on the same moment, there will never be a conflict because of the difference in id's.
Next, we disabled replication for some tables.

semaphore

Why? Because:

  • semaphore is used a locking mechanism, so multiple threads will not be able to execute same operation
  • when both of the system are trying to get a lock on the same time, a duplicate entry will rise
  • there are 2 modules which use semaphore: locale and menu. Each of the actions in those modules are performed by an administrator (installing modules, themes, ...). As we only have 1 admin, operation will only performed on 1 node, so it's safe to not replicate the semaphore table

all cache tables

Why? Because:

  • ... each host would have his own caching data
  • ... the primary keys of caching tables are strings, not integers. So the trick with even/odd numbers doesn't work here :-( ... You cannot determine which id's to create on host1, and which id's to create on host2.

But each system having it's own database, introduces a new issue. If WWW1 is flushing his cache, you want the WWW2 also flushing his cache. Therefore we ...

  • ... made sure that every "flush operation" is executed through the Cache API
  • ... register all flush operations in a custom MySQL table (hacking the Cache API ... :-( ...)
  • ... use the MySQL replication to copy the flush operations to the second system
  • ... touch the second system, telling him the look into the custom MySQL table and flush his own cache

So both systems will have flushed their cache.

So, that's it for database. This setup works perfectly and bloody fast. We have done a lot of tests on this, and we were absolutely sure this would work :-) ...

Files MASTER-MASTER replication

Another issue is the synchronization of files on both the systems. First of all, I want to mention you have 2 kinds of files. You have the PHP files, which are related to Drupal core, contrib and custom modules/themes. Second of all, you have the files/ directory which holds all node images, aggregated javascripts, css files, etc, etc, ...

Source code replication

Source code replication was rather easy. We have a custom Drupal module (Release Management Module) which uploads files from our STAG server to our PROD server. This module checks the filetree on te STAG server, and compares it with host1 and host2. So if a file is missing on the production servers, it will try to upload it. If it changed, it will override the files on production. And if it was deleted, it will delete it on PROD too. This Release Management Module can be be compared with the diff program on a UNIX/LINUX system. The Release Management module extends diff by adding an option to add/change/delete.

Files replication

In order to sync the files, we started using rsync to sync files from host 1 to host 2 (and vice versa). This operation runs every even minute on the first machine, and every odd minute on the second machine.

However, this solution isn't perfect. Imagine following scenario:

  • a user deletes a file (foo.bar) on host 1
  • rsync gets started from host 2 to host 1
  • rsync notices foo.bar is missing on host 1, so it copies the file from host 2 to host 1

As the deletion of files is very rare at our website, we are currently not tracking this problem. However, I'm willing to write a "consistency" page in the backend of Drupal which checks if a file on the file system, is also in the files table of Drupal. If not, delete the file on the file system. Although, we still have to check this

Duplicate session ID's

So as we started with the above setup, we soon had some failures with the MySQL replication sync. We were suffering duplicate sid's (session id's). So a sid was generated on WWW1, and exactly the same sid was generated too on WWW2. After some reseach, MD5 wasn't so unique as we were expecting. MD5 uses the current timestamp and the remote address to calculate the hash value. But the remote address was always the IP address of the loadbalancer.
Because we didn't want to hack core (to change the remote address), we tried something else. On WWW1, we use MD5 to generate a hash value with 4 bits / character. On WWW2, we use SHA-1 to generate a hash value with 5 bits / character. In this way, we cannot have duplicate sid' as each system has it own session mechanism.

At least, that were we thinking: after a while, we received duplicate session id's (again!) on Google bots, which were browsing the site for "Google Web Preview". Those crawlers work in the cloud and they share cookies. As the session id is stored in a cookie, different IP's were using the same cookie and so the same session id. The same session id on different nodes ... that doesn't sound good.
So we ended up in hacking the core (and Dries killed a kitten :-( ...)

Index: includes/session.inc
===================================================================
--- includes/session.inc	(revision 906)
+++ includes/session.inc	(revision 911)
@@ -78,7 +78,7 @@
   else {
     // If this query fails, another parallel request probably got here first.
     // In that case, any session data generated in this request is discarded.
-    @db_query("INSERT INTO {sessions} (sid, uid, cache, hostname, session, timestamp) VALUES ('%s', %d, %d, '%s', '%s', %d)", $key, $user->uid, isset($user->cache) ? $user->cache : '', ip_address(), $value, time());
+    @db_query("INSERT IGNORE INTO {sessions} (sid, uid, cache, hostname, session, timestamp) VALUES ('%s', %d, %d, '%s', '%s', %d)", $key, $user->uid, isset($user->cache) ? $user->cache : '', ip_address(), $value, time());
   }
 
   return TRUE;

mod_deflate brakes streaming of audio/video in Firefox (with combination of Quicktime plugin)

Submitted by Jochus on Tue, 11/01/2011 - 09:54 | Posted in: Linux
Posted in


I recently had a problem with Firefox 3.6.12 and the Quicktime plugin. When I wanted to play an audio file, the audio file gets cut after 3 seconds, even if the audio file was longer than 3 seconds.
The problem was related to the mod_deflate compression of Apache HTTPD on our files. You need to disable this option for audio/video files.

We changed this detail in our .htaccess file:



# Insert filter
SetOutputFilter DEFLATE

# Netscape 4.x has some problems...
BrowserMatch ^Mozilla/4 gzip-only-text/html

# Netscape 4.06-4.08 have some more problems
BrowserMatch ^Mozilla/4\.0[678] no-gzip

# MSIE masquerades as Netscape, but it is fine
BrowserMatch \bMSIE !no-gzip !gzip-only-text/html

# Don't compress images/audio/video
SetEnvIfNoCase Request_URI \\.(?:gif|jpe?g|png|mp3|mp4|wav|avi|mpe?g)$ no-gzip dont-vary

# Make sure proxies don't deliver the wrong content

Header append Vary User-Agent env=!dont-vary


Installing adobe flashplugin for Firefox in Ubuntu Lucid 10.04 (64 bit)

Submitted by Jochus on Thu, 30/12/2010 - 16:14 | Posted in: Linux
Posted in

I had some troubles installing the Adobe Flash plugin in Firefox. "The normal way" (working through the Firefox browser) failed in an error: adobe-flashplugin is virtual. The solution is:

  • $ sudo add-apt-repository ppa:sevenmachines/flash
    $ sudo aptitude update
    $ sudo aptitude install flashplugin64-installer
  • Restart Firefox and enjoy your new Adobe Flash Player! :-)

Virtualbox: shared folders between host (Ubuntu) and guest (Ubuntu)

Submitted by Jochus on Thu, 30/12/2010 - 15:11 | Posted in: Linux
Posted in


If you want to create a shared folder between your host and guest machine (using VirtualBox), you should execute following steps:

  • Make sure you have the virtualbox-guest-additions installed on your system. If not, install it:
    $ sudo aptitude install virtualbox-guest-additions
  • Now install those guest additions in the guest domain: Devices > Install guest additions ... [Host + D]
  • On the host (Ubuntu) computer, run
    $ mkdir ~/VirtualBoxShare
    $ VBoxManage sharedfolder add "Ubuntu 10.10" -name "myshare" -hostpath /home/your/shared/folder/VirtualBoxShare/

    Where "Ubuntu 10.10" is the name of the virtual machine in VirtualBox, and "myshare" is the name of the share as the guest machine will see it. The hostpath must be a fully-qualified path.

  • If the client is Linux, you have to mount and connect it to a folder. Run :
    $ mkdir /mnt/mountpoint
    $ cd /mnt/mountpoint
    $ sudo mount -t vboxsf myshare mountpoint

Franks Carwash te Aalter: blijkbaar niet voor herhaling vatbaar ...

Submitted by Jochus on Tue, 14/12/2010 - 09:09 | Posted in: Lifetime
Posted in

Zaterdag kon ik mijn wagen niet meer aanzien: man man, al dat zout, al dat zout. Het was zelfs zo erg, dat ik bijna niet meer in/door mijn eigen spiegels/ramen kon kijken :p. Ik probeer zo veel mogelijk mijn wagen met de hand te wassen (mijns inziens, de properste manier om een auto te kuisen), maar als het snel moet gaan ga ik naar zo'n self-service-carwash. Ik ben daar niet altijd zo tevreden van (zie ook hier). Het helpt alleen als je zelf nog eens een spons meepakt :-).

Nu, afgelopen zaterdag wou ik eens zo'n carwash met borstels proberen, je betaalt er helaas wat meer voor, maar de wagen lijkt mij wel properder.
Ik probeerde Franks Carwash, dat gelegen is langs de Knokkebaan (Aalter -> Knesselare). Vriendelijke kerel, al viel er mij een bepaald bord direct op: "We zijn niet verantwoordelijk voor schade aan voertuigen". Verdacht :p ... Nuja, volgens mij was het bord er alleen maar voor heel speciale gevallen ofzo.
Goed, door de carwash, en ik moet zeggen: ferm tevreden van het resultaat. Veel beter dan die "zelf-spuit-systemen". Echter, tegen de avond (en het was eigenlijk Sofie die het opmerkte), merkten we dat er heel veel krasjes waren in de rechterspiegel. Als ik wat verder rondkeek op de carrosserie, zag ik overal héél kleine schartjes (je moet er nu ook wel heel dicht op kijken, maar goed, ...). Dus de auto mag dan wel proper zijn, zo'n borstelmachine laat precies zijn sporen na op de wagen. Vervelende zaak, zeker omdat ik er niets tegen kan doen. Ten eerste: hoe kan ik bewijzen dat de scharten van de carwash machine komen? Ten tweede: het bord dat ophing net voor je de carwash binnen rijdt, dekt hem volledig ... Klote!

Advanced statistics for Drupal: most viewed nodes in a week, month, ...

Submitted by Jochus on Wed, 08/12/2010 - 10:13 | Posted in: Drupal
Posted in


I think most of the Drupal developers know the Statistics module, which makes part of the Core of D6. It basically logs access statistics for your site. Every time a node is viewed, a counter gets updated. This counter value is stored in the node_counter table, and gets reset each day at 00 AM.

The most important thing is the hook_cron() implementation:

/**
 * Implementation of hook_cron().
 */
function statistics_cron() {
  $statistics_timestamp = variable_get('statistics_day_timestamp', '');
 
  if ((time() - $statistics_timestamp) >= 86400) {
    // Reset day counts.
    db_query('UPDATE {node_counter} SET daycount = 0');
    variable_set('statistics_day_timestamp', time());
  }
 
  // Clean up expired access logs.
  if (variable_get('statistics_flush_accesslog_timer', 259200) > 0) {
    db_query('DELETE FROM {accesslog} WHERE timestamp < %d', time() - variable_get('statistics_flush_accesslog_timer', 259200));
  }
}

This function will check if we are 24 hours later since the last statistics check. If we are, it will clean all node_counter values. But for us, this was pretty annoying, as we wanted to keep track of older statistics. We wanted to know, "the most viewed" nodes in the last week, month, etc, etc, ...

We started looking around for a contrib module, but really, there was no contrib module that could satisfy our requirements. Most of the contrib modules just bring to much sh*t :-).

So what did we do? We created our own custom Statistics module!

  • We created a custom variable which keeps track when we ran our custom statistics system the last time
  • If we are 00 AM (or later), copy all values from node_counter to a custom table, and index it with a timestamp
  • The structure of this table is defined as:
  • +----------+------------------+------+-----+---------+-------+
    | FIELD    | TYPE             | NULL | KEY | DEFAULT | Extra |
    +----------+------------------+------+-----+---------+-------+
    | DATE     | INT(10) UNSIGNED | NO   | PRI | NULL    |       |
    | nid      | INT(10) UNSIGNED | NO   | PRI | NULL    |       |
    | daycount | INT(10) UNSIGNED | NO   |     | NULL    |       |
    +----------+------------------+------+-----+---------+-------+

  • Reset the node_counter table
  • Adjust the variable for cron run (Statistics module). So the "normal" Statistics cron will never run. We couldn't disable the "Statistics" module, as we were still keeping track of the views through the node_counter table :-)

The result

Performance testing with Drupal and JMeter (with login!)

Submitted by Jochus on Mon, 06/12/2010 - 09:42 | Posted in: Drupal
Posted in


Last week, I had to do some performance testing on a Drupal site. Performance testing can be done with JMeter (http://jakarta.apache.org/jmeter/), but the difficult thing was to create a login session from JMeter to Drupal.

Installation

The first step: install JMeter

$ sudo aptitude install jmeter

Just for the record, I'm using this version:

$ dpkg -l | grep jmeter
ii  jmeter                               2.3.4-2ubuntu1                                  Load testing and performance measurement app
ii  jmeter-help                          2.3.4-2ubuntu1                                  Load testing and performance measurement app
ii  jmeter-http                          2.3.4-2ubuntu1                                  Load testing and performance measurement app

Configuration

  • On the "Testplan" icon, click right and "Add" a "Threadgroup"
    • In "Thread properties", define the "total number of users" which are accessing your site + a "Ramp up Period" and how many times this action should run
  • Next, install an "HTTP Cookie Manager" in the "Thread Group". This cookie manager will send a cookie along the HTTP request, to be able to login on the site
    • Set "Cookie Policy" to "Compatibility"
    • Login to Drupal in your favorite browser (such as Firefox)
    • Open the list of cookies and search for some name as: "SESScc6c90d4c6f532ab4343b8b404cdf01d"
    • Add a line in JMeter with: name, value, domain, path such as in the cookie which can be viewed in Firefox
  • Next, install an "HTTP Request defaults" element
    • Specify "domain" and "path"
  • Finally, install a "View Results Tree" element, which can debug all HTTP requests