Saturday, June 19, 2010

Faster compilation with distcc

Often, you have more than one system at your disposal but no clear way of distributing your compilation workloads over to all or some of them. They might be running different OSes which makes it look even more difficult. In my case, I have one laptop (2 cores) and a desktop (4 cores) connected with a WiFi network. The laptop runs Linux (Fedora 13 64-bit) while the desktop runs Windows 7 (64-bit). I wanted to somehow offload Linux kernel compilation over to my powerful desktop and keep my laptop cool :)

distcc comes to the rescue! distcc is a program that can distribute builds of C, C++ code across several machines on a network. Its a fairly well known program but slightly problematic to setup. It took me few hours to setup everything correctly but now it all works like a charm. I hope this short tutorial will help you get running in minutes :)

So, here is what we need to do:
  • Install distcc on both client and server(s) 
  • Configure distcc on both sides
  • Configure Firewall on server(s) to allow incoming distcc traffic
  • Build, build, build!
  • Monitoring

Before we can go ahead with above, we need to install Linux VMs (I used VirtualBox) on the desktop since its running Windows. I created two Linux (Fedora 13, 64-bit) VMs where Linux Kernel compilation can be offloaded. Each VM was assigned 1G of memory and 2 vCPUs each (a single 4 vCPU VM was quite unstable). In general, you need to have both client and server with the same platform (32/64-bit) and the same compiler versions otherwise you can run into weird compiler/linker errors or even worse, undetectable errors!

Install distcc

Firstly, you need to install distcc on both client and server -- distcc calls machine(s) where compilation actually happens as server. So in my case,  VMs on the desktop will be servers. Almost all Linux distributions provide distcc in standard repositories. On Fedora, all you need to do is:
sudo yum install distcc distcc-server 
Now create these symlinks in a separate folder (e.g. ~/distcc/):
mkdir ~/distcc; cd ~/distcc
ln -s /usr/bin/distcc gcc
ln -s /usr/bin/distcc g++
ln -s /usr/bin/distcc c++

NOTE: do not create these symlinks such that they take precedence over your actual compilers, otherwise it will try to offload all kinds compilation you do on the client. In general, its not useful to offload very small compilations.

Configure distcc

Now we need to configure distcc on both client and server. On client side, we need to list servers where we want to offload compilation. On server side, we need to give “authorized” client IP address(es) and port where distcc daemon will listen for client requests.

Client side configuration:
Available servers needs to be listed in ~/.distcc/hosts file. On my laptop, it looks like this:
192.168.1.10,lzo, 192.168.1.11,lzo

Where 192.168.1.{10,11} are IPs of Linux VMs running on my desktop. The ‘lzo’ option tells distcc to compress object files as they are transferred over the network. This slightly increases CPU usage on both client and server but is useful if you have a low bandwidth network. This configuration completely offloads compilation to server. In case you want to local machine to also participate in compilation, change above to:
localhost, 192.168.1.10,lzo, 192.168.1.11,lzo

NOTE: do not use local IP address instead of term ‘localhost’ in this configuration file, otherwise distcc will incur network overhead even for local part of compilation. But if ’localhost’ is used, local part of compilation will have negligible overhead due to distcc.

As another example, you may want to restrict usage of local machine, so it can remain cool and most of the work is done by other servers:
localhost, 192.168.1.10,lzo, 192.168.1.11,lzo
This restricts the number of compilation threads on local machine to 1. Remaining threads (as specified from make –j parameter) go to other server(s).

Server side configuration:
Among other things, we need to provide list of allowed client IP addresses (by default, all IPs are blocked) and the port where distcc daemon will listen for client requests (default port: 3632). On Fedora, the configuration file is /etc/sysconfig/distccd (the exact location may be different depending on your distro). In my case, two Fedora VMs on the desktop were distcc servers, so I need to enter the following configuration on both of them (config file: /etc/sysconfig/distccd)

USER=ngupta
OPTIONS="--jobs 4 --allow 192.168.1.0/24 --port 3632 --log-file=/tmp/distccd.log"
This specifies upper limit on number of parallel jobs on server, range of allowed client IPs, port to listen on and the log file (by default it spams the system log file: /var/log/messages). The USER option is useful if distccd daemon is started as root, in which case it is changed to user USER. See distccd man page for more details.

Now, start the distcc server with:
service distccd start

Or, manually with:
distccd --daemon --user ngupta --jobs 4 --allow 192.168.1.0/24 --port 3632 --log-file=/tmp/distccd.log

Of course, you need to change the user. Now, verify that it started successfully with:
ps awwx | grep distcc
 

Configure Firewall

We need to open TCP port 3632 (or whatever port you specified in distccd configuration). For this, insert following iptables rule in /etc/sysconfig/iptables
-A INPUT -m state --state NEW -m tcp -p tcp --dport 3632 -j ACCEPT

This must be inserted before any other REJECT rules. Alternately, you can use GUI like system-config-firewall to open TCP port 3632. In fact, this is what I used and the above configuration line is auto generated by GUI.

Build, Build, Build!

All set now, its time to build! Now, for whatever compilation you want to distribute using distcc, issue build like this:
PATH=$HOME/distcc:$PATH make –j8

This PATH prefix makes sure that those distcc symlinks get priority over the real compiler. This also gives us the control to use or avoid distcc easily – just don’t use PATH prefix as above and you will fall back to local compiler.

The distcc man page specifies that number of threads (make –j parameter) should normally be set to twice the number of available CPUs to cover for threads blocked on network I/O.

In my case, I have 2 VMs each with 2vCPUs, so total of 4 CPUs. Sometimes, I also add ‘localhost’ to distcc server list, so I can use 2 cores on my laptop too. With a total of 6 cores, my Linux kernel build time (with default Fedora 13 config) came down from over an hour to just 20 mins!
I used Linux kernel just as an example but you can distribute build of any C/C++ code with distcc. Throw-in the power of Virtualization and you can even use a mix of Linux/Windows, 32/64-bit machines.

Monitoring

You can easily monitor how your build is being distributing among servers with either 'distccmon-gnome' or 'distccmon-text'.

Figure: distccmon-gnome continuously showing distcc status during Linux kernel compile.

Happy Building! :)

3 comments:

  1. Hello,

    Very nice and interesting your Grid computing. But, you know any library to distribute any type of processing?

    I need to build a type of grid, of 5 or 6 computers, to distribute the processing of a 3d rendering engine. But, the only one lib I found are really weird and does not work.

    What is your opinion?

    Thankyou. See you later.

    ReplyDelete
  2. Hi,

    > Very nice and interesting your Grid computing. But, you know any library to distribute any type of processing?

    Unfortunately, no. I don't have much experience with such distributed programming. However, UC Berkeley has excellent resources in this area. See 2009 seminar on parallel computing here:
    http://parlab.eecs.berkeley.edu/bootcampagenda
    All video lectures publicly available!

    And there's more -- they are again having seminar on the same track. Registrations are open:
    http://parlab.eecs.berkeley.edu/

    ReplyDelete