Ubuntu Infiniband IPoIB Installation

04-26-2009

Picture of a Supermicro blade server enclosure

I recently had the opportunity to deploy a blade server architecture, specifically a SuperMicro Bladeform 8100. This is a nice architecture at a very reasonable price point. The only problem I had with it is lack of network cards. This is not unique to Supermicro - it seems nearly all blades are standardized on a 2-NIC design. That's fine if all you need are two NICs, which probably satisfies 90% of the market. But I needed three, one for internal communication between the blades, one for connectivity to the data center provider's network, and the third for a direct connection to a market data feed. Tying any of these networks together with switches was against security regulations, and strictly forbidden. And firewalls or other devices were too slow, added cost and complexity, and added failure points.

The solution I came up with was to leverage an expensive, but very useful, optional accessory for this chassis - an Infiniband switch. While normally used for storage or clustering, Infiniband can be used as a network transport layer with the right drivers, and it's VERY fast - about 3 times the bandwidth and a quarter of the latency of a Gigabit Ethernet switch. This was perfect for use for my box-to-box communications, saving the other two Ethernet cards in each blade for external communications.

The only trouble is, this isn't really a mainstream application, so information is hard to come by on how to set this up. I've written this guide as a reference for others who might be in the same situation.

To begin with, older versions of Linux (I'm using Ubuntu) don't have the right drivers. Up-to-date drivers are critical to a successful connection, so make sure you're using a recent distro. I had good luck with Ubuntu Intrepid Ibex - anything older didn't work so well.

Assuming you're using Ubuntu, you then need to install all of the base packages required to support the tools you'll be building. Do the following:

apt-get install libipathverbs1 libcxgb3-1 librdmacm1 libibverbs1 libmthca1 libopenmpi-dev libopenmpi1 openmpi-bin openmpi-common openmpi-doc libmlx4-1 rdmacm-utils ibverbs-utils build-essential byacc bison flex

Once you have these packages installed, you need to download the latest Infiniband driver package, called OFED.I used version 1.3 for this guide - anything equal to or later should work provided they don't change anything. Download and unpack this archive on your server, then change into the directory where you unpacked it. In the SRPMS directory you'll find packages with the source code to all the drivers and tools. Install them as follows:

cd SRPMS
for i in *.rpm; do rpm -i $i; done
cd /usr/src/rpm/SOURCES
for i in *.tar.gz; do tar -zxvf $i; done
for i in *.tar.bz2; do tar -jxvf $i; done
cd libibcommon*
./configure && make && make install
cd libibumad*
./configure && make && make install
cd libibmad*
./configure && make && make install
cd opensm*
./configure && make && make install
cd infiniband-diags*
./configure && make && make install
ldconfig

The scripts use the wrong shell - if you are trying to run a diagnostic script and are getting errors, edit the file to use /bin/bash instead of its default (top line). Be aware that many scripts call others, so sometimes you need to edit more than one.

Todo: Add the sed commands to translate these to the above code block.

To load the drivers, do the following from the command line. You should also add these entries to /etc/modules (without the modprobe commands!) so they're loaded on your next reboot:

modprobe ib_ipoib
modprobe ib_addr
modprobe ib_mad
modprobe ib_sa
mobprobe ib_cm
modprobe ib_uverbs
modprobe ib_ucm
modprobe ib_umad

Now you need to bring up your interface. Try the following:

opensm -o
ifconfig ib0 10.1.1.1 netmask 255.255.255.0
ping 10.1.1.2 (or some other host already configured on your Infiniband switch)

I use the following block in /etc/network/interfaces to bring up my card(s):

auto ib0
iface ib0 inet static
          pre-up opensm -B
          address 10.200.1.1
          netmask 255.255.255.0

If you want to support bonding, try the following:

ib-bond --bond-addr 10.1.1.1/255.255.255.0

That's it!