February 2, 2007

Linux on an Intel 965 motherboard

(originally posted at http://kbullock.ringworld.org/2007/2/2/linux-on-an-intel-965-motherboard)

For VAWOR, I recently purchased a new firewall box to replace the Soekris box currently serving that function. The Soekris has been a wonderful little firewall for me, but it unfortunately can't handle gigabit throughput, so I purchased a 2U rack-mount box to replace it.

The new box has a Core 2 Duo E6300 on a DQ965GF mainboard, which is Intel's 965 Express chipset. (There, now Google should pick this up with enough different keywords. :) Since this particular combination makes installing Linux difficult, and not much information is apparently yet available on the Web, I've written up my install process here. Hopefully I can save others from some of the pain I endured getting it running.

The components

Aside from the mainboard and CPU, the machine has 1GB RAM, a 160GB SATA hard drive, a PATA DVD-ROM, and an Intel Pro/1000 NIC in a PCI slot. There's also a NIC built into the mainboard, along with audio and USB and such.

The 965 chipset doesn't natively support PATA; Intel wants manufacturers to just use SATA and be done with it, but they also threw a Marvell 88SE6101 IDE controller onto the mainboard to support legacy devices. This controller is not supported by Linux (yet, but see this post to lkml by Alan Cox), so installing from the built-in DVD drive isn't an option.

Beginning the install

Ubuntu's CD installers just plain won't boot. The Live CD tells me "/bin/sh: can't access tty; job control turned off", drops me to an "(initramfs)" prompt, and sits there blinking at me. The Server CD starts the install process, asks me about language and keyboard, and then abruptly fails to detect a CD-ROM drive. Time for another strategy.

I decided to try installing from my USB thumb drive, per https://help.ubuntu.com/community/Installation/FromUSBStick. I set up the new firewall's BIOS to boot to USB, then used my (Ubuntu) workstation to copy the contents of the server CD to the thumb drive and make the thumb drive bootable. To wit:

# mount -o loop /path/to/ubuntu-6.10-server-amd64.iso /mnt

Ubuntu likes to auto-mount a USB mass storage device when you insert it, but be careful: edgy (6.10) seems to thing a VFAT filesystem should default to UTF-8, which caused the installation to fail because it wasn't able to read /pool. So I used p[u]mount to remount it with ISO 8859-1 as its charset (/dev/sda1 is where hotplug put my thumb drive):

# pumount /dev/sda1
# pmount -c iso8859-1 /dev/sda1 USBKEY

Now I could actually copy the files over and rearrange the configuration as described on Ubuntu's help page, ignoring warnings about not being able to copy the symlinks:

# cp -a /mnt/* /media/USBKEY/
# cd /media/USBKEY
# mv install/* isolinux/* .
# mv isolinux.cfg syslinux.cfg

Per the instructions, I edited syslinux.cfg to remove all instances of /install/. Basically, I did this: nano syslinux.cfg, Ctrl-w, Ctrl-r, /install/, Enter, then a lot of y until it was finished replacing, then Ctrl-x, y, <enter>. I skipped any machinations in the dists/ directory, continuing by making the flash drive bootable (unmounting it first!):

# pumount /dev/sda1
# syslinux -s /dev/sda1

Installer issues, and therapy

If you're following along trying to duplicate this, be sure to set the BIOS to configure the SATA drives as AHCI before you boot the installer. More on that later.

So now I tried booting to the USB thumb drive, and immediately hit a problem: the box wouldn't boot from it. It just wouldn't recognize it. So I hit Ctrl-Alt-Del, and it worked. It turns out that for some reason, the motherboard will boot from my thumb drive (a 1GB Crucial Gizmo) only on a warm boot.

So I booted up into the installer and let it fail to find an install CD. I told it not to look for a driver floppy, and not to let me manually select, so it dumped me at the main installer menu. I switched to the shell console (Alt-F2) and mounted the thumb drive (which showed up as /dev/sdb on the firewall, since its HD is SATA) on /cdrom:

# mount -t vfat /dev/sdb1 /cdrom

...and then switched back to the installer console (Alt-F1) and hit Enter to let it continue the install. It loaded the additional components, let me configure the network (manually), set up the clock and the first user.

All good so far, until it "failed to determine the codename of the release". I spent some time unmounting the flash drive, moving it back to my workstation, tweaking things, and moving it back to the new firewall, trying to make it work, but it just wouldn't. *sigh* I decided to try a different tack.

Back in the installer main menu, I went back to the "Load installer components from CD" option, which now prompted me to select more installer components to activate. I added the choose-mirror component and hit Continue. Now it still failed to install the base system. I flipped back to the shell console and:

# umount /cdrom

and then selected "Choose a mirror of the Ubuntu archive", then "Install the base system" again. It worked! I followed the rest of the process as prompted, installing GRUB along the way, and rebooted.

And it failed.

The system just wasn't reading the MBR off the drive. So no GRUB prompt, nothing--just a BIOS message that there was no bootable drive.

Remember when I said before that you should switch the BIOS to set up the SATA drives as AHCI? Yeah, that's why it wouldn't boot. (I think I had actually changed the BIOS setting, installed, changed it to something different, and then tried to boot, and that's why it failed. In any case, keep the setting consistent; and I've seen others mention that AHCI is the way to make Linux work properly on it.)

Okay, so I set the BIOS setting to AHCI, rebooted from the flash drive, reinstalled, and booted into the new system. Success!

Further problems

Well, mostly success. First I discovered that the module for the on-board NIC wasn't being loaded properly, so I created the file /etc/modprobe.d/ethernet with the following contents:

alias eth0 e1000
alias eth1 e1000

That is, both the on-board NIC (eth0) and the PCI NIC (eth1) use the e1000 module, so I told modprobe(8) that. After that, an ifup -a worked as expected, and the interfaces come up on reboot properly.

I got some error messages about 'fake start-stop-daemon', and a bunch of errors when I logged in about not having access to /dev/null. I figured they were probably related, and I was right. udev didn't start properly because somehow dpkg didn't get installed properly. Functionally (thank goodness), but not properly. To track down the problem I looked at /sbin/start-stop-daemon and found that it was a short, do-nothing shell script. I figured that wasn't right, so I found the package it was in and reinstalled it:

# dpkg -S /sbin/start-stop-daemon
dpkg: /sbin/start-stop-daemon
# apt-get install \-\-reinstall dpkg

...and then rebooted, and things worked fine. I should note that it took a couple boots for me to believe it was working, because Ubuntu normally boots with the quiet option; so I rebooted, hit Esc to enter the GRUB menu, edited the kernel command line to remove the quiet option, then booted that way, and saw what it was doing, which was fine.

Compiling a newer kernel

I decided to install the newest stable kernel (2.6.19.2 as of this writing), which was a relatively easy matter if one is familiar with the Debian/Ubuntu way of building kernels. I grabbed the vanilla 2.6.19.2 source and grabbed the source to the most current Ubuntu default kernel to get the default config, and installed kernel-package (to build the kernel), ncurses-dev (which actually installs libncurses5-dev, for make menuconfig), and fakeroot (as a helper for kernel-package, to build as a normal user).

$ wget http://www.us.kernel.org/pub/linux/kernel/v2.6/linux-2.6.19.2.tar.bz2
$ tar xvjf linux-2.6.19.2.tar.bz2
$ apt-get install kernel-package ncurses-dev fakeroot
$ apt-get source linux-image-2.6.17-10-server
$ cd linux-source-2.6.17-2.6.17.1
$ make menuconfig
<then just exit, and save config>
$ cd ../linux-2.6.19.2
$ make menuconfig

Here in the 2.6.19.2 menuconfig, I first went to "Load an Alternate Configuration File", fed it ../linux-source-2.6.17-2.6.17.1/.config, and ignored the errors that went flying by on the screen (all to do with rearranged or missing configuration options between versions). Then all I did was go into "Device Drivers > Serial ATA (prod) and Parallel ATA (experimental) drivers" and enabled (as modules): "ATA device support", "AHCI SATA support", and "Intel PIIX/ICH SATA support". Then:

$ fakeroot make-kpkg clean
$ fakeroot make-kpkg \-\-append-to-version local1 \-\-initrd \-\-stem linux binary-arch
...lots of building...

Then, as root (or, as usual, with sudo):

# dpkg -i linux-headers-2.6.19.2local1_2.6.19.2local1-10.00.Custom_amd64.deb \
    linux-image-2.6.19.2local1_2.6.19.2local1-10.00.Custom_amd64.deb

GRUB was automatically updated with the new kernel image, the initrd got built properly, and I rebooted into the new kernel. It booted much faster, and everything seems to be working properly now.

Remaining issues

Everything, that is, except the DVD-ROM drive. 2.6.19.2 still doesn't support the Marvell PATA controller, and I haven't yet tried to apply Alan Cox's patch mentioned above. I'll post an update if I manage to get it working.

I also haven't tried enabling the onboard audio, because it's a firewall, not a desktop box, so I don't know whether that works.

Other cool things

Now that I'm configuring this box as a firewall, with two network cards that use the same driver (both the on-board and PCI NICs use the e1000 driver), I need a way to make sure that each card gets assigned the same device name every boot. I've used ifrename(8) to that purpose before, but I discovered another way that's now apparently preferred in Ubuntu: /etc/iftab. udev uses this file to name network interfaces, and you can configure it to match a device based on MAC address, PCI ID, or a number of other criteria. I set it up to name the onboard device wan0 and the PCI one lan0, to make my firewall rules easier to deal with. Here's my iftab:

wan* businfo 0000:00:19.0
lan* businfo 0000:06:00.0

(The '*' means to use the next available number, starting from 0, so the devices end up being named wan0 and lan0.) After creating that file and updating my /etc/modprobe.d/ethernet appropriately (changed eth0 and eth1 to wan0 and lan0, respectively), I reloaded the network config:

# ifdown eth0 eth1
# rmmod e1000
# ifup -a

...and everything worked. The ifrename method worked well, and is apparently still a bit more flexible than iftab, but I like the tighter integration with udev.

Posted by kbullock at 3:54 PM