The Advisory Boar

By Abhijit Menon-Sen <ams@toroid.org>

Resetting the Lexmark E323N network configuration

2010-09-12

Many years ago, I bought a Lexmark E323N laser printer (600dpi, 19ppm) because it was the cheapest printer I could find that came with Ethernet and PostScript support. I used it for a long time and was happy with it. When I moved away from home, I left it connected to my switch—along with a DSL modem and a wireless access point—so that my mother could use it.

Fast forward a few years. The DSL modem had died and been replaced. The switch had died and been replaced. The WAP died, and the Netgear WGR614 bought to replace it had four Ethernet ports, and could thus replace the switch as well. But it was a router, not a bridge, and so it wanted its internal and external networks numbered differently. The upshot was that the printer's IP address needed to change from 10.0.0.4 to 192.168.1.4.

No problem. I added a 10.0.0.0/8 address and host route to my netbook's eth2, which let me connect to the printer's administrative interface and change its address in the network settings menu. Alas, I forgot all about the separate "access control" menu, which was set to deny requests from outside 10.0.0.0/8. When the printer came back up, it would respond to ping from 192.168.1.x but discard TCP packets because of the access filter. If I used a 10.0.0.x address, it threw away all packets because they were from a source that did not match the printer's own IP address.

(I can't decide which is more stupid: that I chose to enable an IP-based packet filter on the printer in the first place, or that the printer did not protest at a configuration that rendered it unusable. I have a sinking feeling that it was the former.)

No problem. I went to the Lexmark web site and downloaded a user manual. I followed its description of the occult ritual to reset the printer's configuration settings, which involved opening the printer, switching it on, hopping in a clockwise half-circle on one leg, holding some buttons, staring at blinkenlights, and so on. I did it once, then twice. Nothing changed. Then I found this web page, which explained that the "reset to factory defaults" procedure didn't actually reset the network settings. I tried the NVRAM tweaking procedure described on that page (which involved pressing the continue and cancel buttons a gazillion times while watching blinking LED patterns for feedback), and it didn't seem to work either.

Despair set in. I tried the configuration tweaks again. So did Hassath. Nothing changed. My mother was muttering in the background about buying a new printer. After two or three more attempts, Hassath also gave up, and they both went downstairs to make coffee. I sat down to repeat the process. With tcpdump running, I went through the sequence once, twice, then ten times, then fifteen, then I lost count. Suddenly, just as the page said might happen, the printer emitted "several BOOTP packets and a burst of ARP probe packets". The Netgear answered its DHCP request, and just like that, the network settings were reset and everything worked again.

I can only guess that changing the network settings so many times so quickly triggered some bug in the firmware; perhaps the settings were saved incorrectly, leading to a checksum error when they were loaded, and thus forcing the printer to discard the saved settings. (I used a similar trick to fix my WAP54G.)

Remember: You need to continue tweaking the printer until a sensible IP address appears.

IPv6 support for authbind

2010-09-08

A few days ago, my friend Aaron was trying to add IPv6 support to authbind, a program by Ian Jackson which allows unprivileged processes to bind reserved ports through LD_PRELOAD-interception of bind(2) and a setuid-root helper program.

Yesterday, after returning from a long train journey, I took a few hours to decompress and hack the necessary changes together. It turned out to be quite simple. Here's the patch.

The changes have received only light testing, but everything seemed to work in the test cases I contrived. I'll send the patch upstream after a couple of other people confirm that I didn't overlook anything.

Testing and feedback are very welcome.

Update: a week later, at least one site runs the patched authbind in production, and I have sent the patch to the author (with no response yet). The patch is also now cited in a bug report filed against the Debian package.

Update: a year later, Ian Jackson responded to the bug report and said the patch was unacceptable, because it changed the internal calling convention for a helper program. I wanted to redo and resubmit the patch, but couldn't drum up the motivation to actually do so.

Update (2012-06-02): a year and a half after I wrote the patch, Ian Jackson has released authbind 2.0.0 with IPv6 support. A quick glance suggests that he didn't use any of my code.

Incomprehensible upstart error messages

2010-09-06

I ran "service ssh restart", and got the following error message:

restart: Rejected send message, 1 matched rules; type="method_call", sender=":1.75" (uid=1000 pid=3409 comm="restart) interface="com.ubuntu.Upstart0_6.Job" member="Restart" error name="(unset)" requested_reply=0 destination="com.ubuntu.Upstart" (uid=0 pid=1 comm="/sbin/init"))

It turns out this is how Upstart (Ubuntu's init(8) replacement) says "you're not root".

SATA errors with flaky power cable

2010-06-15

Just for the record—a bad SATA power cable is capable of provoking intermittent errors such as the following:

ata4: exception Emask 0x10 SAct 0x0 SErr 0x4050002 action 0xe frozen
ata4: irq_stat 0x00400040, connection status changed
ata4: SError: { RecovComm PHYRdyChg CommWake DevExch }
ata4: hard resetting link
ata4: SATA link down (SStatus 0 SControl 300)
ata4: EH complete

Two chassis fans and a disk were sharing a single 4-pin molex connector from the power supply. The problem was that the pin sockets on the fan connectors were loose and misaligned, and one of them became dislodged when the disk connector was plugged in. Pushing the pins into place and reattaching the connectors solved the problem.

Too many filesystems

2010-05-17

While trying to explain something about filesystems the other day, I realised that there are too many different (but related) things that can be reasonably described by that term.

First, there's the general idea of a filesystem, discussed in every operating systems textbook, as an organisation of data into a hierarchy of named directories and files for persistent storage on disk. This is what people mean when they say Store data in the filesystem.

Second, there's the specific protocol that defines the UNIX filesystem, with characteristics such as files being just a series of bytes, having case-sensitive names and certain kinds of metadata, using '/' as a path separator, and supporting various operations (open, read, close, …).

Third, there are the many different filesystem implementations, such as UFS, ext3, XFS—all programs that implement UNIX filesystem semantics but have their own features, characteristics, extensions, and on-disk layout of data. This is the level at which one may decide to use, say, a journalling filesystem for a certain purpose.

Next, the layout of data on a disk, as distinct from whatever program is used to read or write that data, is also called a filesystem. This is the sense in which people might say The filesystem on /dev/sda1 is corrupted—the problem is (one hopes) not with the implementation, but with its instantiation on disk.

Finally, on a UNIX system, the filesystem may refer to the hierarchy of directories rooted at / and built up by mounting specific filesystems (in the "data on disk" sense) at various points on the tree. Thus, it is the union of the contents of its constituent filesystems.

These layers are usually taken for granted, but it is necessary to peel them away one by one to explain things properly.

It's a toroid, not a primate!

2010-05-12

My server acts as a secondary nameserver for primate.net, in which zone it is named ns.de.primate.net. I set that up long ago for a friend, and forgot all about it. Until now.

Imagine my surprise when I discovered the other day that Google crawled various pages on my site as "http://ns.de.primate.net/whatever", and was happily presenting (some of) those results in preference to their proper toroid.org versions.

It's true that the site is reachable as ns.de.primate.net, and Apache will—since it doesn't recognise that name—serve the default VirtualHost, which is toroid.org. But I can't imagine why Google ever decided to use that name. I've never used it in a URL, public or otherwise. As far as I know, it's never been used for anything but name service for primate.net (and certainly not in a PTR record for my server's address).

I hope Google doesn't take it upon itself to use any of the other names by which my server happens to be accessible. Just in case, I added the following as the first (i.e. default) virtualhost in my httpd.conf. Now any request to a not-explicitly-configured name will be redirected to toroid.org.

<VirtualHost *:80>
    Redirect permanent / http://toroid.org/
</VirtualHost>

We can't have evil primates running around, after all.

Delivering mail to Hotmail

2010-04-22

I just helped a friend move mail service for a few domains from his old server to a new one running Postfix, Archiveopteryx, and Roundcube. The move went well, but for one thing: mail sent through the new server to Hotmail was accepted, but never delivered to the recipient's inbox, no matter how permissive the anti spam settings. (Mail sent to GMail and Yahoo worked fine.)

It took us a while to figure out what was going wrong. I had installed SpamAssassin (for the first time; I don't use it myself) as an SMTP content filter in Postfix, and it was adding spam evaluation fields to all mail, including mail relayed to the outside world for the system's users (who submit mail to Archiveopteryx, which forwards it to Postfix for delivery). SA added fields that said "this is not spam":

X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on example.com
X-Spam-Level: 
X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,AWL
    autolearn=ham version=3.2.5

We had eliminated other possible causes (bad hostname/PTR configuration, DNS blacklists, bad SPF records), so disabling SpamAssassin for outgoing mail was the next thing to try (and I wanted to do it anyway, as soon as I found out it was happening). It was easy to do. I applied the content filter only to mail received over the external network interface, and added a new SMTP service definition without any filters for mail from localhost. So in /etc/postfix/master.cf, I have:

127.0.0.1:smtp inet n - - - - smtpd
A.B.C.D:smtp inet n - - - - smtpd -o content_filter=spamassassin

spamassassin unix - n n - - pipe
    user=spamd argv=/usr/bin/spamc -f -e
    /usr/sbin/sendmail -oi -f ${sender} ${recipient}

Fortunately, Hotmail started accepting and delivering mail from the new server after this change.

I wonder if spammers forge negative SpamAssassin reports in their mail. And… does it ever fool anyone?

Git disaster recovery

2010-04-12

I typed git commit and git push, and a few seconds later, the mains power died. Normally, I wouldn't have noticed, but my trusty UPS is broken, so for the first time in many years, every power glitch makes its presence felt; and now, I can fully experience the joy of being bitten in the rear by Ext4's delayed allocation.

When my machine came up again, the newly-created commit object and some associated tree objects were corrupted. refs/heads/master pointed to that corrupted commit, so most git commands died with this error message:

$ git log
fatal: object 54590b644cb542d30ec962c138a763dddc26aac0 is corrupted

To my great good fortune, my git push had completed before the power failed, so I knew I could recover everything from the remote repository. I flailed around a little before finding out how, but here's what ultimately worked for me.

First, I kept running git fsck and deleting the objects it complained about:

$ git fsck
fatal: object 54590b644cb542d30ec962c138a763dddc26aac0 is corrupted
$ rm -f .git/objects/54/590b644cb542d30ec962c138a763dddc26aac0

Then I copied the corrupted objects back from the remote repository one by one, using a trick Sam Vilain showed me on IRC:

$ ssh remote.ho.st \
    "git cat-file commit 54590b644cb542d30ec962c138a763dddc26aac0" | \
    git hash-object -w -t commit --stdin

If I had deleted the corrupted objects and reset my HEAD to point to an older commit, a plain old git fetch should have retrieved the missing objects. I didn't think of that soon enough, and recovered the missing commit first, so git fetch thought everything was up to date. But fetching the objects one by one worked fine, and git fsck stopped complaining.

I'm not sure what I would have been able to do if the remote repository had not been updated in time. I would almost certainly have lost the most recent commit, and perhaps also its immediate parent.

I really hope my UPS gets fixed soon.

Nokia 7210 as a GPRS modem under Linux

2010-02-21

I needed a GPRS-capable phone to use as a modem with our Lenovo S10 on a trip out of town, and after some research, Hassath and I bought the Nokia 7210 Supernova, which does GPRS and Bluetooth well enough (and has a host of features that we didn't care about). Here's a very brief report.

Our S10 runs Ubuntu 9.10, which detected a new "mobile broadband" connection when I plugged in the phone using the (absurdly short) included USB cable. To my surprise, it let me select my country and provider (Vodafone), and I was online in a few seconds with no fuss. Disappointed at the lack of an opportunity for heroic action, I tried Bluetooth next. Following some advice on the Ubuntu forum, I installed blueman, and… that just worked, too. I could detect the phone, pair with it, browse its filesystem; and if I activated dialup access, I could use the same mobile broadband connection as above. All of this took barely more than a minute.

While travelling, I noticed that the connection via Bluetooth sometimes had trouble with flaky GSM connectivity. If the phone lost coverage, the connection would die, and both devices would need to be rebooted to make them talk to each other again. But that happened only when we were on a train, hopping between towers. Other than that, things worked very well (at least, if I tried not to think about the INR5/MB usage charges).

One little quirk: when I activate dialup access in blueman, it pops up a window that says "The device Nokia 7210 Supernova does not appear to support GSM/CDMA. This connection will not work". But it does.

Ubuntu 9.10 on the Lenovo Ideapad S10

2009-11-05

I installed Ubuntu 9.10 from scratch on our Lenovo Ideapad S10 (which was running 8.10 earlier) some days ago, and I also had the opportunity to install it on a friend's new S10-2. There's very little to report in either case. The installation itself was perfectly ordinary.

When I booted up the first time, the Broadcom BCM4312 wireless interface didn't work. I knew it used wl.o under 8.10, but that file was nowhere to be found. A bit of research (which I really should have done before I reinstalled) showed that I needed to install the dkms and bcmwl-kernel-source packages, and the wireless interface worked fine thereafter. I was lucky to have no other hardware problems.

That apart, my first impressions are all positive. 9.10 really does boot up faster (35s vs. 55s for 8.10). The interface is also noticeably more responsive, but too many things have changed for me to try to isolate a cause. Everything seems to work nicely, without any need for tweaking. Suspend and resume continue to work correctly.

Upgrading also fixed the few niggling hardware-related problems we had. Tapping and scrolling with the touchpad works much better, and the audio problems are gone, including the excessive feedback (which I thought was due to a faulty microphone). Recording through the internal microphone works fine, and the built-in speakers are no longer inaudible. I don't know yet what effect (if any) the upgrade has had on battery life.

I'm very happy so far.