A photo album with nice URLs

November 14, 2007

I've always uploaded my photographs into an "img" or a "misc" directory, and handed out URLs like http://toroid.org/ams/misc/foo.jpeg to people. This has worked pretty well.

Occasionally, however, I would upload a group of related photographs, and, tiring quickly of cutting and pasting many URLs, end up writing a little HTML page with links to the images in question.

Eventually, I tired of that too.

Update 2015-10-31: I don't actually use this code any more. It's been replaced by something that works very similarly, but this version is no longer maintained.

What I wanted

I wanted to keep the best parts of the old system: having all the data in one place, being able to add things easily, and controlled access.

I like to keep all of these photographs in the same place; and, by not auto-indexing that directory, to limit access to those who have been given (or can guess) the URL of a specific image.

In addition, and with a minimum of effort, I wanted to be able to define albums, and I wanted to be able to put one photograph into more than one album, without needing to copy the image. I wanted to be able to add an optional comment to each album or photograph.

Above all, I wanted nice URLs for everything.

What I wrote

I wrote a small CGI program that can produce three kinds of pages based on the value of PATH_INFO: a list of albums if it is empty, a list of (links to) photographs if it names a known album, or a page to display a single photograph if it looks like "album/photo-name". If it is none of these, it returns a 404 status.

$_ = $ENV{PATH_INFO} || "";
unless ( $_ eq "" ||
         (($album) = m{^/([\w\d.-]+)$}) ||
         (($album,$photo) = m{^/([\w\d.-]+)/([\w\d.-]+)$}) )
{
NOTFOUND:
    print "Status: 404 File Not Found\r\n";
    # ...

To render these pages, the program must find answers to three questions: What albums are defined? What photographs does each one contain? What do we know about a given photograph?

The answers are derived from albums.txt, which contains a list of albums and their contents, and photographs.txt, which lists optional extra information about each photograph, and the file name of the photograph itself.

The two data files are in an easily-parsed format similar to the Linux kernel CREDITS. This is the code that parses photographs.txt and creates a hash that maps an image name to a title and description:

sub fetch_photographs {
    my @photos;

    local *PHOTOS;
    open( PHOTOS, "$data/photographs.txt" ) || die;
    while ( <PHOTOS> ) {
        chomp;
        next if /^#/ || /^\s*$/;
        if ( /^N: ([\w\d.-]+)$/ ) {
            push @photos,
                { name => $1, title => photo_title( $1 ),
                  description => "" };
        }
        elsif ( @photos && /^T: (.*)$/ ) {
            $photos[-1]->{title} = $1;
        }
        elsif ( @photos && /^D: (.*)$/ ) {
            $photos[-1]->{description} .= "$1\n";
        }
    }

    return map { $_->{name} => $_ } @photos;
}

Yes, the program may need to parse both files to generate a page. That doesn't bother me, because I won't have too many albums or photographs, and more importantly, I can always store the data differently without changing the URLs. In the meantime, the two files are easy to manage.

Now, http://toroid.org/ams/img/img.cgi generates a list of albums, with links like http://toroid.org/ams/img/img.cgi/birds, which in turn generates a list of photographs, with links like http://toroid.org/ams/img/img.cgi/birds/elanus-caeruleus, which refers to the image http://toroid.org/ams/img/elanus-caeruleus.jpeg.

Nicer URLs

With a little hideous mod_rewrite magic (in .htaccess), we can eliminate img.cgi altogether from visible URLs.

Options FollowSymLinks ExecCGI
RewriteEngine on
RewriteBase /ams/img
RewriteRule img.cgi - [L]
RewriteRule ^$ img.cgi [L]
RewriteRule /img$ /ams/img/img.cgi [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule (.*)$ img.cgi/$1

This arrangement allows me to pretend that albums are directories under img, and photographs are files in each directory (and that's how I want them to behave, even if I don't want to create the directories).

Thus, http://toroid.org/ams/img (with or without a trailing slash) is internally redirected to http://toroid.org/ams/img/img.cgi. Similarly, http://toroid.org/ams/img/birds and http://toroid.org/ams/img/birds/elanus-caeruleus are internally redirected to http://toroid.org/ams/img/img.cgi/birds and http://toroid.org/ams/img/img.cgi/birds/elanus-caeruleus, and any URL with img.cgi in it is processed without further ado. At the same time, if a real file named img/elanus-caeruleus.jpeg exists, a request for http://toroid.org/ams/img/elanus-caeruleus.jpeg is handled as a normal file, with no reference to img.cgi.

What I can do now

I can still scp photographs into my image directory and give people a direct .jpeg URL. If I don't want to do any more work, I can leave it at that.

I can define an album by adding the following paragraph to albums.txt:

N: birds
D: See my <a href="http://toroid.org/ams/birds">page about birds</a>
D: for more details.
I: accipiter-badius.jpeg
I: anthus-rufulus.jpeg
I: elanus-caeruleus.jpeg
A: For much better photographs of birds, see
A: <a href="http://www.orientalbirdimages.org">Oriental Bird Images</a>.

This specifies the name of the album, an optional description and afterword in HTML, and lists the photographs which belong to the album. When I upload a new image, I can just add an I: line to this (or any other) album definition. Without it, that image will not be accessible through this album.

Once this is done, a URL like http://toroid.org/ams/img/birds will lead to a page that displays my comments, along with links to the individual photographs. The index page, http://toroid.org/ams/img, will contain a list of links to any albums I've defined.

At my leisure, I can add entries like the following to photographs.txt:

N: elanus-caeruleus.jpeg
T: Black-Shouldered Kite (Elanus caeruleus)
D: Photographed at Sultanpur, India.

This specifies the file name, the title of the page it should be displayed on, and an optional comment (again in HTML); but even if I never do this, a sensible default title ("Elanus caeruleus") will be inferred from the file name.

Thus, a URL like http://toroid.org/ams/img/birds/elanus-caeruleus will lead to a sensibly titled page that displays http://toroid.org/ams/img/elanus-caeruleus.jpeg, and includes the comment, along with links to the next and previous images in the album, if applicable.