Easy user authentication with HTTP

December 12, 2007

Many small CGI programs need little in the way of access control: anyone may read, but only certain users are allowed to write. Here's an unobtrusive way to implement this scheme.

This method is useful only for self-contained CGI programs that generate forms with links to themselves to implement operations like searching or editing a database. I use it successfully for my personal book database.

What I didn't want

I didn't want to complicate my simple program with authentication and cookie-based session management. The extra code needed would have given me many new opportunities to make mistakes and exposed me to the chore of managing a password database, all for something only tangentially relevant to my program.

A simpler option was to set up HTTP authentication to require a password for access to the CGI program. Then I could allow anyone to login as "guest/guest", but give trusted editors their own username and password. I did not like this option principally because it restricted read access unnecessarily, made it necessary to advertise the guest account, and forced everyone to interact with a browser popup for authentication.

What I did

Let's assume that my CGI program was named foo.cgi.

I made every form which could modify the database POST its contents to foo.cgi/edit. This separated the handling of reads and writes conceptually; but writes could still be handled by foo.cgi, which would be run with “PATH_INFO=/edit” in its environment.

To enforce this separation, I also added a few lines of code to reject any attempted writes unless they were posted to /edit:

unless ( $readonly || $ENV{PATH_INFO} eq "/edit" ) {
    print "Status: 403 Forbidden\r\n";
    print "Content-Type: text/plain\r\n\r\n";
    print "Write access is available only through foo.cgi/edit\r\n";
    exit;
}

Once the URL served to distinguish reads from writes, I could require HTTP authentication for foo.cgi/edit, while allowing access to foo.cgi as usual. For example, with Apache:

<Location /foo.cgi/edit>
    AuthType basic
    AuthName foo
    AuthUserFile /usr/local/web/foo-passwords
    Require valid-user
</Location>

Now foo.cgi provides unrestricted read-only access, but anyone who tries to change something (by filling in and submitting a form) must authenticate successfully before the program is invoked (with the AUTH_TYPE and REMOTE_USER environment variables set).

As a precaution, I extended the test above to reject all unauthenticated writes, i.e. unless both “PATH_INFO=/edit” and AUTH_TYPE were set for the request. This makes the program read-only by default (i.e., if it were made executable without first setting up HTTP authentication).

With this scheme, foo.cgi needs few changes, and in particular, has no knowledge of users, passwords, or HTTP authentication. (Of course, it could use REMOTE_USER to perform more fine-grained authorization checks, but such complications have rapidly diminishing returns.)

Disadvantages

The principal disadvantage of this scheme is that it forces people to interact with the varied and dismal browser implementations of HTTP authentication. In this case, it is tolerable only because of its extreme simplicity, and because the number of people who need to authenticate is limited (e.g. just myself).

It is unfortunate that, with Apache, location-based authentication can't be configured in an .htaccess file near the CGI script; it must be in httpd.conf. (I understand why it isn't possible, but it's still unfortunate, especially since people are used to configuring HTTP authentication with .htaccess alone.)

This simple scheme doesn't scale beyond a few privileged editors. Once you have enough users that password management and recovery need to be automated, you have to maintain your own user database anyway. If you can foresee that situation, just use a cookie-based scheme from the start, and don't bother with HTTP authentication.