Feeds

Tackling Apache zombies

Myths and legends laid to rest

Boost IT visibility and business value

The net is full of Apache knowledge. Tips and tricks, discussion fora, experts of all kinds, and innumerable "how-tos" and tutorials on a range of subjects. Some of these are worth reading; others may be otherwise.

Among all this wisdom are some myths and half-truths that are so well known as to be "common knowledge". They regularly need to be "un-explained" in the support fora. They typically have origins in the very early days of Apache and its predecessor the NCSA server, and were either true in the distant past, or appeared in tutorial examples which gathered momentum to become, in cargo cult terms, the One True Way. Now, zombie-like, they refuse to die.

So, let's tackle some of them here. I've selected four that can lead you into trouble if misunderstood. But before launching into them, here's a public service announcement, particularly for those who missed ApacheCon in Dublin for reasons of geography. Two more ApacheCon conferences are coming up shortly:

  1. Asia: Colombo, Sri Lanka, 14-17 August
  2. USA: Dallas, Texas, 9-13 October

Password Protection and .htaccess

One of the most pervasive myths around Apache is that password protection means .htaccess, and vice versa. While not entirely false, this belief is misleading, and often leads to configurations that are unnecessarily complex and slow. Apache itself may be guilty of confusing users by virtue of the naming scheme: the similarly-named htpasswd and htdigest programs are indeed concerned with password protection.

To understand the role of .htaccess, we need to take a look at Apache configuration. The majority of Apache's configuration can be set per-directory, so that http://www.example.com/example/ may behave entirely differently to http://www.example.com/, etc. This is based on the <Directory> container in httpd.conf, and its cousins <Location>, <Files>, etc. Password protection is just one of many aspects of configuration that can be determined per-directory.

Now, the true role of .htaccess files is to enable some aspects of Apache configuration to be determined outside httpd.conf, so it can be delegated to users who have no privilege to mess with the core of the system. The AllowOverride directive determines what aspects of configuration can be set in .htaccess files, and AllowOverride is itself something that can be set per-directory.

Now, you may sometimes encounter instructions to create a .htaccess file, containing something like:

  AuthType Basic   AuthName "My Application"   AuthUserFile /path/to/my/userfile   Require valid-user

together with setting

  AllowOverride AuthConfig

in httpd.conf to enable it.

This is harmful for two reasons. Firstly, it's complex: it introduces new and unnecessary issues concerning management of, permissions on, and (not least) security of, the .htaccess files themselves. Secondly, it's slow: as soon as you set AllowOverride to anything other than its default setting of None, Apache has to look for and read the .htaccess files (possibly more than one, where subdirectories are involved) for every hit!

Just say no to .htaccess! Whatever goes in a .htaccess file can always go directly into a corresponding <Directory> stanza. The only people to use .htaccess should be end-users who want control without having to bug the server admin.

AddType Abuse

The AddType directive serves to associate a filename "extension" with a MIME type, so the server sets the correct Content-Type header for the Client to handle the contents.

In the NCSA server and in Apache 1.0, this was extended in a rather dirty hack to create "magic" types that would invoke special processing on the server. Common instances were CGI scripts and server-side includes, and later PHP.

This hack became redundant with the introduction of the AddHandler directive in Apache 1.1 (1996). Now we have a clean distinction between handlers (serverside processing) and MIME Types (which describe contents so that browsers and other Clients can handle them correctly). But ten years on, the abuse of AddType to determine serverside processing continues to muddy the waters. This, perhaps more than any other Apache zombie, has advocates who really should know better.

Don't limit the limits!

Another password-protection myth that is mercifully near-dead concerns the use of <Limit ...> sections. These appeared in some early example documentation, from where they were widely copied out of context and grew into a myth.

The <Limit> container serves to say "this limit applies only to the HTTP methods listed". So you're granting unlimited access using other methods. This is rarely what you want, although it does have valid uses. The most common valid case is where you use Limit alongside a matching LimitExcept to determine different access policies for different operations. WebDAV and subversion commonly use this form, and provide good examples. For other password protection policies, you should normally just ignore it.

The case-insensitive URL

The case-insensitivity myth goes something like "Apache on Windows is case-insensitive; on *X it's case sensitive". And so it appears, though Apache's mod_speling(sic) - or an AliasMatch or RewriteRule - can change this behaviour.

Underlying this is a half-truth: that URLs correspond to the filesystem. They may indeed do that, but this is no more than a convenient convention. URLs are in fact defined unambiguously by RFC2396 to be case-sensitive (in the "file" part). A server that treats them otherwise is in fact serving the same contents at multiple different URLs.

Servers that give the illusion of being case-insensitive are actively harmful to the 'net infrastructure. That's because agents such as caches and search engines MUST treat each URL as different. So creating multiple versions is in effect spamming them (and may get a site penalised in search engines that watch out for spamming).

To see the potential extent of this spam, note that each "case-insensitive" letter doubles the number of spammed URLs. So for example, an eight letter "case-insensitive" URL is in fact 256 URLs, and at 20 letters, it rises to over a million!

And now the light is failing and I'm pulling the covers over my head – before the zombies get me... ®

The Essential Guide to IT Transformation

More from The Register

next story
NO MORE ALL CAPS and other pleasures of Visual Studio 14
Unpicking a packed preview that breaks down ASP.NET
KDE releases ice-cream coloured Plasma 5 just in time for summer
Melty but refreshing - popular rival to Mint's Cinnamon's still a work in progress
Leaked Windows Phone 8.1 Update specs tease details of Nokia's next mobes
New screen sizes, dual SIMs, voice over LTE, and more
Another day, another Firefox: Version 31 is upon us ALREADY
Web devs, Mozilla really wants you to like this one
Put down that Oracle database patch: It could cost $23,000 per CPU
On-by-default INMEMORY tech a boon for developers ... as long as they can afford it
Secure microkernel that uses maths to be 'bug free' goes open source
Hacker-repelling, drone-protecting code will soon be yours to tweak as you see fit
Mozilla keeps its Beard, hopes anti-gay marriage troubles are now over
Plenty on new CEO's todo list – starting with Firefox's slipping grasp
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
The Essential Guide to IT Transformation
ServiceNow discusses three IT transformations that can help CIO's automate IT services to transform IT and the enterprise.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.