A look at Apache modules

Now for some real Apache code

5 things you didn’t know about cloud backup

Column As I write this, I'm on the Eurostar train, just returning from O'Reilly's OSCon (Open Source Conference) in Brussels. Some fascinating insights there; and even my own talk generated some interesting discussion. Some of the delegates, including O'Reilly himself, are promoting opensource ideas going beyond software and into society more generally. I've touched on that in this very column before now, but a related argument that's new to me is that the open/closed debate in software could become largely sidelined, as the industry focuses on software as a service (such as Google's offerings) more than as a product.

Anyway, I've been meaning for some time to give you an article on writing Apache modules. But first things first. If you're going to write modules, you'll need a proper Apache installation. A safe option is to download the latest release version (currently 2.2.3) from httpd.apache.org, and install it from that. For the adventurous, you could install the current development version from svn.apache.org. If you use a different package such as an rpm or deb, you'll probably need an "apache-dev" package as well as Apache itself.

What is a module?

As I'm sure I've said, Apache has a diverse developer and user profile, and we all have very different uses for it. This approach to serving a wide range of needs is based on a small core, together with a large number of modules. Most of what you get when you install Apache as a package, from apache.org or elsewhere, comprises the modules that perform Apache's standard functions. Even such a simple task as serving an "index.html" file involves various modules: for example, mod_dir to resolve the URL http://www.example.com/ to the file index.html, and mod_mime to determine its MIME type as "text/html" so the browser knows how to render it.

Just as Apache's standard functions are driven by modules, so we can write new modules to change its behaviour, or introduce entirely new capabilities. Some examples of the kind of things we can do with modules include:

  • A content generator module takes an HTTP request and generates a response, in the manner of, for example, a CGI or PHP script. The default handler simply serves up a static file from local disc, while others may implement a service such as XML-RPC, do custom processing, or (like mod_cgi) delegate the work to an external script.
  • A mapper module runs before content generation, and determines how a request will be processed. For example, mod_negotiation selects amongst different versions of a document (e.g. different languages) according to browser preferences, while mod_alias and mod_rewrite perform rule-based URL manipulation.
  • An authentication module ascertains the identity of a user. When used, it is usually accompanied by an authorization module, which determines whether the user is permitted the attempted operation.
  • A filter module transforms incoming and/or outgoing data. Filters may be chained arbitrarily, and are the building blocks for sophisticated processing and aggregation applications. They range from simple content manipulation such as server side includes, through compression, to SSL encryption, and include many of the most exciting third-party applications.
  • A service module may export an entirely new API and/or service for other modules. For example, mod_dbd manages SQL database connections, and mod_xmlns exports an API for namespace-based processing of XML.

A HelloWorld Module

Conceptually, the simplest type of module is the content generator or handler, whose role in Apache is directly equivalent to a CGI or PHP script. That is to say, it processes a request in whatever manner is required, and generates a response to return to the Client. It is not required to deal with the details of the HTTP protocol, though (as with a script) it may do that, or any number of other things. Usually it's good to keep the content generator simple, and use other types of module for different tasks.

So in the spirit of simplicity, let's take a look at a minimal HelloWorld module. Note that, unlike a script, this doesn't live amongst our web documents, so we can't run it straight from the filesystem. We'll need to configure it using a directive such as SetHandler instead:

LoadModule helloworld_module modules/mod_helloworld.so
<Location /helloworld>
        SetHandler helloworld

Here's a function to return a HelloWorld page to the client. The prototype is typical: it takes the request_rec (HTTP Request) object as a single argument, and returns an integer status code. The request_rec provides access to everything a handler might need (such as the variables available to a script) and also serves as an I/O descriptor, among other things:

static int helloworld_handler(request_rec *r) {
  /* First, some housekeeping. */
  if (!r->handler || strcasecmp(r->handler, "helloworld") != 0) {
    /* r->handler wasn't "helloworld", so it's none of our business */
    return DECLINED;

  if (r->method_number != M_GET) {
    /* We only accept GET and HEAD requests.
     * They are identical for the purposes of a content generator
     * Returning an HTTP error code causes Apache to return an
     * error page (ErrorDocument) to the client.

  /* OK, we're happy with this request, so we'll return the response. */

  ap_set_content_type(r, "text/html");
  ap_rputs("<title>Hello World!</title> .... etc", r);

  /* we return OK to indicate that we have successfully processed
   * the request.  No further processing is required.
  return OK;

So, that's our handler function. Now we need to hook it in to Apache's processing, so it will be run when we get a request for /helloworld. We use a special function that runs at server startup to register our handler with Apache:

static void helloworld_hooks(apr_pool_t *pool) {
  /* hook helloworld_handler in to Apache */
  ap_hook_handler(helloworld_handler, NULL, NULL, APR_HOOK_MIDDLE);

This hooks function is itself part of the module object. For most modules, this is the only symbol exported and visible to other modules or the core:

module AP_MODULE_DECLARE_DATA helloworld_module = {

And that's all! We have a complete HelloWorld module. We can use apxs, a compiler-wrapper that is part of the Apache installation, to compile and (as root) install it:

$ apxs -c mod_helloworld.c
# apxs -ie mod_helloworld.la

Well, as usual I'm over the 1000 words, so I'll bring this first look at modules to a close. For further information, stay tuned. If you're seriously interested, my book is now in production with the publisher, so for the first time you have a more than just the source code and a handful of ad-hoc materials to help upgrade your LAMP and application server skills!

Boost IT visibility and business value

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Linux turns 23 and Linus Torvalds celebrates as only he can
No, not with swearing, but by controlling the release cycle
China hopes home-grown OS will oust Microsoft
Doesn't much like Apple or Google, either
Sin COS to tan Windows? Chinese operating system to debut in autumn – report
Development alliance working on desktop, mobe software
Apple promises to lift Curse of the Drained iPhone 5 Battery
Have you tried turning it off and...? Never mind, here's a replacement
Eat up Martha! Microsoft slings handwriting recog into OneNote on Android
Freehand input on non-Windows kit for the first time
Linux kernel devs made to finger their dongles before contributing code
Two-factor auth enabled for Kernel.org repositories
This is how I set about making a fortune with my own startup
Would you leave your well-paid job to chase your dream?
prev story


Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Endpoint data privacy in the cloud is easier than you think
Innovations in encryption and storage resolve issues of data privacy and key requirements for companies to look for in a solution.
Scale data protection with your virtual environment
To scale at the rate of virtualization growth, data protection solutions need to adopt new capabilities and simplify current features.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?