Feeds

A look at Apache modules

Now for some real Apache code

The Power of One eBook: Top reasons to choose HP BladeSystem

Column As I write this, I'm on the Eurostar train, just returning from O'Reilly's OSCon (Open Source Conference) in Brussels. Some fascinating insights there; and even my own talk generated some interesting discussion. Some of the delegates, including O'Reilly himself, are promoting opensource ideas going beyond software and into society more generally. I've touched on that in this very column before now, but a related argument that's new to me is that the open/closed debate in software could become largely sidelined, as the industry focuses on software as a service (such as Google's offerings) more than as a product.

Anyway, I've been meaning for some time to give you an article on writing Apache modules. But first things first. If you're going to write modules, you'll need a proper Apache installation. A safe option is to download the latest release version (currently 2.2.3) from httpd.apache.org, and install it from that. For the adventurous, you could install the current development version from svn.apache.org. If you use a different package such as an rpm or deb, you'll probably need an "apache-dev" package as well as Apache itself.

What is a module?

As I'm sure I've said, Apache has a diverse developer and user profile, and we all have very different uses for it. This approach to serving a wide range of needs is based on a small core, together with a large number of modules. Most of what you get when you install Apache as a package, from apache.org or elsewhere, comprises the modules that perform Apache's standard functions. Even such a simple task as serving an "index.html" file involves various modules: for example, mod_dir to resolve the URL http://www.example.com/ to the file index.html, and mod_mime to determine its MIME type as "text/html" so the browser knows how to render it.

Just as Apache's standard functions are driven by modules, so we can write new modules to change its behaviour, or introduce entirely new capabilities. Some examples of the kind of things we can do with modules include:

  • A content generator module takes an HTTP request and generates a response, in the manner of, for example, a CGI or PHP script. The default handler simply serves up a static file from local disc, while others may implement a service such as XML-RPC, do custom processing, or (like mod_cgi) delegate the work to an external script.
  • A mapper module runs before content generation, and determines how a request will be processed. For example, mod_negotiation selects amongst different versions of a document (e.g. different languages) according to browser preferences, while mod_alias and mod_rewrite perform rule-based URL manipulation.
  • An authentication module ascertains the identity of a user. When used, it is usually accompanied by an authorization module, which determines whether the user is permitted the attempted operation.
  • A filter module transforms incoming and/or outgoing data. Filters may be chained arbitrarily, and are the building blocks for sophisticated processing and aggregation applications. They range from simple content manipulation such as server side includes, through compression, to SSL encryption, and include many of the most exciting third-party applications.
  • A service module may export an entirely new API and/or service for other modules. For example, mod_dbd manages SQL database connections, and mod_xmlns exports an API for namespace-based processing of XML.

A HelloWorld Module

Conceptually, the simplest type of module is the content generator or handler, whose role in Apache is directly equivalent to a CGI or PHP script. That is to say, it processes a request in whatever manner is required, and generates a response to return to the Client. It is not required to deal with the details of the HTTP protocol, though (as with a script) it may do that, or any number of other things. Usually it's good to keep the content generator simple, and use other types of module for different tasks.

So in the spirit of simplicity, let's take a look at a minimal HelloWorld module. Note that, unlike a script, this doesn't live amongst our web documents, so we can't run it straight from the filesystem. We'll need to configure it using a directive such as SetHandler instead:

LoadModule helloworld_module modules/mod_helloworld.so
<Location /helloworld>
        SetHandler helloworld
</Location>

Here's a function to return a HelloWorld page to the client. The prototype is typical: it takes the request_rec (HTTP Request) object as a single argument, and returns an integer status code. The request_rec provides access to everything a handler might need (such as the variables available to a script) and also serves as an I/O descriptor, among other things:

static int helloworld_handler(request_rec *r) {
  /* First, some housekeeping. */
  if (!r->handler || strcasecmp(r->handler, "helloworld") != 0) {
    /* r->handler wasn't "helloworld", so it's none of our business */
    return DECLINED;
  }

  if (r->method_number != M_GET) {
    /* We only accept GET and HEAD requests.
     * They are identical for the purposes of a content generator
     * Returning an HTTP error code causes Apache to return an
     * error page (ErrorDocument) to the client.
     */
    return HTTP_METHOD_NOT_ALLOWED;
  }

  /* OK, we're happy with this request, so we'll return the response. */

  ap_set_content_type(r, "text/html");
  ap_rputs("<title>Hello World!</title> .... etc", r);

  /* we return OK to indicate that we have successfully processed
   * the request.  No further processing is required.
   */
  return OK;
}

So, that's our handler function. Now we need to hook it in to Apache's processing, so it will be run when we get a request for /helloworld. We use a special function that runs at server startup to register our handler with Apache:

static void helloworld_hooks(apr_pool_t *pool) {
  /* hook helloworld_handler in to Apache */
  ap_hook_handler(helloworld_handler, NULL, NULL, APR_HOOK_MIDDLE);
}

This hooks function is itself part of the module object. For most modules, this is the only symbol exported and visible to other modules or the core:

module AP_MODULE_DECLARE_DATA helloworld_module = {
  STANDARD20_MODULE_STUFF,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  helloworld_hooks
};

And that's all! We have a complete HelloWorld module. We can use apxs, a compiler-wrapper that is part of the Apache installation, to compile and (as root) install it:

$ apxs -c mod_helloworld.c
# apxs -ie mod_helloworld.la

Well, as usual I'm over the 1000 words, so I'll bring this first look at modules to a close. For further information, stay tuned. If you're seriously interested, my book is now in production with the publisher, so for the first time you have a more than just the source code and a handful of ad-hoc materials to help upgrade your LAMP and application server skills!

HP ProLiant Gen8: Integrated lifecycle automation

More from The Register

next story
Whoah! How many Google Play apps want to read your texts?
Google's app permissions far too lax – security firm survey
Chrome browser has been DRAINING PC batteries for YEARS
Google is only now fixing ancient, energy-sapping bug
Do YOU work at Microsoft? Um. Are you SURE about that?
Nokia and marketing types first to get the bullet, says report
Microsoft takes on Chromebook with low-cost Windows laptops
Redmond's chief salesman: We're taking 'hard' decisions
EU dons gloves, pokes Google's deals with Android mobe makers
El Reg cops a squint at investigatory letters
Big Blue Apple: IBM to sell iPads, iPhones to enterprises
iOS/2 gear loaded with apps for big biz ... uh oh BlackBerry
OpenWRT gets native IPv6 slurping in major refresh
Also faster init and a new packages system
prev story

Whitepapers

Reducing security risks from open source software
Follow a few strategies and your organization can gain the full benefits of open source and the cloud without compromising the security of your applications.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
Application security programs and practises
Follow a few strategies and your organization can gain the full benefits of open source and the cloud without compromising the security of your applications.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
Consolidation: the foundation for IT and business transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.