The Register® — Biting the hand that feeds IT

Feeds

A look at Apache modules

Now for some real Apache code

Regcast training : Hyper-V 3.0, VM high availability and disaster recovery

Column As I write this, I'm on the Eurostar train, just returning from O'Reilly's OSCon (Open Source Conference) in Brussels. Some fascinating insights there; and even my own talk generated some interesting discussion. Some of the delegates, including O'Reilly himself, are promoting opensource ideas going beyond software and into society more generally. I've touched on that in this very column before now, but a related argument that's new to me is that the open/closed debate in software could become largely sidelined, as the industry focuses on software as a service (such as Google's offerings) more than as a product.

Anyway, I've been meaning for some time to give you an article on writing Apache modules. But first things first. If you're going to write modules, you'll need a proper Apache installation. A safe option is to download the latest release version (currently 2.2.3) from httpd.apache.org, and install it from that. For the adventurous, you could install the current development version from svn.apache.org. If you use a different package such as an rpm or deb, you'll probably need an "apache-dev" package as well as Apache itself.

What is a module?

As I'm sure I've said, Apache has a diverse developer and user profile, and we all have very different uses for it. This approach to serving a wide range of needs is based on a small core, together with a large number of modules. Most of what you get when you install Apache as a package, from apache.org or elsewhere, comprises the modules that perform Apache's standard functions. Even such a simple task as serving an "index.html" file involves various modules: for example, mod_dir to resolve the URL http://www.example.com/ to the file index.html, and mod_mime to determine its MIME type as "text/html" so the browser knows how to render it.

Just as Apache's standard functions are driven by modules, so we can write new modules to change its behaviour, or introduce entirely new capabilities. Some examples of the kind of things we can do with modules include:

  • A content generator module takes an HTTP request and generates a response, in the manner of, for example, a CGI or PHP script. The default handler simply serves up a static file from local disc, while others may implement a service such as XML-RPC, do custom processing, or (like mod_cgi) delegate the work to an external script.
  • A mapper module runs before content generation, and determines how a request will be processed. For example, mod_negotiation selects amongst different versions of a document (e.g. different languages) according to browser preferences, while mod_alias and mod_rewrite perform rule-based URL manipulation.
  • An authentication module ascertains the identity of a user. When used, it is usually accompanied by an authorization module, which determines whether the user is permitted the attempted operation.
  • A filter module transforms incoming and/or outgoing data. Filters may be chained arbitrarily, and are the building blocks for sophisticated processing and aggregation applications. They range from simple content manipulation such as server side includes, through compression, to SSL encryption, and include many of the most exciting third-party applications.
  • A service module may export an entirely new API and/or service for other modules. For example, mod_dbd manages SQL database connections, and mod_xmlns exports an API for namespace-based processing of XML.

A HelloWorld Module

Conceptually, the simplest type of module is the content generator or handler, whose role in Apache is directly equivalent to a CGI or PHP script. That is to say, it processes a request in whatever manner is required, and generates a response to return to the Client. It is not required to deal with the details of the HTTP protocol, though (as with a script) it may do that, or any number of other things. Usually it's good to keep the content generator simple, and use other types of module for different tasks.

So in the spirit of simplicity, let's take a look at a minimal HelloWorld module. Note that, unlike a script, this doesn't live amongst our web documents, so we can't run it straight from the filesystem. We'll need to configure it using a directive such as SetHandler instead:

LoadModule helloworld_module modules/mod_helloworld.so
<Location /helloworld>
        SetHandler helloworld
</Location>

Here's a function to return a HelloWorld page to the client. The prototype is typical: it takes the request_rec (HTTP Request) object as a single argument, and returns an integer status code. The request_rec provides access to everything a handler might need (such as the variables available to a script) and also serves as an I/O descriptor, among other things:

static int helloworld_handler(request_rec *r) {
  /* First, some housekeeping. */
  if (!r->handler || strcasecmp(r->handler, "helloworld") != 0) {
    /* r->handler wasn't "helloworld", so it's none of our business */
    return DECLINED;
  }

  if (r->method_number != M_GET) {
    /* We only accept GET and HEAD requests.
     * They are identical for the purposes of a content generator
     * Returning an HTTP error code causes Apache to return an
     * error page (ErrorDocument) to the client.
     */
    return HTTP_METHOD_NOT_ALLOWED;
  }

  /* OK, we're happy with this request, so we'll return the response. */

  ap_set_content_type(r, "text/html");
  ap_rputs("<title>Hello World!</title> .... etc", r);

  /* we return OK to indicate that we have successfully processed
   * the request.  No further processing is required.
   */
  return OK;
}

So, that's our handler function. Now we need to hook it in to Apache's processing, so it will be run when we get a request for /helloworld. We use a special function that runs at server startup to register our handler with Apache:

static void helloworld_hooks(apr_pool_t *pool) {
  /* hook helloworld_handler in to Apache */
  ap_hook_handler(helloworld_handler, NULL, NULL, APR_HOOK_MIDDLE);
}

This hooks function is itself part of the module object. For most modules, this is the only symbol exported and visible to other modules or the core:

module AP_MODULE_DECLARE_DATA helloworld_module = {
  STANDARD20_MODULE_STUFF,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  helloworld_hooks
};

And that's all! We have a complete HelloWorld module. We can use apxs, a compiler-wrapper that is part of the Apache installation, to compile and (as root) install it:

$ apxs -c mod_helloworld.c
# apxs -ie mod_helloworld.la

Well, as usual I'm over the 1000 words, so I'll bring this first look at modules to a close. For further information, stay tuned. If you're seriously interested, my book is now in production with the publisher, so for the first time you have a more than just the source code and a handful of ad-hoc materials to help upgrade your LAMP and application server skills!

Agentless Backup is Not a Myth

Latest Comments

PHP 5 Module

We demonstrated the loading of the PHP 5 module in the PHP/DB2 article.

# For PHP 5

LoadModule php5_module "C:/PHP/php5apache2.dll"

AddType application/x-httpd-php .php

http://www.regdeveloper.co.uk/2006/08/09/db2_udb_part2/

0
0

Documentation Severely Lacking

"So for the first time you have a more than just the source code and a handful of ad-hoc materials to help upgrade your LAMP and application server skills!"

Very true! The Apache web server is very powerful and extensible however when I first came to implement a module I found the documentation seriously lacking for anything but the basic cases.

0
0

More from The Register

Bjarne Again: Hallelujah for C++
Plus: Now officially OK to admit you never used STL algorithms
Nuke plants to rely on PDP-11 code UNTIL 2050!
Programmers and their walking sticks converge in Canada
Interwebs taunt Sir Jony over Apple eye candy makeover
Hey Ive, Ive... add more unicorns, willya?
SCO vs. IBM battle resumes over ownership of Unix
Zombie lawsuit back and wants to suck the brains out of Linux
Red Hat to ditch MySQL for MariaDB in RHEL 7
So long, Oracle! Don't let the door hit you on the way out
Shy? Socially inadequate? Fiddling with your phone could help
App 'tells the brutal truth' about social inadequates' chatup lines
Java EE 7 melds HTML5 with enterprise apps
New release arrives with GlassFish, NetBeans support
 breaking news
'Office Facebook' firm Tibbr wants you to PAY for mobe-meetings app
Great idea. Punters won't cough for it though
 breaking news
The only Waze is Google: Ad giant tipped to gobble map app 'for $1.3bn'
Pac-Man-satnav-ish upstart in bidding war with Apple, Facebook
 breaking news
PM Cameron calls for modern, programmable computers! (We think)
IT education musings to G8 chiefs to mystify IT industry