Apache 2.2: new goodies from an old friend
Our Apache wiz looks at what is new in version 2...
Of course, you know this anyway. But just for the benefit of any long-term residents of Planet Amnesia, Apache is the software that powers most servers on the web - including, naturally, El Reg. And in December, Apache marked its tenth birthday with its first major new release in a little over three and a half years.
So, what's changed in 2.2? Well, the good news is a bunch of improvements. And the good news is a painless upgrade path from the previous version. It's an incremental update from Apache 2.0: it adds new features and consolidates existing capabilities, but preserves the underlying architecture and API. And after over a year in public alpha and beta testing, users can rest assured of its stability. Applications written for Apache 2.0 will need to be recompiled, but source code changes (if required at all) might amount to an hour's work. Or ten minutes, for modules after your first.
Scaling it up and up
Some of the most exciting changes serve to make Apache altogether more scalable for the most demanding users. That's not to say it wasn't already scalable: system administrators in various roles such as news sites and big download sites report a single server sustaining well over 20,000 concurrent connections at full performance.
The Event MPM decouples the client connection from the server's request processing (worker) threads. That means worker threads are not tied up in an idle state while connections are in the HTTP Keepalive state, so a source of conflict between internal efficiency and efficiency of client connections is eliminated. Users most likely to benefit are those with static or near-static contents (so the server can comfortably service huge numbers of requests), and pages with lots of included contents (images, scripts, etc) so that keepalive is a major gain.
Readers of a certain age will recollect a time around the late 1980s, when every PC software application came on a huge pile of floppy discs, 95 per cent of which was taken up by drivers for every possible printer. Eventually the situation was resolved when a single printing interface relieved applications of an unnecessary burden. The DBD framework does the same for database applications in Apache. Instead of every module - Perl, Python, PHP, Authentication, Logging, MyCustomApplication - maintaining its own database connection, Apache maintains a pool of persistent connections, available to whatever needs them. Even when there's only one database application, connection pooling is a major improvement in efficiency compared to maintaining one connection per process or thread, as in old-fashioned LAMP. Databases currently supported are PostgreSQL, MySQL, SQLite and Oracle. Back in August, El Reg asked of LAMP: "But, what happens if the "P" part of the stack is losing developers and evaporates?" With or without scripting languages, DBD provides the foundations for new generation of Web/Database applications.
Yet More Applications Support
Two of the interesting new features in Apache 2.0 were the proxy framework and the filter chain. The proxy is interesting because it is itself a modular, extensible architecture. The filter chain is even more interesting, because it enables highly efficient chaining of different data processing and transformations in the server's I/O. Putting the two together, we have a whole new class of applications that have flourished since 2002: the content-transforming proxy. Both these areas have seen exciting new developments in Apache 2.2.
Let's start with filters. The fundamentals are all there in Apache 2.0, but configuring it could be a pain with proxied (or even local dynamic) contents. The basic issue is that the proxy doesn't know what the response from the backend is going to be until that response arrives. By the time that information is available, it's too late to configure the filter chain to deal with that particular content. For example, if we have an XML filter, we don't want to process images through it, and if we get compressed XML contents we'll need to uncompress it before we can apply the XML filter. A new module mod_filter configures the filter chain dynamically according to the actual contents seen.
Regarding the proxy itself, development has been driven largely by the Java (Tomcat) community. A new protocol module supports the AJP protocol and obsoletes the old mod_jk. Of wider relevance, load-balancing, connection pooling and failover capabilities in mod_proxy_balancer help to make proxying with Apache an attractive option for high-availability enterprise-grade applications, including but not limited to Java.
Another enhancement is the refactoring of Access, Authentication and Authorization control (AAA). Setting aside IP-based access (Allow/Deny From, which is unchanged), this has been split into three tasks: (1) the communications protocol (Basic or Digest Authentication), (2) password lookup, and (3) confirmation that the user is authorized for the attempted operation.
The benefit of this is that system administrators can mix-and-match. For example, a site with a million users but just a couple of privileged groups might use DBM for fast password lookup, plain text groupfile for easy maintenance, and Digest Authentication for security without the overhead of SSL. Combined with DBD, this also obsoletes the multitude of earlier MySQL and other database authentication modules.
And more ...
Looking at Apache's own What's New list, you'll see several things I haven't mentioned above. That's not because they're unimportant; rather that I don't really have anything to say about them in a broadly-techie general overview:
- The APR (Apache Portable Runtime) version 1.2 or higher is required.
- Cacheing is promoted from experimental to stable
- Configuration has been refactored, though that may have little effect on what end-users see.
- Graceful stop enables shutdown without dropping any requests.
- Enabling use of a system regexp library helps avoid the problem of version conflicts with applications using regexps.
- Large file support is enabled as standard.
- LDAP directory support is much improved.
- Additional introspection options and monitoring/debugging hooks are available.
Oh, and did I mention the good news? The development community after ten years is as vibrant and active as ever, and working hard on a bunch of new goodies for the next release. But that'll be a while yet.
Copyright © 2006 Nick Kew
Sponsored: Beyond the Data Frontier