God's gift to C

Crossing platforms with Apache Portable Runtime

Internet Security Threat Report 2014

OK, here it is: a proper developer article. After all, this is Reg Developer, so it's about time. Today's subject is an Apache spinoff project: the APR lies at the heart of the webserver, but is also a standalone library, and is extensively used in separate projects: most famously Subversion.

But before diving in, here are a couple of things for your diary:

  • ApacheCon, the main Apache conference, will take place in Dublin, 26-30 June (two days tutorials + three days conference). Earlybird registration is available until 29 May. So, start working on the boss!
  • If you are a student, Google's Summer of Code may offer an opportunity to undertake a really interesting open source project, and get paid for it. Apache is one of many open source organisations participating.

The need for APR

Despite its age and patchy heritage, C has a stronger claim than any other programming language to be the industry standard. Yet, it has many shortcomings, and typically requires quite a lot more programming effort to work with than higher-level and more modern languages. For example, C lacks dynamically resizing strings and arrays, and data types such as hash tables that are standard in scripting languages. More fundamentally, C leaves all resource management in the hands of the programmer, and avoiding memory (or other resource) leaks can be a major chore. Finally, it is harder to write portable, cross-platform code in C than in scripting languages.

APR serves to deal with these issues, bridging the gap between C and a scripting language in terms of programmer productivity. The basic areas dealt with by APR include resource management, cross-platform programming, and a range of utility classes.

Resource Management

Perhaps the most fundamental barrier to productivity in C is the problem of resource management. C programmers are responsible for all resource allocation, and (generally much harder) ensuring that resources are always cleaned up after use, but never used after cleanup! Dealing with resource management can consume an utterly disproportionate amount of programmer effort, and generate very difficult bugs.

APR's solution to this is "pools", which lie at the heart of APR and Apache. Pools serve to allocate memory (faster than malloc on most platforms), and to ensure it is cleaned up at the appropriate time. They can also register cleanups for other resources, for example, to close a filehandle or socket, or release a lock. Typical usage of a pool is to tie it to an object with a well-defined lifetime (such as, in the webserver, a TCP connection or HTTP request), so objects can be allocated with that same lifetime and then just left.

Dynamic Resources

A secondary shortcoming in C is the lack of built-in support for dynamically-resizing resources such as strings and arrays. APR provides this too, built on top of dynamic memory allocation with pools. For example, whereas in C's stdio, we have:

sprintf(buf, fmt, varargs);

APR's strings module gives us instead:

buf = apr_psprintf(pool, fmt, varargs);

freeing the programmer from the need to compute the size of buf and allocate in advance, or make a guess and leave the code vulnerable to buffer overflows.

Basic Class Library

Higher-level and scripting languages have at least dynamic arrays and hashes as native datatypes. APR provides these for C programmers:

  • Array is implemented as a stack, and can be used as an automatically-resizing array or queue.
  • Hash is a hash table, in which keys and entries are (pointers to) arbitrary data types.
  • Table is a table indexed by character strings. As such, it is less general than the hash, but it supports a number of additional operations, such as merging multiple values for a key into a comma-separated list. It is used extensively in Apache to represent tables such as the HTTP headers in a request/response, where these operations are required.
  • Ring is a doubly-linked list. It is implemented in macros, and resembles a C++ Template.
  • Queue is a thread-safe FIFO queue.
  • Bucket is an arbitrary data container or source, that lies at the heart of Apache's I/O. Buckets are contained in bucket brigades, which are an instantiation of the ring.

Dynamic Resource Pools

Another level of resource management is the apr_reslist, which manages a dynamically-resizable pool of typically-complex resources. An example is Apache's DBD architecture, that uses a reslist to implement a pool of connections to a backend SQL database.

Portability Layer

Aside from resource management, the other fundamental purpose of the APR is to provide a common cross-platform API for operations falling outside standard C and involving a platform-specific library. This aims to encompass the platform-dependent operations likely to be used by Apache and its modules, such as:

  • I/O, including network and filesystem operations.
  • Process and thread management, conditions, mutexs.
  • Dynamic loading of code.
  • Identity management and system security.
  • Shared memory and memory mapping.
  • Signals, Events, Environment.

In addition to specific modules, there are general portability aids, such as pre-processor macros that expand to platform-specific declarations where necessary (Windows' dllimport/dllexport being prime examples where vendor-defined information that should belong to the build flags has to go in the source code).

Your humble scribe has worked mostly in environments where portability is not an issue. Before getting seriously involved with Apache, I'd ported Perl and Java reasonably painlessly, but found C and C++ more trouble than they were worth for nontrivial jobs. Writing Apache modules I've found I can develop on Linux or BSD and later compile on Solaris, MacOSX, or even Windows with no extra effort, or at worst half an hour’s worth of simple hacks. Of course, the APR portability layer is the key to this. As proof of the pudding, Site Valet (for which I am responsible) is moving from C++ STL-based classes to APR-based classes to make it maintainable across platforms.

More ..

In addition to these core functions, APR provides a range of utilities. On the one hand, basic but important things such as time/date and cryptographic APIs. On the other hand, high-level abstractions such as apr_dbm (DBM databases), apr_dbd (SQL databases), apr_ldap, and apr_memcache, in the tradition of scripting language abstractions such as Perl's Tie/AnyDBM and DBI/DBD. Last but not least, APR makes use of the pre-processor to implement the infrastructure for hooks that form the basis for the Apache module API and related constructs.

Using APR

As usual, I've passed the thousand words while still having a lot to say. So rather than show usage examples and the shape of an APR-based program here, I'll give you some URLs for further reading. If this article has aroused your interest in using the APR, INOUE Seiichiro's tutorial is a great introduction. Tutorials on selected APR topics in the context of Apache exist at Apache Tutor. My forthcoming book Applications Development with Apache devotes a complete chapter to a more extensive introduction to the APR. Finally, of course, the APR project site includes general information, downloads and API documentation. ®

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
PEAK APPLE: iOS 8 is least popular Cupertino mobile OS in all of HUMAN HISTORY
'Nerd release' finally staggers past 50 per cent adoption
Microsoft to bake Skype into IE, without plugins
Redmond thinks the Object Real-Time Communications API for WebRTC is ready to roll
Microsoft promises Windows 10 will mean two-factor auth for all
Sneak peek at security features Redmond's baking into new OS
Mozilla: Spidermonkey ATE Apple's JavaScriptCore, THRASHED Google V8
Moz man claims the win on rivals' own benchmarks
Yes, Virginia, there IS a W3C HTML5 standard – as of now, that is
You asked for it! You begged for it! Then you gave up! And now it's HERE!
FTDI yanks chip-bricking driver from Windows Update, vows to fight on
Next driver to battle fake chips with 'non-invasive' methods
DEATH by PowerPoint: Microsoft warns of 0-day attack hidden in slides
Might put out patch in update, might chuck it out sooner
Ubuntu 14.10 tries pulling a Steve Ballmer on cloudy offerings
Oi, Windows, centOS and openSUSE – behave, we're all friends here
prev story


Cloud and hybrid-cloud data protection for VMware
Learn how quick and easy it is to configure backups and perform restores for VMware environments.
Getting started with customer-focused identity management
Learn why identity is a fundamental requirement to digital growth, and how without it there is no way to identify and engage customers in a meaningful way.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
New hybrid storage solutions
Tackling data challenges through emerging hybrid storage solutions that enable optimum database performance whilst managing costs and increasingly large data stores.
Website security in corporate America
Find out how you rank among other IT managers testing your website's vulnerabilities.