Speed is the essence of WAR

Data location, location, location

3 Big data security analytics techniques

WAR on the cloud, Part 4 In part 3 I tested some different semi-cloudy solutions for mirrors of my site and I am in the process of replacing one dedicated WebVisions Linux machine with two virtual private system (VPS)es in separate AsiaPac countries, for less money in total. Ker-ching!

Cloud files: public or private?

But this still isn't really getting the full cloud religion, and in particular I haven't dealt with storage and the cloud elegantly. I have simply been letting my existing code manage local mirror caching, with no special cloud storage support at all.

At the moment, when running on a bare system (or VPS) as a mirror, when a request is made for one of the multimedia exhibits in the catalogue, the mirror feeds as much of it (if any) as it has locally from cache. Then the mirror streams the rest over from the master server while saving to local cache.

Thus for popular exhibits, after the first load from a given mirror, every other user of that mirror gets it streamed from the mirror's local cache for speed, and the first download is only a little slower than going to the master direct. This works, but mirrors don't all have the same outgoing bandwidth available, and bandwidth is most important for larger downloads.

(On Amazon's Elastic Beanstalk, if for some reason it decides that my mirror is taking too much CPU time, my mirror gets restarted and its cache discarded, which is wasteful. In part I mitigated that by not having such lightweight cloud mirrors do any significant pre-caching other than of the most popular content. If I could persist the cache through a restart I could reconsider this fix.)

Amazon and Rackspace both support (private) cloud-based storage and a public content delivery network (CDN), either of which might perform better than my existing solution, especially in conjunction with my lower-bandwidth mirrors.

The private persistent cloud storage could be used "behind" my mirrors with the mirrors as a shared concurrency-safe cache with the mirror front-ends protecting against excessive use of bandwidth, and/or the public cloud/CDN could serve appropriate files directly to end users.

Ambush bills again

With neither Rackspace nor Amazon is it possible to restrict download bandwidth and bills. Even if not feeling paranoid about DoS/DDoS attacks, pilfered bandwidth through hotlinking can be a significant nuisance. I checked with Rackspace and it's not currently possible to restrict downloads (say) to requests with a Referer header that matches a supplied regex, which would stop most casual misuse.

The lack of such first-line defences and any cost cap would imply that regular active monitoring is needed, and possibly only selectively making available via the CDN material less likely to be hotlinked (such as site static furniture) and a sampling of requests for popular items not currently setting off hotlink alarms.

On Amazon's AWS the public CDN is partitioned by geographical area (with extra complication and cost to distribute content to each region) whereas the Rackspace CDN (Akamai underneath) seems to be global. The enticing prospect therefore is that one Rackspace CDN could do much of the heavy lifting on behalf of all of the mirrors leaving them just to construct and serve the relatively small Web pages that the end user sees.

Rackspace's CDN is built with OpenStack which reduces the chance of having to throw away or redo any CDN integration work if I switch to a different CDN provider or use more than one.


I tested a simple face-off between Rackspace's public CDN and my home/office Apache Web site in London serving a reasonable-size binary file (a few MB), both for latency and bandwidth.

I tried pulling down the file to the following locations (within co-lo facilities, not retail broadband connections): UK (London), US (Atlanta), SG, AU (Sydney), IN (Mumbai). (Connectivity from the IN machine was sufficiently erratic and poor at the times I was testing that I have excluded it from the results.)

In general, when downloading the file from my office Apache, I could max out my link outbound at at little over 128KB/s, and the round-trip time (ie, latency) varied from 24ms to the UK machine up to 360ms for AU.

As I can construct and serve a page from one of my mirrors in typically 50ms or (much) less, these latencies to serve up page furniture are significant and annoying to end users, especially when exceeding 100ms, ie: for all but the UK.

When downloading a large file, latency is less significant than bandwidth.

With the Rackspace CDN, with content nominally uploaded to the "UK" cloud, latency to download to UK and US machines was under 1ms and the worst was SG at under 70ms which would have beaten all but the UK-to-UK serving from my own office Apache.

The bandwidth available from the Rackspace CDN was good everywhere too, maxing out my SG connection (at about 240kBytes/s) and getting as high as 22MBytes/s downloading to a UK host.


Latency and bandwidth

Note that the SG and AU servers were probably maxing out their in-bound links during the CDN tests (2Mbps and 10Mbps burstable) rather than indicating the maximum CDN bandwidth available.

Conclusion: Fit or fad?

The upshot of this simple experiment is that it is worthwhile in both latency and bandwidth terms to improve the user experience for small and large files (with perceived performance dominated by latency and bandwidth respectively) to consider taking advantage of a commodity CDN, such as that of Rackspace.

Indeed, even with my own faster mirrors I'd have difficulty matching the better CDN numbers, so if hotlinking and so on were not a worry and I wanted to maximise performance, then I should probably only serve the dynamic page content from my own mirrors, with all other material served by the CDN.

So come on Rackspace, Amazon, et al, gimme a way to control bandwidth and risks and you'll have one more customer and widen your appeal to other SMEs too... ®

SANS - Survey on application security programs

More from The Register

next story
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Kingston DataTraveler MicroDuo: Turn your phone into a 72GB beast
USB-usiness in the front, micro-USB party in the back
Dropbox defends fantastically badly timed Condoleezza Rice appointment
'Nothing is going to change with Dr. Rice's appointment,' file sharer promises
BOFH: Oh DO tell us what you think. *CLICK*
$%%&amp Oh dear, we've been cut *CLICK* Well hello *CLICK* You're breaking up...
Bored with trading oil and gold? Why not flog some CLOUD servers?
Chicago Mercantile Exchange plans cloud spot exchange
Just what could be inside Dropbox's new 'Home For Life'?
Biz apps, messaging, photos, email, more storage – sorry, did you think there would be cake?
IT bods: How long does it take YOU to train up on new tech?
I'll leave my arrays to do the hard work, if you don't mind
prev story


Designing a defence for mobile apps
In this whitepaper learn the various considerations for defending mobile applications; from the mobile application architecture itself to the myriad testing technologies needed to properly assess mobile applications risk.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
Five 3D headsets to be won!
We were so impressed by the Durovis Dive headset we’ve asked the company to give some away to Reg readers.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.