Feeds

Death of batch – long live real-time

So why is it taking so long to die?

Internet Security Threat Report 2014

Remember batch processing? All your vital business reports and reconciliations ran overnight when everyone had gone home; and finished, with a bit of luck, just before everyone arrived back for work in the morning.

Well, it's been clear for 20 years that the day of batch processing is over:

  • People work flexibly, 24x7, these days – a "batch slot" isn't available, and if you mix up batch and online processing you can corrupt databases.
  • Global ecommerce is where we're all going; and when we stop work here someone is starting work on the far side of the world – and not everyone stops work on Sundays, either.
  • Volumes are increasing and the hard processing-time limit implied by the "batch window" will be a pain in the proverbial. At very least, it may get you involved with expensive over-provisioning, to make sure you can cope with month or year-end (implying capacity lying idle. Or, perhaps, you'll have to get involved in the effort required to build a "capacity on demand" grid architecture).

But batch processing is taking an awful long time to die! If you've built a batch-oriented architecture, no matter what the long-term benefits of modernisation, change is expensive and risky in the short-term, and people are ruled by short-term market factors these days.

Batch processing also seems to make sense for data warehousing. You want business intelligence (BI) based on end-of-day or end-of-month consolidated positions, and that sounds like batch processing to populate the warehouse.

But this is misleading. Your BI must be based on appropriate latencies – there is nothing pleasant about trying to deal with an immediate stock crisis or some such if you have to wait until tomorrow until you can get the information you need to make decisions with.

If you build a real-time system, you can easily arrange for end-of-month consolidations if you need them; but getting real-time information out of a batch-oriented system can be tricky. You can do it, computers can do anything, but it is likely to be expensive and unreliable.

Charles Nichols, founder and CEO of SeeWhy software, advises that real-time BI capabilities should be built into the business process at design time – as what is usually called "non functional requirements" (if BI is as useful as it is hyped, these will, in fact, be very "functional" requirements).

This makes a lot of sense to me, even if Charles's enthusiasm carries him away a bit – not all existing BI systems are batch oriented and what he calls his "vision for the new BI 2.0" sounds not dissimilar to Information Builders' "BI for the masses" mantra, which is also very much about BI-enabling operational systems.

Nevertheless, SeeWhy's event driven view of BI is a bit out of the ordinary and if you do want to build real-time BI into your application, SeeWhy's free community edition is available here. So, you can build an event-driven BI-enabled proof-of-concept relatively easily – and cheaply.

Competitors to SeeWhy in the real-time space include Applix. By coincidence, just before I met Charles, Martin Richmond-Coggan, VP EMEA at Applix, told me about a possible use of its TM1 product as, in effect, a real-time cache for operational information feedback, embedded in a business system.

Martin worries about scalability and performance – which TM1 deals with by processing multi-dimensional data in-memory, in (if you need it) a 64 bit address space, perhaps copying the data out to a relational database for persistence in the background (if that's how you want to design it).

So there are several ways to skin this particular cat. But bear in mind four things:

  1. You should design operational business information feedback into the application architecture from the start, not bolt it on afterwards, especially not after the application has gone live.
  2. Not all BI applications need real-time information – but there's a spectrum between historical point-in-time information and zero-latency information feedback. It's not black and white.
  3. Zero latency is an impossible goal anyway, because of (at bottom) speed-of-light limitations. This affects geographically distributed systems (and aren't they all) in practice.
  4. It is easier to add point-in-time capabilities to a real-time architecture than vice-versa. ®

Choosing a cloud hosting partner with confidence

More from The Register

next story
Netscape Navigator - the browser that started it all - turns 20
It was 20 years ago today, Marc Andreeesen taught the band to play
Sway: Microsoft's new Office app doesn't have an Undo function
Content aggregation, meet the workplace ... oh
Sign off my IT project or I’ll PHONE your MUM
Honestly, it’s a piece of piss
Do Moan! MONSTER 6-day EMAIL OUTAGE hits Domain Monster
Customers freaked out by frightful service
Return of the Jedi – Apache reclaims web server crown
.london, .hamburg and .公司 - that's .com in Chinese - storm the web server charts
NetWare sales revive in China thanks to that man Snowden
If it ain't Microsoft, it's in fashion behind the Great Firewall
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
Win a year’s supply of chocolate
There is no techie angle to this competition so we're not going to pretend there is, but everyone loves chocolate so who cares.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Intelligent flash storage arrays
Tegile Intelligent Storage Arrays with IntelliFlash helps IT boost storage utilization and effciency while delivering unmatched storage savings and performance.