Feeds

Death of batch – long live real-time

So why is it taking so long to die?

Choosing a cloud hosting partner with confidence

Remember batch processing? All your vital business reports and reconciliations ran overnight when everyone had gone home; and finished, with a bit of luck, just before everyone arrived back for work in the morning.

Well, it's been clear for 20 years that the day of batch processing is over:

  • People work flexibly, 24x7, these days – a "batch slot" isn't available, and if you mix up batch and online processing you can corrupt databases.
  • Global ecommerce is where we're all going; and when we stop work here someone is starting work on the far side of the world – and not everyone stops work on Sundays, either.
  • Volumes are increasing and the hard processing-time limit implied by the "batch window" will be a pain in the proverbial. At very least, it may get you involved with expensive over-provisioning, to make sure you can cope with month or year-end (implying capacity lying idle. Or, perhaps, you'll have to get involved in the effort required to build a "capacity on demand" grid architecture).

But batch processing is taking an awful long time to die! If you've built a batch-oriented architecture, no matter what the long-term benefits of modernisation, change is expensive and risky in the short-term, and people are ruled by short-term market factors these days.

Batch processing also seems to make sense for data warehousing. You want business intelligence (BI) based on end-of-day or end-of-month consolidated positions, and that sounds like batch processing to populate the warehouse.

But this is misleading. Your BI must be based on appropriate latencies – there is nothing pleasant about trying to deal with an immediate stock crisis or some such if you have to wait until tomorrow until you can get the information you need to make decisions with.

If you build a real-time system, you can easily arrange for end-of-month consolidations if you need them; but getting real-time information out of a batch-oriented system can be tricky. You can do it, computers can do anything, but it is likely to be expensive and unreliable.

Charles Nichols, founder and CEO of SeeWhy software, advises that real-time BI capabilities should be built into the business process at design time – as what is usually called "non functional requirements" (if BI is as useful as it is hyped, these will, in fact, be very "functional" requirements).

This makes a lot of sense to me, even if Charles's enthusiasm carries him away a bit – not all existing BI systems are batch oriented and what he calls his "vision for the new BI 2.0" sounds not dissimilar to Information Builders' "BI for the masses" mantra, which is also very much about BI-enabling operational systems.

Nevertheless, SeeWhy's event driven view of BI is a bit out of the ordinary and if you do want to build real-time BI into your application, SeeWhy's free community edition is available here. So, you can build an event-driven BI-enabled proof-of-concept relatively easily – and cheaply.

Competitors to SeeWhy in the real-time space include Applix. By coincidence, just before I met Charles, Martin Richmond-Coggan, VP EMEA at Applix, told me about a possible use of its TM1 product as, in effect, a real-time cache for operational information feedback, embedded in a business system.

Martin worries about scalability and performance – which TM1 deals with by processing multi-dimensional data in-memory, in (if you need it) a 64 bit address space, perhaps copying the data out to a relational database for persistence in the background (if that's how you want to design it).

So there are several ways to skin this particular cat. But bear in mind four things:

  1. You should design operational business information feedback into the application architecture from the start, not bolt it on afterwards, especially not after the application has gone live.
  2. Not all BI applications need real-time information – but there's a spectrum between historical point-in-time information and zero-latency information feedback. It's not black and white.
  3. Zero latency is an impossible goal anyway, because of (at bottom) speed-of-light limitations. This affects geographically distributed systems (and aren't they all) in practice.
  4. It is easier to add point-in-time capabilities to a real-time architecture than vice-versa. ®

Intelligent flash storage arrays

More from The Register

next story
Netscape Navigator - the browser that started it all - turns 20
It was 20 years ago today, Marc Andreeesen taught the band to play
Sway: Microsoft's new Office app doesn't have an Undo function
Content aggregation, meet the workplace ... oh
Sign off my IT project or I’ll PHONE your MUM
Honestly, it’s a piece of piss
Return of the Jedi – Apache reclaims web server crown
.london, .hamburg and .公司 - that's .com in Chinese - storm the web server charts
NetWare sales revive in China thanks to that man Snowden
If it ain't Microsoft, it's in fashion behind the Great Firewall
Chrome 38's new HTML tag support makes fatties FIT and SKINNIER
First browser to protect networks' bandwith using official spec
Admins! Never mind POODLE, there're NEW OpenSSL bugs to splat
Four new patches for open-source crypto libraries
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Cloud and hybrid-cloud data protection for VMware
Learn how quick and easy it is to configure backups and perform restores for VMware environments.
Three 1TB solid state scorchers up for grabs
Big SSDs can be expensive but think big and think free because you could be the lucky winner of one of three 1TB Samsung SSD 840 EVO drives that we’re giving away worth over £300 apiece.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.