Data analysis isn't dead
It ain't no good if it ain't got good data
Call me old-fashioned, but data is still pretty important. In most systems, if you feed bad data in you get bad data out (Garbage In, Garbage Out - GIGO).
And if you analyse data structures and relationships, you can eliminate a lot of poor thinking before it goes live. If I know that one of these things is always, or never, associated with one of those things; or these things here can have no possible use or meaning when I delete that thing there; then at least some cases of incorrect processing can be identified easily because they produce results incompatible with this "logical data model", which documents the information associated with things and the relations between them.
Or, on the other hand, if you generate your database from a complete and internally-consistent data model, some kinds of incorrect processing simply won't be possible.
Data analysis is especially useful because it is usually an independent check on systems development - the data analysts are usually a separate team to the coders and make different errors and assumptions. If the data model doesn't match the code then one or the other, or both, are wrong.
Data analysis was big in the 1980s when the curious idea was practiced that it might be good if all your customer information, say, was stored once and only once, in a single database - a single source of the "truth".
Then Objects came along and data didn't matter much for a while. Objects were always right even if their associated data was rather undefined. Then, powered by some nasty things like Y2K (when you suddenly wanted to know where dates were and how they were used) and company directors signing off on financial reports (on pain of going to jail), data started to get important again...
So I was a little saddened when Donna Burbank (pictured right), the director of enterprise modelling and architecture solutions at Embarcadero, told me that one of her reasons for leaving CA and moving to Embarcadero (one of only a few vendors of effective data analysis and DBA tools - BMC is another) was that CA's new focus on ITIL was putting data analysis in the shade.
What sense does this make? Surely ITIL doesn't say that data isn't important? Good data is at the core of IT governance - and IT governance (as part of corporate governance generally) is why firms should be implementing ITIL. Or is ITIL simply an end in itself, a tickbox in a magic scroll, which you wave to keep auditors away? I hope not, it is worth more than that (it would also make for a very expensive magic scroll).
Anyway, Embarcadero is certainly not abandoning data. It sees data as the core of a business - and control of data quality is vital to SOX (Sarbanes Oxley) and Basel II compliance and the like. In fact, I think this has probably been a nice little earner for Embarcadero.
Now, Donna claims, it is moving on to the next stage, having done a pretty good job of assisting the DBA team with its automated tools. The "next stage" is adding a Business Process Modelling capability to the metadata repository which describes a company's data and their relationships. It's really a visualisation exercise for the business, based on the repository - and the repository keeps it honest because it can be validated for consistency and completeness, and it manages "real" operational data.
Expect new Eclipse-based tools from Embarcadero, based on a new process-modelling framework, in October. These will bridge both logical and physical viewpoints and provide a conceptual mapping from the business into the repository. You should be able to reuse analysis at the object level, without necessarily having the whole model in place (early attempts at this sort of thing failed because they expected a complete and validated "Corporate Data Model", and no one ever finished building one). In fact, you can probably import an existing high-level conceptual model and use it, with its shortcomings (missing objects and information) highlighted.
Oh, and if you're a DBA who's pretty happy with Embarcadero's ER Studio, don't worry. According to Donna "we are very protective of our ER Studio customers, they're already happy". So, the development team has split and Embarcadero's new framework is a fork, so that no one will be forced to migrate. And an ER Studio v7.1 product, is promised.
This will apply data security classification schemes to document information security and introduce a Model Validation wizard which can help you check model completeness and help you review it for appropriate standards and best practices. It also includes workflow and productivity improvements (and N-level Undo/Redo) as well as many detailed technical updates. Database support is also enhanced (for example, foreign keys in MYSQL 5 are now supported, as are SQL Server 2005 schemas).
But, whether you are a DBA managing database implementations or a company auditor managing SOX compliance, just remember this: data really is important.
David Norfolk is the author of IT Governance, published by Thorogood. More details here.
Sponsored: Benefits from the lessons learned in HPC