Feeds

Product data quality in the information supply chain

As important as a physical supply chain

Protecting against web application threats using SSL

Readers may recall that I wrote last year about Silver Creek Systems®. At that time I extolled its capabilities for product data quality (matching, cleansing and classification) through the use of semantically-based content profiling and attribute identification. This is particularly relevant when information comes into the organization in unstructured formats as most conventional data quality tools have historically been limited to support for structured information.

I went on to say that "if you have a complex matching problem that goes beyond conventional name and address matching (not necessarily for products) then you must talk to Silver Creek Systems". Well, I have been getting an update from the company about what it has been doing over the last year or so. And it turns out to be quite a lot.

Before I discuss what Silver Creek has done with its software, however, it is worth reporting on some research that Ventana Research has conducted on Silver Creek's behalf. It found that "80 per cent of companies are not confident about the quality of their product data" (which I don't find entirely surprising considering the widespread use of Excel spreadsheets) and that "73 per cent find it 'difficult' or 'impractical' to standardise product data". In other words, product data quality is a big problem, although it is only relatively recently that this has become obvious. Silver Creek was one of the first, if not the first, data quality company to recognize this problem and to develop a specialised offering to address the issues involved.

The first point that I want to raise is with respect to what Silver Creek refers to as Information Supply Chains. What the company means by 'Information Supply Chains' is the data equivalent of a physical supply chain and they are important within the context of product data because they can be used to highlight problems associated with such information. And just as a physical chain can have many participants so too can an Information Supply Chain. Specifically, Information Supply Chains carry product (or other) information between systems, users and organizations and, just as a physical supply chain is intended to maximize the efficiency of that chain so, in an Information Supply Chain, you want to enable efficient operations that range through eCommerce to product design, and from inventory to business intelligence and customer service. Also, just as in a physical supply chain, whenever the chain is broken, the business is likely to suffer.

Now, the point about Information Supply Chains is that there are a number of touch points where data quality potentially becomes a major issue and can break the chain, particularly where data is coming from external sources. In particular—and this is what makes product data fundamentally different from other types of data—there are no universal standards for either the format or content of data. Different systems and users need to see the data in very different ways within very different contexts, and each product category differs from every other product category, with different schemas, validation and business rules, and vocabulary. Moreover, MDM (master data management) systems need the data differently from the inventory system; the website needs it differently from the ERP system; and every system needs it in a consistent, complete and correct form, which is almost never available from external systems!

To summarize on Information Supply Chains then: the data requires a lot of work and the more you have to use manual methods to do all of this work, then the more expensive and unreliable the solution is going to be. This is where Silver Creek Systems comes in.

What Silver Creek has done is to introduce the concept of Data Service Applications that you can place at each of the touch points in the Information Supply Chain. These are web service-enabled data quality applications, each of which is based around one or more Data Lenses. I probably need to explain that a bit further.

Each Data Lens applies semantic rules to interpret, standardize, validate and apply a quality metric to any product (or other) data. This is regardless of the format or domain to which the product data belongs. Next, Data Service Applications take one or more Data Lenses and combine these with business rules to manage exceptions and to take appropriate (business) actions to ensure that only complete, consistent and reliable data is passed on to downstream systems. For example, suppose that you are selling a particular type of widget for which you require five specific attributes. What do you do if a description of a widget comes in from a supplier that only has three of those attributes? The short answer is that you determine some sort of business rule concerning how to handle this exception, such as fill in the gaps by scraping a website or other data source, flag it for manual review or some other sort of resolution. These rules are built into the Data Service Application, which then 'handles' the exception.

Going back to the Information Supply Chain, the idea is that you embed Data Service Applications at each relevant point in the supply chain, in order to ensure a smooth flow of accurate data. Note that these applications do not need to know about the source or target as these are simply applications that are called or invoked via a web service, as required. Further, there is no actual programming involved as the DataLens™ System has been built from the ground up to be used by business people and you create Data Service Applications simply by using graphical, drag-and-drop techniques.

There are two really important points here. The first is that a Data Lens is better than traditional approaches when it comes to standardising, cleansing and matching unstructured product data. For example, Silver Creek cites stories from its customer base such as an electronics company that improved its quote fulfillment rate by 20 per cent in 3 months, worth "millions of dollars in additional quotations"; and a distributor that cut manual services costs by 75 per cent and, of course, with improved data quality. However, the real point is that these companies couldn't have achieved anything close to these results using traditional approaches.

The second important point is this notion of a Data Service Application that can be built for the specific task in hand (Data Lenses can be reused where appropriate) and which can be initiated on demand. It is not so much that you couldn't implement this sort of solution with other products, but that I have not spoken to any other company that has such a clear vision of how these sorts of solutions should be created and deployed. This is particularly important because there are other companies trying to grab at Silver Creek's coat-tails with respect to product data quality and it is this broader vision as much as its capabilities that will keep Silver Creek ahead of the chasing pack.

Copyright © 2007, IT-Analysis.com

The next step in data security

More from The Register

next story
New 'Cosmos' browser surfs the net by TXT alone
No data plan? No WiFi? No worries ... except sluggish download speed
'Windows 9' LEAK: Microsoft's playing catchup with Linux
Multiple desktops and live tiles in restored Start button star in new vids
iOS 8 release: WebGL now runs everywhere. Hurrah for 3D graphics!
HTML 5's pretty neat ... when your browser supports it
Mathematica hits the Web
Wolfram embraces the cloud, promies private cloud cut of its number-cruncher
Google extends app refund window to two hours
You now have 120 minutes to finish that game instead of 15
Intel: Hey, enterprises, drop everything and DO HADOOP
Big Data analytics projected to run on more servers than any other app
Mozilla shutters Labs, tells nobody it's been dead for five months
Staffer's blog reveals all as projects languish on GitHub
SUSE Linux owner Attachmate gobbled by Micro Focus for $2.3bn
Merger will lead to mainframe and COBOL powerhouse
iOS 8 Healthkit gets a bug SO Apple KILLS it. That's real healthcare!
Not fit for purpose on day of launch, says Cupertino
prev story

Whitepapers

Providing a secure and efficient Helpdesk
A single remote control platform for user support is be key to providing an efficient helpdesk. Retain full control over the way in which screen and keystroke data is transmitted.
WIN a very cool portable ZX Spectrum
Win a one-off portable Spectrum built by legendary hardware hacker Ben Heck
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.
Protecting users from Firesheep and other Sidejacking attacks with SSL
Discussing the vulnerabilities inherent in Wi-Fi networks, and how using TLS/SSL for your entire site will assure security.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.