Feeds

Clouds mass over data warehousing

DAAStardly goings-on

3 Big data security analytics techniques

Comment Suddenly the data warehousing sector seems to be hotting up. There's EMC's new competency centre and now Kognitio's in-memory data warehouse which threatens to brush server vendors aside if the idea gets adopted big time. How does that one work?

The story goes like this: Cluster lots of servers together in a shared-nothing architecture and use parallelising data-warehouse SW - WX2 in this case - to treat them as a single but very parallel resource. The servers all execute different threads of queries against the data that is stored in the servers' DRAM as an in-memory database. All other data, such as query results or a fraction of the data warehouse that is not in memory, is stored on disk - the servers' directly-attached disk and not in a networked disk resource such as a SAN or a NAS box.

Generally, with a disk-based data warehouse, only a fraction of the data is stored in memory, and query results executed against this are only looking at a data sample and not the full warehouse. Results from a full-warehouse query are statistically much more likely to be correct.

Roger Gaskell, the chief technology officer of Kognitio, says the firm is currently bidding for a 40TB data warehouse and its bid is less expensive than the installed DW system based on storage arrays and many servers. But how can 40TB memory-based system be cheaper?

It's cheaper in memory than on disk

The prospective customer, a large US business with a retail interest, currently has a 600TB data warehouse stored on a Fibre Channel-accessed modular drive-array resource, with queries processed by high-cost servers. Kognitio's bid is for 600 servers in a cluster - or, more accurately, a grid set-up - which collectively have 40TB of DRAM and 600TB of disk, but server direct-attached disk, and not modular arrays.

The servers are low-cost Dell or HP X86 servers and the cost of this set-up will be around $4,000,000, whereas the cost of the installed system was $5,000,000. Gaskell said that because the servers are so cheap, "The disk storage is almost free."

Gaskell told The Reg that the Kognitio system will be radically faster in answering queries - up to 80 times faster - than the disk-based system. The reason that the customer is looking to replace or augment the existing DW array-based system is because complex queries can now take up to four or more hours, and they'll be answered in three to six minutes on the in-memory Kognitio warehouse.

If this is true - that is, if the proposed system really is 80 times faster and a fifth less expensive - then it's a steal. Gaskell wouldn't identify the prospective customer because that company didn't want to upset its incumbent vendors. You can see why: Kognitio's technology renders DW use of storage arrays redundant. This customer still gets 600TB of disk but he'll be paying a much lower server-drive price rather than storage-array prices. Gaskell says, "You can get a terabyte of disk for about $400 on an HP rack-mount server."

Why not use flash storage instead of DRAM? Wouldn't it be cheaper still? Yes, it would, said Gaskell, but as a drive-array substitute it would only be two to three times faster than disk instead of 80 times faster, and the whole reason for going in-memory is to achieve the speed needed to get real-time response to queries.

Why not use a single big chunk of DRAM, like a TMS RamSan? "We have a shared-nothing architecture for reliability," said Gaskell. "If a server goes down we can work around that," meaning that if links to the RamSan or the RamSan itself goes down then, oops, your real-time response just went dead.

SANS - Survey on application security programs

More from The Register

next story
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Kingston DataTraveler MicroDuo: Turn your phone into a 72GB beast
USB-usiness in the front, micro-USB party in the back
Dropbox defends fantastically badly timed Condoleezza Rice appointment
'Nothing is going to change with Dr. Rice's appointment,' file sharer promises
Inside the Hekaton: SQL Server 2014's database engine deconstructed
Nadella's database sqares the circle of cheap memory vs speed
BOFH: Oh DO tell us what you think. *CLICK*
$%%&amp Oh dear, we've been cut *CLICK* Well hello *CLICK* You're breaking up...
Just what could be inside Dropbox's new 'Home For Life'?
Biz apps, messaging, photos, email, more storage – sorry, did you think there would be cake?
IT bods: How long does it take YOU to train up on new tech?
I'll leave my arrays to do the hard work, if you don't mind
Amazon reveals its Google-killing 'R3' server instances
A mega-memory instance that never forgets
prev story

Whitepapers

Designing a defence for mobile apps
In this whitepaper learn the various considerations for defending mobile applications; from the mobile application architecture itself to the myriad testing technologies needed to properly assess mobile applications risk.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
Five 3D headsets to be won!
We were so impressed by the Durovis Dive headset we’ve asked the company to give some away to Reg readers.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.