Adrift on a sea of data: Architecting for GDPR
The European regulator cometh
I’ve spent many hundreds of hours listening to sales pitches from technology vendors but it’s only during the last year I’ve started to find them rather depressing. That’s been thanks to the arrival in 2018 of the European Union’s General Data Protection Regulation.
For example, I was recently pitched to about one particular company’s new GDPR “solution”. To be fair, it was a funky security package that actually looked pretty cool, but this unnamed vendor went and spoilt everything by putting a GDPR label on it. The bit where they tried to tell me that I wouldn’t have to redevelop any of my existing systems in order to integrate their software only increased my sense of desperation not to be in this meeting.
Memo to the tech industry: stop trying to tell me that there’s such a thing as a piece of kit, or a suite of applications, or a cloud service, that will make my business GDPR compliant.
The way you make your organisation GDPR compliant is by employing strong controls around your data. By running strong training and awareness campaigns to all your staff. By ensuring that your access controls give users access to the data they need to work with, and no more. By defining rigid data retention policies and destroying data when the policy says you should. The policies, the processes, the procedures – the controls – are what you need.
Yes, systems have their place - it’s just that this particular place presents itself once you’ve figured out the processes to stay secure. This comes from implementing a culture around GDPR. As a colleague of mine put it rather eloquently: “[The] systems come at the very end of the process.”
But what does such a culture look like?
Security information and event management
SIEM has been growing as a concept in recent years: Gartner cited 15.8 per cent growth in the sector in 2015, and Cybersecurity Ventures sees the wider threat intelligence security market as growing from $3bn in 2015 to around $6bn by 2020. Why do we care about SIEM for GDPR? Simple: if you get hacked (and you should assume you will) an intelligence system can not only alert you to this but can help you with that most difficult of tasks: identifying which systems were compromised and hence deducing what data was accessed.
SIEM offerings come in a variety of shapes and sizes: you can spend six figures on an appliance-based product, or at the other extreme you can do it yourself using open-source software. The main requirement of a SIEM installation is that it’s physically and logically separate from the systems whose event logs and log streams it’s consuming: the idea is that if someone’s hacked the enterprise systems, they can’t jump off onto the SIEM system and cover their tracks by deleting the evidence in the log files.
So if you’re taking a software approach to your SIEM offering, you don’t want to host it on your core virtual server farm and SAN: a better approach is to put in a separate little world, heavily firewalled and managed separately from the enterprise network – check out the micro-server offerings if you have a small setup, or take an HCI approach with a handful of nodes if your log volumes would swamp a micro-server.
Data loss prevention
I like to think of DLP as Data Leakage Prevention, but there’s no need to get hung up on the expansion of the abbreviation. DLP is all about having the technology watch your data and alert you if something’s leaving your organisation when it shouldn’t.
The desire to move from a purely on-premises setup to hybrid infrastructure is strong: no matter your reasons for maintaining in-house servers and apps, you will miss out on some pretty cool features and capabilities in the public cloud by avoiding the public cloud and not going hybrid. One such area covered by the cloud is DLP.
For example, one particular public email service will perform, out-of-the-box, heuristic analysis of outgoing emails and spot (for instance) credit card information or bank details being sent via email. You can build on top of these basic features, too.
Which brings us to another concept – a concept that I consider to be part of a broader trend in the industry – that of “convergence”. What I’m talking about here in particular is network admission. This is about making your servers, storage and network (but mostly the servers and the storage) integrate properly with each other so that people can’t stroll in, connect their laptop to your LAN, and hoover out your Personally Identifiable Information.
Servers and storage, however, have been converging for years through various combinations of hardware, interfaces and software. Convergence has of late, though, become “hyper” convergence. The key attributes of this hyper convergence are combined server, storage and network appliances running features or workloads increasingly written in the software layer, that employ virtualisation and microservices, and that are managed centrally a via single interface – or a “single pane of glass.”
It’s these centralised, managed stacks running mission-critical applications and that are increasingly acting as conduits for data that provide the opportunity to peer deep into your infrastructure and to really manage data to the level demanded by GDPR.
All of which brings us onto the actual data itself – and knowing what you have and where it is. These are the two biggest problems I see in any organisation that’s dealing with implementation of GDPR. And that, in some ways, is a bit worrying given that much of what the new legislation requires is already part of existing data protection laws that we’ve all had for 20 plus years. It’s far from trivial to know precisely what data you have, where it is, who’s responsible for it and how old it is. It’s here that the right systems and tools can help. Yes, part of this exercise is manual, meaning you’ll need to sit down with people and asking questions, but you can make the compliance task itself easier using the “right” tools.
- Where you have a mix of on-prem and cloud storage, appliances or software that provide a unified, single virtual view of the storage estate makes it easier to manage your data.
- Encrypting your data at rest and in transit makes it far easier to be compliant in the event of a breach: if strongly encrypted data is stolen your liabilities are minimised. Ticking that box on your SAN’s GUI to turn on the encryption function is a good move, then, and it’s so easy to do.
- If you have central control over all your tiers of storage – including the live, backup and archive repositories, both on-premise and off – the chances are that compliance with the disposal dates on your data retention policy will be made easier.
- And if you can go a step further and virtualise all the distributed storage into a single view too – servers and user devices – that’s another step toward proper control over your data and data protection compliance.
- Even if you don’t initially know what data you have or – more commonly – who uses it, central logging of who’s accessing what will provide you with the clues you need to track down the owners of the data you’ve not yet been able to attribute. It will also enable you to respond effectively when you get hacked.
GDPR comes into force in May 2018. Complying with the EU’s new rules is a combination of policies and procedures and of understanding you data.
Systems can help you do the latter and implement the former. Hybrid systems in particular give you the best pick of tools: some on premises and others running in the public cloud.
In all of this, hyperconvergence is key: it provides a centralised system of management and control that lets you to see as much of your world as possible and act accordingly – to police data, protect data, remove data as needed. Only this visibility and control will give you the power to act and to demonstrate to the authorities that you have acted – or can act if called upon.
And that at the end of the day is perhaps most important. Because, if you can’t reach deep into your infrastructure, to control and manage the data, you’ll be looking at some pretty hefty fines.
Supported by HPE