From production to development databases (and back again)
Checklist for Oracle DBAs
Workshop In many organisations, it isn't just the production database that database administrators (DBAs) have to look after. There are several non-production versions, as well.
Project teams may need one to develop on, so that they can work in isolation without affecting production. A QA team may need its own version for testing patches, and general change management. Training teams may want a copy, as might the folks in helpdesk and support.
According to a 2013 enterprise data performance survey by the Independent Oracle Users' Group, 43 per cent of respondents had between three and five copies of their data held in non-production databases. Another 16 per cent had more than five, and a sorry four per cent didn't even know how many they had.
Managing this proliferation can quickly become a complex task. DBAs must keep track of the different databases, while also ensuring that the storage requirements don't get out of control. Development and test databases must sometimes be refreshed, to ensure that project teams are working with an accurate baseline set of data. And then, there are the security issues to consider.
Here are some pointers on how to manage this mixture of production and non-production databases simply and safely.
Build a policy
Managing requests is a first step for getting these things under control. Different groups will want non-production databases for different reasons. Those databases will have different requirements, including:
- Particular subsets of data that must be provided
- How regularly the database should be refreshed
- Whether incremental data is needed from the production database.
Having a spreadsheet or some other policy document will help to keep the requests - and your responses to them - straight.
Cloning your production database
The simplest way to generate a copy of your Oracle database for non-production, test and development purposes is to use the DUPLICATE command in the backup and recovery manager (RMAN). This will create an independent copy of the database being cloned, with its own database ID.
RMAN's duplication feature can also be used to select subsets of data for duplication, selectively omitting data that you don't need to copy. Use the TABLESPACE command as an option to strip out certain tablespaces. Using SKIP READONLY will avoid duplicating read-only tablespaces when cloning.
Toad for Oracle, which is one of the most popular third party tools for managing Oracle databases, can help to shed some light on RMAN functions. It has a backup explorer that can be used to monitor RMAN jobs, for example.
Selecting subsets of data
This issue of data subset selection is important. In many cases, teams requesting non-production versions of the database won't need access to all of the data. Replicating it would be a waste of space, and computing resource. Focus on tables with large volumes, and analyse the needs of the team requesting the non-production clone, to see what you can eliminate. Also, explore geographical or time-based approaches to data (does the QA team really need any customer orders made more than a year ago?).
But it's important here to know your database structure inside out. You may choose a particular subset of data (product SKUs for example) to duplicate from a database when producing a clone for non-production purposes. But if you don't spot relationships that this table has with other entities in the database, then you risk omitting key data sets during the duplication process. That could make the data inaccurate for testing and development purposes.
There are pre-packaged accelerators available from third parties that can help DBAs to subset their data quickly, by helping to spot these relationships.
Toad for Oracle has a data export feature, which allows DBAs to select which data tables they want to export for a variety of purposes. This can be one way to export data to a non-production database after it has been created. This includes features such as including only the data for a specific schema/owner, or for exporting the GRANTs on all tables for a selected user.
You can also use the Oracle Data Masking Pack, which is a management pack for Oracle Enterprise Manager, to look for some of this information.
Mask sensitive data
The Data Masking Pack can also help to deal with another key problem when generating non-production data: protecting sensitive information.
In your production database, you should hopefully have properly-defined security roles that stop people without the proper privileges getting access to sensitive data. When you clone this data into non-production databases, you often need to remove these security models altogether, so that you can test properly. That represents a security and compliance risk so it's a sensible idea to mask the sensitive data during the cloning process.
You ideally want it to be replaced with data that is still usable in a test and development environment. The masked data should still look like the formatted production data so as to tie in with the application logic that it will be tested against, but it should be different, to avoid giving away any information.
Apply changes from non-production to production servers
Some of your non-production databases will be development and test versions, on which developers will be making and validating schema changes. Eventually, you may want to merge those schema changes back into the production version. Take regular schema baselines, which you can generate using a tool such as Dell Toad. Toad also lets you compare them, so that you can see what changed.
You can then generate the SQL to make the changes between the first and second baselines and implement them manually using scripts.
Audit all non-production copies
DBAs who want to protect themselves and their organisations from any compliance blowback should take care to audit the non-production databases that they create and assign to other teams. This audit should extend across the entire life of the database, including the date and time of creation, and who created it, alongside similar details when it is destroyed.
Make a record of which sensitive data was masked. Document how the subset of the database's contents was decided, and include information about where the database resides, and who has access to it.
You may not be able to stop the requests for non-production copies of your Oracle database, but you can at least gain some control over how they're served. Proper governance and due process is the key to keeping track of this digital inventory, and folding non-production changes back into the production database. ®