Original URL: http://www.theregister.co.uk/2010/05/19/user_data_everywhere/

User Data: Here, there, everywhere

But rarely where you need it

By Trevor Pott

Posted in Servers, 19th May 2010 13:39 GMT

Blog Every computer user in the world has heard tales of “the computer that ate my files”. Perhaps the magical write-limit fairy arrived and turned your SSD back into a pumpkin. The infamous “someone, definitely not me” could have opened an infected email or Facebooked up an infected flash ad, corrupting the OS and causing all sorts of merriment.

The reasons all vary, but the result is always the same: there are no backups, and you had just put some incredible amount of time into a project on that computer which is absolutely crucial to your personal survival in the modern world. Since computer components eventually die, what can possibly be done to prevent this?

The single most important measure of preventative digital medicine (ironically also that which is most frequently ignored) is to back up your data. The second most important item is to make your data fault tolerant. In a business IT environment both of these can usually be accomplished in one fell swoop: put all of the user data on the servers, not the local computers. The servers are regularly backed up, have RAID and other goodies for fault tolerance, and are looked after by a cadre of good looking, witty, intelligent well paid systems administrators. (Well, I hope the servers at least have a RAID and scheduled back ups.)

How then do we get the user data onto the servers? User data is generally stored in two locations: the user’s profile, and their homefolder. In Windows, these are most often separate concepts. In Linux, Unix or OSX the profile is most often a part of the homefolder. When talking about a scenario where all of your users are always connected to the same network (ie no roaming or VPN users) then Linux, Unix and Mac admins have it easy. They configure their server to post an NFS share, map /home (or its equivalent) to the server, and this whole mess of trying to get user data to live on the server is done and dusted.

Windows doesn’t offer anything quite so simple. Windows has at its disposal “roaming profiles” and “folder redirection”, neither of which is a complete solution. Folder redirection is simple enough: you can tell certain pre-determined folders that they exist on the server, not on the local computer.

In a Windows environment, homefolder remapping is the canonical example of this practice. The limitation being that you can not arbitrarily remap any folder you choose; Microsoft has determined that there are a limited set of folders to which you may do this.

Most importantly, you can not redirect the entirety of a user’s profile to the server, thus preventing a Unix-like easy approach to this problem. Both the Unix approach of mapping the homefolder/profile set entirely to the server and the Windows approach of very selectively synchronising or redirecting folders to the server completely break down when you start talking about remote (VPN or laptop) users or individuals who log onto multiple machines.

If you look at the traditional Unix approach to keeping the user’s data on the servers, the instant the connection between the user’s computer and the server is severed, the user loses all access to that information. A laptop user can then do nothing without a VPN connection; there is no local copy of the data on his laptop. There are some attempts to get around this, but the really short answer is that there is no nice and clean solution. If you are using Unix, then either your users' information lives on their system, or it lives on the server. Attempts to bridge the gap in Unix are…generally very complicated.

This is where the concept of “roaming profiles” starts take hold Roaming profiles synchronises the local copy of the user profile to the server, in essence making a backup copy of all that information every time a user logs off. If this sounds like the ideal solution to the remote user problem then be warned that roaming profiles does have its flaws. Some applications leave temp files, caches, buffers or other large files in folders that get replicated to the server.

There is nothing quite so frustrating as waiting patiently for your computer to log off over the VPN, only to realise Acrobat decided that Application Data (a folder that gets replicated to the server,) was the best possible place for 4GB worth of temp files. Conversely, roaming profiles doesn’t actually copy the complete user profile to the server. Many programs that don’t play by Microsoft’s rulebook won’t have their information stored on the server.

Firefox is one such; if the copy of the profile on your local computer is lost or corrupted for any reason, there go all your Firefox bookmarks or settings. Some of the more severely misbehaved programs can actually refuse to release locks on the user’s portion of the registry upon log-off which in some cases this has the effect of preventing the profile from being copied to the server at all. Microsoft has created the profile hive cleanup service to help administrators deal with this issue, but the local computer can often still require a reboot after log-off to complete unlocking of the user’s profile.

An issue that is mostly one of user training, but still quite a common problem, is the reality that roaming profiles are only as current as the last time that user logged off their local computer. (Assuming they were even connect the corporate network at the time.) With laptop hibernation, users can sometimes go months before logging off or rebooting their laptops, again with no guarantees that when they do so they are connected to the corporate network so that profile synchronisation can occur. Systems administrators must also take into account that the user generally feels secure in the knowledge that their information is “backed up on the server” regardless of the procedures they personally follow. This can result in a policy and training issue which has proven very hard for some organisations to overcome.

The other scenarios that break roaming profiles are those in which the same user logs onto multiple machines; the results are generally not pretty. To illustrate, I will detail for you the very common “desktop folder” scenario:

Alice logs into both Computer A, creates a file and saves it to the desktop. Alice goes for lunch, logs Computer A off; the file on her desktop synchronises with the rest of her profile to the server. She returns from lunch, logs back onto Computer A, finishes her day but forgets to log off of Computer A. When she returns in the morning, she logs onto Computer B, the server supplies her the last copy of the profile it had, (circa lunchtime the previous day.)

At some point, Alice notices the file on the desktop. Having no further need of it, she deletes the file. When Alice goes to lunch, she logs off Computer B. Noticing she left computer A logged in from yesterday (oops) she logs it off as well. When she returns from lunch and logs back on to Computer B, the file she deleted off the desktop is back again!

When Alice deleted the file on Computer B, there were two other copies of that file in existence, one on the server, and one on Computer A. While the copy on the server would have been deleted during the synchronisation process of Computer B’s logoff, it was Computer A that was logged out last. Computer A did exactly what it was designed to do: it copied all the files it had over to the server, including the copy of the file that Alice had deleted. The synchronisation process can rapidly lose track of what it should or should not delete in such scenarios and defaults to “just don’t delete it.”

The long story short is that the only way you are ever getting rid of that file is to kill it out of every copy of the profile that exists on the server and all local machines Alice had logged on to. To complicate the scenario a bit, to widening the number of potential computers that Alice may log into by 5. Let’s also assume that one or two of these computers she only logs onto every few months.

The strangled scream you hear in the background is that of administrators who have dealt with this one before. It is generous to say that getting rid of the detritus that can accumulate in a profile under these conditions is exceptionally tedious.

Overall, the task of ensuring that user data is stored on (or at least regularly backed up to) the server is generally a nightmare, with actual implementation having to be tailored to your individual usage cases. Logon and logoff scripts can help a great deal, and if your usage scenarios allow for it, judicious use of a combination of both folder redirection and roaming profiles can really help.

Roaming profiles are a huge step forward in this, and this feature is one of the many reasons why Windows servers running Active Directory are so popular. Roaming profiles do however have a very long way to go before they are easy, efficient and bug free. For some of these issues, it is tempting to lay all complaints at the feet of third party application developers; roaming profiles for remote user usage cases would probably work great if everyone just played along.

Even with that in mind, I can’t find it within myself to absolve Microsoft of all guilt or blame. Roaming profiles have been around for a long time, long enough for Microsoft to have given us much finer grained controls over what gets replicated, how, and what can trigger it. Considering the plethora of usage cases, what is available either for Microsoft or its competitors is just too clumsy and primitive.

My next article will delve into the details of making the best use of the tools available to us; to overcome the inherent limitations of these technologies by combining them with each other and throwing in some simple scripting.