How do you copy 60m files?
Apart from telling someone else to do it, that is
Sysadmin blog Recently I copied 60 million files from one Windows file server to another. Tools used to move files from system to another are integrated into every operating system, and there are third party options too. The tasks they perform are so common that we tend to ignore their limits.
Many systems administrators are guilty of not knowing exactly where the limits of their file management tools lie - at least until they run up against them.
But I know that using Windows Explorer in Windows XP/Server 2003 would be lunacy. The first permissions, filename or path length problem and the copy grinds to a halt.
If I were moving files from one server to another, a halt would just be a minor annoyance. The moved files are on the destination server. If the transfer halts, restart with the faulting file omitted. Irritating and time consuming, but not difficult.
Copying is another matter entirely. The file manager must handle exceptions. If it doesn't, then you have to do a lot of checking to find out what has been copied and what hasn’t.
FTP is one of my favourite ways to handle this problem. A good FTP client is designed with all sorts of abnormal situations in mind. It has bandwidth and treading controls, a transfer queue, the ability to resume failed transfers, and it reconnects to the target server if the connection is lost.
Decades after it was created, FTP still remains one of the best ways to move files. But no graphical FTP client I could find would cope with 60 million files. Filezilla blew up somewhere around one million. WS-FTP managed a few more. None of them were capable of more than about four million files.
My next attempt was to package them into a ball on the originating server and unpack them on the destination. No good. Neither Windows Server 2003’s native zip utilities, WinZip, 7Zip or WinRAR were up to it. Somewhere between four and ten million files all of them threw an exception and died.
Knowing that Windows 7/Server 2008 R2’s Windows Explorer is more advanced than the Server 2003 version, I tried using a third server to move the files from A to B via C (which was Server 2008 R2). It's much better at handling exceptions, but it too fell apart at four million files too.
Consulting my flash keyfob, I started trying my sysadmin tools. XXCopy, FastCopy, TeraCopy and Beyond Compare all made valiant, but ultimately ineffective, attempts. Of the GUI tools I tried, only Richcopy was able to handle the load. Richcopy is a free multi-threaded file management application written by Ken Tamaru at Microsoft. Increasingly it is my fallback for handling odd or exceptional file transfer scenarios.
I wanted to give several command-line tools a go as well. XCopy and Robocopy most likely would have been able to handle the file volume but - like Windows Explorer - they are bound by the fact that NTFS can store files with longer names and greater path than CMD can handle. I tried ever more complicated batch files, with various loops in them, in an attempt to deal with the path depth issues. I failed.
What worked brilliantly was using a Linux virtual machine. A simple default CentOS 5.5 install was able to mount SMB shares on both the originator and destination machines. From there, the command-line tool cp was able to succeed where every tool except Richcopy failed.
The Linux command line tool cp, while slower than Richcopy, copies in a linear fashion. It copies each file in sequence, leading to no fragmentation on the destination server.
Richcopy can handle large quantities of files, but can multi-thread the copy, and so is several hours faster than using a Linux server as an intermediary. The disadvantage is that the resulting file system on the destination server is heavily fragmented. You could restrict Richcopy to a single thread, but then it is no faster than cp.
Richcopy is not so fast that there is time to defragment an NTFS partition with 60 million files on it before CP would have finished. So the best way to move 60 million files from one Windows server to another turns out to be: use Linux.