Exchange 2010 dumps single instance storage
Say hello again to attachment duplication
Microsoft added single instance storage (SIS) to Exchange 2005 and has now removed it from Exchange 2010, ensuring that duplicated applications will be stored in all their redundant, space-gobbling glory on Exchange server's disks. Why has Microsoft made this apparently retrograde step?
A Microsoft Social Technet entry reads:
Header data for all mailbox items is stored in a single database table — this change makes the database more efficient because it can process a single table for a mailbox during a client session instead of accessing different tables for different mailbox folders. A side effect of this schema change is that Exchange no longer uses Single Instance Storage (SIS) to keep just one copy of message content per database. Most servers support multiple databases, so the efficiency gained from SIS is less and less as time goes on.
This talks of Exchange server efficiency and doesn't mention storage array efficiency, in which the storage of gigabytes of duplicate data is to be abhorred.
A change in the opposite direction concerns compression:
The Store compresses attachments — Microsoft calculates that the CPU time spent compressing and decompressing attachments is less than the work required to manage the storage of very large uncompressed data within the database. This change also reduces the overall size of Exchange databases, which speeds up operations such as backups.
Again the focus of optimisation is the server, with storage efficiency a byproduct.
Previously we have learnt that Exchange 2010's I/O improvements meant that its database could use cheap high-capacity SATA drives instead of faster, more expensive Fibre Channel or SAS disk drives. Now we see that high capacity SATA drives may well be needed anyway because of attachment duplication.
Across a large enterprise with tens of thousands of Exchange users, there must be the potential for multiple tens of gigabytes of wasted storage space - if not terabytes. ®
Backup is cheap.
Deal with it by buying more. Oh, oops.
Exchange Footprint is Too Big
The cost of hosting exchange is rising exponentially. SIS was at least a step in shrinking storage requirements for Exchange. Exchange storage requirements are becomming huge and the processing requirments are having to increase to cope.
The trouble is that the space saved with SIS is nothing compared to the huge emails created by Exchange/Outlook/Word. Have you ever seen the amount of wasted space in typical Exchange based emails? Couple that with the way users tend to just reply to emails - the size of the typical exchange/outlook email is becomming huge. If I check on a typical folder I can easily spot 20 Meg messages containing the one word response "Thanks".
Because the typical cheap SATA disk is now 500GB does not mean that you can just grow your storages indefinitely - the cost of storage is still high because now you have to make the SANs that much faster in order to be able to search the huge amounts of data - increasing the processing requirements and making the whole thing slower.
hard disk size
"And I shall also point out that the smallest hard disk you can buy now is over 500 Gb (!)"
It's ironic that you talk about not basing Exchange on desktop PCs in one breath then talk about low end / desktop storage as if its the only option out there the next. (And you'd still be wrong, I think).
500Gb is actually quite a large size when you're talking about high performance/reliability enterprise disk storage based on interfaces other than SATA.