The Register® — Biting the hand that feeds IT

Feeds

Raising the roof on the shingled write problem

Read and write track width asymmetry

SaaS data loss: The problem you didn’t know you had

Shingled writing has a serious problem: writes take more time than current disk writes, probably ruling out enterprise use unless complex flash memory technology is used.

The problem stems from the layout of heavily overlapped tracks in shingled write recording (SWR) and the direction of writes. Together they mean that a random, in-place replacement of data on a single track cannot be done without rewriting the data on adjacent (overlapped) tracks as well, and this takes time, a lot of time in disk access terms.

As perpendicular magnetic recording (PMR) heads towards 1Tbit/in2 and the looming supraparamagnetic limit and world of randomly changing magnetic states the two main successor technologies, BPM (Bit-Patterned Media) and HAMR (Heat-Assisted Magnetic Recording), plus the third and theoretical alternative, MAMR (Microwave-Assisted Magnetic Recording), all pose intensive, complex and very costly development and production challenges. Thus the HDD industry is suffering a collective bout of pre-production financial indigestion and searching for a common way forward via the Storage Technology Alliance (STA).

SWR has been proposed as an interim way forward - one that could be applied to current PMR media and extend its life towards 2-5Tbit/in2. Use of 2D readback and signal-processing could extend it again, such that the combination of this and SWR in Two Dimensional Magnetic Recording (TDMR) could perhaps attain 10Tbit/in2.

(This information comes from an IEEE Transactions on Magnetics, Vol. 45, No. 10. October 2009, and an article entitled "Future Options on HDD Storage", which is only available behind an IEEE paywall. In our view such papers should be openly available and free of charge. It is a disgrace that they are not. )

Overlaid tracks

In SWR a hard disk drive gets more tracks because each track except the innermost ones are overlaid by the next track. The picture shows this effect, with each resulting track having only about a third or less of its width along one edge not overlapped by the adjacent track. The result is much narrower tracks.

The authors of the paper say: "Readback (with a suitably narrow reader) is unconstrained and random-access reads work just as in a conventional drive … Reading a sector from a shingled written surface may have conventional performance."

Shingled writing

Shingled disk writing diagram.

The next sentence reveals the problem: "A huge disadvantage is that 'update-in-place' is no longer possible."

What do the authors mean? They write: "Tracks are written sequentially in one direction cross-track. Therefore, a single track or portion of a track cannot be altered without first recovering many tracks of subsequently written data."

It's obvious, isn't it? If you replace and re-write the data already on a track A, which has 75 per cent or more of its width overlaid by two or three other tracks (B and C), then a write to part of track A obliterates the data on the overlapping parts of tracks B and C as well, because you write in the full track width while reading in the narrow track width. This asymmetry is the issue.

What the shingled HDD has to do is to first copy the data that is going to be overwritten into some buffer, taking let's say three disk track accesses, write the new data over the old data in the buffer, copy the bugler data, and then rewrite the data back onto all the tracks involved on the disk surface, taking another three disk track accesses. So we have six disk track accesses and three buffer data operations, taking a much longer time than the single track access needed in today's PMR, non-shingled drives.

Put it another way; will enterprises buy shingled HDDs if rewriting data takes seven or eight times longer than with today's drives? Of course they won't. SWR drives won't find a place in applications where fast write I/O is a requirement. Some commentators think the only place they will find an application is inside the slow write world of personal video recorders (PVRs).

What can be done to fix this? The authors of the paper suggest adding in two-dimensional readback and signal processing technology, and using firmware techniques such as those needed to overcome the block erase-write cycle delays in solid state drives.

TDMR

The authors state: "For 2-D readback, powerful signal processing can be applied that takes advantage of knowing the waveforms on the adjacent tracks. A highly accurate track following system will also be applied to deliver the required spatial resolution."

As well as: "By applying firmware similar to [the SSD firmware above] TDMR should be able to offer a standard HDD interface supporting small random writes and deliver write performance comparable to today's HDD by dynamically remapping written sectors to append to the most convenient band of shingles. If multiple adjacent tracks must be read to recover a sector, very effective buffer caching will be needed to mask the large latency of such reads."

In more detail TDMR needs readback signals to be available from several adjacent tracks so that a 2-dimensional waveform or image can be constructed. This waveform has powerful 2-D coding and signal-processing applied to it and this takes account of, if not advantage of, inter-track interference. Gaining the information from adjacent tracks means a multi-track reader or progressive scans with a single head, meaning extra time will be needed for the extra three to five disk rotations involved. A lot of research is needed to see if this approach is viable.

If this approach is used then additional system memory will be needed, meaning extra cost, and more cost still will come from using a full 2-D readback process.

It's apparent that although shingling is less expensive to develop than BPM or HAMR, it is not an easy option. But shingling and full TDMR can be applied to both BPM and HAMR media as well as the current media, so once suppliers know how to do it they can continue using the technology. ®

Steps to Take Before Choosing a Business Continuity Partner

It should not be such a problem

The difference between a disk and a flash is that a disk can contain non-shingled and shingled areas which can be read and written by the same head.

If the physical writes are optimised to the current free shingled location and the journal is written to a non-shingled area a shingled disk can fly.

This concept can be taken even further. All writes can take place to a non-shingled buffer and moved to a shingled area in the background once the disk is idle. Allocating let's say 8G non-shingled write buffer on a 8TB drive is no big deal and there are very few applications which will produce more than 8G of data at a speed which is capable of saturating a drive.

This may even be done dynamically as needed and when needed.

2
0

I think you missed the boat

I've already filled up a 2TB drive and two 1TB drives at home, and at the office, we have a brand spanking new SAN with 10 1TB drives (and 20 300GB) that are rapidly being filled up from all of the clunky old SANs.

A 2TB drive takes about 30-40 hours to format or copy.

0
0

shingled slow-write very useful..

>will enterprises buy shingled HDDs if rewriting data takes seven or eight times longer than with today's drives? Of course they won't. SWR drives won't find a place in applications where fast write I/O is a requirement.

Bad generalisation. It is the re-write that is slow, not the write. Back-up and archive disks can benefit from this technology right now. I reckon we'll see more and more specialised disks: expensive and fast SSD for OLTP (fast access time), cheap and slower SSD for 24/7 low-load servers (no wear, no heat), shingled for one-off writes (huge capacity, fast first write), etc.

Marketers will love this.

0
0

More from The Register

SCO vs. IBM battle resumes over ownership of Unix
Zombie lawsuit back and wants to suck the brains out of Linux
 breaking news
You don't need phone lines or cable for ANYTHING, says Dish
The satellite-dish man can sort you out with phone and broadband over the air too
 breaking news
What's HP got under wraps? Looks awfully flash and tape shaped
What happens in Vegas won't stay there - we've got the details
Microsoft borks botnet takedown in Citadel snafu
Stupid Redmond kicked over our honeypots, wail white hats
IBM's $1bn layoffs latest: Now axe swings in US, Canada - reports
Union claims 121 storage bods canned after dismal sales
NetApp musters muscular cluster bluster for ONTAP busters
Storage array OS overhauled to juggle more nodes, go down on you, er, less
HP adds 'Haswell' Xeon E3s to entry ProLiant servers
Gussies up MicroServer for SMBs, adds baby switches