Solid state revolution: in-depth on how SSDs really work

malor · Jun 4, 2012

I've only gotten through page 1, so far, but I think the author probably uses a Mac.

Why? Because SSDs are terribly important on a Mac for good system performance. They matter so, so much, more than any other single component. If the author uses a Mac, then his description of the night-and-day difference is precisely accurate. It really is that big a deal.

Windows, however, especially 64-bit Windows 7 with a lot of RAM, does an excellent job of insulating you from a slow hard disk. It has very smart caching, and it's able to make a big-RAM machine feel much faster than it actually is, and way, WAY faster than OS X on the same hardware.

So, an SSD still matters on Win7, but it has much less of an impact, especially if you have 16 gigs. (which is super cheap, these days, under $100) The biggest thing an SSD helps with is boot speed, while the BIOS is still running the disk. Once the kernel takes over, and starts using all that copious RAM as a drive cache, the difference between SSD and rotating storage is reduced very sharply.

I presume that OS X is so absolutely dominated by drive seek time as a component of overall performance because of the ancient and creaky HFS+. With a replacement filesystem, or with performance tuning, that bottleneck could disappear. But at the moment, that is THE upgrade you want on a Mac, if you don't already have it. A Mac Mini with an SSD will feel faster, in routine use, than a monster Mac Pro on a rotating drive.

malor · Jun 6, 2012

koolraap":1ugdq1f4 said:
But you youngsters. Why I remember (where did everybody go?) when we got the lowercase mod chip installed on the TRS-80 clone. That's right, we had both kinds of case then, upper, and lower.

Yeah, that was very similar to my family's first computer upgrade, from the TI 99/4 to the 99/4A. The 4A added that same exciting feature, lower case letters.

The first computer I actually bought myself was an Amiga 500, with a meg of memory, two floppy drives, and a 2400 baud modem. That machine definitely had lower case.

malor · Jun 7, 2012

That just seems specious. A normal hard drive has a sector size of 512 bytes, so unless we are going to say that normal hard drives have a write amplification factor of up to 4096 we can't say that SSDs have a write amplification factor of 16.8 million. In fact, modern hard drives have 4096 byte sectors, resulting in a "hard drive write amplification factor" of up to 32,768.

They absolutely do. This is absolutely correct. And on a magnetic drive, it doesn't matter, because writing bits doesn't wear out the medium. If they have an amplification factor of 0.5 or five million, it makes no real difference to the overall lifespan of the drive.

With SSDs, every time a cell is written, it takes damage. Eventually, it stops working completely. So, there, write amplification matters a very great deal, and those numbers are accurate.

malor · Jun 7, 2012

brucedawson2":c2eoulch said:
As a programmer I never think "I'm gonna write a zero to that bit on the hard drive". I think in terms of much larger writes because I know that writing a single bit is meaningless on any kind of mass storage device.

Usually, no, but single-byte writes are very common. Unix does this all the time, for instance. The storage layer is often, but not always, smart enough to batch these writes up into larger sectors, but even then, there will usually be a large write amplification factor if you're using an SSD. It won't be 16 million or whatever, but it'll probably be 8 to 10.

More concretely, if a hard drive has a write bandwidth of 100 MB/s I do not expect to be able to do 800 million single-bit writes per second. Or, if my memory has a write bandwidth of 10 GB/s I do not expected to be able to do 80 billion single-bit writes per second.

That's just good old IOPS, and it seems rather irrelevant here. Everyone who's reasonably expert knows that each write takes a certain amount of time, and that large writes can move a lot more bytes in a given time period. This is why SSDs are often so much faster, but it's not at all relevant to the disk-wearout problem.

Mathematically the 16.8 million write amplification factor is correct, but I maintain that it is misleading. It is more sensible and meaningful to describe how much larger the write-amplification factor is for SSDs compared to other devices, in my opinion.

No, it just isn't, because other media don't wear out from being written to. Write amplification matters enormously on an SSD, because it's a huge factor in determining the useful life of the drive. Magnetic technologies don't have this problem. They're slower to start reading and writing, because of seek and rotational latencies, but the media does not inherently wear out from use. As long as the mechanism above it keeps working, the platter will keep recording data nearly indefinitely.

Therefore, for disks, it would be more meaningful to talk about the write-amplification factor for 512-byte or 4096-byte writes.

Well, it depends on what write patterns you're talking about. But write amplification is only important for SSDs, because high write amplification levels break them. This is not particularly true of hard drives... it might be, a little tiny bit, because each seek causes the tiniest bit of wear on the actuators, but what matters on a hard drive is the total number of head movements, not how many bytes are read and written from the disk.

So, when comparing the two media types, write amplification is a useless figure to talk about in reference to hard disks. It doesn't matter. Not even a little bit. The technologies are different. It's like you're yelling about how paddle wear on a rice cooker is terribly important, when we're talking about bread machines.

The fact that single-bit writes to an SSD are inefficient is, by and large, unrelated to SSDs, and therefore a bad example.

WRONG. Wrong, wrong, wrong, wrong. Single-bit writes to an SSD cause wildly disproportionate wear. That's why write amplification matters on that technology, and doesn't on hard drives. Any write at all damages an SSD, be it one bit or 50 megabits. So measuring actual, in-use write amplification is very important. It doesn't matter on hard drives, and can't be compared in any meaningful way.

Accurate and useful/meaningful are not the same thing.

Well, I'm sorry to be so blunt, but your objections are neither. Because only flash storage actively wears out from being written to, write amplification is a problem that exists only with that technology. It's useful/meaningful only in that context. Even if a hard drive had amplification factors ten times worse than SSD, it wouldn't matter, because writing the extra bits doesn't cause any wear.

Overall: it's head movement that matters on a hard drive. It's number of erase cycles that matters on an SSD. Write amplification is barely related to head movement at all.

malor · Jun 7, 2012

If you do single-bit writes to a hard drive that are not coalesced then your bandwidth will be tens of thousands of times lower. If you do single-bit writes to an SSD that are not coalesced then your bandwidth will be tens of thousands of times lower, and you will wear out your SSD.

Right, you're getting the 'not wearing out' part. One is just a performance issue. The other is actually destroying the media. They're not at all equivalent.

malor · Jun 24, 2012

Not really. What actually matters is how many pages of data you write to the drive, how many times each cell has to be erased. It doesn't matter how you partition it, it matters how much you data you write and rewrite onto the drive. Partitioned at 90 gigs, it will still take almost exactly the same amount of total blocks erased before failing.

You can think about it like a massively overprovisioned 30 gig drive, but you can think that way independently of what partitioning scheme you use.

If you think, for some reason, that you will write less total data to the drive if you partition it smaller, then do so. But loading the drive once is no big deal, it's the constant rewriting that eats the cells. And you can rewrite data just as easily and just as quickly on a 30 gig partition as on a 90.

malor · Jun 30, 2012

('independent' makes much more logical sense, regardless of the history of the thing - RAID0 is an argument against 'redundant' more than it is against 'independent')

The original meaning of the term is absolutely 'Redundant Array of Inexpensive Disks' -- it was invented in an era when permanent digital storage was extremely expensive. The whole point of the technology was using a bunch of small and cheap drives, instead of the big, expensive ones.

RAID changed the game so much that really large, expensive drives have almost completely gone away. The total price delta between the cheapest per-byte volumes and the physically largest volumes isn't tens of thousands of dollars, anymore, it's typically hundreds. And this is because of RAID, as it's really hard to sell a $10,000 drive, when ten $250 drives will be both larger and faster. FusionIO is sort of in that space, but they're selling a completely different technology that can be abstracted similarly to a hard drive.

In a sense, you're wanting to redefine what RAID means, because the original definition has been so wildly successful, has so dominated the market, that it seems kind of silly and pointless to say 'inexpensive'. RAID works so well that it's hard to imagine a world where RAID doesn't exist. Drives are cheap, but they're cheap because of RAID, so leaving that as part of the definition is important. If, somehow, RAID as a technology disappeared tomorrow, you'd see ridiculously huge single drives come back into the market, priced, once again, at nosebleed levels. The Inexpensive part of the acronym is directly responsible for why drives are inexpensive.

RAID-0 is a weak argument for redefining 'redundant', as it's kind of a butchered variation. There are many many flavors of RAID, and all of them but RAID-0 have at least some redundancy. In fact, I would argue that RAID-0 itself is a misnomer -- it's just striping two disks together. RAID-0 not being redundant doesn't make the definition wrong, it makes its inclusion in the term wrong.

malor · Jul 5, 2012

error404":1kq7i6fv said:
The fact that the disks are inexpensive is completely irrelevant to the technology, which is why I prefer the 'independent' variant.

Only RAID1 is (somewhat) independent; all the other types are clusters of disks that all rely on one another. The disks are meaningless on their own.

What do you call a RAID of expensive disks?

A RAID. At higher capacities, it will always be cheaper to gang together N devices of X capacity than to buy a single device of N*X capacity and speed. RAID devices are almost always far cheaper than an equivalent single-drive alternative.

Down in the super low end of the market, the mechanisms cost so much that the price stops moving downward. You just can't get it below a certain point, so if you were to buy a bunch of the absolute cheapest disks you could find, you might be able to do better in terms of bytes/dollar by buying a single midrange drive instead. So RAID doesn't make sense there in terms of storage size for money spent. But even this slightly odd case, the array is going to be a lot faster than the single mechanism. You'd have to spend a LOT more money to match the transfer rate you'd get off an array of the slowest and cheapest IDE drives you could find. Even in that segment, RAID remains inexpensive in terms of price/performance, even though price/storage kind of breaks down due to the physical minimum price for hard drives.

And, in the much more normal case, that of ganging up a bunch of midrange or high-end drives, the resulting RAID volume is vastly cheaper and faster than any single unit. This is such a powerful effect that they don't even make super high-end magnetic drives anymore, because nobody would buy them.

RAID has been so successful, in other words, that there are really only Inexpensive disks left. You just don't see $50,000 drives anymore. If you buy a $50K storage unit, it will have a bunch of inexpensive disks in it -- inexpensive, again, compared to the alternative. Even if they cost $1K each, that's still a tiny fraction of what it would cost to buy a single drive that was as fast and large as the whole array.

I will grant that the initial impetus for the development and research of RAID was the increasing cost of high-capacity disks, however I don't agree that it's why the technology has stuck around. Today's RAID is typically used with expensive, low-capacity and high performance disks to achieve performance and redundancy.

Again, unless you're at the absolute bottom of the market, ganging together N drives, even pricey ones, is a bargain compared to what a single unit of similar performance and capacity would cost you. RAID lets you take advantage of the miracle of mass production, using the same drives that smaller outfits are using, instead of needing bespoke storage solutions.

Some users do use the cheap, PC-oriented disks spoken of the original literature, but I think this is the exception, not the rule, and certainly hasn't been the primary selling point in most settings for at least a decade.

Dude, RAID is everywhere with cheap IDE-class disks. Again, that's the whole point to having it -- you can use inexpensive, relatively unreliable storage, and avoid downtime from your penury. These aren't big-dollar installations, that being the entire point, and because of that, they don't command very much attention, but they are freaking ubiquitous, especially on Linux.

Yes, they do also use RAID with pricey drives, but they do this because it brings high performance and high capacity into financial reach.... ie, it makes ridiculous amounts of storage at ridiculous speeds very Inexpensive, compared to any other alternative. Being able to buy 5 300-gig SSDs is vastly cheaper than a custom 1500-gig SSD would be, because you can buy the same hardware that everyone else is using. You just gang them up together to make the storage size you need; other people or companies buy fewer (or even singles) and make smaller volumes. But you're all buying similar drives, so the drive manufacturers can scale. It's the miracle of mass production of Inexpensive Disks.

If RAID0's inclusion in the term RAID is wrong (which I'd agree with), then RAID0 can't be an argument against 'independent' as it was used in the article.

True, but of all the forms of RAID, only RAID1 could actually be considered independent. You cannot take a single disk out of any other RAID type and get any kind of useful data off it... pull a singleton drive, and it's garbage. For the great majority of installations, you will need a large fraction of the total number of disks to extract usable data -- all-but-one for RAID5, all-but-two for RAID 6, and at least half for RAID 10.

Anyway, in the context of a 'Redundant Array of Independent Disks' I don't see the problem anyway. The disks are independent; they fail independently and have independent interfaces and storage, and they are what we have an array of.

They aren't independent, they're interdependent. You can only recover when Disk 1 fails if enough other disks survive. And arrays don't usually don't have truly independent interfaces, unless it's a very expensive setup -- the vast majority of RAIDs run through one hardware controller or motherboard. And, as discussed, for everything but RAID1, they're not independent storage either, because the disks are garbage data by themselves.

Anyway, I think either term is acceptable, just didn't think the forcefulness was appropriate given that both are generally accepted and 'inexpensive' in the context of modern RAID doesn't make sense.

In my view, it's only because RAID has had such a gigantic impact on the upper echelons of the drive market, making it so Inexpensive, that you can really seriously think that Independent is a better word choice. And it's had an equally large impact in the cheaper seats, too, if only because the technologies trickle down from the enterprise to the common folks.

The initial paper seems to have been the 1988 paper by Patterson, Gibson and Katz, using the inexpensive term. Independent appeared in the literature only a few years later in the early 1990s.

Well, they were wrong too. I didn't realize that the misnomer went back that far, though. I thought you were just backing up a misunderstanding, rather than echoing things people had actually been saying for real. So I owe you an apology for that much... I was a little more aggressive than I should have been, because I thought the idea originated with you. I'm sorry for being too harsh. I spoke more strongly than I should have. I seem to have come down with a case of Someone Is Wrong On The Internet, which I suppose is worthy of a scolding.

That said, even back in the beginning, the disks were not independent in the great, great majority of installations. Only RAID1 qualifies in that sense, and that's far too narrow a slice of the overall market to be the definition of the term.

Inexpensive always works, anywhere from the bleeding edge down to the cheap seats, because it's usually cheap compared to any possible alternative storage solution. Midrange or higher, I think it's probably ALWAYS cheaper, but the absolute bottom of the market kind of screws that up a little.

Independent, however, only applies to a very limited subset of the solution space.

malor · Jul 5, 2012

I mean, from another angle, realize that the world we came from was where IBM did your storage. I dunno if you saw that picture of an early 5MB drive getting loaded on an airplane, but it was the size of a washing machine, and probably a lot heavier.

In a world without RAID, maybe where it was locked up forever under patent or something, if you wanted big storage, you'd have to call a big company like IBM, and they'd quote you some obscene figure, and then show up a few days later with the forklifts. You'd end up with a huge wall of hardware that would hold, say, 2TB, and set you back a million bucks or so.

In a world with RAID, basically all disks are inexpensive. Even the scariest, nosebleediest prices you can think of for drives are a tiny fraction of what custom solutions would cost, and without RAID or something like it, custom solutions would be the only option. The very fact that you think about large drives as an assembly of units, costing a few hundred to maybe a thousand dollars each, as opposed to million-dollar monolithic units, is because of RAID.

In the RAID world, all drives are Inexpensive, even the pricey ones.

Search

Search

Solid state revolution: in-depth on how SSDs really work

malor

Ars Legatus Legionis

More options

malor

Ars Legatus Legionis

More options

malor

Ars Legatus Legionis

More options

malor

Ars Legatus Legionis

More options

malor

Ars Legatus Legionis

More options

malor

Ars Legatus Legionis

More options

malor

Ars Legatus Legionis

More options

malor

Ars Legatus Legionis

More options

malor

Ars Legatus Legionis

More options