[zfs-discuss] The latest from Backblaze: 2016 Hard Drive Review

Edward Ned Harvey (zfsonlinux) zfsonlinux at nedharvey.com
Thu May 19 17:06:58 EDT 2016

I have always found their data not to be useful, although I'm glad they're trying. For example, you might be tempted to just look at Annual Failure Rate %, and choose to buy whatever models of drive have the lowest failure rate in the latest year. But you would be misled. As an example, look at HGST HMS5C4040BLE640. The data is cumulative from 2013 until the period shown in the chart, so in 2014 you would have seen these drives to have a whopping 20% annualized failure rate, in a quantity of 494 drives. That means they had nearly 100 drives failed in one year, out of 494. We're talking unbelievably horrible results. Did that stop them from buying more? No. They bought another 2,600 drives of that model that year. So in 2015 it says 3,100 drives deployed with 0.48% failure rate. Now I'm not a rocket scientist, but if the data is cumulative, and 100 drives failed out of 3,100, that's more than 3% of the drives, so the 0.48% doesn't seem to make sense. Further, we now know that 2,600 out of 3,100 of those drives are less than a year old, so we *expect* a low rate of failure, whereas some of the other models of drives (Seagate 1.5TB models) started with high numbers in 2014, slowly reducing over the next couple of years, which means these are old drives that have already been in service for a longer time, and they're not buying any more of that model, so we expect to see a higher rate of failure. And we do.

Now maybe I'm misinterpreting the data. Maybe "drive count" doesn't indicate the number of drives deployed, but the number of drives failed. If we literally look at the same drives I just mentioned, with new assumptions, we reach dramatically different results: It looks like the HGST drives in 2014 had 494 failures, which was 20% of the number deployed. But in 2015, it was 3,100 failures which was 0.48%, which means 645,000 drives were deployed that year. If that's the case, it hardly seems fair to compare the failure rate of half a million brand new drives against a few thousand seagate drives that are several years old.

If they want to give us meaningful data, give us Survival Rate graphs, or General Predicted Failure Rate graphs, for each individual model of drive. Ironically, when you search for these types of graphs, the best examples actually come from backblaze, but not for individual drive models (at least, not that I can find.)

More information about the zfs-discuss mailing list