Jump to content

Backblaze: SSDs might be as unreliable as disk drives

Lightwreather
Go to solution Solved by LAwLz,
59 minutes ago, jagdtigger said:

Look when i have a wd green from 2011 that even survived running torrents and being in a raid vs 3 hdd from 2017 which died with something like 30k hours on it used in its intended use-case. Thats way more than just bad luck.

(I even have a 200 GB WD somewhere that still works with <10 bad sectors.... [and even those are old AF, the drive was one or two years old i think])

/EDIT

Oh and did i mention for not much more i could get wd dc-hc drives instead of crappy ironwolfs? Yeah seagate can go bust for all i care.....

Seagate has lower RMA rates than Western digital. 

It was 0.93% vs 1.26% in 2017 (no more up to date data). 

 

Failure rate of the 4TB WD Red - 2.95%

Failure rate of 4TB IronWolf - 2.81%

 

 

Source: https://www.hardware.fr/articles/962-6/disques-durs.html

 

It's RMA rates from a very large French retailer. 

 

I don't doubt your experience, but the fact of the matter is that your experience is just a very tiny sample and as a result of bad luck, it is very skewed compared to the real world generalized numbers. 

 

 

Edit: 

For those interested, here are the RMA statistics for HDDs and SSDs according to the French retailer, which I think is way more representative of what consumers doing consumer things can expect. 

 

HDDs:

  • HGST 0,82%
  • Seagate 0,93%
  • Toshiba 1,06%
  • Western 1,26%

 

SSDs:

  • Samsung 0,17%
  • Intel 0,19%
  • Crucial 0,31%
  • Sandisk 0,31%
  • Corsair 0,36%
  • Kingston 0,44%
On 10/7/2021 at 10:46 AM, 26astr00 said:

Am I the only one who thought OP meant floppy disk drives when he said “disk drives”?

Are you one of those people who thought the 3.5" floppy was a "hard disk" because of the rigid plastic cover?

 

Realistically, I think pretty much everyone has said what they needed to say on this topic. Backblaze used consumer (likely SATA) SSD's just like they used consumer 3.5" mechanical HDD's, and didn't give us the failure conditions.

Link to comment
Share on other sites

Link to post
Share on other sites

44 minutes ago, Kisai said:

didn't give us the failure conditions.

Well their fail conditions for SSD as was stated as follows

Quote

In the meantime, all of the SSDs which have failed to date are reactive failures, that is: They just stopped working.

This was under the section which was titled "What Does Drive Failure Look Like for SSDs and HDDs?"  While more tricky to analyze as well, they all of the data that they used to make the assessment, you just had to sift through 30GB of raw data.

3735928559 - Beware of the dead beef

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Kisai said:

Are you one of those people who thought the 3.5" floppy was a "hard disk" because of the rigid plastic cover?

 

 

Most of us just called them disks and disk drives (floppy was largely redundant terminology) ,  Hard drives were solid steel monstrosities that took up have the case and often the term disk was not used.  So when people say disk drive the first thing we think is floppy disk drive and occasionally CD disc drive not hard drive.    It's just how some people assimilate information and communicate. (especially those of us who are visual thinkers).

Grammar and spelling is not indicative of intelligence/knowledge.  Not having the same opinion does not always mean lack of understanding.  

Link to comment
Share on other sites

Link to post
Share on other sites

15 hours ago, wanderingfool2 said:

This was under the section which was titled "What Does Drive Failure Look Like for SSDs and HDDs?"  While more tricky to analyze as well, they all of the data that they used to make the assessment, you just had to sift through 30GB of raw data.

Would have been a hell of a lot more informative if they listed the number that failed due to going in to firmware read-only mode, failed outright and no longer detect and those that are operative however worn to the extent the NAND flash usable has shrunk below that of the partition so has failed.

 

I suspect all of the would be in the first two conditions. When my Sandisk SSDs failed they just stop showing up at all, bios, windows, anything. When my Samsung EVO stopped working it shows up just fine but you just can't create a partition on it anymore.

Link to comment
Share on other sites

Link to post
Share on other sites

15 hours ago, leadeater said:

Would have been a hell of a lot more informative if they listed the number that failed due to going in to firmware read-only mode, failed outright and no longer detect and those that are operative however worn to the extent the NAND flash usable has shrunk below that of the partition so has failed.

 

I suspect all of the would be in the first two conditions. When my Sandisk SSDs failed they just stop showing up at all, bios, windows, anything. When my Samsung EVO stopped working it shows up just fine but you just can't create a partition on it anymore.

What would have really been handy is if they listed dates of failures...30GB (each day has SMART data on each drive that hasn't failed)...the only way to find failed drives easily is to write a script to scrub it all (they stored things as a CSV file...and wow, maybe it's just me but Excel has really become terrible at the amount of processing it takes to open a 65mb csv file...my script literally processed 10 files in the time it took Excel just to open one file)

 

I mean...their data has one of the SSD's lasting less than 5 days, and with an average failure of 80ish days, my guess would be just dead chips leading to outright failure (so similar to what you said).  The 228 day failure one is the only outlier that I could believe possibly ran into NAND degradation, although even then the SMART data didn't show the reserve sectors as being used.

3735928559 - Beware of the dead beef

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, wanderingfool2 said:

although even then the SMART data didn't show the reserve sectors as being used.

I've seen some SSDs where it won't do that while it's powered on, as in update those particular SMART data values, then you fully power it off and then back on again and it starts reporting all kinds of issues. Weird, think it's just badly written firmware or something. SD cards too, happily "working" fine then power off the system and then SD card never comes back, probably why VMware just announce they will not longer support USB and SD card installs anymore.

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, leadeater said:

I've seen some SSDs where it won't do that while it's powered on, as in update those particular SMART data values, then you fully power it off and then back on again and it starts reporting all kinds of issues. Weird, think it's just badly written firmware or something. SD cards too, happily "working" fine then power off the system and then SD card never comes back, probably why VMware just announce they will not longer support USB and SD card installs anymore.

The ADATA drive I posted info from earlier in the thread, literately did this. None of the SMART values were useful.

 

USB installs are unreliable, and you should never run software or virtual machines on them. Likewise most SD-card readers are attached via USB, not PCIe, except for some recent Dell laptops.

Link to comment
Share on other sites

Link to post
Share on other sites

I don't actually think someone like blackblaze existing and reporting their experiences is bad in any way, shape, or form. The "problem" is that while Blackblaze's model is explicitly aiming to just try shit and see what works for them or doesn't (and lets be 100% clear, having SOMEONE in the industry just trying this shit is good for everyone), the far more controlled and consistent environments simply don't publish any of their data at all, so then it looks as if Blackblaze is the only game in town and then their particular business model (again asking the question can you take whatever the fuck random excess deal hardware you find and make that work long enough to profit on it) makes for sloppy solo comparison.

 

Obviously we have been brutally proven in recent times that the spec's direct from manufacturers are commonly misleading at best and used to mask hugely significant changes particularly for consumer drives (both SSDs and HDDs)... so a third party looking into this stuff is sorely needed. And yes, I would love if Blackblaze was a bit more self-aware on messaging language, but it is still far better than nothing and to be frank better 'on the whole' than anything else easily accessible out there, even with the limited model and number sets being investigated. I look forward to the next few reports for comparative analysis on the failure progressions over time (I don't tend to look at the specific numbers as being very relevant).

 

 

EDIT: When a site as reputable as Anandtech continues to publish the false and completely unsubstantiated claims that Seagate Exos drives run too loud and high power for recommending as consumer drives, despite being cheaper, more reliable, and faster than those consumer drives and being lower power/noises than the best consumer drives from only a few years ago.... I think there is an overall lack of competency when it comes to storage in the industry.

LINK-> Kurald Galain:  The Night Eternal 

Top 5820k, 980ti SLI Build in the World*

CPU: i7-5820k // GPU: SLI MSI 980ti Gaming 6G // Cooling: Full Custom WC //  Mobo: ASUS X99 Sabertooth // Ram: 32GB Crucial Ballistic Sport // Boot SSD: Samsung 850 EVO 500GB

Mass SSD: Crucial M500 960GB  // PSU: EVGA Supernova 850G2 // Case: Fractal Design Define S Windowed // OS: Windows 10 // Mouse: Razer Naga Chroma // Keyboard: Corsair k70 Cherry MX Reds

Headset: Senn RS185 // Monitor: ASUS PG348Q // Devices: Note 10+ - Surface Book 2 15"

LINK-> Ainulindale: Music of the Ainur 

Prosumer DYI FreeNAS

CPU: Xeon E3-1231v3  // Cooling: Noctua L9x65 //  Mobo: AsRock E3C224D2I // Ram: 16GB Kingston ECC DDR3-1333

HDDs: 4x HGST Deskstar NAS 3TB  // PSU: EVGA 650GQ // Case: Fractal Design Node 304 // OS: FreeNAS

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

9 hours ago, Kisai said:

USB installs are unreliable, and you should never run software or virtual machines on them. Likewise most SD-card readers are attached via USB, not PCIe, except for some recent Dell laptops.

There are server rated high endurance USB drives and SD cards, HPE/Dell/Lenovo all sell them and SD cards has been until VMware just stopping support for them the defacto standard and also the vendor official ESXi boot kits when buying servers.

 

There is in general no need to put in HDDs or SSDs for an ESXi install that is only a few GB in size and after boot runs entirely from memory, you can pull out the boot device live and ESXi will stay running. What kills the SD cards and USB drives is the log writing, over time VMware has made the log load higher with more things getting logged and in more detail.

 

Nobody has ever run VMs on SD cards or USB drives, that's not what they are for.

 

Quote

HPE 32GB microSD RAID 1 USB Boot Drive

 

image.png.6f4765e3b438483589817c3eba4decf1.png

https://support.hpe.com/hpesc/public/docDisplay?docId=a00093294en_us&docLocale=en_US

 

image.png.6070317270601f94fe8a9290b565a6a3.png

 

Servers aren't laptops or desktops and we don't go in to Walmart or w/e to buy SD cards or USB drives to hang them off the back of the server etc, these are designed solutions.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×