DIYbanter - View Single Post

Arno

In comp.sys.ibm.pc.hardware.storage Jeff Liebermann wrote:
On Sat, 10 Apr 2010 22:33:49 +0000 (UTC), Sergey Kubushyn
wrote:

I'm right here in the US and I had 3 of 3 WD 1TB drives failed at the same
time in RAID1 thus making the entire array dead.

That's the real problem with RAID using identical drives. When one
drive dies, the others are highly likely to follow. I had that
experience in about 2003 with a Compaq something Unix server running
SCSI RAID 1+0 (4 drives). One drive failed, and I replacing it with a
backup drive, which worked. The drive failure was repeated a week
later when a 2nd drive failed. When I realized what was happening, I
ran a complete tape backup, replaced ALL the drives, and restored from
the the backup. That was just in time as both remaining drives were
dead when I tested them a few weeks later. I've experienced similar
failures since then, and have always recommended replacing all the
drives, if possible (which is impractical for large arrays).

For high reliability requirements it is also a good idea to use
different brand drives, to get a better distributed times between
failures. Some people have reported the effect you see.

A second thing that can cause this effect is when the disks are not
regularly surface scanned. I run a long SMART selftest on all disks,
also the RAIDed ones for this every 14 days. The remaining disks are
under more stress during array rebuild, especially if the have weak
sectors. This additional load can cause the remaining drives to
fail a lot faster, in the wort case during array rebuild.

Arno
--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email:
GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans