[Dirvish] Backup Drive Failure

Dale Amon amon at vnl.com
Thu Nov 4 21:12:46 UTC 2010

On Thu, Nov 04, 2010 at 02:34:30PM -0400, Przemek Klosowski wrote:
> > 'Self-healing' bad sector errors... but maybe I'm just paranoid. Do SATA drives do bad sector remapping, and do they do it well? Who knows...
> Certainly they do. I recommend checking the SMART firmware status by
> running smartctl -a /dev/sdd or skdump /dev/sdd
> On more recent kernels they can interrogate most disks across the USB
> bus as well as the standard ATA-attached disks.
> Look for Reallocated_Sector_Ct and especially Current_Pending_Sector :
> the former shows hard errors that were recovered by replacing the bad
> sector by a spare; the latter shows presently occurring errors that
> could not yet be repaired.

Absolutely. And in my particular case, there were not
even any errors showing from smartd after the 
drive was brought back into it's normal temperature
operating range and a full badblock check was run on the
drive before reusing it. It just makes sense to me that if
a device goes a bit outside its thermal range it might
show temporary bad results... now if one were to keep
it running well over temp, you would eventually see
permanent damage. At least that's my story and I'm
sticking to it. :-^

I guess it comes from having actually worked with discrete
transistors, or maybe due to the mental trauma from the
time I picked up a soldering iron by the wrong end one
5am in the morning...

