[Dirvish] Hard Drive Died, and I was shocked to discover...

Jason Boxman jasonb at edseek.com
Mon Feb 13 10:30:17 EST 2006


Steve Ramage said:
<snip>
> I gasped... but wait the backups run every night and it had been a week
> or so. I went back to last thursday, rsync_error. Then Wednesday, then
> Tuesday. I was fed up of clicking on each so I highlighted all the
> January folders, and checked there size 100 KB.

Been there.  I don't recall the circumstance now, but I wasn't getting
`rsync` errors and still managed to have a bunch of empty directories.  Not
sure how that happened now, but I finally found one as you did.  Another
time the original 'core' exclude was hidding usbcore.ko and a core/ kernel
module directory -- _ouch_.

Actually, it may have been I stopped backups during a known error --
actually I think it was a compromise -- and dirvish-expire happily nuked
every single good image that was still outstanding for me.

Or maybe I have a few different events confused.  I don't really recall.

I do get errors emailed via cron when there are `rsync` errors, though.

I never seem to have time to mess with Dirvish these days as I'm always
recovering from some kind of disk failure instead.  (Sometimes of the backup
server itself -- It's RAID 0 as 300GB were way too expensive at the time for
me to buy more than two...  Obviously never put something as important as
backups on a RAID 0 array that cannot sustain any disk failures and
survive.)

In either case, I just finished a comprehensive setup of syslog-ng and a
central loghost, so running something nightly that does a `logger` to syslog
would be pretty helpful in the event of some catastrophic failure or
something.

Are there any known issues that happen that aren't currently reported via
STDOUT / STDERR to cron, but show up in summary or log(\.(?:gz|bz2))?

Thanks.





More information about the Dirvish mailing list