[Dirvish] Help debugging RC=139

Brian brian_dorling at t-online.de
Fri Jun 16 07:48:31 UTC 2006


I need some help debugging the above return code.	
I have an NSLU2 that runs a dirvish script every evening, sometimes the 
dirvish calls fail with RC=139, up to now I have no idea why. As the 
three clients that are backed up are not always on, I created the 
following script:


set -x

/bin/logger "/opt/sbin/my-dirvish started"

ping -c 1 > /dev/null
     if [ $? -eq 0 ]
	/bin/logger "Brian is active"
	/opt/sbin/dirvish --vault brian
	/bin/logger "dirvish RC = $?"
	/bin/logger "Brian is inactive"

ping -c 1 > /dev/null
     if [ $? -eq 0 ]
	/bin/logger "VDR is active"
	/opt/sbin/dirvish --vault vdr-src
	/bin/logger "dirvish RC = $?"
	/opt/sbin/dirvish --vault vdr-etc
	/bin/logger "dirvish RC = $?"
	/bin/logger "VDR is inactive"

ping -c 1 > /dev/null
     if [ $? -eq 0 ]
	/bin/logger "IBM-TP is active"
	/opt/sbin/dirvish --vault IBM-tp
	/bin/logger "dirvish RC = $?"
	/bin/logger "IBM-TP is inactive"

/bin/logger "/opt/sbin/my-dirvish done"

Lots of debugging stuff at the moment, due to the problems. Main point 
is that I only try the dirvish command if I can ping the client.

When the error occurs I have RC=139, and the vault contains a summary 
file with length=0, and a log that only contains the rsync command, the 
tree is empty.

Here an example from the log:

ACTION: rsync -vrltH --delete -pgo --stats -D --numeric-ids 
--link-dest=/public/backup/brian/20060601-1802/tree /public/backup/brian/20060605-0125/tree

Here a working log example:

ACTION: rsync -vrltH --delete -pgo --stats -D --numeric-ids 
--link-dest=/public/backup/brian/20060601-1802/tree /public/backup/brian/20060610-1354/tree

receiving file list ... done
..... <--------- deleted some file names here

Number of files: 1728
Number of files transferred: 100
Total file size: 476511005 bytes
Total transferred file size: 148812162 bytes
Literal data: 13779629 bytes
Matched data: 135032533 bytes
File list size: 35618
File list generation time: 16.192 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 306362
Total bytes received: 14003419

sent 306362 bytes  received 14003419 bytes  140983.06 bytes/sec
total size is 476511005  speedup is 33.30

Rsnycd.log on a debian box shows nothing up till now, even when the 
rsync command fails. Just increased the verbosity on that log.

I am wondering what else I can do to track this down, it's totally 
random, but when it fails it seems to fail for all the active clients.
Maybe rsync is not responding when called by dirvish?

Cheers Brian

More information about the Dirvish mailing list