[Dirvish] Dirvish Improvements

Dominik Schulz lkml at ds.gauner.org
Mon Feb 9 19:31:16 UTC 2009


Am Montag 09 Februar 2009 14:38:43 schrieb Paul Slootman:
> Good initiative, a couple of comments / questions:
>
> On Mon 09 Feb 2009, Dominik Schulz wrote:
> > Changes:
> > - Concurrent Backups with dirvish-runall (using Threads)
> Without having looked at the code, how is this implemented?
> Can you define how many to run in parallel? Does it try to connect to
> different clients, instead of hammering one client may times in
> parallel?
The current snapshot does use 5005Threads. But I just figured out that those 
are somewhat outdated and I did already switch my code to ithreads. Maybe I'll 
use fork() instead, but I'm not yet sure what the advantages/disadvantages 
would be. I did define a configuration option that allows the user to specify 
how many threads will run in parallel. I was also thinking about implementing 
some more advanced options, like those you mentioned.

> > - Handling of SIGnals. A "kill <PID of dirvish>" will (try to) properly
> > shut down dirvish and remove the unfinished backup
> I'd like a way of continuing interrupted backups, which would mean
> leaving unfinished backups in place.  There would have to be some sort
> of status kept somewhere so that a future run knows how to continue.
The problem is that the "state", i.e. the files, changes meanwhile. But I'll 
set this on my todo list. I think if you don't care about the files changing - 
or if you use some kind of (LVM) snapshot - it should be no problem.

> > - Locking for dirvish-runall (default: /var/run/dirvish.pid), dirvish
> > ($vault/dirvish.pid) and dirvish-expire (/var/run/dirvish-expire.pid)
> Hmm, not to sure about dirvish itself, as I have been known to --init a
> new backup while dirvish-runall (and hence dirvish also) is running.
I see no problem with that, since dirvish-runall has its own lockfile as well 
as dirvish. The rationale behind that was that you probably don't want 
dirvish-runall to run twice, at least I've read about some cases on the this 
list and/or the wiki where users had problems with long running instances over 
slow links. Those problems would have been prevented by locking dirvish-
runall. You can always run dirvish multiple times as long as each copy will 
run on its own vault, because the lockfile is placed inside the vault.

> > - Place a symlink to the most-recent image inside the vault
> I use "image-temp: latest" for that...
I'm not sure if I'm happy with the image-temp option. I'll have to look into 
that ...

> > - Removed dependency on File::Find
> Why? That's part of standard perl, so should always be available;
> I see no point in rewriting things yourself just for the sake of it.
You're right, that point was unclear. I did try to remove the problem that 
Dirvish did look for summary files in every subdirectory of the bank. While I 
looked into that File::Find became unnecessary.

And last but not least: Please remember that this was meant as some kind of 
development snapshot. If something is wrong with it I'm happy to fix it if 
anybody can explain to me why I should do so, i.e. why it was wrong before.

-- 
Mit freundlichen Grüßen / Best Regards
Dominik
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part.
Url : http://www.dirvish.org/pipermail/dirvish/attachments/20090209/771b1653/attachment.bin 


More information about the Dirvish mailing list