[Dirvish] dirvish and weeding out duplicates

Carel Fellinger cf-2003-nn at xs4all.nl
Tue Nov 15 11:44:10 EST 2005

On Tue, Nov 15, 2005 at 03:19:38PM +0000, Dave Howorth wrote:
> On Tue, 2005-11-15 at 15:42 +0100, Carel Fellinger wrote:
> <snip complicated sounding solution>

hm, I thought it simple:), maybe I should refrase it to make it sound
simple too.

> The simplest solution would just be to back up everything every day and
> see how much space it really takes. It might not be as much as you fear
> since everything is hardlinked.

that's how I do it now, time will tell how much is wasted.

> To achieve two separate backups as described, just create two vaults.
> One includes just the limited part of the filesystem - run dirvish every
> day on that in the normal way. The other includes the whole filesystem
> but excludes the limited part - just run dirvish once a week on that
> directly from a crontab.

that's what I planned if the current approach failed.

> Or have I missed something?

yep, I was not trying to solve an actual problem:)  I reacted to what I
read on the wiki and thought I'd come up with a simple and efficient
solution to achieve what was discussed there, ie. have all files with
the same content be hardlinked in the dirvish vaults whether they were
originally hardlinked or not.  I can imagine that e.g. backing up several
vservers might benefit from such an approach.  And though I do not need
it right now, I'm curious whether my idea would flunk, mainly to better
understand dirvish.

groetjes, carel

