[Dirvish] A few small bugs
paul at debian.org
Wed Nov 23 06:27:41 EST 2005
On Tue 22 Nov 2005, foner-dirvish at media.mit.edu wrote:
> (2) When requesting logfile and index compression, it looks like
> dirvish creates both in their uncompressed form, and -then-
> compresses them. This means, for example, that in my runs over a
> filesystem of about two million files, dirvish first has to write
> out a half-gig file, and -then- compresses it. This (a) makes it
> more likely that the vault might run out of space, and (b) is very
> slow, because the disk must thrash all over the place doing the
> find while simultaneously writing this enormous file, and must
> then thrash some more while reading this enormous (hence uncached)
> file and then writing out its compressed version. If compression
> is requested, it should happen in a pipe in between the find
> that's generating the data and the disk; this is presumably a
> one-line change. Doing so means that most of this data never hits
> the disk in the first place and thus speeds things up enormously,
> since the actual file compresses by about 95%.
For the log file I don't think that compressing on the fly is the smart
thing to do; if something goes wrong and the process gets killed, you
lose a lot of logging.
For the index it's certainly worth doing.
More information about the Dirvish