On Wed, 2005-03-02 at 22:55 -0500, der Mouse wrote:
Don't
suppose anyone's come across anything that'll attempt to fix a
corrupt .Z (Unix compress) file, have they?
Not me, but it may be doable manually, depending. What kind of error
do you have? Wrong data, dropout, insertion, you don't know, what?
It was an unknown - uncompress just died with "corrupt input", which
could be any of the above.
As Al (I think it was Al) suggested, I started playing around with doing
multiple reads from the tape and after several passes I got enough to
reconstruct the archive - so the specific problem's gone.
However the question about fixing .Z files still stands - I'm suprised
that given how long the format's been around, nobody ever published a
utility to attempt to fix corrupt files.
In theory it may be possible. compress (.Z) uses
Lempel-Ziv-Welch. An
insertion or dropout is relatively hard to fix, in large part because
it means you have trouble telling where compressed values' boundaries
are. Wrong data is comparatively easy; it will give you a corrupt
decoding table (or an outright error if the coded value is out of range
and the decoder bothers to check), but if the decompressed data is
highly redundant it's often fixable.
*likely* this was corrupt data, but I'm not certain what reading from a
tape under linux does when a bad block is found. I'd expect it to still
spit out a corrupt block but flag the error. Maybe it doesn't and just
truncates the output, which isn't too helpful at all (particularly as
there's no way in GNU tar to print out block numbers of faulty blocks,
despite what the documentation says).
There is another possibility: ignore the garbling and
keep
decompressing. If you're lucky, the encoder will emit a clear code
soon and you'll start over with a clean table, at which point you will
suddenly start getting non-garbled decompression. (If you're unlucky,
the encoder will hold off on the clear code.)
I'd assume that's what uncompress does anyway, but possibly not. It'd
seem more logical to me for it to restore as much as it could but still
flag an error, rather than just barfing at the first sign of trouble.
But if that were the case, then the file I was decompressing really was
rather broken.
Still, all sorted now in this instance. A file listing of the archive
suggests that I might have the only source code copy to bits of Acorn's
ARX and Brazil operating systems (or possibly the whole lot), which is
why I was keen to get this one particular tape read. Time to extract the
archive and see exactly what there is there...
cheers
Jules