There's plenty of file system meta-data out there to confound
this process. Blindly archiving and un-archiving will destroy
data that's not inside the file. There's a lot to be said for
archiving images of entire filesystems. What, timestamps
aren't important? Creation dates as well as last-modified dates?
Archive bits? At least 'tar' preserves Unix's groups and
permissions to a reasonable degree.
--
There is also 'hidden' but valuable data you may not know about.
I recovered the sources to some stand-alone machine utilities off
the end of a 9 track data set on a tape that had been reused.
It's also been mentioned that DEC 'distributed' unsupported code
on distribution discs where the unsupported code was 'deleted'
(marked as deleted in the directory)
--
Another problem with bits in the wild is you have no idea if they've
been patched or corrupted in some way before you get them.
This is why it's necessary to read as many copies of the same program
that you can find, even if the program has already been 'archived'.