How do UNIX files work? Is there a header of some sort?
They are just a stream of bytes. Period. Some types of files can
start with certain special byte pairs (a.out, perl scripts, Postscript,
shell scripts and the like), but that's a convenience for the
segment loader (program loader). You can feed any byte stream,
valid or invalid, to a UNIX app and it will do its best to process
the data. Now... programmers from a limited background have been
known to code in explicit checks for certain file extensions and
make assumptions based on the spelling of the file name, but that's
the programmer's idea, not UNIX's idea of how to do things. Extensions
are optional. Formatted data is optional. As a result, every
configuration file ever invented is a different format. There are
no real standards, just occasional similarities.
cf. termcap entries vs. passwd entries vs. any .*rc file for any
app, etc.
BTW, I think it's an incredible pain that the Mac
has no built in
way to change file types. If they get lost, I have to used DiskEdit
or some such thing to restore them.
What about ResEdit?
John writes:
>One of my latest
three-great-ideas-before-breakfast ideas is
>to write a program for Windows that sniffs and identifies files
>in the manner of Unix's "file". That's the problem with files as
>files: you can easily lose track of what's in them, especially
>if you lose that three-char extension, or it gets wrapped in
>an archive format or attachment, etc.
If I were you, I'd start with the source to the UNIX 'file' program
and use its associated list of file types. It has a list (in its
own format, of course ;-) of various kinds of files and whatever
signature bytes are found in whatever offsets to make a best guess
at the nature of the file. Fortunately, this table is in printable
ASCII (kinda universal within UNIX) and fairly easy to extend, making
it suitable for the Windoze world.
Enjoy,
-ethan