It was thus said that the Great Vintage Computer Festival once stated:
On Sun, 27 Jul 2003, Megan wrote:
There also
needs to be a map somewhere specifying which
sectors are free/used.
That can be a lot of overhead, and can take up a fair amount of disk space
depending on how it is implemented, and with each directory entry having
start and size, it isn't needed.
Using one bit per byte (or word) you can store the status of up to 2048
allocation units (sectors, blocks, whatever) in 256 (8-bit) bytes of disk
space. So assuming your allocation units--let's call them "sectors"--are
256 bytes, a map taking up one sector can give the free/used status for a
2048 sector disk, or a disk with a half-megabyte capacity. Each
additional sector assigned for use in your map gives you another
half-megabyte of allocation information. Not bad.
But at some of the sizes being used, the use of a bit map to track
free space becomes problematic---with an 8G drive, that's 2M of space just
for the bitmap, which takes time to search. And after reading
http://www.acmqueue.org/modules.php?name=Content&pa=showpage&pid=43
which discusses the problems of ever increasing disk space, I came up with
this rather "wasteful" idea which is easy, moderate in speed, and while it
wastes disk space, it does so in a rather unusual fasion.
At the location where you can freely use space (past the fixed areas, such
as boot records, etc) you set aside a few tracks worth of space for the
"master directory." It contains a pointer to the beginning of free space on
the disk (block number, cylinder, head, sector, however you store geometry
data) and then (taking an idea from Unix) a name and pointer to where the
"file" starts on the disk (realizing that a directory is just a specially
formatted file), a layout with something like:
disklocation end_of_fixed_portion;
disklocation free_space;
/* repeats to fill rest of "master directory" */
char[FILENAME] filename;
disklocation fileloc;
char[FILENAME] filename...;
disklocation fileloc...;
Then, for each file, the first sector contains information about the file,
including the name (again, for reasons that will become apparent), size of
the file, timestamp, version (more on this in a bit) and anything else you
want to fill a sector, then sequentially, the rest of the file. When you
open a file for modification (or appending data) then a "new" file is
created, with the same name, starting with the next available free space on
the disk and any data is copied. The entry in the "master directory" is
updated to the new version (along with the free_space being recalculated as
needed).
Files are always created at the end of the used space and slowly
(depending upon how much data you are saving) eats into free space. You
also get versioning (hmmm, might want to save a pointer to previous versions
in the file metadata area) and if you construct the file metadata correct,
you can even detect where files start and end if the "master directory" is
blown away. And every file is contiguous so you avoid having to seek around
on the disk. To delete a file, just remove the entry from the "master
directory"---you don't have to bother with actually removing the data from
the disk until you absolutely need it, and if you are using an 8G drive for
an older system, it will take quite a bit (if ever) to fill it.
Now, you do loose the ability to have multiple files open for writing
(appending) but is that really a consideration?
But I'm not saying you *have* to do it this way, but given the size of
modern disks nowadays (and it seems that modern disks can be used with older
hardware) it might be worth it to use a scheme like this.
-spc (Just tossing out ideas ... )