Defining Disk Image Dump Standard

1 Jun 2000

On Jun 1, 11:10, Sellam Ismail wrote:
...
  On Thu, 1 Jun 2000, Pete Turnbull wrote: 
...
  But a lot more volumnious.  But this is just my
prejudice speaking.  Even
 though I find HTML useful, I hate it. 
It needn't be a whole lot more voluminous.  The tags should be concise,
there's no need to write an essay for each part.  Keywords might be a good
idea.  Tags would be omitted if irrelevant (as many would be for a "raw"
archive, or for a common format with no "funnies").  So a disk descriptor
might look something like this:
{Apple
][<00>soft<00>trks:<40><00>rpm:<15><255><00>{trk:<00><00>logical<00>
length:<12><34><00>sectors:<10>{sector:<00><00>{sync{bytes:<16><00>value:<255><00>}{header:GCR<00>trk:<00><00>sec:<00><00>physsec:<00><00>head:<00><00>size:<00><01><00>}{data:<
---256 binary bytes---- >crc:<xx><xx><00>}}sector: [repeat as reqd]
}}{track: [repeat as reqd] }}
I can't remember some details like the size of a DOS 3.3 track or what the
sync bytes are so that's just an stylistic example.
The opening "{" marks the start of an object and is matched by a closing
"}"; braces are nested because objects are nested.
Variable-length strings like "Apple ][" are terminated by some agreed
control character (I used ASCII NUL, <00>).  Numeric values are stored in
binary (actually it might make more sense to store them in ASCII where they
follow a string description, but probably not for a block of sector data).
 So "rpm" is stored as a 2-byte representation of 360.  Hmm, we'd need to
decide if it's little-endian or big-endian -- or add another tag!
...
  > a problem?  The tags don't all need to be
ASCII text, things like the
 > sector size could be integers, and field lengths could be limited.  I'd
 > envisage something like nested objects (borrowing from Sellam's slightly
...
   later mail):

 I don't like the idea of storing the actual sector data as text though. 
I hadn't meant to imply that; I mean you could hexify it if you wanted, but
I don't see any need.  Actually one of the things I was thinking of earlier
today, was Acorn's "DrawFile" format, which uses similar objects, but the
data is still binary (it's a computer program that reads the data, not a
human).  If a human really did need to read it, you could always use a hex
editor.
...
  I guess in this
 day and age it doesn't matter much anymore but when I was growing up you
 had to make every byte count, and I know more than 95% of us here can
 relate to that. 
Yup, I was too, but I think here the benefits greatly outweigh the
disadvantage of extra storage requirement.  We want this to be as useful as
possible, and the easier it is to use for unexpected formats (to create
*and* to read), the more it will get used.
...
  > It also
 > means that if the database is lost, damaged, incomplete or otherwise
 > inaccesible, an archive can still be understood, and there's no chance of
...
  > inconsistency because two people tried to add new
formats at about the same
...
   time, or
someone rolled their own. 
 I agree with that.  Human readability is definitely a compelling advantage
...
  as is the elimination of the need for a centralized
database of system
 descriptions. 
It would still be good to have a central repository.  At the very least, it
would allow those who know where to look, to see what has already been
dealt with, and save a lot of design effort if the format they want is
already there.  It would be the place to store the explanation of the tag
system.  Plus, the bigger it gets, the more it will encourage others to
archive their treasures, too.
--
Pete                                            Peter Turnbull
                                                Dept. of Computer Science
                                                University of York

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Defining Disk Image Dump Standard