-----Original Message-----
From: cctalk <cctalk-bounces at classiccmp.org> On Behalf Of Dennis Boone via
cctalk
Sent: 17 September 2020 06:53
To: cctalk at
classiccmp.org
Subject: Re: 9 track tapes and block sizes
> What I know is that tape is subdivided in files by means of marks, >
and
each
file is subdivided in blocks of equal size.
Er, no. The blocks aren't necessarily of equal size. Unix people who are
used
to
tar often seem to have this mindset, but the general
case is that records
can be
of varying size.
On labelled tapes the blocks are almost certainly different sizes. Usually
one or more 80-charater labels followed by a tape mark followed by data
blocks
Now suppose you find and unknown tape you want to
preserve: using dd >
you could easily 1:1 copy tape files to hard disk files
using a SCSI >
drive and
Linux.
DON'T DO THAT. If you use dd, you're throwing away information.
Specifically, you're throwing away knowledge of the block size. Most of
the
conventional unix utilities don't care. Many
other things do. In many
cases, it's
difficult or impossible to reconstruct the block sizes
from the content,
but even
if it was, it's terrible archival practice.
There are file formats for containing tape image data. The most common
one
is probably the simh .tap format. These all preserve
block lengths, tape
marks,
indications of errors in reading the original, etc.
Many fail to provide
a means
to embed metadata, but you can put that in separate
adjacent files.
The docs for SIMH .TAP files are here:-
http://simh.trailing-edge.com/docs/simh_magtape.pdf
be careful as there are also non-SIMH .tap formats
In the IBM Mainframe emulation world there is also .AWS, an IBM format
introduced with its P390 Microchannel Mainframe card
and .HET a Hercules extension to .AWS which allows compression
note all the above formats contain info that allows the file to be read
block by block, both forwards and backwards, from any position on the tape.
But: how you
know which block size is on the tape?
Generally speaking, do a read of a blocksize as large or larger than the
max on
the tape, and the system will hand you the full
record, and the actual
number
of bytes read. If you're writing C or scripting
code, the unix read()
call does
this. From the command line, you can do it with dd -
specify a large
block size
and a count of 1, and it'll tell you what it
actually got as it exits.
For 9 track, few systems could write blocks larger than 32k or 64k, so
those are
decent guesses for "large" there. If
it's DLT or something more modern,
then
the largest possible block might be a lot larger. The
system reading the
tape
may impose a limit based on available buffer space.
You should able to
iteratively determine the largest size it will accept.
Long block mean fewer inter-block gaps so are often a choice for archives,
especially on 6250BPI 9-track tape.
If you are lucky and the tape contains labels these usually have info about
the block and record size.
http://www.setgetweb.com/p/i5/volum.htm
Of and yes tape data often has records embedded in the block without
terminating <lf> , <cr> or <crlf> characters.
So it was common to use tapes with 800 byte blocks, with 10 card images in
each block, or 1200 byte blocks with ten 120 byte printer lines in each
block.
For commercial data you may even find multiple record types, so when I
worked in insurance, we would have a customer record with one or more policy
records.
You will also find variable length records and blocks where there are length
fields in the blocks.
Of course that was a long time ago...
.. also on 9-track tapes its possible to read off the end of the tape.
Many of the quarter-inch cartridge formats actually
don't support block
sizes
other than 512 bytes. If they were used on systems
that expected to be
able to
write larger and/or variable records, the system
hardware or software may
have implemented a logical blocking layer on top of the
512 hardware layer. If you're reading one of these and don't have the
original
hardware/software to decode it, you'll have to
figure out how to decipher
it
yourself.
De
Hope this helps
Dave