Thoughts on manual database design?

Fri Oct 2 20:08:22 CDT 2015

On 10/2/2015 7:25 PM, Mike Stein wrote:
> Ah, OK; not really relational then, but as long as it works reliably...

Not *fully* relational.  It fits at least one of the lower echelon
orders, I think.

> 
> Yeah, you may not remarry or repaint a cabinet but it's quite possible
> that you'll move one...
> 
> Doesn't your software provide for hiding the auto keys except during
> maintenance?

Actually, I will need them to put in the references and for the
application to follow to get details (particularly for the update
application) and to use in the 1 to many situations where an auto
increment key exists.

> 
> No provision for multiple (changed) publishers so a given manual could
> show up under both Compaq or HP, for example?

Yes, there is.  That is why an *artifact* - a given *hardcopy* of a
manual - has a *publisher*.   As I mentioned in an earlier reply at some
point to someone, a given manual applies to a single machine
*manufacturer* (e.g., Apollo is considered the manufacturer of the DN
series, even though, technically, later HP made them for a year or two),
but a given manual with the same number the applies to them might have
Apollo as a publisher (older copy) and HP (newer copy - with the same
contents).

A given hardcopy of a manual could not change publishers.  The publisher
is permanent characteristic of that copy.

I could conceivably add a Publisher column to the Manual_File table, too
(indicating the publisher of that scan of the manual), but I didn't see
the point.  Most textual searches in the online application will be on a
join of Manual and Manual_Artifact (where the Manual table data gets
replicated in the result set for each artifact), so the publisher can be
picked up that way.

JRJ

> 
> m
> 
> ----- Original Message ----- From: "Jay Jaeger" <cube1 at charter.net>Ah, OK
> To: <cctalk at classiccmp.org>
> Sent: Friday, October 02, 2015 6:53 PM
> Subject: Re: Thoughts on manual database design?
> 
> 
>> The descriptions are not the keys to any table, but the names sometimes
>> are, if there hasn't been a reason to use a different key.
>>
>> For example:
>>
>> Location is the key to the Location table.
>>
>> Location . Cabinet is the key to the Cabinet table - cabinets are not
>> necessarily unique unto themselves but are unique within a location.
>> (Well, actually, I plan to make them globally unique, but didn't want to
>> design the database to require it).
>>
>> ArtifactID is the key to the Manual_Artifact table.
>>   Location . Cabinet are columns in the Manual_Artifact table.
>>   They are also foriegn keys (i.e., keys to the Cabinet table).
>>
>> If I cared about the color of a cabinet, that would be a column in the
>> cabinet table.  The artifact would not care what color the cabinet was.
>>  Color would not be a key.
>>
>> If a cabinet got renamed, then the Artifacts would have to change as
>> well.  (This is why purists would suggest adding a separate numeric key
>> to the cabinet table.  I am not worried about that happening - there
>> isn't any reason, really, to rename a cabinet.).
>>
>> HOWEVER, if a cabinet got moved then the Location in both the Cabinet
>> and the Manual_Artifact table would have to change.  That is actually
>> plausible (as a "Box" is one kind of cabinet - and those *do* move), so
>> this gives me some reason to add such an invented key that I had not
>> thought of before, so I think I will make that change to add a
>> autoincrement key to the cabinet table.
>>
>> I don't like to invent these separate autoincrement keys without a good
>> reason - but am perfectly happy to if I find such a reason.
>>
>> No, there is no magic with respect to the database doing updates.
>>
>> JRJ
>>
>> On 10/2/2015 5:36 PM, Mike Stein wrote:
>>> I'm just surprised that it looks like the names/descriptions are the
>>> key; what if you repaint the BLUE cabinet in your wife MARY'S BEDROOM
>>> red, or for that matter if you remarry and MARY'S BEDROOM becomes
>>> LINDA'S BEDROOM?
>>>
>>> Does the software update all occurrences automatically, or am I
>>> misunderstanding?
>>>
>>> m
>>>
>>> ----- Original Message ----- From: "Jay Jaeger" <cube1 at charter.net>
>>> To: <cctalk at classiccmp.org>
>>> Sent: Friday, October 02, 2015 5:06 PM
>>> Subject: Re: Thoughts on manual database design?
>>>
>>>
>>>> There are three columns named Location for a reason.
>>>>
>>>> There is the column Location in a table all by itself.  That is a list
>>>> of locations - not just for manuals, but, eventually, for all of my
>>>> artifacts.
>>>>
>>>> Then there is a table of Cabinets.  Each Cabinet has a single location
>>>> at any given time.  But Cabinet by itself may not be unique, so
>>>> Location
>>>> comes along for the ride as part of the key for the table of cabinets.
>>>>
>>>> Manual artifacts (copies) are stored in Cabinets.  Yes, I *could* have
>>>> created a separate key for each Cabinet, and stored that in the Manual
>>>> table and the Cabinet table (like I did to relate Manuals to the other
>>>> tables), but that would have actually complicated the design, so
>>>> instead
>>>> I used the same concatenation which is the key to the Cabinet table.
>>>>
>>>> The database is defined such that Location is a foreign key in Cabinet,
>>>> and the keys to Cabinet (tee hee) are a foreign key in Manual_Artifact.
>>>> This allows the database to *guarantee* that there is not any Manual
>>>> whose Cabinet does not exist or a Cabinet whose Location does not
>>>> exist.
>>>> It also allows the web applications  to easily populate pull down lists
>>>> without having to read through the entire artifacts table.  With a
>>>> database this small that probably doesn't matter much, but if the table
>>>> had millions of rows it certainly would.
>>>>
>>>> Type and manufacturer are handled the same way for the same reason.
>>>>
>>>> I could have made a more relationally pure design by creating a
>>>> separate
>>>> table of Artifacts and Cabinets, Cabinets and Locations and so on.  But
>>>> because each of those relationships is just one to many and never many
>>>> to many, there was no point in my mind (purists would probably
>>>> disagree).  Not so for manuals and machines - many manuals may apply to
>>>> a given machine, and a given manual may apply to many machines, so that
>>>> had relation to be stored in a separate table.
>>>>
>>>> Historial digression....
>>>>
>>>> I have been doing database design essentially like this from *before*
>>>> relational databases were well known and commercially available, at
>>>> Wisconsin DOT, which developed its own database system called File
>>>> Handler in the early 1970's, starting on an IBM 360/65 MP with 2MB of
>>>> core.  It was written because the other DBMS's at the time were either
>>>> too slow, to big or required taking too much of the database too often
>>>> for reorganizations.  (IBM's IMS, in particular).  DB2 did not yet
>>>> exist.  Huge gamble management took on the programmers that wrote it,
>>>> which paid off in millions of dollars saved in computer capacity alone.
>>>> For a while, for performance reasons, we had a "cheat" that could store
>>>> a one to many relation in a single column (which we called a "repeating
>>>> group") that was done for our drivers database. We broke them out into
>>>> separate tables when we upgraded to an Amdahl 470/V6 in 1976.
>>>>
>>>> File Handler production before I started there in 1975, I was the
>>>> primary DBA for it for about 7 years before I moved on to other things.
>>>> It had features like row-level blank compression, elimination of nulls
>>>> at the column level (a bitmap indicated which columns were present), an
>>>> API which had a LALR compiler which parsed queries (though for online
>>>> production we required them to be pre-compiled), full (single phase)
>>>> commit with preempt detection including redundant log and checkpoint
>>>> files in case the machine went down mid-commit and so on.  It used
>>>> techniques for indexing that would be recognizable today.  It was
>>>> almost
>>>> its own OS: by the time we were done it had (in order of development)
>>>> its own memory management ("KORMAN" aka "Harvey"), task management and
>>>> program loading/content management ("CONMAN"). It supported the SMP
>>>> fully, though applications were usually written to be
>>>> single-threaded in
>>>> a given serially-reusable application instance - though you could have
>>>> multiple instances of the same application running.  A man named Robert
>>>> Tomlinson wrote the query compiler, and  used some of his work on File
>>>> Handler for his advanced degree theses (at least his PhD, for certain).
>>>> I was an EE student at the same time he was at U. Wisconsin, though our
>>>> paths never crossed.
>>>>
>>>> Florida DOT acquired the code from us in the late 1970's, and
>>>> established it, with some assistance from Wisconsin DOT staff, as their
>>>> motor vehicle and/or driver database system for many years as well.  I
>>>> have a copy of that instance of the code, and have run the thing under
>>>> Hercules, just for giggles.  Wisconsin DOT retired the last vestige of
>>>> it just last year - it had a 40 year run, all told.
>>>>
>>>> JRJ
>>>>
>>>> On 10/2/2015 1:38 PM, Mike Stein wrote:
>>>>> Is that the way it's done these days, e.g. the contents of the
>>>>> Location
>>>>> field in three places, Location and Manual_Type only containing one
>>>>> field, no keys other than Manual_Key etc.?
>>>>>
>>>>> Looks like I'll have to brush up on database design... ;-)
>>>>>
>>>>> m
>>>>>
>>>>>
>>>>> ----- Original Message ----- From: "Jay Jaeger" <cube1 at charter.net>
>>>>> To: <cctalk at classiccmp.org>
>>>>> Sent: Friday, October 02, 2015 11:59 AM
>>>>> Subject: Re: Thoughts on manual database design?
>>>>>
>>>>>
>>>>>> On 10/2/2015 12:04 AM, william degnan wrote:
>>>>>>> Coming up with a schema that works with multiple manufacturers is
>>>>>>> the
>>>>>>> big
>>>>>>> challenge.
>>>>>>>
>>>>>>
>>>>>> Not sure it is that big a challenge. Perfection is not required. 
>>>>>> Just
>>>>>> the ability to find stuff later.  My schema currently has manual
>>>>>> manufacturer - the original manufacturer of the machine, and then
>>>>>> each
>>>>>> artifact (copy of a manual) has a publisher.
>>>>>>
>>>>>> Consider the case of Apollo which got bought by HP.
>>>>>>
>>>>>> For a DNxxxx machine, the machine manufacturer is always Apollo. 
>>>>>> For a
>>>>>> 400 or 700 series, the manufacturer is always HP.  However a given
>>>>>> copy
>>>>>> of a manual may have been published by Apollo (older) or HP (newer) -
>>>>>> with the very same number.  The schema supports that.
>>>>>>
>>>>>> (New schema posted at
>>>>>> http://webpages.charter.net/thecomputercollection/misc/manualmodel.pdf
>>>>>>
>>>>>> ) .
>>>>>>
>>>>>> JRJ
>>>>>>
>>>>>
>>>>>
>>>
>>>
> 
>