New subject: WAS : Text encoding Babel. now PICKING LOCKS OR FINDING KEY MFR AND KEY #

30 Nov 2018

Ouch, what was I thinking? Mentioning a project I fundamentally can't talk in detail
about yet; not very smart.
Thus spawning a thread guaranteed to go chaotic. Sorrrry!
Also I've changed the title, since it's disrespectful to drag a deceased
person's name along with this.
I've been busy a couple of days, didn't have time to follow the thread. Still
busy, but briefly with extracts:
@ Keelan Lightfoot
...
   Our problem isn't ASCII or Unicode, our problem
is how we use computers.
  Markup languages are a kludge, relying on plain text to describe higher level concepts.
 [snip lots]
Nice post, and I agree with all of it. This is the type of thinking needed, and in general
much like my approach. Except I'm a software and hardware designer, synthesist, and
pursue practical results. Or at least _try_ to.
Funny you mention keyboards, as that's one of the project's bootstrapping steps.
First a simulated keyboard (html & js initially) to allow free experimentation, later
an open hardware design suitable for makers, 3D printing, etc. The crappyness of
commercial keyboards is a bugbear of mine. Keyboards should be MUCH better than they are.
And last forever.
@ Grant Taylor & Toby Thain
...
   ???? bold
 ???? italic
 ???? overline
 ???? strike through
 ???? underline
 ???? superscript exclusive or subscript
 ???? uppercase exclusive or lowercase
 ???? opposing case
 ???? normal (none of the above) This covers only a small fraction of the
Latin-centric typographic
palette - much of which has existed for 500 years in print (non-Latin
much older). Computerisation has only impoverished that palette, and
this is how it happens: Checklists instead of research.
Work with typographers when trying to represent typography in a
computer. The late Hermann Zapf was Knuth's close friend. That's the
kind of expertise you need on your team. 
More generally, an encoding standard needs to allow for ANY kind of present and future
characters, fonts and modifiers.
But even more critically, it has to allow for such things without reference to
'central standards groups'. Enforced centralism is poison. For instance Unicode,
and that vast table of symbols - that still doesn't include decent arrows (and many
other needs.) What's required is a way for any bunch of people to be able to define
their own character sets, fonts, adornments, etc, create definition files for them, and
use those among themselves. Either embedded in documents or used as referenced defaults -
both must be possible.  It is easy enough to define a base encoding that allows this. And
in which legacy coding (ASCII, Unicode, etc) is one of the available defaults.
The point with embedding such capabilities in the base coding scheme, and then building
the superstructure of computing language and OS on top of that, is to achieve a scheme in
which human language and typesetting freedom is available through the entire structure.
@ Cameron Kaiser
...
 > Surely a Chinese or Japanese based programming
language could be
> developed. 
...
 The Tomy Pyuuta has a very limited BASIC variant called
G-BASIC which has
Japanese keywords and is programmed with katakana characters (such as "kake" ...

Exactly, except it should be possible for any group (eg who speak whatever language) to
modify existing computer language to their own human dialect. With compilers and
assemblers this is not trivial, but with dictionary-based interpreters it's much
easier. The keywords and operators are all just looked up in tables to achieve effects,
and what characters or ideograms serve as the keywords are entirely flexible.
Then imagine one interpretted scripting language, that serves multiple functions: document
layout, user apps and OS scripting. And that scripting language can be phrased in any
human language, AND includes full typsetting of itself.
@ Liam Proven
...
  There are a wider panoply of options to consider.
  ...
...
  Try to collapse all these into one and you're
doomed. 
Lots of great references, thanks! As for doomed... well we'll see. I think the trick
is to merely provide a mechanism for including extensible classes of 'stuff' in
the base coding. Because being rigid about the mechanics of the higher level capabilities
really is fatal. Fortunately, 'flexible extensibility' isn't so hard to do.
Especially when you have a bunch of disused legacy control codes to work with.
At 02:34 PM 28/11/2018 -0700, Jim Manley wrote:
...
 Some computing economics history:
I'm an engineer and scientist by both education and experience,    [snip]
...
 A theoretically "superior" encoding may
not see practical use by a significant number of people because of legacy
inertia that often makes no sense, but is rooted in cultural, sociological,
emotional, and other factors, including economics. 
Yep. I'm intensely aware of the economics and inertia factors. Points:
1. The ASCII-replacement coding is just a part of a wider project.
2. It's all a private project, for fun.
3. And yet there's a convergence of developments suggesting an opportunity in near
future.
 MS/Intel are bastardizing, backdooring and box-closing the Wintel platform into something
so evil even non-technical people are getting sick of it. This will continue, due to
political agenda of MS/Intel.
 Simultaneously the competing Linux world is fragmenting into churn-chaos. (Complex but
irreversible reasons.)
 Apple is... Apple. Becoming a platformm based mostly on virtue signalling, and
increasingly as bad as Wintel.
4. If it ever is released, it will be freeware, open hardware and copylefted. DRM
specifically banned from the platform. With many quite appealing wow-factors, several of
which will be totally killer. It is not politically possible for MS/Intel/Apple to follow
this path.
 [snip]
...
 Logic and reasoning are
simply nowhere near enough to create the conditions necessary for
widespread adoption - sometimes it's just good luck in timing (or, bad
luck, as the case may be). 
Absolutely. It's mostly about politics and meme-crafting.  Ref: Marx, L Ron Hubbard,
Mao, various religions, etc. Odd isn't it - so few instances of memetic weavers who
used their skills for the benefit of humankind. As opposed to those guys above, who
were all arseholes with pretty twisted objectives. Did you know L Ron Hubbard created
Scientology to win a drunken bet in a bar? Someone said "I bet you can't create a
religion!" And L Ron said "I bet I can!"
...
 ASCII was developed in an age when Teletypes ... 
Yep.
...
 You can't blame the ASCII developers for lack of
foresight when no one in
their right mind back then would have ever predicted we could have upwards
of a trillion bytes of memory in our pockets ... 
Absolutely. ASCII was a godsend at the time and I take pains to make this clear in the
proposal docs. This is a _hindsight_ refactoring.
...
 Someone thinking that they're going to make oodles
of money from some
supposedly new-and-improved proprietary encoding "standard" that discards
five-plus decades of legacy intellectual and economic investment, is
pursuing a fool's errand. 
Ha ha, I don't intend to even try to make any money from this. Other objectives.
Though, I'd probably set up a donations channel. Just in case people like it.
...
   Even companies with resources at the level of
Apple, Google, Microsoft, etc., aren't that arrogant, and they've
demonstrated some pretty heavy-duty chutzpah over time.  BTW, you won't be
able to patent what apparently amounts to a lookup table, and even if you
copyright it, 
Patents and copyright are poisons that are crippling intellectual and technological
progress. The original concepts were OK, but got over-extended by greed (and still getting
worse.) Patents in particular have become a tool for big corporate suppression of any
potential competition, while copyright is used to destroy free expression. The entire
DRM/copyright legal framework should be nullified.
This project will be intentionally copyright and patent excluding. Freeware, published,
open source, open hardware, etc. Just a conformance symbol, which certifies (among other
things) that _nothing_ in the systems & software is under any kind of DRM restriction.
People buy or build such a system, they own it entirely.
This is why I can't mention details or coined terminology now.
...
 True standards are open nowadays - the days of
proprietary "standards" are 
Except that by 'open' they usually mean you can pay a lot of money for a copy of
the standard doc.
That's not what I call 'open.'
...
 a couple of decades behind us - even Microsoft has been
publishing the
binary structure of their Office document file formats.  The specification
for Word, that includes everything going back to v 1.0, is humongous, and
even they were having fits trying to maintain the total spec, which is
reportedly why they went with XML to create the .docx, .xlsx, .pptx, etc.,
formats.  That also happened to make it possible to placate governments
(not to mention customers) that are looking for any hint of
anti-competitive behavior, and thus also made it easier for projects such
as OpenOffice and LibreOffice to flourish.
Typographical bigots, who are more interested in style than content, were
safely fenced off in the back rooms of publishing houses and printing
plants until Apple released the hounds on an unsuspecting public.  I'm
actually surprised that the style purists haven't forced Smell-o-Vision
technology on The Rest of Us to ensure that the musty smell of old books is
part of every reading "experience" (I can't stand the current common use of
that word).  At least I have the software chops to transform the visual
trash that passes for "style" these days into something pleasing to _my_
eyes (see what I did there with "severely-flawed" ASCII?  Here's how you
can do /italics/ and !bold! BTW.). 
Oh yes, tell me about it. 'Do it this way' bigots of all kinds. Pick any possible
thing that can be done more than one way, and there will be camps of fanatics insisting
their one way is the true way and all others are crazy.
Finding such artificial dichotomies (or n-way splits) has been a very rich source of
inspiration for holistic rethinking.
Btw, again I'll emphasize that when I say ASCII is severely flawed, I mean this in the
context of what we know now about information coding requirements, and creating extensible
systems. It was't 'severely flawed' back when it was created.
...
 Nothing frosts me more than reading text that can't
be resized and
auto-reflowed, especially on mobile devices with extremely limited display
real estate.  I'm fully able-bodied and I'm perturbed by such bad design,
so, I'm pretty sure that pages that prevent pinch-zooming, and that don't
allow for direct on-display text resizing/auto-reflow, violate the spirit
completely, if not virtually all of the letters, of the Americans with
Disabilities Act (and similar legislation outside the U.S., I imagine). 
Well, there's more than that one requirement. If one wanted to capture a historical
document, the absolute image of the page(s) is a core aspect, and can't be
'reflowed'. But otoh, the text content should be accessible as a searchable and
reflowing character stream. A decent coding scheme will support both objectives
simultaneously.
Btw I'm constantly amazed by how badly tech docs are being 'digitized' even
now. Service manuals with fold out schematics, screened tonal multi-colour illustrations
etc... just endless awful digital copy fails. Meanwhile the original paper copies get
rarer and rarer, because idiots think 'those are all online now, paper copies are
obsolete', and throw them out.
@ Keelan Lightfoot
...
 from a usability standpoint, control codes are
problematic. Either the user needs to memorize them, or software needs
to inject them at the appropriate times.
You're thinking of 'control codes' as something you type by holding down CTRL
and some other key. Yes, these are a pain and I personally hate UI's that depend on
memorising lots of them.
But strictly speaking 'control codes' are the byte codes 0x00 to 0x1F, in the
ASCII table. Most of which are now little used apart from in hardware protocols. How those
would be brought into use in an ASCII-replacement and new UI, is another topic. Sadly,
part of the area I won't talk about. Just bear in mind that this system includes new
keyboard designs, and 'things that have to be memorised' are fine for some people
but not for others (including me.)
Ha ha, even ctrl-C and ctrl-V for cut and paste are a pain, not because they must be
memorised, but because the ergonomics of distorting the fingers to type them, is horrible
for such a common action. Stuff like this...
Oh, and if you are wondering if I'm imagining some huge keyboard with even more keys,
no. Personally I use a short ('10-keyless') keyboard, and don't want to ever
have to go back to stupidly big keyboards.
...
 In addition to crusty old computers, I also enjoy the
company of three crusty old Linotypes. In fact, that's what got me thinking
about this
stuff in the first place.
Ah, I am intensely jealous! I wish I could find an old but working linotype. And someone
to teach me how to use it. Hot lead, yeah! (I used to cast things in lead as a child, have
done bronze casting and intend to do more.)
I have some exposure to typesetting & printing; enough to know how much I don't
know. Some articles on related topics are in-progress, but not yet posted.
Anyway, back on topic (classic computing.) Here's an ascii chart with some control
codes highlighted.
  http://everist.org/ASCII/ascii_reuse_legend.png
I'm collecting all I can find on past (and present) uses of the control codes.
Especially the ones highlighed in orange. Not having a lot of success in finding detailed
explanations, beyond very brief summaries in old textbooks.
Note that I'm mostly interested in code interpretations in communications protocols.
Their use in local file encodings not so much, since those are the domain of legacy
application software and wouldn't clash with redefinition of what the codes do, in
future applications.
And now, back to machining a lock pick for a PDP-8/S front panel cylinder lock.
  http://everist.org/NobLog/20181104_PDP-8S.htm#locks
Guy