Text encoding Babel. Was Re: George Keremedjiev

Keelan Lightfoot keelanlightfoot at gmail.com
Fri Nov 30 12:34:21 CST 2018


> Welcome.  :-)

Thanks!

> Do you think that we stopped enhancing the user input experience more
> because we were content with what we had or because we didn't see a
> better way to do what we wanted to do?

Both. In the beginning we were content, because the keyboard was well
suited to the capabilities of the technology available at the time it
was invented. We didn't see a better way, because when compared to
using a pen and paper (for writing) or using toggle switches (to
control a computer), a keyboard was a significant improvement. It's
the the explosive growth and universal adoption of computers that has
locked us in to the keyboard as the standard.

> I agree that markup languages are a kludge.  But I don't know that they
> require plain text to describe higher level concepts.
>
> I see no reason that we can't have new control codes to convey new
> concepts if they are needed.

I disagree with this; from a usability standpoint, control codes are
problematic. Either the user needs to memorize them, or software needs
to inject them at the appropriate times. There's technical problems
too; when it comes to playing back a stream of characters, control
characters mean that it is impossible to just start listening. It is
difficult to fast forward and rewind in a file, because the only way
to determine the current state is to replay the file up to that point.

> Aside:  ASCII did what it needed to do at the time.  Times are different
> now.  We may need more / new / different control codes.
>
> By control codes, I'm meaning a specific binary sequence that means a
> specific thing.  I think it needs to be standardized to be compatible
> with other things -or- it needs to be considered local and proprietary
> to an application.

Do you mean modal control codes? As in "everything after here is bold"
and "the bold stops here"?

> I actually wonder how much need there is for /all/ of those utilities.
> I expect that things should have streamlined and simplified, at least
> some, in the last 30 years.

We've gone backwards sadly. For a brief while, this kind of rich user
interface stuff was provided by the OS. A text box, regardless of the
application, would use the OS's text box control, and would have a
universal interface for rich text. But the growth of the web has
resulted in an atavism. We're back to plain text, and using markup to
style our text. If I want bold text in Slack, I have to use markup.
Facebook Messages and YouTube comments also support markup, but the
syntax is slightly different between them. Back in 1991, If I wanted
bold text in any application that supported rich text on my SE/30, I
hit command-B and I got bold text. Sure, there are Javascript rich
text editors that can be bolted on, but they all have their own UI
concepts, and they're all a trainwreck.

> What would you like to do or see done differently?  Even if it turns out
> to be worse, it would still be something different and likely worth
> trying at least once.

In addition to crusty old computers, I also enjoy the company of three
crusty old Linotypes. In fact, that's what got me thinking about this
stuff in the first place. The Linotype keyboard has 90 keys, which
directly map to the 90 glyphs a Linotype can "render". The keyboard is
laid out in three qual sized sections: lowercase letters on the left,
uppercase on the right, with numbers and punctuation in the middle.
Push the button, and what's marked on the button is what ultimately
ends up on the page. Each Linotype mat (matrix; letter mold) has two
positions, which can be selected by flipping a little lever when
they're being assembled into a line. The two positions are almost
always used to select between two versions of a font; roman/bold or
roman/italic are the most common pairings.

But what it means is that you can walk up to a machine with a
half-typed line in the assembler and immediately determine its state.
Any mats set in the bold position are in a physically different
position in the assembler. The position of the switch tells you if
you're typing in bold or roman. When you push the 'A' key, you know an
uppercase 'A' in bold will be added to the line. Additionally, the
position of that switch can be verified without taking your eyes off
of the copy. There is no black magic, no spooky action at a distance.
The capabilities of the machine are immediately apparent.

> I don't think of bold or italic or underline as second class concepts.
> I tend to think of the following attributes that can be applied to text:
>
>   · bold
> [snip]
>
> I don't think that normal is superior to the other four (five) in any
> way.  I do think that normal does occur VASTLY more frequently than the
> any combination of the others.  As such normal is what things default to
> as an optimization.  IMHO that optimization does not relegate the other
> styles to second class.

I agree. I think that they're normal enough that they should exist as
their own code points in unicode. Our 'standard' coding treats
'formatting' as optional. IOW, I agree more!

> I will say that some people probably decided what a minimum viable
> product is when selling typewriters, and consciously chose to omit the
> other options.

Consciously omitted in the beginning, yes, otherwise typewriters would
have never been affordable enough to become mainstream. But that has
led to "plain text" becoming the de-facto standard. It's 2018, and I
can't type italic text in this e-mail without potentially causing some
people problems, 𝘣𝘶𝘵 𝘐'𝘮 𝘸𝘪𝘭𝘭𝘪𝘯𝘨 𝘵𝘰 𝘨𝘪𝘷𝘦 𝘪𝘵 𝘢
𝘵𝘳𝘺.

> I see no reason that the keyboard can't have keys / glyphs added to it.

I agree. But if they're added as a touch screen, shoot me now.
"Haptics" has mutated into "shakes when you touch it", instead of "you
can feel the button".

> I'm personally contemplating adding additional keys (via an add on
> keyboard) that are programmed to produce additional symbols.

I have the little DigiStump Arduino thingy somewhere that I bought to
use for exactly that purpose! My goal is to create a Linotype style
keyboard, the middle bank of 30 keys tailored to my application, as
was often done with the Linotype. I have one Linotype with dedicated
E13B keys for setting the magnetic ink characters at the bottom of
cheques.

> When I say frequently, I mean that I use some of them daily, many of
> them weekly, and others monthly.  I've written tiny shell scripts that I
> can run that insert the character into the clipboard so that I can
> easily paste the character where I need them.  (The order of the symbols
> above comes from the alphabetical nature of their names.)

I find that having the extra glyphs readily available means that I use
them more in everyday communication; I use the trademark symbol (which
is only slightly buried on a mac keyboard) quite often to convey a
sense of sarcasm (i.e. "sure, we could do that, but there is no One
True Way™, regardless of what the sales people told you...").

> So … I'm in favor of extending the keyboard.  :-)

When my Linotype 2000 keyboard enters production, I'll let you know ;)

> We need a way to tell the computer that something is a comment.  One
> method is to use markup / control key sequences.  Another method is to
> use what Google Docs uses, namely highlight text to comment, click the
> comment balloon, enter the comment in the comment box.  There's no key
> sequence.  But there is an explicit indication that something is a
> comment.  Said comment is displayed in the right hand margin, so I know
> it's a comment.

What if comment characters had their own unicode code points? A bit
silly, yes, but that's the lines I'm thinking along. It would allow me
to put comments right inside my code if I found myself stricken with
such a desire to produce prodigiously incomprehensible programming!

> I believe that some language treat strings, arrays, hashes, etc. as
> separate namespaces.  Thus you must indicate which namespace you are
> meaning.  I don't see any reason that we can't have a common namespace
> and allow the named entity to include information about what type of
> entity it is.  —  This seems like a programming language design issue,
> not a human keyboard issue.
>
> Or did I completely misinterpret what you meant by strings?

I'm going to lavish on the unicode for this example, so those of you
properly unequipped may not see this example:

foo := 𝑡ℎ𝑖𝑠 𝑖𝑠 𝑎 𝑠𝑡𝑟𝑖𝑛𝑔 𝘁𝗵𝗶𝘀 𝗶𝘀 𝗮 𝗰𝗼𝗺𝗺𝗲𝗻𝘁
printf(𝑡ℎ𝑒 𝑠𝑡𝑟𝑖𝑛𝑔 𝑖𝑠 ① 𝑖𝑠𝑛𝑡 𝑡ℎ𝑎𝑡 𝑒𝑥𝑐𝑖𝑡𝑖𝑛𝑔, foo)
if 𝘁𝗵𝗶𝘀 𝗶𝘀 𝗮 𝗽𝗼𝗼𝗿𝗹𝘆 𝗽𝗹𝗮𝗰𝗲𝗱 𝗰𝗼𝗺𝗺𝗲𝗻𝘁 foo ==
𝑡ℎ𝑖𝑠 𝑖𝑠 𝑎𝑙𝑠𝑜 𝑎 𝑠𝑡𝑟𝑖𝑛𝑔, 𝑏𝑢𝑡 𝑛𝑜𝑡 𝑡ℎ𝑒 𝑠𝑎𝑚𝑒
𝑜𝑛𝑒 { 𝘁𝗵𝗶𝘀 𝗶𝘀 𝗮𝗹𝘀𝗼 𝗮 𝗰𝗼𝗺𝗺𝗲𝗻𝘁
...

An atrocious example, but a good demonstration of my point. If I had a
toggle switch on my keyboard to switch between code, comment and
string, it would have been much simpler to construct too!

> I will concede that many computers and / or programming languages do
> behave based on text.  But I am fairly confident that there are some
> programming languages (I don't know about computers) that work
> differently.  Specifically, simple objects are included as part of the
> language and then more complex objects are built using the simpler
> objects.  Dia and (what I understand of) Minecraft come to mind.

I don't deny that they exist, but there are no significant
applications being developed with them.

> Visual Basic keeps coming to mind as I read your paragraph.  (From what
> I understand) VB is inherently visually / graphically oriented.  You
> start with the visual components, and then assign actions to various
> pieces.  Those actions are likely text.  But I see no reason that you
> couldn't extend the model to execute other things.

I'm not sure how VB puts things together behind the scenes. The
example I am more familiar with is HyperCard, where instead of the UI
existing in the code, the code exists in the UI as you describe. This
of course violates all the religious tenets of Model-View-Controller
design (for the most part, dogmatic adherence to that pattern has
mainly served to give us government software projects that die as
multi-trillion dollar failures before seeing the light of day. I am of
course being a bit facetious, but not entirely). Back on topic, the
tools exist, but they are often seen as toys and not serious software
development tools. Are we at the point where the compiler for a visual
programming language is written in the visual programming language?

- Keelan


More information about the cctalk mailing list