It was thus said that the Great Keelan Lightfoot via cctalk once stated:
I see no
reason that we can't have new control codes to convey new
concepts if they are needed.
I disagree with this; from a usability standpoint, control codes are
problematic. Either the user needs to memorize them, or software needs
to inject them at the appropriate times. There's technical problems
too; when it comes to playing back a stream of characters, control
characters mean that it is impossible to just start listening. It is
difficult to fast forward and rewind in a file, because the only way
to determine the current state is to replay the file up to that point.
[ and further down the message ... ]
I'm going to lavish on the unicode for this
example, so those of you
properly unequipped may not see this example:
foo := ???? ?? ? ?????? ???? ?? ? ???????
printf(??? ?????? ?? ? ???? ???? ????????, foo)
if ???? ?? ? ?????? ?????? ??????? foo ==
???? ?? ???? ? ??????, ??? ??? ??? ????
??? { ???? ?? ???? ? ???????
...
An atrocious example, but a good demonstration of my point. If I had a
toggle switch on my keyboard to switch between code, comment and
string, it would have been much simpler to construct too!
Somehow, the compiler will have to know that "???? ?? ? ??????" is a
string while "???? ?? ? ???????" is a comment to be ignored. You lamented
the lack of a toggle switch for the two, but existing langauges, like C,
already have them, '"' is the "toggle" for strings, while
'/*' and '*/' are
the toggles for comment (and now '//' if you are using C99). It's still
something you have to "type" (or "toggle" or "switch" or
somehow indicate
the mode).
The other issue is now such inforamtion is stored, and there, I only see
two solutions---in-band and out-of-band. In-band would be included with the
text. Something along the lines of (where <ESC> is the ASCII ESC character
27, and this is an example only):
foo := <ESC>_this is a string<ESC>\ <ESC>^this is a
comment<ESC>\
printf(<ESC>_the string is <ESC>[1p isn't that exciting<ESC>\,foo)
But this has a problem you noted above---it's a lot harder to seek through
the file to arbitrary positions. Grant Taylor stated another way of doing
this:
What if there were (functionally) additional bits that
indicated various
other (what I was calling) stylings?
I think that something along those lines could help avoid a concern I
have. Namely how do search for an A, what ever ""style it's in. I
think I could hypothetically search for bytes ~> words (characters)
containing (xxxxxxxx xxxxxxxx) (xxxxxxxx) 01x00001 (assuming that the
proceeding don't cares are set appropriately) and find any format of A,
upper case, lower case, bold, italic, underline, strike through, etc.
There are several problems with this. One, how many bits do you set aside
per character? 8? 16? There are potentially an open ended set of stylings
that one might use. Second problem---where do you store such bits? Not to
imply this is a bad idea, just that there are issues that need to be
resolved with how things are done today (how does this interact with UTF-8
for instance? Or UCS-4?).
Then there's out-of-band storage, which stores such information outside the
text (an example---I'm not saying this is the only way to store such
information out-of-band):
foo := this is a string this is a comment
printf(the string is 1 isn't that exciting,foo)
---
string 8-23
string 50-63
string 65-84
replacement 64
comment 25-41
This has its own problems---namely, how to you keep the two together. It
will either be a separate file, which could get separated, or part of the
text file but then you run into the problem of reading Microsoft Word files
cira 1986 with today's tools.
-spc (I like the ideas, but the implementations are harder than it first
appears ... )