Text encoding Babel. Was Re: George Keremedjiev

Grant Taylor cctalk at gtaylor.tnetconsulting.net
Sun Nov 25 18:00:27 CST 2018

On 11/25/18 3:53 PM, Liam Proven wrote:
> It's been enlightening!


> Some I was ready for.
> E.g. In French or Spanish, both of which I can speak to some extent, 
> letters  like á or ó are not seen as separate letters: French would call 
> them a-acute, an a with an acute accent. Ç is a c with a cedilla.  Etc.

If they are not seen as separate letters, then do their meaning's 
change?  Or is the different accent more for pronunciation?

> But in Swedish/Norwegian/Danish -- I speak basic Norwegian and rudimentary 
> Swedish -- ø and å and ä and so on are not a or o with accents on: 
> they are _different letters_ that come at the end of the alphabet.

I assume that they have different meanings (if that applies to letters) 
and are uses as different as "A" and "q".

> Czech is like that. Š and Č and Ž and many more that my Mac can't 
> readily type are _extra letters_ which come after the unmodified form 
> in the alphabet.


I don't even know how to properly describe something that visually looks 
like letters (glyphs?) to me, but may be an imprecise simplification on 
my part.

> Without them, you can't write correct Czech. It's worse than writing 
> English without the letter E.
> Usually you can guess but not always.
> Byt means flat, apartment; b y-acute t means the verb "to be".
> You can probably work that out, but you can't always. A restaurant 
> menu would be hopelessly corrupted as both "raw" and "with cheese" 
> are quite likely.


> Sure, my office street name:  Křižíkova
> K, r haček, i, z haček, i acute, k o v a.

I had to zoom my font to see enough detail in Křižíkova, but it does 
look like things came through just like you describe.  (They even made 
it through my shell script that I use to re-flow text in replies.)

> A hacek is like an upside down circumflex: ^
> Also known as a caron.


> Oh yes. It's quite a minefield.

/me blinks and shakes his head.

> Czech keyboards have so many extra letters, the *numbers* are on shift 
> combinations!


> Well yes.
> I believe Mr Corlett here rejects all mail from gmail.com -- except 
> mine... ;-)


Grant. . . .
unix || die

More information about the cctalk mailing list