Text encoding Babel. Was Re: George Keremedjiev

Liam Proven lproven at gmail.com
Tue Nov 27 06:44:23 CST 2018

On Mon, 26 Nov 2018 at 23:39, Christian Gauger-Cosgrove
<captainkirk359 at gmail.com> wrote:
> On Mon, 26 Nov 2018 at 03:44, Liam Proven via cctalk
> <cctalk at classiccmp.org> wrote:
> > If it's in Roman, Cyrillic, or Greek, they're alphabets, so it's a letter.
> >
> Correct, Latin, Greek, and Cyrillic are alphabets, so each
> letter/character can be a consonant or vowel.
> > I can't read Arabic or Hebrew but I believe they're alphabets too.
> >
> Hebrew, Arabic, Syriac, Punic, Aramaic, Ugaritic, et cetera are
> abjads, meaning that each character represents a consonant sound,
> vowel sounds are either derived from context and knowledge of the
> language, or can be added in via diacritics.
> Devanagari and Thai (and Tibetan, Khmer, Sudanese, Balinese...) are
> abugidas, where each character is a consonant-vowel pair, with the
> "base" character being one particular vowel sound, and alternates
> being indicated by modifications (example in Devanagari: "क" is "ka",
> while "कि" is "ki"; another example using Canadian Aboriginal
> Syllabics "ᕓ" is "vai" whereas "ᕗ" is "vu").
> > I don't know anything about any Asian scripts except a tiny bit of
> > Japanese and Chinese, and they get called different things, but
> > "character" is probably most common.
> >
> Japanese actually uses three different scripts. Chinese characters
> (the kanji script of Japanese, and the hanja script of Korean) are
> logograms.
> Japanese also has two syllabic scripts, katakana and hiragana where
> each character represents a specific consonant vowel pair.
> Korean hangul (or if you happen to be from the DPRK, chosŏn'gŭl) is a
> mix of alphabet and syllabary, where individual characters consist of
> sub parts stacked in a specific pattern. Stealing Wikipedia's example,
> "kkulbeol" is written as "꿀벌", not the individual parts "ㄲㅜㄹㅂㅓㄹ".
> And now for even more fun, Egyptian hieroglyphics and cuneiform (which
> started with Sumerian, and then used by the Assyrians/Babylonians and
> others) are a delightful mix of logographic, syllabic and alphabetic
> characters. Because while China loathes you, Babylon has a truly deep
> hatred of you and wishes to revel in your suffering.

Um. Yes. Thank you for that. Very informative, interesting, and I did
actually know most of it already but maybe others didn't.

The thing is that it's not actually very germane to the question I was
addressing, which was "what do you call the individual units in
different scripts?" I.e. "letter" vs "glyph" vs "character" vs
"ideogram" vs "grapheme", etc... :-)

Liam Proven - Profile: https://about.me/liamproven
Email: lproven at cix.co.uk - Google Mail/Hangouts/Plus: lproven at gmail.com
Twitter/Facebook/Flickr: lproven - Skype/LinkedIn: liamproven
UK: +44 7939-087884 - ČR (+ WhatsApp/Telegram/Signal): +420 702 829 053

More information about the cctalk mailing list