Text encoding Babel. Was Re: George Keremedjiev

Grant Taylor cctalk at gtaylor.tnetconsulting.net
Sun Nov 25 16:42:37 CST 2018


On 11/23/18 5:52 AM, Peter Corlett via cctalk wrote:
> Worse than that, it's *American* ignorance and cultural snobbery which 
> also affects various English-speaking countries.

Please do not ascribe such ignorance with such a broad brush, at least 
not without qualifiers that account for people that do try to respect 
other people's cultures.

> The pound sign is not in US-ASCII, and the euro sign is not in ISO-8859-1, 
> for example.

Well, seeing as how ASCII, the /American/ Standard Code for Information 
Interchange, is inherently /American/, I don't personally fault it for 
not having currency symbols for other languages / regions.

Instead, I consider ASCII to be a limited standard.  Hence why so much 
effort has gone into other standards to overcome this, and other, 
limitation(s).

I do not know for sure, but I'm confident that other character sets 
don't have characters / glyphs from other languages.

I'm sure that there is room for a discussion of why ASCII is used as the 
underlying character set for network services and the imposition that it 
imposes on international friends and colleagues.

> Amusingly, peering through my inbox in which I have mail in both Dutch 
> and English, the only one with a UTF-8 subject line is in English. It 
> was probably composed on a Windows box which "helpfully" turned a hyphen 
> into an en-dash.

I'm trying to NOT search my mailbox.

I'd be more curious about the number of bodies that contain UTF-8 or 
UTF-16 that can encode more characters / glyphs.  It's my understanding 
that without some special quite modern extensions, non-ASCII is shunned 
in headers, including the Subject: header.

P.S.  Resending from the correct email address.  —  A recent Thunderbird 
update broke the Correct-Identity add-on.  :-(



-- 
Grant. . . .
unix || die


More information about the cctalk mailing list