Text encoding Babel. Was Re: George Keremedjiev

Maciej W. Rozycki macro at linux-mips.org
Fri Nov 30 18:59:43 CST 2018

On Sun, 25 Nov 2018, Liam Proven via cctalk wrote:

> > > For example, right now, I am in my office in Křižíkova. I can't
> > > type that name correctly without Unicode characters, because the ANSI
> > > character set doesn't contain enough letters for Czech.
> >
> > Intriguing.  Is there an old MS-DOS Code Page (or comparable technique)
> > that does encompass the necessary characters?
> Don't know. But I suspect there weren't many PCs here before the
> Velvet Revolution in 1989. Democracy came around the time of Windows
> 3.0 so there may not have been much of a commerical drive.

 Be assured there were enough IBM PC clones running DOS around from 1989 
onwards for this stuff to matter, and hardly anyone switched to MS Windows 
before version 95 (running Windows 3.0 with the ubiquitous HGC-compatible 
graphics adapters was sort of fun anyway, and I am not sure if Windows 3.1 
even supported it; maybe with extra drivers).

 Anyway MS-DOS 5.0 onwards had a complete set of code pages for various 
regions of the world.  For Czechia, Hungary, Lithuania, Poland, and other 
European countries located towards the east and using a language with a 
latin transcription code page 852 was provided.  For France, Germany, 
Spain, Nordic countries, etc. page 850 was provided.  There were other 
pages included as well, beyond the IBM's original page 437, including 
Greek and Cyrillic, but I don't know the details.  It's quite likely 
Wikipedia has them.

 Of course the HGC didn't support text mode character switching, however 
ISA VGA clones started trickling in at one point too.  I still have my ISA
Trident TVGA 8900C adapter from 1993 working in one of my machines, though 
I have since switched to Linux.

 NB my last name is also correctly spelled Różycki rather than Rozycki, 
and the two letters with the diacritics are completely different from and 
have sounds associated that bear no resemblance to the corresponding ones 
without, i.e. these are not merely accents, which we don't have in Polish 
at all (Polish complicates this further in that the sound of `ó' is the 
same as the sound of `u' and the sound of `ż' is the same as the sound of 
`rz' (which is BTW different from where the two letters are written 
separately), however the alternatives are not interchangeable and are 
either invalid or change the meaning of a word, and many native Polish 
speakers get them wrong anyway).



More information about the cctalk mailing list