On 15 Oct 2011 at 19:39, Eric Smith wrote:
However, the standard also requires that the character
type occupy at
least 8 bits, that the minimum range for unsigned char is 0 to 255,
and that the minimum range for signed char is -127 to +127 (section
5.2.4.2.1).
This rules out the use of 6-bit and 7-bit characters, so the native
PDP-10 text representation cannot be used as the C standard character
type at all.
Must an int contain an integral number of chars? Back in the 70s,
there were several proposals for supporting 8-bit ANSI and EBCDIC
character sets on Cyber 70/170 series of machines with 60-bit words.
One such was that a word would contain 7.5 characters.
Another proposal was to use only the low-order 48 bits (certain
instructions made this very attractive).
Another proposal was to go to 10- or 12-bit characters (the latter
attractive from an efficient I/O standpoint).
What was finally chosen, IIRC, was a mix of 6- and 12-bit character
codes, with a 00 code acting as an "escape" for the lowercase set.
It was very messy.
The Cray-1 COS approach was just to unpack character data into a 1-
per-word format, wasting 56 bits out of every word. "Pack" and
"unbpack" system read/write calls were available.
--Chuck