On Fri, Feb 20, 2015 at 6:42 PM, Johnny Billquist <bqt at update.uu.se> wrote:
Misunderstanding 1: a char do not need to be 8 bits.
Also, an int is defined to be whatever size makes most sense on a machine.
It can be any number of bits. The only thing guaranteed is that an int is
between a short and a long in size. (They can all be equal.)
[...]
And since sizeof returns an integer, you pretty much
also needs to make
one a multiple of the other as far as number of bits goes.
That's a directly stated requirement, not just a consequence of
sizeof, though likely sizeof(), memcpy(), and many other library
functions provide some of the rationale for the requirement.
All C data types except bit fields must have a size that is an integer
multiple of CHAR_BIT bits, which is the number of bits in the char
type.
ISO/IEC 9899 2nd Ed. ?6.2.6.1 ?2, 4
C also requires that the size of a character be at least 8 bits.
ISO/IEC 9899 2nd Ed. ?5.2.4.2.1 ?1
memcpy() is specifically defined to copy characters.
ISO/IEC 9899 2nd Ed. ?7.21.2.1 ?2
and many other standard library functions are used similarly to deal
with any type of C object as being composed of characters.
These rules are why it's not possible for a conformant C
implementation on a PDP-10 to have 6-bit characters, nor for it to
have 7-bit or 8-bit characters but 18-bit or 36-bit integers. A
conformant implementation of C on a PDP-10 could have 9-bit, 12-bit,
18-bit, or 36-bit characters, with 9-bit probably providing the
greatest utility.
Alternatively, a conformant implementation on the PDP-10 could have
8-bit characters and all other types (including pointers!) being a
multiple of that, with four bits per native 36-bit word going to
waste. That would better suit portability of code from platforms with
8-bit characters, but would be rather less efficient since arithmetic
code would have to include extra steps to mask the native 36-bit
values to the size of the relevant C types, even for intermediate
results within expressions.
One can argue that a not-quite-C compiler for a PDP-10 that supported
6-, 7-, and 8-bit character types as well as 18-bit and 36-bit
integers would be useful, but by definition it wouldn't actually be C.