On 31 Jan 2012 at 21:59, Toby Thain wrote:
This is essentially how Professor Knuth achieved
portability to
non-ASCII systems for TeX, METAFONT and his other tools.
Essentially, the rule is "Don't do arithmetic on characters".
But I've had a lot of questions about what the specs actually mean.
If the "smallest addressable unit" in C of being type char apparently
doesn't mean that the machine has to be char-addressable.
For example, a machine with 128-bit words, and only addressable by
word addresses doesn't need to have type char as 128 bits, only that
the compiler and run-time need to make provision for some means of
addressing chars, even if that means a separate system of addressing;
e.g. "C" addresses are machine addresses shifted by 4 bits.
I suppose it's even possible to create a C where word addresses ==
char addresses; the char being aligned in a word, one char per word,
with the remainder of the word unsued.
So does the difference between to void* pointers necessarily equate
to a count of chars between those addresses? Take the case of one
char per word above, for example.
Do char and int addresses have to share the same space? Or can chars
and ints enjoy separate addressing spaces? Do addressing spaces need
to be compatible? (I think about low-end PIC 8-bit and AVR where
data stored in code space as constants don't have the same
granularity.
--Chuck