the assumption
of everything being made up of bits is wired very
deeply into modern C - and to a nontrivial extent into even K&R C.
I dunno, as
I recall K&R C was - and is - just as bit oriented as current gc$
So, what bits -:) of modern C are more wired into the
language?
First, let me dispose of the easy ones:
Auto-increment and -decrement are not bit-oriented; they make just as
much sense for PICTURE 999999 variables as for int variables - indeed,
any type for which addition and subtraction of small integers make
sense can support ++ and --. (Note that standard C supports ++ and --
applied to pointers and floating-point types.)
No C variant I'm aware of has rotates (though with the variety of
experiments out there, I'd be surprised if nobody had created one).
gcc's internal representation (in at least some versions) has rotates,
it's true, but (a) gcc's "C" is not actually C (it's C plus various
extensions, at a minimum), and (b) gcc also supports a bunch of other
languages, at least some of which may well have rotates.
Bitwise operators and shifts are the interesting ones.
Modern C requires them to operate on unsigned integral types, and
signed integral types when the values in question are nonnegative, as
if they were represented in binary:
6.5 Expressions
...
[#4] Some operators (the unary operator ~, and the binary
operators <<, >>, &, ^, and |, collectively described as
bitwise operators) are required to have operands that have
integer type. These operators return values that depend on
the internal representations of integers, and have
implementation-defined and undefined aspects for signed
types.
...
6.5.7 Bitwise shift operators
...
[#3] The integer promotions are performed on each of the
operands. The type of the result is that of the promoted
left operand. If the value of the right operand is negative
or is greater than or equal to the width of the promoted
left operand, the behavior is undefined.
(in this next paragraph, note that the textification I'm quoting from
lost superscripts; "2E2" is supposed to be 2 to the power E2.)
[#4] The result of E1 << E2 is E1 left-shifted E2 bit
positions; vacated bits are filled with zeros. If E1 has an
unsigned type, the value of the result is E1?2E2, reduced
modulo one more than the maximum value representable in the
result type. If E1 has a signed type and nonnegative value,
and E1?2E2 is representable in the result type, then that is
the resulting value; otherwise, the behavior is undefined.
[#5] The result of E1 >> E2 is E1 right-shifted E2 bit
positions. If E1 has an unsigned type or if E1 has a signed
type and a nonnegative value, the value of the result is the
integral part of the quotient of E1 divided by the quantity,
2 raised to the power E2. If E1 has a signed type and a
negative value, the resulting value is implementation-
defined.
(Interesting. I hadn't previously noticed that << was undefined but >>
was implementation-defined in the negative case. I wonder why....)
~ (6.5.3.3 #4), & (6.5.10 #4), ^ (6.5.11 #4), and | (6.5.12 #4) all
simply speak of the bits making up the operands, so it's not clear what
the spec could mean if the machine is non-binary or the operands are
otherwise not made up of well-defined bits.
But K&R is much laxer in its definitions. I'd have to go read it to be
sure, but you might be able to implement &, |, ^, and ~ as per-digit
operations rather than per-bit operations (my first cut would be & as
digit-by-digit minimum, | as digit-by-digit maximum, ^ as
digit-by-digit addition modulo the base, and ~ as ~x = MAXINT-1-x
(loosely speaking)), with shifts shifting by digits instead of bits,
and still conform to K&R. That certainly feels like a reasonable
approach to me.
K&R C was designed as an OS implementation language. As such, it is
expected that the coder knows the machine, with operations doing things
that are unsurprising in view of that. Modern C is a tricky balancing
act, on the one hand pulled towards that original stance by the desire
that it still be a useful OS implementation language, on the other hand
pulled towards precisely-specified and machine-independent semantics by
the desire that it be usable for cross-OS-portable programming, such as
for application-level code and utility libraries. The current ubiquity
of binary machines has meant that C could get away with mandating
binary for things like & and << without crippling it enough to bother
anyone with significant clout, in contrast to things (like int size,
which they found it necessary to leave considerable leeway in).
/~\ The ASCII Mouse
\ / Ribbon Campaign
X Against HTML mouse at
rodents-montreal.org
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B