Pascal not considered harmful - was Re: Rich kids are into COBOL

Mouse mouse at Rodents-Montreal.ORG
Sat Feb 21 14:26:56 CST 2015


> Consider a machine with a word length of 64 bits.  This machine
> represents floating point numbers with a 16 bit exponent and a 48 bit
> mantissa (nothing unusual so far).

Okay.

> So we have the length of short = int = long = 64 bits.  So far so
> good.

Well, we _can_ have short/int/long all having 64 bits.  Even char, too,
I think.

> However, given such a generous word length, the designers of this
> machine decide not to dedicate special hardware toward handling 64
> bit integers, but have said that 48 bits should be long enough for
> anyone, and so treat integer arithmetic as a subset of floating point
> (straightforward enough).   So not all values of a 64 bit word
> reflect valid integers--the exponent must be a certain
> value--anything else is floating point.

Well, this makes it impossible to implement long long, which (loosely
put) must have at least 64 bits of range.  But let's pretend that long
long doesn't exist, or perhaps that all the bit counts you give are
doubled, or some such.

> Now, here's where we get into sticky territory.  Since C draws no
> data type distinctions between bitwise logical operations on ints and
> arithmetic operations, is it possible to implement C on this machine?

I'm not sure.  The C99 draft I have says (6.2.6.1)

       [#5] Certain object representations  need  not  represent  a
       value  of the object type.  If the stored value of an object
       has  such  a  representation  and  is  read  by  an   lvalue
       expression  that  does not have character type, the behavior
       is undefined.  If such a representation  is  produced  by  a
       side  effect  that modifies all or any part of the object by
       an lvalue expression that does not have character type,  the
       behavior is undefined.41)  Such a representation is called a
       trap representation.

Footnote 41 says

       41)Thus,  an automatic variable can be initialized to a trap
          representation without causing  undefined  behavior,  but
          the  value  of the variable cannot be used until a proper
          value is stored in it.

The boolean operations on unsigned integer types would have to be
implemented in a way that's careful to avoid messing up the magic
exponent-field value (they couldn't just be 64-bit boolean operations,
in general).

The wording about "character type" above appears to be intended to
support idioms like

	for (i=0;i<sizeof(thing);i++)
                ((char *)&thing2)[i] = ((char *)&thing1)[i];

However, there are constraints on how integers are represented,
specifically on how signed and unsigned integers are related, that
might break the above.  6.2.6.2 (remember the missing supserscripting):

       [#1] For unsigned integer types other  than  unsigned  char,
       the  bits of the object representation shall be divided into
       two groups: value bits and padding bits (there need  not  be
       any  of  the  latter).   If there are N value bits, each bit
       shall represent a different power of 2 between 1  and  2N-1,
       so   that   objects   of  that  type  shall  be  capable  of
       representing values from 0  to  2N-1  using  a  pure  binary
       representation;   this   shall   be   known   as  the  value
       representation.   The  values  of  any  padding   bits   are
       unspecified.44)

       44)Some  combinations  of  padding  bits might generate trap
          representations, for example, if one  padding  bit  is  a
          parity bit.  Regardless, no arithmetic operation on valid
          values can generate a trap representation other  than  as
          part of an exceptional condition such as an overflow, and
          this  cannot  occur  with  unsigned  types.   All   other
          combinations  of  padding  bits  are  alternative  object
          representations of the value specified by the value bits.

       [#2]  For  signed  integer  types,  the  bits  of the object
       representation shall be divided  into  three  groups:  value
       bits, padding bits, and the sign bit.  There need not be any
       padding bits; there shall be exactly one sign bit.  Each bit
       that  is  a  value bit shall have the same value as the same
       bit  in  the  object  representation  of  the  corresponding
       unsigned  type (if there are M value bits in the signed type
       and N in the unsigned type, then M<=N).  If the sign bit  is
       zero,  it shall not affect the resulting value.  If the sign
       bit is one, the value  shall  be  modified  in  one  of  the
       following ways:

         -- the  corresponding  value  with  sign  bit 0 is negated
            (sign and magnitude);

         -- the sign bit has the value -(2N) (two's complement);

         -- the sign bit has the value -(2N-1) (one's complement).

       Which of these  applies  is  implementation-defined,  as  is
       whether  the  value  with sign bit 1 and all value bits zero
       (for the first two), or with sign bit and all value  bits  1
       (for one's complement), is a trap representation or a normal
       value.   In  the  case  of  sign  and  magnitude  and  one's
       complement,  if  this representation is a normal value it is
       called a negative zero.
...
       [#5]  The  values  of any padding bits are unspecified.45) A
       valid (non-trap) object representation of a  signed  integer
       type   where  the  sign  bit  is  zero  is  a  valid  object
       representation of the corresponding unsigned type, and shall
       represent the same value.

       45)[...basically a repeat of footnote 44, above...]

I think this permits what you sketch, but your sketch is brief enough
I'm not entirely sure.  It's also possible I've missed a constraint
somewhere else that's relevant.

I'll ask my go-to C guy about this.

> As far as "bytes" on systems, perhaps the attribute of byte
> addressability makes sense on short word-length machines, but I don't
> believe that it's necessary for longer word length machines.  [...]
> I think byte addressing is more a matter of functional fixedness more
> than anything else.

FSVO "byte", I agree.  However, since we've been discussing C, there's
a _lot_ of (C) code that assumes that char * has the same size and
representation as other pointer types, even though there's no
justification in the C spec for such an assumption.  (POSIX, on the
other hand, may impose such a restriction; I don't have even a draft of
POSIX handy to check.)

/~\ The ASCII				  Mouse
\ / Ribbon Campaign
 X  Against HTML		mouse at rodents-montreal.org
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


More information about the cctalk mailing list