C & undefined behaviour - was Re: tumble under BSD

3 Apr 2016

...
 > Indeed, intel segmented memory model was weird.
[...]
> Far pointers were insanity-inducing, though.  Since there were
> multiple ways to represent the same address as a far pointer, [...]
> Thankfully, huge pointers behaved exactly as one would expect, [...] 
...
  There we have the issue.  Often when people speak of
what they "expect" you'$ 
(Please don't use paragraph-length lines.)  Yes and no.  A lot of C is
undefined, or implementation-defined, to allow different disagreeing
implementations to coexist.  Someone writing under and for one
particular implementation may indeed reasonably expect certain
behaviour that the standard does not promise - for example, if I'm
writing for a SS-20, I consider it reasonable to expect ints to be
32-bit two's-complement, even though C qua C does not promise either
part of that, and, while a new compiler may in principle break either
part or both, it would have to violate the "int is the `natural'
integer type for the architecture" principle to do so.
Another part of this is that C originated as, and is still used as, an
OS implementation language.  In such use, it is not unreasonable to
treat it as the "high-level assembly language" some people have called
it - apparently intending it to be a criticism while not understanding
that, in some senses, that's what C is _supposed_ to be.  And, from
that point of view, compilers that take advantage of formally-undefined
behaviour to optimize things as sketched here and in Mr. Regehr's
writings are not clever; they are broken.
I don't think either position is unreasonable, either.  Which I suppose
really means that I think there are places both for compilers that act
like high-level assemblers, doing the unsurprising thing from the POV
of someone familiar with the architecture being compiled for, and for
compilers that take advantage of all the liberty the language spec
allows to optimize the hell out of the code.
I'm not sure there is any fix for the problems arising when people try
to satisfy both desires with the same compiler (or the same set of
configuration switches to a single compiler, or some such).  It's
basically the "is this language right for this task?" problem in
slightly different dress.
...
  When a better compiler (with more powerful
optimization) breaks the
 program, the compiler is blamed rather than the programmer who made
 the incorrect assumption. 
Or, to see it from the "high-level assembly" position, when a less
appropriate compiler (with more aggressive optimization) is used, it
is, correctly, blamed (for not being apporpriate to the task at hand).
...
  Ideally compilers would flag all undefined programs,
but in practice they do$ 
It's not possible in general, because sometimes the undefined behaviour
depends on something not known until run time.  Consider
        int v;
        scanf("%d",&v);
        printf("%d",v+1);
This is perfectly well-defined - until and unless someone feeds it (a
suitable textual representation of) INT_MAX.  There might be a place
for a compiler that flagged every instance of undefined behaviour, even
if it means otherwise unnecessary run-time costs, but for most purposes
that would be a Bad Thing.  (I've often contemplated building a
`checkout' compiler that deliberately went out of its way to break
various assumptions people tend to make that aren't promised, things
like "all pointers are really just memory addresses, with pointer casts
being no-ops" or "all signed arithmetic is two's-complement" or
"the
stack grows down" or "pointers into different objects are comparable"
or "shims are inserted into structs only when necessary to avoid
placing objects at unusual alignments" or "there are no padding bits in
integer representations" or "nil pointers are all-bits-zero"....)
...
  This paper
https://pdos.csail.mit.edu/papers/ub:apsys12.pdf is an
 excellent survey. 
A pity pdos.csail.mit.edu is willing to impair its accessibility for
the sake of..I'm not sure what..by refusing to serve it over HTTP.
/~\ The ASCII                             Mouse
\ / Ribbon Campaign
 X  Against HTML                mouse at rodents-montreal.org
/ \ Email!           7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

C & undefined behaviour - was Re: tumble under BSD