It was thus said that the Great Swift Griggs once stated:
On Fri, 20 May 2016, Sean Conner wrote:
By the late 80s, C was available on many
different systems and was not
yet standardized.
There were lots of standards, but folks typically gravitated toward K&R or
ANSI at the time. Though I was a pre-teen, I was a coder at the time.
Those are pretty raw and primitive compared to C99 or C11, but still quite
helpful, for me at least. Most of the other "standards" were pretty much a
velvet glove around vendor-based "standards", IMHO.
In 1988, C had yet to be standardized.
In 1989, ANSI released the first C standard, commonly called ANSI C or
C89.
I stared C programming in 1990, so I started out with ANSI C pretty much
from the start. I found I prefer ANSI-C over K&R
(pre-ANSI C), because the
compiler can catch more errors.
The standards
committee was convened in an attempt to make sense of all
the various C implementations and bring some form of sanity to the
market.
I'm pretty negative on committees, in general. However, ISO and ANSI
standards have worked pretty well, so I suppose they aren't totally
useless _all_ the time.
Remember OSI networking protocols? They had a big nasty committee for all
their efforts, and we can see how that worked out. We got the "OSI model"
(which basically just apes other models already well established at the
time). That's about it (oh sure, a few other things like X.500 inspired
protocols but I think X.500 is garbage *shrug* YMMV). Things like TPx
protocols never caught on. Some would say it was because the world is so
unenlightened it couldn't recognize the genius of the commisar^H^H^H
committee's collective creations. I have a somewhat different viewpoint.
The difference between the two? ANSI codified existing examples where as
ISO created a standard in a vacuum and expected people to write
implementaitons to the standard.
All those
"undefined" and "implementation" bits of C? Yeah, competing
implementations.
Hehe, what is a long long? Yes, you are totally right. Still, I assert
that C is still the defacto most portable language on Earth. What other
language runs on as many OS's and CPUs ? None that I can think of.
A long long is at least 64-bits long.
And Lua can run on as many OSs and CPUs as C.
And because of
the bizarre systems C can potentially run on, pointer
arithmetic is ... odd as well [4].
Yeah, it's kind of an extension of the same issue, too many undefined grey
areas. In practice, I don't run into these types of issues much. However,
to be fair, I typically code on only about 3 different platforms, and they
are pretty similar and "modern" (BSD, Linux, IRIX).
Just be thankful you never had to program C in the 80s and early 90s:
http://www.digitalmars.com/ctg/ctgMemoryModel.html
Oh, wait a second ...
http://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models
It also
doesn't help that bounds checking arrays is a manual process,
but then again, it would be a manual process on most CPUs [5] anyway ...
I'm in the "please don't do squat for me that I don't ask for"
camp.
What's wrong with the following code?
p = malloc(sizeof(somestruct) * count_of_items);
Spot the bug yet?
Here's the answer: it can overflow. But that's okay, because sizeof()
returns an unsiged quantity, and count_of_items *should* be an unsigned
quantity (both size_t) and overflow on unsigned quantities *is* defined to
wrap (it's signed quantities that are undefined). But that's *still* a
problem because if "sizeof(somestruvt) * count_of_items" exceeds the size of
a size_t, then the result is *smaller* than expected and you get a valid
pointer back, but to smaller pool of memory that expected.
This may not be an issue on 64-bit systems (yet), but it can be on a
32-bit system. Correct system code (in C99) would be:
if (count_of_items > (SIZE_MAX / sizeof(somestruct)))
error();
p = malloc(sizeof(somestruct) * count_of_items);
Oooh ... that reminds me ... I have some code to check ...
I know that horrifies and disgusts some folks who want
GC and auto-bounds
checking everywhere they can cram it in. Would SSA form aid with all kinds
of fancy compiler optimizations, including some magic bounds checking?
Sure. However, perhaps because I'm typical of an ignorant C coder, I would
expect the cost of any such feature would be unacceptable to some.
Don't discount GC though---it simplifies a lot of code
Also,
there are plenty of C variants or special compilers that can do such
things. Also, there are a few things that can be run with LD_PRELOAD which
can help to find issues where someone has forgot to do proper bounds
checking.
I'm generally not a fan of shared libraries as:
1) Unless you are linking against a library like libc or libc++, a
lot of memory will be wasted because the *entire* library is loaded
up, unlike linking to a static library where only those functions
actually used are linked into the final executable
2) because of 1, you have a huge surface area exposed that can be
exploited. If function foo() is buggy but your program doesn't call
foo(), in a static compile, foo() is not even in memory; with a
dynamically loaded library, foo() is in memory, waiting to be called
[1].
3) It's slower. Two reasons for this:
3a) It's linking at runtime instead of compile time. Yes,
there are mechanisms to mitigate this, like lazy runtime
linking (where a routine isn't resolved until it's called
for the first time) but that only helps over the long
term---it *still* has to be resolved at some point.
3b) Not all CPUs have PC (program counter) relative modes
(like the relatively obscure and little used x86---ha ha)
and because of this, extra codes needs to be included to do
the indirection. So, your call to printf() is not:
call printf
but more like:
call printf at plt
printf at plt: jmp shared_lib_printf
where printf at plt is constructed at runtime, *for every call
in a shared library*. This indirection adds up. Worse,
global data in a shared library becomes a
multi-pointer-dereference mess.
To see how silly this can be on a modern Linux system, run
% ldd pick-your-executable
for each process running and see just how many of those "shared"
libraries are actually shared (in addition to the potential attack
surface).
Because
"x+1" can *never* be less than "x" (signed overflow? What's
that?)
Hmm, well, people (me included in days gone by) tend to abuse signed
scalars to simply get bigger integers. I really wish folks would embrace
uintXX_t style ints ... problem solved, IMHO. It's right there for them in
C99 to use.
Um, see the above malloc() example---it's not fixed there.
I use the uintXX_t types for interoperability---known file formats and
network protocols, and the plain (or known types like size_t) otherwise.
Except for,
say, the Intel 432. Automatic bounds checking on that one.
You can't always rely on the hardware, but perhaps that's your point.
It was a joke. Have you actually looked into the Intel 432? Granted,
there's not much about it on the Internet, but it was slow. And it was slow
even if you programmed in assembly!
It's not
to say they can't test for it, but that's the problem---they
have to test after each possible operation.
That's almost always the case when folks want rubber bumpers on C. That's
really emblematic of my issues with that seemly instinctual reaction some
folks have to C.
You miss the point. I mean, an expression like:
y = 17 * x + 5
can't be optimized:
mov edx,eax
shl eax,4
lea ecx,[eax+edx+5]
but has to be:
imul eax,17
jo error ; [2]
add eax,5
jo error
For some perspective on this, there are two blogs I would recommend. The
first is "programming in the twenty-first century":
http://prog21.dadgum.com/
and the second on compiling and optimizing C code:
http://blog.regehr.org/
-spc (Sigh---called into work to debug a production problem with C and
Lua)
[1] How can a program that doesn't call foo() be enticed with calling
foo()? Return-oriented programming.
https://en.wikipedia.org/wiki/Return-oriented_programming
[2] Using INTO is slow:
http://boston.conman.org/2015/09/05.2
while JO isn't:
http://boston.conman.org/2015/09/07.1
mainly because JO can be branch-predicted and so the overhead of the
actual JO instruction is practically zero. On the other hand, the
fact that you have to use JO means more code, which could increase
presure on the I-cache, *and* you have to use instructions which set
the O flag (LEA does not). It's this (having to use instructions to
set the O flag) that cause perhaps as much as a 5% penalty (worse
case), depending upon the code.