It was thus said that the Great Chuck Guzis once stated:
On 8/13/2006 at 11:53 PM Sean Conner wrote:
for (i = 0 ; i < MAX ; i++)
foo[i] = 0;
when it could easily be replaced with:
memset(foo,0,sizeof(foo));
The argument some would give you is that if you had a compiler worth spit,
it would optimize away the index and reduce the above to:
memset(foo,0,sizeof foo);
i = MAX;
Then if the compiler saw that the value of i was re-initialized without
being read, it would get rid of the last statement. The explicit loop
statement is a easy way to explicitly state what's meant and does not rely
on the implementation of a library function.
Ah, but see, an ANSI C compiler is free to treat memset() "specially" if
you include <string.h>. In fact, ANSI C is free to treat any function in
the Standard C Library (assuming the appropriate header files are included)
specially, and usually anything in <string.h> is handled by inlining the
code.
That's
just the programmer not knowing the available functions, or
perhaps, coming from a system that doesn't have stat() available (it's not
part of the ANSI-C standard library, limiting the ways one can get the
size
of a file portably, and each of them having
problems). For more horror
stories, you can always check out
http://thedailywtf.com/ .
Why not fseek( file, 0, SEEK_END); length = ftell( file); ?
ANSI-compatible and shouldn't involve any I/O.
First off, you still have to open the file to use fseek()/ftell(). But
okay, given that, how is the file opened? If in binary mode, yes, this will
work as expected as ftell() for a binary file will return the number of
charcters from the beginning of the file [1], but for a text file, the value
returned has no defined meaning except that when passed to fseek() it will
position the file pointer back to the place when ftell() was called.
Why the distinction? Mainly, it comes down to how the end of line is
handled, and under Windows (derived from DOS), the following bit of code
will behave differently if the file is opened for text vs. binary mode:
pos = ftell(fpin);
c = fgetc(fpin);
if (c == '\n')
{
newpos = ftell(fpin);
printf("End of file character is %d characters long\n",newpos - pos);
}
If the C runtime detects a CR/LF pair in a text file, it will suck both up
and only return an LF, whereas on a binary file, it won't do that. And even
the standard says as much about ftell():
For a text stream, its file position indicator contains unspecified
information, usable by the fseek() function for returning the file
position indicator for the stream to its position at the time of
the ftell() call; the difference between two such return values is
not necessarily a meaningful measure of the number of characters
written or read.
Okay, so we need to make sure the file is opened in binary mode, but
reading the fine print on fseek():
A binary stream need not meaningfully support fseek() calls with
a whence value of SEEK_END.
So (if we're being pedantic about portability) we're stuck.
fseek(SEEK_END) is only fully defined on text streams, but the result of
ftell() is only meaningful on binary streams. Now, reading further about
fseek():
For a text stream, either offset shall be zero, or offset shall be
a value returned by an earlier call to the ftell() function on the
same stream and whence shall be SEEK_SET.
In restrospect, if the programmer *was* coding to ANSI C only calls, then
maybe it wasn't such a stupid thing to read through the entire file to get
the size (since I code under Unix, I'll use stat() and be done with it---if
the code gets ported to a system that doesn't support stat(), then maybe
I'll write my own stat() that will Do The Right Thing on whatever platform
it runs on).
Such is the nature of writing portable code.
-spc (Granted, I'm being overly pedantic here, but if you want your code
to run predictably across platforms, you need to understand what
you can and can't do with C)
[1] Note: characters, not *bytes*. This from the actual ISO C standard
states. What a "character" is is implementation defined.