On 15 Oct 2011 at 11:25, Fred Cisin wrote:
howzbout:
not a COMPILER, but, . . .
function 9 of MS-DOS and CP/M, that uses '$' terminated strings!
There are many ways to store a string. Terminating character requires
a little bit of extra care, but are the others much better?
Sure--the strings that employ descriptors are particularly great--
there's no need to figure out the length of a string before doing
anything with it. On RISC systems with wide words, one can move a
string very quickly, without having to look at it byte-byte for a
terminator.
In compiling, string descriptors allow for the "pooling" of constants
having common substrings.
For example, the string:
"Oregon is bordered by Washington, Idaho, Nevada and California"
allows for the sharing of substrings without duplication.
I'm well aware of CP/M function 9--and the inability to print a "$"
using it.
CDC SCOPE/KRONOS/MACE used 6-bit characters. There were some
operating system functions that terminated character records on a 00
(octal) byte. Unfortunately, the 63-character set lacked a colon, so
one was introduced for octal 00 and the definition of a coded (i.e.
character) record altered to "terminated by zero in the low-order 12
bits of of the final word". Then double colons created problems of
their own, for which no one had any satisfactory answers, except
"don't do it".
SCOPE 2 on the 7600 used length-prefix character records (W type) and
the problem was completely unknown to them. Furthermore, you could
query the device being read from ahead of time to obtain the length
of the current record. (I think DEC FORTRAN had that feature with
their Q format designator).
How many new "C" programmers have made the mistake of thinking that
strlen() returns the total length of a string (including the
delimiter)?
Take the original example in C:
while( *a++=*b++);
Now execute the snippet with two pointers that create an overlapping
move, such that the null terminator of the source string is destroyed
The string move eats its tail, clobbering evertyhing in its path.
That would never happen with descriptor-based strings.
--Chuck