[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: `low slots'
- To: "Nelson H. F. Beebe" <beebe@math.utah.edu>
- Subject: Re: `low slots'
- From: Hans Aberg <haberg@matematik.su.se>
- Date: Mon, 6 Oct 1997 16:22:13 +0200
- Cc: math-font-discuss@cogs.susx.ac.uk
At 07:04 -0600 97/10/06, Nelson H. F. Beebe wrote:
>Hans Aberg writes
>>> there is nothing in the C standard demanding strings to be null
>>>terminated...
>
>That is incorrect. Section 2.2.1 of ANSI X3.159-1989 on p 11 says:
>
> A byte with all bits set to 0, called the {\em null character},
> shall exist in the basic execution character set; it is used
> to terminate a character string literal.
Strictly speaking, there are no strings in C, only types called char*,
etc., which can be used to point to strings; these strings may (commonly
called "C-strings"), or may not be null-terminated ("memory operations"),
then: There are standard library routines, strcpy() etc, supporting
null-terminated strings, but there are also routines memcpy() for handling
non-null-terminated strings.
>Significant portions of the C library, and virtually every non-trivial
>C program in exist, depend on this property.
So, if you program in C, there is nothing in the C-standard forcing you
to use C-strings. In fact, if you do not want to use the null-terminated
C-strings, an easy way around it is by writing a C++ class that expands to
the C library memory operations; this is what I did. (But I think the new
C++ library <http://www.cygnus.com/misc/wp/index.html> has some string
classes in it.)
Of course, people starting programming in C often think they have to use
the C library routines for null-terminated strings, which is the reason a
lot of programs have it. Otherwise, sloppy written C-software can also miss
the binary character 0xff ( = -1 as a C char), because -1 is also used to
indicate end-of-file; but the end-of-file -1 is an int, on a 32-bit machine
equal to 0xffffffff. So it is possible to get around that in C only -- but
I prefer using the C++ IOstreams library, which is much better in such
respects.
I know that for example that the computer language Haskell
<http://haskell.org/> uses strings formed by Unicode 2-byte char's, and the
highly non-trivial implementations of it uses C; so, just because you are
using C, you do not have to use the null-terminated C-strings.
Hans Aberg
* AMS member: Listing <http://www.ams.org/cml/>
* Email: Hans Aberg <haberg@member.ams.org>