digitalmars.D - A solution to the keyword problem

Greg Smith <greg siliconoptix.com> writes:
What problem? IMHO, there are too many, and more to the point, there are
more than 20 of them which simply don't need to be keywords.
A couple of weeks ago I made the suggestion that all the built-in types
(int, double, wchar etc) should be predefined identifiers, like
'Object'; likewise, 'true' and 'false' do not need to be keywords.
There were some responses suggesting that there was no need for such a
change.

I just believe at a gut level that's it a bad idea to put things in the
keyword table when they can be done as predefined identifiers. I don't
know of any other well-thought-out language which does this. But, that's
not a very strong argument. So, I've been looking through the spec, and
have found some specific things which support a reduction of the number
of keywords.  Also, I have a simple suggestion which will allow a
workaround for the issues caused by the keywords.

So, weakest first:

(1) Version identifiers are in a separate namespace. If I write some
code with version(remote){...}, and later on, 'remote' becomes a
keyword, then not only will I have to change all the code, I'll need tp
change the build scripts, too.

(2) the import statement, and module names, are tied to the file system.
I might create modules called 'supercomm.local' and 'supercomm.remote',
if 'local' or 'remote' are added to the language as keywords, I have
to rename the directories to fix it. If this is 3rd-party code supplied
as a library, I'm skunked.

(3) D can interface to C code. Unless the C code uses identifiers which
are D keywords. At particular risk of being C function or variable
names: align, bit, byte, delegate, export, final, interface, version.

Not all of these problems can be really solved by eliminating the
roughly 20% of keywords which stand in for predefined types or values,
of course; you just reduce the risk of a collision.

So the suggestion is this: provide a lexical convention for quoting
identifiers, to prevent them from being recognized as keywords:

import supercomm.'local';  /* ... assuming 'local' became a keyword */

/*
* access to C function called 'final', which is a D keyword
*/
extern (C) int 'final'( struct foodle *p );

The proposed extension to the D lexer is that an 'identifier' of at
least two characters can be enclosed in single quotes; this will not
change its meaning as an identifier but will prevent it from being
recognized as a keyword. VHDL has this feature; it's very important to
VHDL since it's often necessary to give certain names to ports in VHDL
(even if those names are VHDL keywords), to interface to external
things. You can even have names starting with digits in VHDL, but they
must be quoted. The VHDL  quouting convention is to use slashes:

variable /7up/: integer;  -- variable name starts with digit
variable /if/: integer;   -- variable name same as keyword
...
if /if/ > 0 then
/7up/ := /7up/ + 3;
end if;

The problem with slashes is you need to properly handle things like
"a=b/c/d;" I suspect, in VHDL, that the set of tokens which can legally
precede an identifier is disjoint from the set which can legally precede
the '/' operator, so the lexer can just look at the previous token.
However, in C/C++/D, ')', for one, can legally precede either of these.

Another possibility is to put a single letter in front of the string
quote, this is consistent with r"foo\bar" and x"0D0A": i.e.

import supercomm.w"local";

extern (C) int w"final"( struct foodle *p );

Personally I prefer the single-quote approach, but I don't think it
matters too much, as long as there's a way to do it. Another
possibility: <final>, but that presumes that expressions like a<b>c are
useless enough that you don't mind encumbering them with the need for
spaces.

Scripts which automatically translate C headers to D could detect the
reserved D words and automatically escape them; or even escape any C
identifer which is in danger of becoming a D keyword, since there's no
harm in escaping something that isn't actually a keyword.

Another example of a separate namespace which is unnecessarily harmed by
defining built-in types as keywords: goto labels. If I have some code
where I do 'goto pchar', and pchar later becomes a new type and a new
keyword, this code will need to be changed. If 'pchar' is added as a new
pre-defined identifier, no harm will be done.

A while ago I was trying to build some old C++ code; a class had been
defined with member functions called 'and', 'or' and 'xor'. These have
been, fairly recently, added to the C++ language as keywords, presumably
for folks who don't have '^','|' and '&'. (There are some more to cover
all the operators with these characters in them). What happened was, I
got these baffling error messages, I ended up looking at the
preprocessor output to see if there was something screwy in there; I
eventually found the problem and created a header file like this

#define xor ident_xor
#define and ident_and
...

But, D doesn't have #define. Also, if the code I was working with had
been in the form of a 3rd party object module, it would have been
impossible to link to  it without getting the compiler to recognize
'xor' etc as identifiers.
(turns out there's a gcc compiler option to suppress recognition of
these silly keywords, BTW).

So to summarize the suggestion:

- keywords can get you into trouble, especially when you are
interfacing with other environments. D interfaces with C, and the file
system (module names), in this manner.

- the more keywords you have, the more likely to get into this
trouble. Keywords not necessary for parsing should be removed and
implemented as predefined identifiers, like 'Object' is already. If you
don't want it to be possible to redefine them in certain contexts, then
implement that as needed (no need to prohibit 'wchar' from being used as
a goto label, for instance, which is currently impossible). If builtin
types were identifiers, I could "import mmedia.real.realaudio;", without
redefining 'real' in any harmful way.

- a lexical convention should be added to allow the construction of
identifiers whose patterns are otherwise reserved as keywords; as a
workaround when you run into a problem.

One more thing: if the lexical convention is adopted, then it makes a
big difference whether or not the built-in types become identifiers.
Because, for instance 'Object' and unquoted Object would mean the same
thing; likewise 'creal' and unquoted creal would mean the same thing if
creal were a predefined identifier; if creal were a keyword, 'creal' and
creal would of course be completely independent.

-- Greg

Jul 25 2005
Greg Smith wrote:
What problem? IMHO, there are too many, and more to the point, there are
more than 20 of them which simply don't need to be keywords.
A couple of weeks ago I made the suggestion that all the built-in types
(int, double, wchar etc) should be predefined identifiers, like
'Object'; likewise, 'true' and 'false' do not need to be keywords. There
were some responses suggesting that there was no need for such a change.

One more thing: if the lexical convention is adopted, then it makes a
big difference whether or not the built-in types become identifiers.
Because, for instance 'Object' and unquoted Object would mean the same
thing; likewise 'creal' and unquoted creal would mean the same thing if
creal were a predefined identifier; if creal were a keyword, 'creal' and
creal would of course be completely independent.

-- Greg

change which would turn the built-in types into Objects (well,
magic objects, that were literally present rather than pointed
to...same internal representation, but different language
mapping), resulting in such features as, say, 10.mul(7) being the
equivalent of 10 * 7, and allowing one to create classes
(NON-magic classes) descending from them.  This would also need
to somehow harmonize array so that it could be a normal class,
probably by allowing things like optAssign and optSliceAssign (do
I remember those names correctly...I always need to look them up)
to be redefined and used on classes that AREN'T strictly arrays
as we now know them.

The trick would be doing all that without paying an unacceptable
penalty in terms of either ambiguity or performance.  (Or
development time.)

In other words...I don't see the utility of what you are
proposing as an isolated change, and including it where it
appears to fit won't happen until AFTER version 1.0 is out.
(Probably not then, as I don't think that Walter likes the
"Everything is an object" school of language design.  I may like
it, but I consider it a lot less important than I do many other
features.)

Jul 25 2005
In article <dc40t4$2o2k$1 digitaldaemon.com>, Charles Hixson says...
Greg Smith wrote:
What problem? IMHO, there are too many, and more to the point, there are
more than 20 of them which simply don't need to be keywords.
A couple of weeks ago I made the suggestion that all the built-in types
(int, double, wchar etc) should be predefined identifiers, like
'Object'; likewise, 'true' and 'false' do not need to be keywords. There
were some responses suggesting that there was no need for such a change.

One more thing: if the lexical convention is adopted, then it makes a
big difference whether or not the built-in types become identifiers.
Because, for instance 'Object' and unquoted Object would mean the same
thing; likewise 'creal' and unquoted creal would mean the same thing if
creal were a predefined identifier; if creal were a keyword, 'creal' and
creal would of course be completely independent.

-- Greg

change which would turn the built-in types into Objects (well,
magic objects, that were literally present rather than pointed
to...same internal representation, but different language
mapping), resulting in such features as, say, 10.mul(7) being the
equivalent of 10 * 7, and allowing one to create classes
(NON-magic classes) descending from them.  This would also need
to somehow harmonize array so that it could be a normal class,
probably by allowing things like optAssign and optSliceAssign (do
I remember those names correctly...I always need to look them up)
to be redefined and used on classes that AREN'T strictly arrays
as we now know them.

The trick would be doing all that without paying an unacceptable
penalty in terms of either ambiguity or performance.  (Or
development time.)

In other words...I don't see the utility of what you are
proposing as an isolated change, and including it where it
appears to fit won't happen until AFTER version 1.0 is out.
(Probably not then, as I don't think that Walter likes the
"Everything is an object" school of language design.  I may like
it, but I consider it a lot less important than I do many other
features.)

I second this thought. Change alone: not worth it. Change with added benefit of
complete type homogeneity: very much worth it.

--AJG.

Jul 25 2005
Greg Smith <greg siliconoptix.com> writes:
Charles Hixson wrote:
Greg Smith wrote:

What problem? IMHO, there are too many, and more to the point, there
are more than 20 of them which simply don't need to be keywords.
A couple of weeks ago I made the suggestion that all the built-in types
(int, double, wchar etc) should be predefined identifiers, like
'Object'; likewise, 'true' and 'false' do not need to be keywords.
There were some responses suggesting that there was no need for such a
change.

...

One more thing: if the lexical convention is adopted, then it makes a
big difference whether or not the built-in types become identifiers.
Because, for instance 'Object' and unquoted Object would mean the same
thing; likewise 'creal' and unquoted creal would mean the same thing
if creal were a predefined identifier; if creal were a keyword,
'creal' and creal would of course be completely independent.

-- Greg

I could see doing this if it were a part of a more wide reaching change
which would turn the built-in types into Objects (well, magic objects,
that were literally present rather than pointed to...same internal
representation, but different language mapping), resulting in such
features as, say, 10.mul(7) being the equivalent of 10 * 7, and allowing
one to create classes (NON-magic classes) descending from them.  This
would also need to somehow harmonize array so that it could be a normal
class, probably by allowing things like optAssign and optSliceAssign (do
I remember those names correctly...I always need to look them up) to be
redefined and used on classes that AREN'T strictly arrays as we now know
them.

The trick would be doing all that without paying an unacceptable penalty
in terms of either ambiguity or performance.  (Or development time.)

In other words...I don't see the utility of what you are proposing as an
isolated change, and including it where it appears to fit won't happen
until AFTER version 1.0 is out. (Probably not then, as I don't think
that Walter likes the "Everything is an object" school of language
design.  I may like it, but I consider it a lot less important than I do
many other features.)

I can't speak to what you are suggesting; this is a 'type/class
unification', I just don't know enough yet about D class semantics to
comment. A similar change was made to Python in versions 2.2 and 2.3,
and it was a great enhancement. It also took a lot of plannning and
implementing in order to work well without breaking too much existing code.

However, I don't see how this major change is connected at all to the
minor change I am suggesting. Currently, the built-in type 'int' is
indicated by a keyword 'int'; I am suggesting that the keyword (and all
the other 'type' keywords) be removed, and pre-defined identifiers be
implemented instead. This would have *zero* effect on existing D code.

This change could be done without changing the semantics of the types.
It would also be perfectly possible and reasonable to change the
semantics of the types as you suggest, while leaving the 20 keywords in
place. This would not, however, address the issues I raised in my post.
I would appreciate if someone could address these specifically rather
than just saying, "I don't see the utility". I could be utterly wrong
about the problems with the C interface, for instance, and I'd
appreciate knowing why.

I can see how my mention of 'Object' could cause confusion; since
'Object' can be used as a base class, and 'int' can't; and 'Object' is a
pre-defined identifier and 'int' isn't.

But they're both predefined types; I'm *not* suggesting that int, real,
etc. be changed into predefined identifiers so that they can be used as
base classes; I'm suggesting that it be done because:

(a) there's absolutely no reason whatsoever, none, anyone has given,
or that I can think of, that they should be keywords as opposed to
predefined identifiers, except for "because it's already done that way".
If there is a reason it was done that way, I'd be interested in hearing it.

(b) It would not break any code whatsoever. There is no effect on the
semantics of any currently legal D code, since the new identifier 'int'
has the same semantics as the old keyword 'int'. The only downside is
the effort to change the compiler.

(c) There are at least a few good reasons why types should *not* be
keywords; it would make more identifiers available for unrelated
namespaces such as goto labels and module (directory path) names.
It would provide a way to add new built-in types in a uniform manner,
without breaking old code. Certain error messages are much clearer with
this change. Refer to previous posts.

(d) "because that's the way it's already done", to my mind, should
not apply to 'tabula rasa' designs which are in such a preliminary
stage, especially for relatively minor changes to the compiler. I'm
really guessing at how big a change it is, it's certainly not trivial;
but the fact that the language already has a predefined identifier for
the 'Object' type suggests to me that this is mostly a matter of
deleting a bunch of code in the parser and lexer, and adding some new
code, near where 'Object' is defined, to predefine the new identifiers.
Since you can define
'alias int myint;'
and then use the identifier 'myint' in place of keyword 'int'
everywhere, it is clear that the parser can already handle identifiers
for ints, so it's largely a matter of deleting the grammar rules which
recognize the keyword tokens.

And again, the suggestion for lexically escaping identifiers is almost
completely independent of all of this stuff, and is, I feel, a decent
workaround for problems caused by keywords clashing with external names.

-- Greg

Jul 26 2005
"Regan Heath" <regan netwin.co.nz> writes:
For what it's worth I like your idea. I have nothing further to add to the
argument in either direction however. I wonder what Walter thinks, it
appears to me that he is the only person in a position to have all the
information for a descision on this.

Regan

On Tue, 26 Jul 2005 10:45:55 -0400, Greg Smith <greg siliconoptix.com>
wrote:
Charles Hixson wrote:
Greg Smith wrote:

What problem? IMHO, there are too many, and more to the point, there
are more than 20 of them which simply don't need to be keywords.
A couple of weeks ago I made the suggestion that all the built-in types
(int, double, wchar etc) should be predefined identifiers, like
'Object'; likewise, 'true' and 'false' do not need to be keywords.
There were some responses suggesting that there was no need for such a
change.

One more thing: if the lexical convention is adopted, then it makes a
big difference whether or not the built-in types become identifiers.
Because, for instance 'Object' and unquoted Object would mean the same
thing; likewise 'creal' and unquoted creal would mean the same thing
if creal were a predefined identifier; if creal were a keyword,
'creal' and creal would of course be completely independent.

-- Greg

change which would turn the built-in types into Objects (well, magic
objects, that were literally present rather than pointed to...same
internal representation, but different language mapping), resulting in
such features as, say, 10.mul(7) being the equivalent of 10 * 7, and
allowing one to create classes (NON-magic classes) descending from
them.  This would also need to somehow harmonize array so that it could
be a normal class, probably by allowing things like optAssign and
optSliceAssign (do I remember those names correctly...I always need to
look them up) to be redefined and used on classes that AREN'T strictly
arrays as we now know them.
The trick would be doing all that without paying an unacceptable
penalty in terms of either ambiguity or performance.  (Or development
time.)
In other words...I don't see the utility of what you are proposing as
an isolated change, and including it where it appears to fit won't
happen until AFTER version 1.0 is out. (Probably not then, as I don't
think that Walter likes the "Everything is an object" school of
language design.  I may like it, but I consider it a lot less important
than I do many other features.)

I can't speak to what you are suggesting; this is a 'type/class
unification', I just don't know enough yet about D class semantics to
comment. A similar change was made to Python in versions 2.2 and 2.3,
and it was a great enhancement. It also took a lot of plannning and
implementing in order to work well without breaking too much existing
code.

However, I don't see how this major change is connected at all to the
minor change I am suggesting. Currently, the built-in type 'int' is
indicated by a keyword 'int'; I am suggesting that the keyword (and all
the other 'type' keywords) be removed, and pre-defined identifiers be
implemented instead. This would have *zero* effect on existing D code.

This change could be done without changing the semantics of the types.
It would also be perfectly possible and reasonable to change the
semantics of the types as you suggest, while leaving the 20 keywords in
place. This would not, however, address the issues I raised in my post.
I would appreciate if someone could address these specifically rather
than just saying, "I don't see the utility". I could be utterly wrong
about the problems with the C interface, for instance, and I'd
appreciate knowing why.

I can see how my mention of 'Object' could cause confusion; since
'Object' can be used as a base class, and 'int' can't; and 'Object' is a
pre-defined identifier and 'int' isn't.

But they're both predefined types; I'm *not* suggesting that int, real,
etc. be changed into predefined identifiers so that they can be used as
base classes; I'm suggesting that it be done because:

(a) there's absolutely no reason whatsoever, none, anyone has given,
or that I can think of, that they should be keywords as opposed to
predefined identifiers, except for "because it's already done that way".
If there is a reason it was done that way, I'd be interested in hearing
it.

(b) It would not break any code whatsoever. There is no effect on the
semantics of any currently legal D code, since the new identifier 'int'
has the same semantics as the old keyword 'int'. The only downside is
the effort to change the compiler.

(c) There are at least a few good reasons why types should *not* be
keywords; it would make more identifiers available for unrelated
namespaces such as goto labels and module (directory path) names.
It would provide a way to add new built-in types in a uniform manner,
without breaking old code. Certain error messages are much clearer with
this change. Refer to previous posts.

(d) "because that's the way it's already done", to my mind, should
not apply to 'tabula rasa' designs which are in such a preliminary
stage, especially for relatively minor changes to the compiler. I'm
really guessing at how big a change it is, it's certainly not trivial;
but the fact that the language already has a predefined identifier for
the 'Object' type suggests to me that this is mostly a matter of
deleting a bunch of code in the parser and lexer, and adding some new
code, near where 'Object' is defined, to predefine the new identifiers.
Since you can define
'alias int myint;'
and then use the identifier 'myint' in place of keyword 'int'
everywhere, it is clear that the parser can already handle identifiers
for ints, so it's largely a matter of deleting the grammar rules which
recognize the keyword tokens.

And again, the suggestion for lexically escaping identifiers is almost
completely independent of all of this stuff, and is, I feel, a decent
workaround for problems caused by keywords clashing with external names.

-- Greg


Jul 26 2005
Hasan Aljudy <hasan.aljudy gmail.com> writes:
I have nothing against that.

Maybe you would want to change the terminology you are using, because in

keyword: for me (and I'm an idiot, btw) means an identifier that has a
special meaning. >>> therefor it makes perfect sense that things like
"int" and "this" be keywords.

keyowrd: when you speak of it, you actually mean how the compiler deals
with this "special identifier"

So when you say remove keywords, you aren't suggesting to make D
dynamicly typed or anything like that.

^
^
That's what confused me in the previous thread. Maybe it's just because
I don't know alot about compilers.

Although one concern remains: syntax hilighting!
I'm thinking most editors with d-syntax-hilighting would color 'real' in:
#import something.real.somethingelse;

Greg Smith wrote:
What problem? IMHO, there are too many, and more to the point, there are
more than 20 of them which simply don't need to be keywords.
A couple of weeks ago I made the suggestion that all the built-in types
(int, double, wchar etc) should be predefined identifiers, like
'Object'; likewise, 'true' and 'false' do not need to be keywords. There
were some responses suggesting that there was no need for such a change.

I just believe at a gut level that's it a bad idea to put things in the
keyword table when they can be done as predefined identifiers. I don't
know of any other well-thought-out language which does this. But, that's
not a very strong argument. So, I've been looking through the spec, and
have found some specific things which support a reduction of the number
of keywords.  Also, I have a simple suggestion which will allow a
workaround for the issues caused by the keywords.

So, weakest first:

(1) Version identifiers are in a separate namespace. If I write some
code with version(remote){...}, and later on, 'remote' becomes a
keyword, then not only will I have to change all the code, I'll need tp
change the build scripts, too.

(2) the import statement, and module names, are tied to the file system.
I might create modules called 'supercomm.local' and 'supercomm.remote',
if 'local' or 'remote' are added to the language as keywords, I have
to rename the directories to fix it. If this is 3rd-party code supplied
as a library, I'm skunked.

(3) D can interface to C code. Unless the C code uses identifiers which
are D keywords. At particular risk of being C function or variable
names: align, bit, byte, delegate, export, final, interface, version.

Not all of these problems can be really solved by eliminating the
roughly 20% of keywords which stand in for predefined types or values,
of course; you just reduce the risk of a collision.

So the suggestion is this: provide a lexical convention for quoting
identifiers, to prevent them from being recognized as keywords:

import supercomm.'local';  /* ... assuming 'local' became a keyword */

/*
* access to C function called 'final', which is a D keyword
*/
extern (C) int 'final'( struct foodle *p );

The proposed extension to the D lexer is that an 'identifier' of at
least two characters can be enclosed in single quotes; this will not
change its meaning as an identifier but will prevent it from being
recognized as a keyword. VHDL has this feature; it's very important to
VHDL since it's often necessary to give certain names to ports in VHDL
(even if those names are VHDL keywords), to interface to external
things. You can even have names starting with digits in VHDL, but they
must be quoted. The VHDL  quouting convention is to use slashes:

variable /7up/: integer;  -- variable name starts with digit
variable /if/: integer;   -- variable name same as keyword
...
if /if/ > 0 then
/7up/ := /7up/ + 3;
end if;

The problem with slashes is you need to properly handle things like
"a=b/c/d;" I suspect, in VHDL, that the set of tokens which can legally
precede an identifier is disjoint from the set which can legally precede
the '/' operator, so the lexer can just look at the previous token.
However, in C/C++/D, ')', for one, can legally precede either of these.

Another possibility is to put a single letter in front of the string
quote, this is consistent with r"foo\bar" and x"0D0A": i.e.

import supercomm.w"local";

extern (C) int w"final"( struct foodle *p );

Personally I prefer the single-quote approach, but I don't think it
matters too much, as long as there's a way to do it. Another
possibility: <final>, but that presumes that expressions like a<b>c are
useless enough that you don't mind encumbering them with the need for
spaces.

Scripts which automatically translate C headers to D could detect the
reserved D words and automatically escape them; or even escape any C
identifer which is in danger of becoming a D keyword, since there's no
harm in escaping something that isn't actually a keyword.

Another example of a separate namespace which is unnecessarily harmed by
defining built-in types as keywords: goto labels. If I have some code
where I do 'goto pchar', and pchar later becomes a new type and a new
keyword, this code will need to be changed. If 'pchar' is added as a new
pre-defined identifier, no harm will be done.

A while ago I was trying to build some old C++ code; a class had been
defined with member functions called 'and', 'or' and 'xor'. These have
been, fairly recently, added to the C++ language as keywords, presumably
for folks who don't have '^','|' and '&'. (There are some more to cover
all the operators with these characters in them). What happened was, I
got these baffling error messages, I ended up looking at the
preprocessor output to see if there was something screwy in there; I
eventually found the problem and created a header file like this

#define xor ident_xor
#define and ident_and
...

But, D doesn't have #define. Also, if the code I was working with had
been in the form of a 3rd party object module, it would have been
impossible to link to  it without getting the compiler to recognize
'xor' etc as identifiers.
(turns out there's a gcc compiler option to suppress recognition of
these silly keywords, BTW).

So to summarize the suggestion:

- keywords can get you into trouble, especially when you are
interfacing with other environments. D interfaces with C, and the file
system (module names), in this manner.

- the more keywords you have, the more likely to get into this
trouble. Keywords not necessary for parsing should be removed and
implemented as predefined identifiers, like 'Object' is already. If you
don't want it to be possible to redefine them in certain contexts, then
implement that as needed (no need to prohibit 'wchar' from being used as
a goto label, for instance, which is currently impossible). If builtin
types were identifiers, I could "import mmedia.real.realaudio;", without
redefining 'real' in any harmful way.

- a lexical convention should be added to allow the construction of
identifiers whose patterns are otherwise reserved as keywords; as a
workaround when you run into a problem.

One more thing: if the lexical convention is adopted, then it makes a
big difference whether or not the built-in types become identifiers.
Because, for instance 'Object' and unquoted Object would mean the same
thing; likewise 'creal' and unquoted creal would mean the same thing if
creal were a predefined identifier; if creal were a keyword, 'creal' and
creal would of course be completely independent.

-- Greg


Jul 28 2005
Greg Smith <greg siliconoptix.com> writes:
Hasan Aljudy wrote:

I have nothing against that.

Maybe you would want to change the terminology you are using, because in

keyword: for me (and I'm an idiot, btw) means an identifier that has a
special meaning. >>> therefor it makes perfect sense that things like
"int" and "this" be keywords.

D. But if you look in the list of keywords (
http://www.digitalmars.com/d/lex.html#keyword ) , you won't find it there.

'Keyword' is pretty standard terminology in the compiler business, and
it is tied to the almost universal concept of analyzing source code by a
two-stage process of (a) lexical analysis (lexing) and (b) grammatical
analysis (parsing).  A 'keyword' is recognized in the lexer, prior to
parsing. Predefined identifiers like 'Object' are recognized after
parsing, and, unlike keywords, their interpretation may depend on name
scoping rules. Since name scopes are generally defined by the structure
of the program, which is inferred by the parser, it's very cumbersome to
apply scoping rules before parsing.

Because of this distinction, you can have a D local variable called
'Object' but not one called 'while'.

I suspect that if I used different terminology, the result would be
greater confusion overall.

- Greg

Aug 03 2005
J C Calvarese <technocrat7 gmail.com> writes:
In article <dcr2se$1hv2$1 digitaldaemon.com>, Greg Smith says...
Hasan Aljudy wrote:

I have nothing against that.

Maybe you would want to change the terminology you are using, because in

keyword: for me (and I'm an idiot, btw) means an identifier that has a
special meaning. >>> therefor it makes perfect sense that things like
"int" and "this" be keywords.

D. But if you look in the list of keywords (
http://www.digitalmars.com/d/lex.html#keyword ) , you won't find it there.

'Keyword' is pretty standard terminology in the compiler business, and
it is tied to the almost universal concept of analyzing source code by a
two-stage process of (a) lexical analysis (lexing) and (b) grammatical
analysis (parsing).  A 'keyword' is recognized in the lexer, prior to
parsing. Predefined identifiers like 'Object' are recognized after
parsing, and, unlike keywords, their interpretation may depend on name
scoping rules. Since name scopes are generally defined by the structure
of the program, which is inferred by the parser, it's very cumbersome to
apply scoping rules before parsing.

I'm not a compiler writer, so examples help me understand what your suggesting.

Because of this distinction, you can have a D local variable called
'Object' but not one called 'while'.

I didn't even realize that we could do this:

import std.stdio;
int main()
{
int Object;
Object = 1;
writefln(Object);
return true;
}

It seems like a clever way coding that I'd probably end up shooting myself in
the foot with. So now, you want to be able to do this:

import std.stdio;

int main()
{
double int;
int = 1;
writefln(int);
return true;
}

Seems like a bad idea to me. I don't see a big problem with the compiler
prohibiting this. Yes, it could be a pain if you're porting a library written in
another language that uses "int" as an identifier, but I don't expect that
happens too often.

I suspect that if I used different terminology, the result would be
greater confusion overall.

Probably. It's a confusing topic anyway.

- Greg

That's my 1.99999 cents.

jcc7

Aug 03 2005
"Regan Heath" <regan netwin.co.nz> writes:
On Wed, 3 Aug 2005 20:01:21 +0000 (UTC), J C Calvarese
<technocrat7 gmail.com> wrote:
Maybe you would want to change the terminology you are using, because
in

keyword: for me (and I'm an idiot, btw) means an identifier that has a
special meaning. >>> therefor it makes perfect sense that things like
"int" and "this" be keywords.

D. But if you look in the list of keywords (
http://www.digitalmars.com/d/lex.html#keyword ) , you won't find it
there.

'Keyword' is pretty standard terminology in the compiler business, and
it is tied to the almost universal concept of analyzing source code by a
two-stage process of (a) lexical analysis (lexing) and (b) grammatical
analysis (parsing).  A 'keyword' is recognized in the lexer, prior to
parsing. Predefined identifiers like 'Object' are recognized after
parsing, and, unlike keywords, their interpretation may depend on name
scoping rules. Since name scopes are generally defined by the structure
of the program, which is inferred by the parser, it's very cumbersome to
apply scoping rules before parsing.

I'm not a compiler writer, so examples help me understand what your
suggesting.

I've writen a simple C parser. I recommend it to anyone who wants an
interesting and challenging (assuming no experience) challenge.

Because of this distinction, you can have a D local variable called
'Object' but not one called 'while'.

I didn't even realize that we could do this:

import std.stdio;
int main()
{
int Object;
Object = 1;
writefln(Object);
return true;
}

It seems like a clever way coding that I'd probably end up shooting
myself in the foot with. So now, you want to be able to do this:

import std.stdio;

int main()
{
double int;
int = 1;
writefln(int);
return true;
}

Seems like a bad idea to me. I don't see a big problem with the compiler
prohibiting this. Yes, it could be a pain if you're porting a library
written in another language that uses "int" as an identifier, but I
don't expect that happens too often.

One of the points made was that you can still prohibit the use above, if
you want, as part of the 'parser' (not the lexer), the advantage being
that you have more control over where you allow it and where you prohibit
it.

Personally I don't have a problem with:

double int;
int = 1;
writefln(int);

because it's clear when you read any of those lines, together or stand
alone, that 'int' is not a type but a variable. This is because, you, like
the parser can take the context of 'int' into consideration when you read
the lines.

About the worst bug I can think of would be if you meant to type "int a =
1;" and accidently missed the "a" getting "int = 1;".  But, that would
cause an error in all situations except where there was a variable called
"int" present.  So, it's really no different to any other typo except that
it can occur in a variable declaration (which can occur just about
anywhere, anyway). Can anyone think of a potential bug which becomes
'more' likely because of this change?

If the keywords were removed then yes, code like that shown above with
'int' could become legal, unless Walter decided to prohibit it in the
parser. The advantage would would be a change to the D grammar, it would
get smaller and simpler.

Another point made was that as a result you get more descriptive error
messages for less work (you don't have to code special cases in the
lexer). Some examples were given in a previous thread.

The fact that conversion from another language, that allows "this" or any
other D keyword, would become easier is another advantage.

In short, more control, better errors, other advantages and no
disadvantages (except some work for Walter to implement the change) that I
can see.

Regan

Aug 03 2005
J C Calvarese <technocrat7 gmail.com> writes:
In article <opsuydw2lk23k2f5 nrage.netwin.co.nz>, Regan Heath says...
On Wed, 3 Aug 2005 20:01:21 +0000 (UTC), J C Calvarese
<technocrat7 gmail.com> wrote:

I'm not a compiler writer, so examples help me understand what your
suggesting.

I've writen a simple C parser. I recommend it to anyone who wants an
interesting and challenging (assuming no experience) challenge.

I'm much too lazy for that. ;)

Because of this distinction, you can have a D local variable called
'Object' but not one called 'while'.

I didn't even realize that we could do this:

import std.stdio;
int main()
{
int Object;
Object = 1;
writefln(Object);
return true;
}

It seems like a clever way coding that I'd probably end up shooting
myself in the foot with. So now, you want to be able to do this:

import std.stdio;

int main()
{
double int;
int = 1;
writefln(int);
return true;
}

Seems like a bad idea to me. I don't see a big problem with the compiler
prohibiting this. Yes, it could be a pain if you're porting a library
written in another language that uses "int" as an identifier, but I
don't expect that happens too often.

One of the points made was that you can still prohibit the use above, if
you want, as part of the 'parser' (not the lexer), the advantage being
that you have more control over where you allow it and where you prohibit
it.

Well, if we're not prohibiting this kind of crazy, I don't understand why we'd
undertake the effort. Are we trying to speed up a compiler that's already
blazingly fast?

Personally I don't have a problem with:

double int;
int = 1;
writefln(int);

because it's clear when you read any of those lines, together or stand
alone, that 'int' is not a type but a variable. This is because, you, like
the parser can take the context of 'int' into consideration when you read
the lines.

It's clear to me that who ever wrote that example was a masochist (oops, I wrote
that). The context shows that something fishy is going on. :)

About the worst bug I can think of would be if you meant to type "int a =
1;" and accidently missed the "a" getting "int = 1;".  But, that would
cause an error in all situations except where there was a variable called
"int" present.  So, it's really no different to any other typo except that

Right. Well, I'm worried about when a variable called "int" is present.

it can occur in a variable declaration (which can occur just about
anywhere, anyway). Can anyone think of a potential bug which becomes
'more' likely because of this change?

Hey, we could make semi-colons optional, too. :)

If the keywords were removed then yes, code like that shown above with
'int' could become legal, unless Walter decided to prohibit it in the
parser. The advantage would would be a change to the D grammar, it would
get smaller and simpler.

Another point made was that as a result you get more descriptive error
messages for less work (you don't have to code special cases in the
lexer). Some examples were given in a previous thread.

There's always cost/benefit issues. Are you sure that the cost of changing the
innards of the compiler outweigh the benefit of possibly better error messages?
I'm sure I don't know. Even if we do get better error messages, though, it seems
like a detour taking us farther away from D 1.0.

The fact that conversion from another language, that allows "this" or any
other D keyword, would become easier is another advantage.

So then we could call a method "this". Could we use the "this" of that method's
class? :x I think keywords serve a purpose ("This identifier is off-limits"). It
hurts my head to consider all of the possible pitfalls.

In short, more control, better errors, other advantages and no
disadvantages (except some work for Walter to implement the change) that I
can see.

Regan

Well, you might be right, but I'm unconvinced. If Walter wants to do it, he can
do it. If someone else wants to try it out, that's what GDC is for:
http://www.prowiki.org/wiki4d/wiki.cgi?GdcHacking

jcc7

Aug 03 2005
Greg Smith <greg siliconoptix.com> writes:
J C Calvarese wrote:

I think keywords serve a purpose ("This identifier is off-limits"). It
hurts my head to consider all of the possible pitfalls.

'mmedia.real.realaudio', simply because 'real' is a keyword and the
parser would balk. If 'real' were a predefined identifier, this would
work, and would not create a harmful redefinition of 'real'.

In my view, "This identifier is off-limits" is precisely the collateral
damage *caused* by keywords, *not* their purpose. They serve to provide
a set of readable punctuation marks to guide the parser, and the
downside is that they are reserved in all possible contexts, whereas
normal identifiers are scoped.

This is more of an issue when you consider the likely possibility that
new built-in types will be added to the language in the future; if they
are added as keywords, they will break existing code; but if they are
added as predefined identifiers, they won't.

Yes, it's easy to laugh at silly examples where 'int' is used as a local
variable of type 'char'.  We could sit here all day writing dangerously
misleading code, without having to propose language changes to make that
possible. "alias wchar rea1;" comes to mind... You'd fill less silly,
and more p-d off, if you had used 'rchar' as a perfectly good local
variable name, or struct member name, and it later became a new built-in
type (and a new keyword). And by the way, it's possible to redefine
'integer' in Pascal, and nobody has ever seemed to complain of their
head hurting, or even notice, for that matter. They just don't do it.
[Ada too, I think]. I have never once proposed that anybody should be
encouraged to redefine int; in fact, this can be prohibited without
making it a keyword.

There's IMHO an issue in D, arising from the fact that there are a lot
of keywords; and while I believe that more than 20 of them are
unnecessary (all the type names, and true/false), even if they are
eliminated there will still be a problem as follows:

extern(C) int delegate( foo*p, int id );	// Can't do it!

I simply can't interface to this existing C function since its name is a
D keyword. Almost 100 C identifiers are simply 'off-limits', and there
is no real solution to this unless you can modify the C code.

I've previously proposed that a lexical convention be added
to escape identifiers from keyword recognition, e.g.

extern(C) int 'delegate'( foo*p, int id );

...
if( can_delegate )  'delegate' ( foop, 0); // call C function

int 'if' = 0;   'if'++;   // now possible, if you really want to.
char x = 'c';	// still a char constant if only 1 char

import commlib.'interface'.localapi;	// interface is a keyword

Or, w"delegate".

This suggestion is separate from the suggestion that type names should
not be keywords; either could be adopted without the other.

So then we could call a method "this". Could we use the "this" of
that method's class? :x

identifier, it would not be so simple, since 'this' in the class
namespace would be the constructor. I don't know enough about all the
details of D classes to comment further.
In C++, it would be possible to remove 'this' from the keyword table and
implement as a predefined formal parameter of member funcs. And the
answer to your question would be "Yes, and with no extra trouble than is

It would work like this:

class CLS  {	// mutant C++, with 'this' not a keyword
int f1(CLS *); int f2(); int f3();
int this();  // this has no special meaning outside a mem func body
};
int CLS ::f1(CLS  *f3)
// note: parameter named f3 hides CLS::f3 as per normal C++
// implicit parameter 'this' hides member 'this' in the same way
{
f2();	// member func call
this->f3();   // can't just use f3(),  it's hidden by parameter
this->this(); // likewise, when calling CLS::this()
f3->this();	// call member func of other CLS object
f3->f3();
}

The call to 'CLS::this()' has the same issue, and solves it in the same
way, as the call to CLS::f3. My rather convoluted point is that whatever
difficulty is presented here is already present in the language and can
be avoided in the same way you are already avoiding it. I.e., you
probably try to avoid giving class mem func params the same name as
class members, so don't go using 'this' as a member name just because
you can.

But if you had some  old C code that defined a struct member 'this', you
wouldn't have to change that to convert it into our mutant C++, whereas
with real C++, you do.

-- Greg

Aug 04 2005
Hi,

I think keywords serve a purpose ("This identifier is off-limits"). It
hurts my head to consider all of the possible pitfalls.

'mmedia.real.realaudio', simply because 'real' is a keyword and the
parser would balk. If 'real' were a predefined identifier, this would
work, and would not create a harmful redefinition of 'real'.

There are reasons for that. First of all, IMHO, it would be fairly confusing to
allow such a thing. Second, what happens if you declare something like min or
max in your new 'real' module:

ulong m = real.max; // Is this your module's or the primitive's?

In my view, "This identifier is off-limits" is precisely the collateral
damage *caused* by keywords, *not* their purpose. They serve to provide
a set of readable punctuation marks to guide the parser, and the
downside is that they are reserved in all possible contexts, whereas
normal identifiers are scoped.

Keywords are scoped too. Their scope is universal. Where, pray tell, would
declaring a variable called "int" be a good idea? IMO nowhere, thus the scope of
the keyword is universal for a good reason.

This is more of an issue when you consider the likely possibility that
new built-in types will be added to the language in the future; if they
are added as keywords, they will break existing code; but if they are
added as predefined identifiers, they won't.

You can't guarantee this.

Yes, it's easy to laugh at silly examples where 'int' is used as a local
variable of type 'char'.  We could sit here all day writing dangerously
misleading code, without having to propose language changes to make that
possible.

One of the goals of any good language should be to reduce the possibility of
writing such dangerous code. Just because you can use _other_ features to write
dangerous code doesn't mean we need more such features. Two wrongs don't make a
right. Declaring a variable named "int" or "char" is just plain wrong IMO. If
the language prevents it, all the better.

"alias wchar rea1;" comes to mind... You'd fill less silly,
and more p-d off, if you had used 'rchar' as a perfectly good local
variable name, or struct member name, and it later became a new built-in
type (and a new keyword).

Since this is all about identifiers, you can very easily do a global search and
replace. Normally, such replaces are not that simple if it's something like a
string, or a construct, or what have you. But since it's an identifier, it's
exceedingly simple to replace it.

And by the way, it's possible to redefine
'integer' in Pascal, and nobody has ever seemed to complain of their
head hurting, or even notice, for that matter. They just don't do it.

"Nobody" has ever complained? Is this a fact?

"They" just don't do it? Can you guarantee this hasn't happened and caused bugs?

[Ada too, I think]. I have never once proposed that anybody should be
encouraged to redefine int; in fact, this can be prohibited without
making it a keyword.

You are contradicting yourself here. Do you want "int" to be a valid variable
name or not? If yes, then that's a bad idea. If no, then what's with all the
"evidence" that this is not an abominable idea?

There's IMHO an issue in D, arising from the fact that there are a lot
of keywords; and while I believe that more than 20 of them are
unnecessary (all the type names, and true/false), even if they are
eliminated there will still be a problem as follows:

extern(C) int delegate( foo*p, int id );	// Can't do it!

I simply can't interface to this existing C function since its name is a
D keyword. Almost 100 C identifiers are simply 'off-limits', and there
is no real solution to this unless you can modify the C code.

A language has keywords. This is a fact of life. I understand your desire to
reduce the number of keywords, but at what cost? I see a miniscule benefit
(slightly higher compatibility with C), at a huge cost (ambiguity, Walter's
time, potential for dangerous code, etc.).

I've previously proposed that a lexical convention be added
to escape identifiers from keyword recognition, e.g.

This suggestion is separate from the suggestion that type names should
not be keywords; either could be adopted without the other.

This makes a _lot_ more sense than allowing people to name their functions
"delegate". You should push for this suggestion instead.

So then we could call a method "this". Could we use the "this" of
that method's class? :x

Allowing redefinition of "this" is insane IMO.

But if you had some  old C code that defined a struct member 'this', you
wouldn't have to change that to convert it into our mutant C++, whereas
with real C++, you do.

What about the simple suggestion to allow quoted identifiers?

extern (C) 'this'(int foo); // for a function called 'this'.

Cheers,
--AJG.

Aug 04 2005
Greg Smith <greg siliconoptix.com> writes:
AJG wrote:

Hi,

I think keywords serve a purpose ("This identifier is off-limits"). It
hurts my head to consider all of the possible pitfalls.

But off-limits *everywhere*? I can't have a module called
'mmedia.real.realaudio', simply because 'real' is a keyword and the
parser would balk. If 'real' were a predefined identifier, this would
work, and would not create a harmful redefinition of 'real'.

There are reasons for that. First of all, IMHO, it would be fairly confusing to
allow such a thing. Second, what happens if you declare something like min or
max in your new 'real' module:

ulong m = real.max; // Is this your module's or the primitive's?

import mmedia.real.realaudio;

ulong m = real.max; // Existing real
ulong m2 = mmedia.real.realaudio.max;	// from module

or just "m2=max;", as I understand import to work. I have not redefined
real in the global scope.

This is what scopes are for.

It's not confusing to *allow* something, what's confusing is what people
do with it. Do you think this change is going to turn D from something
in which no confusing code is possible into something which is a morass
of unavoidable confusion?

In my view, "This identifier is off-limits" is precisely the collateral
damage *caused* by keywords, *not* their purpose. They serve to provide
a set of readable punctuation marks to guide the parser, and the
downside is that they are reserved in all possible contexts, whereas
normal identifiers are scoped.

Keywords are scoped too. Their scope is universal. Where, pray tell, would
declaring a variable called "int" be a good idea? IMO nowhere, thus the scope
of
the keyword is universal for a good reason.

A keyword is actually scoped more universally than a variable. You can't
do version(keyword){}, or extern(keyword) int foo();, whereas normal
identifers can be used in those contexts without being defined or even
mentioned anywhere else; and even if they are mentioned in other
contexts there is no connection between a version tag 'xyz' and a
language identifier (type, variable, etc) 'xyz'.

I can't see how using

version(real){ }
version(emulation){ }

... should be particularly harmful. And again, it's not the existing
keywords that are the issue, it's the new ones that will be defined later.

This is more of an issue when you consider the likely possibility that
new built-in types will be added to the language in the future; if they
are added as keywords, they will break existing code; but if they are
added as predefined identifiers, they won't.

You can't guarantee this.

will hide the predefined definition. This is what scoping was designed
to do; to allow local namespaces to be protected from changes to global
namespaces.

Yes, it's easy to laugh at silly examples where 'int' is used as a local
variable of type 'char'.  We could sit here all day writing dangerously
misleading code, without having to propose language changes to make that
possible.

One of the goals of any good language should be to reduce the possibility of
writing such dangerous code. Just because you can use _other_ features to write
dangerous code doesn't mean we need more such features. Two wrongs don't make a
right. Declaring a variable named "int" or "char" is just plain wrong IMO. If
the language prevents it, all the better.

to do a few additional confusing things. Therefore it shouldn't be done,
despite the benefits. That doesn't wash. Besides, you can still make it
illegal to redefine 'int'.

"alias wchar rea1;" comes to mind... You'd fill less silly,
and more p-d off, if you had used 'rchar' as a perfectly good local
variable name, or struct member name, and it later became a new built-in
type (and a new keyword).

You are contradicting yourself here. Do you want "int" to be a valid variable
name or not? If yes, then that's a bad idea. If no, then what's with all the
"evidence" that this is not an abominable idea?

I DON'T CARE if you can redefine int or not. I DON'T CARE. I would never
do it. It's a very bad idea to redefine it. But if the language lets it
be done, the language will still work.

I have no evidence that it would be useful to redefine 'int'. I have
presented evidence that it would be useful to redefine *less* *central*
type names, which may not have even been in existence at the time you

If you want it to be illegal to redefine int, you can *still* make int a
predefined identifier, not a keyword, and make it illegal to redefine
it, in whatever contexts you think it might be harmful, including all of
them. I'm not going to repeat all the reasons why this is different from
int being a keyword, you can go check out the other thread "Why are type
names keywords".

If you insist on using 'int' as an example and fail to recognize that
the language has both a past (C code compatiblity) and a future
(possible new types to be added) to deal with, then I am sure you will
never understand my point. You seem to be assuming that keywords can
never conflict with identifiers in any troublesome way. It would be
great if that were true.

Here's a worst-case scenario:

(1) I have code like this:

import commlib.rchar.api;

(2) 'rchar' becomes a new type and a new keyword.

(3) I have to change all my code, renaming 'rchar' to, say, 'rrchar';
and worse, my directory structure (possibly throwing my version control
into havoc), then modify the build scripts, and rebuild the libraries.
If the library came as object code from a 3rd party, I'm skunked, I need
to  get a new build of it with a different name, because I can't link
with the old unless I can use 'rchar' as an identifier.

If, on the other hand:

(2a) 'rchar' becomes a new type and a new predefined identifier

(3a) I don't have to do anything.

(4a) At my leisure, I could rename things to avoid confusion. It
could be a lot of work, since I'd have to rename the directories and
possibly change the build scripts for the module. If I think the effort
is worth it, I'll do it. But I'm not forced to do it.

I don't know what else I can stay. Please stop talking about redefining
'int'.

I've previously proposed that a lexical convention be added
to escape identifiers from keyword recognition, e.g.

This suggestion is separate from the suggestion that type names should
not be keywords; either could be adopted without the other.

This makes a _lot_ more sense than allowing people to name their functions
"delegate". You should push for this suggestion instead.

functions "delegate". I'm just proposing a method of doing it despite
the presence of a keyword "delegate". Note that
foo()  and 'foo'()
...would be identical.

This suggestion provides a method of dealing with keyword conflicts; the
other suggestion reduces the likelihood of them happening in the first
place.

In the worst-case scenario discussed above, with
import commlib.rchar.api;
... and 'rchar' becoming a new keyword, I wouldn't be nearly as screwed
if I could use this quote thing. I would still need to edit the source,
but I could keep the directory name. I could continue to use the
3rd-party compiled library.

So then we could call a method "this". Could we use the "this" of
that method's class? :x

<snip>

Allowing redefinition of "this" is insane IMO.

C++ (not D) is a perfect example of something that doesn't need to be a
keyword.
It is *only* meaningful in contexts where a member function parameter
has scope, and has exactly the same grammar treatment as an identifier.
Nothing is gained by making it a keyword.
In D the 'this' keyword is doing more work, in constructor declarations,
etc, so the situation is different.

But if you had some  old C code that defined a struct member 'this', you
wouldn't have to change that to convert it into our mutant C++, whereas
with real C++, you do.

What about the simple suggestion to allow quoted identifiers?

extern (C) 'this'(int foo); // for a function called 'this'.

I was discussing converting old C source code into C++, not linking old
C code to D. If C++ 'this' were *not* a keyword, I wouldn't need to
change any 'this' identifiers in the C code, since none of them could
conflict with the C++ meaning. This is not a hypothetical example, by
the way - I've run into this exact case.

- Greg

Aug 04 2005
Hi,

I think keywords serve a purpose ("This identifier is off-limits"). It
hurts my head to consider all of the possible pitfalls.

But off-limits *everywhere*? I can't have a module called
'mmedia.real.realaudio', simply because 'real' is a keyword and the
parser would balk. If 'real' were a predefined identifier, this would
work, and would not create a harmful redefinition of 'real'.

There are reasons for that. First of all, IMHO, it would be fairly confusing to
allow such a thing. Second, what happens if you declare something like min or
max in your new 'real' module:

ulong m = real.max; // Is this your module's or the primitive's?

import mmedia.real.realaudio;

ulong m = real.max; // Existing real
ulong m2 = mmedia.real.realaudio.max;	// from module

My example did not use "realaudio.max," but rather "real.max." There's the
difference. Do you not see a problem with that usage?

or just "m2=max;", as I understand import to work. I have not redefined
real in the global scope.

This is what scopes are for.

It's not confusing to *allow* something, what's confusing is what people
do with it. Do you think this change is going to turn D from something
in which no confusing code is possible into something which is a morass
of unavoidable confusion?

No. Mere introduction of the "feature" would unlikely break havoc immediately.
What I mean is that it creates the potential for confusion.

In addition, the feature has _very_ _little_ _benefit_. In other words, it is
not worth introducing the great potential for confusion for such a miniscule
gain in functionality. The juice is not worth the squeeze.

On the other hand, let me present to you another such potentially-abused
feature: goto. Now, some will say goto is the source of all evil, but for all
its evil, goto is a very powerful tool. So in that respect it is different, thus
D allows goto.

Keywords are scoped too. Their scope is universal. Where, pray tell, would
declaring a variable called "int" be a good idea? IMO nowhere, thus the scope
of
the keyword is universal for a good reason.

do version(keyword){}, or extern(keyword) int foo();, whereas normal
identifers can be used in those contexts without being defined or even
mentioned anywhere else; and even if they are mentioned in other
contexts there is no connection between a version tag 'xyz' and a
language identifier (type, variable, etc) 'xyz'.

I can't see how using

version(real){ }
version(emulation){ }

... should be particularly harmful.

Well, this is all relative. Would it cause great physical harm to the user?
Unlikely. Will it create confusion? Yes.

If a user only knows the D language and sees:

version (real)

what is he supposed to think? It kinda looks like "use this version if D is
compiled with real number support." Or perhaps "use this version if you want
real number precision."

And again, it's not the existing
keywords that are the issue, it's the new ones that will be defined later.

Frankly, I don't see a flood of new primitive types and keywords. I'd speculate
maybe a couple per year, since Walter likes to reuse his keywords. That's just a
risk you're gonna have to take. Btw, perhaps you shouldn't be naming your things
so close to already existing keywords.

There's char, there's dchar, there's wchar. It doesn't take a genius to figure
out the naming convention here. Therefore it also doesn't take a genius to
figure out that rchar is particularly prone. Also, here's another quick tip:
keywords in D are all lowercase. Hint, hint.

This is more of an issue when you consider the likely possibility that
new built-in types will be added to the language in the future; if they
are added as keywords, they will break existing code; but if they are
added as predefined identifiers, they won't.

will hide the predefined definition. This is what scoping was designed
to do; to allow local namespaces to be protected from changes to global
namespaces.

Hm... actually, it's the other way around. The purpose is to protect the global
namespace from the local ones. A local namespace cannot override the global one:

int x;
{
float x; //IIRC, error.
}

If a mere variable is prevented from redefinition, I think it's safe to say a
predefined type/identifier/keyword would be too. And for good reason. I wouldn't
want subtle redefinitions to happen in my code without so much as a warning.

The *only* argument you are making is that this change adds the ability
to do a few additional confusing things. Therefore it shouldn't be done,
despite the benefits. That doesn't wash. Besides, you can still make it
illegal to redefine 'int'.

No. You seem to ignore the costs of your little operation: it would probably
take valuable time to implement. It would require complex scoping rules to be
created. In addition to the potential for abuse.

Moreover, what doesn't wash is that all that work is not worth it. The quotes
proposal already takes care of that at a fraction of the cost.

If you want it to be illegal to redefine int, you can *still* make int a
predefined identifier, not a keyword, and make it illegal to redefine
it, in whatever contexts you think it might be harmful, including all of
them. I'm not going to repeat all the reasons why this is different from
int being a keyword, you can go check out the other thread "Why are type
names keywords".

So then you need to introduce complex scoping rules for predefined identifiers
just so you can use language keywords in certain special scopes were they are
not meaningful. Does this makes sense over a simple "no use" rule across the
language?

If you insist on using 'int' as an example and fail to recognize that
the language has both a past (C code compatiblity)

D does not have C code compatibility. It would be hazardous to your mental
health to keep believing that. If you fail to recognize that then you should
work on it.

and a future
(possible new types to be added) to deal with, then I am sure you will
never understand my point.

I understand your point very well, I just don't happen to agree with it.

This suggestion is separate from the suggestion that type names should
not be keywords; either could be adopted without the other.

This makes a _lot_ more sense than allowing people to name their functions
"delegate". You should push for this suggestion instead.

functions "delegate". I'm just proposing a method of doing it despite
the presence of a keyword "delegate". Note that
foo()  and 'foo'()
...would be identical.

Well, no they wouldn't (if foo were a keyword), because you'd have to always use
quotes to accesss foo. In other words, it's almost as if the quotes became part
of the identifier and they were taken off only for C-compat.

This suggestion provides a method of dealing with keyword conflicts; the
other suggestion reduces the likelihood of them happening in the first
place.

In the worst-case scenario discussed above, with
import commlib.rchar.api;
... and 'rchar' becoming a new keyword, I wouldn't be nearly as screwed
if I could use this quote thing. I would still need to edit the source,
but I could keep the directory name. I could continue to use the
3rd-party compiled library.

I have already said that the quote suggestion is valid, and would (at little
cost) bring about the benefit you suggest. That's what makes the feature good:
It's a small cost.

Allowing redefinition of "this" is insane IMO.

C++ (not D) is a perfect example of something that doesn't need to be a
keyword.
It is *only* meaningful in contexts where a member function parameter
has scope, and has exactly the same grammar treatment as an identifier.
Nothing is gained by making it a keyword.
In D the 'this' keyword is doing more work, in constructor declarations,
etc, so the situation is different.

Unfortunately, D is not C++ and in D, this has almost universal meaning. In
fact, at the base scope (module-level, I assume), 'this' is relevant, due to
static ctors/dtors.

My suggestion that redefining 'this' is insane applies to D, not C++, or I would
have said so. Since apparently you didn't like my redefining int that much, let
me propose redefining this then ;)

I suppose you would have no problem with the following:

# module this;
# this() {
#     version (this) {
#         debug (this) {
#             int this;
#         }
#     }
# }

Nope, no confusion in sight.

But if you had some  old C code that defined a struct member 'this', you
wouldn't have to change that to convert it into our mutant C++, whereas
with real C++, you do.

What about the simple suggestion to allow quoted identifiers?

extern (C) 'this'(int foo); // for a function called 'this'.

I was discussing converting old C source code into C++, not linking old
C code to D. If C++ 'this' were *not* a keyword, I wouldn't need to
change any 'this' identifiers in the C code, since none of them could
conflict with the C++ meaning.

If you are converting old C code to D chances are you are already doing some
heavy lifting. I don't think ONE additional global search and replace will be
particularly troublesome. Furthermore, with the quotes proposal the problem
would be even less relevant. Finally, what is with all this C compat
glorification? D is not C. D is not a superset of C. D already breaks all sorts
of things that prevent C-code compat. Relatively speaking, the keywords are very
small issue compared to the other things. Creating all sorts of complex
keyword/predefined-identifier scoping rules just so you can have your almighty
'this' structure in C is horribly misguided, IMHO.

Cheers,
--AJG.

Aug 04 2005
Derek Parnell <derek psych.ward> writes:
On Thu, 4 Aug 2005 21:00:29 +0000 (UTC), AJG wrote:

[snip]

int x;
{
float x; //IIRC, error.
}

Actually, this is only an error at the module level. The code below is
quite okay ...

void main()
{
int x;
{
float x; // okay
}
}

This too is alright ...

void x()
{
int x;
{
float x; //Okay too.
}
}

[snip]

Unfortunately, D is not C++

LOL. And I think exactly the opposite. Thank Bob that D is not like C++!

My take on this issue is that if people want to write stupid and confusing
code, then they should be allowed to. They might not stay employed for very
long in any of my teams but that's natural selection at work.

--
Derek Parnell
Melbourne, Australia
5/08/2005 7:26:30 AM

Aug 04 2005
Hi,

[snip]

int x;
{
float x; //IIRC, error.
}

Actually, this is only an error at the module level. The code below is
quite okay ...

That's certainly not what the docs say...:

"A block statement introduces a new scope for local symbols. A local symbol's
name, however, must be unique within the function."

http://www.digitalmars.com/d/statement.html

The examples there contradict yours. I don't have a compiler at hand, though, so
I can't say. I prefer what the docs say, frankly.

[snip]

Unfortunately, D is not C++

LOL. And I think exactly the opposite. Thank Bob that D is not like C++!

Thank Bob indeed. You knew I was going for the sarcastic effect! ;)

My take on this issue is that if people want to write stupid and confusing
code, then they should be allowed to. They might not stay employed for very
long in any of my teams but that's natural selection at work.

I don't agree. This kind of "remedial" attitude is no good. What we need (just
like in real life) is preventative care. Take a proactive approach so that
things don't become fatal.

Having been assigned "maintenance" duties on occasion, I can say the ones burned
("skunked," to quote Greg) are not generally the original doofuses but the
subsequent maintainers.

<often quoted possibly completely wrong statistic>
<take this with a grain of salt>
Around 80% of the total cost of software is not the original development.
</take this with a grain of salt>
</often quoted possibly completely wrong statistic>

Cheers,
--AJG.

Aug 04 2005
Derek Parnell <derek psych.ward> writes:
On Fri, 5 Aug 2005 00:01:30 +0000 (UTC), AJG wrote:

Hi,

[snip]

int x;
{
float x; //IIRC, error.
}

Actually, this is only an error at the module level. The code below is
quite okay ...

That's certainly not what the docs say...:

"A block statement introduces a new scope for local symbols. A local symbol's
name, however, must be unique within the function."

http://www.digitalmars.com/d/statement.html

The examples there contradict yours. I don't have a compiler at hand, though,
so
I can't say. I prefer what the docs say, frankly.

Yes, I knew about the docs, but I believe Walter has said that the docs are
wrong in this case. The compiler does compile the examples I gave and this
is the intended behaviour. I just can't find the Walter-quote right now.

My take on this issue is that if people want to write stupid and confusing
code, then they should be allowed to. They might not stay employed for very
long in any of my teams but that's natural selection at work.

I don't agree. This kind of "remedial" attitude is no good. What we need (just
like in real life) is preventative care. Take a proactive approach so that
things don't become fatal.

Having been assigned "maintenance" duties on occasion, I can say the ones
burned
("skunked," to quote Greg) are not generally the original doofuses but the
subsequent maintainers.

Yes, I can understand this. We have defined coding standards here and all
code is peer-reviewed and non-standard code is identified and corrected
before being allowed to go out into the world. But under more common
development methodologies, I can understand that you'd like to have a
mechanized method of enforcing coding decency - i.e. the compiler.

<often quoted possibly completely wrong statistic>
<take this with a grain of salt>
Around 80% of the total cost of software is not the original development.
</take this with a grain of salt>
</often quoted possibly completely wrong statistic>

I think I heard once that 94.37% of all statistics are misused, and the
remaining 84.12% are actually a bit suspect too.

--
Derek Parnell
Melbourne, Australia
http://www.dsource.org/projects/build/ v2.08 released 29/May/2005

http://www.prowiki.org/wiki4d/wiki.cgi?FrontPage

5/08/2005 11:04:09 AM

Aug 04 2005
J C Calvarese <technocrat7 gmail.com> writes:
In article <1l0oh1ebg8svr.13hqjnzi2kbus$.dlg 40tude.net>, Derek Parnell says... On Fri, 5 Aug 2005 00:01:30 +0000 (UTC), AJG wrote: <often quoted possibly completely wrong statistic> <take this with a grain of salt> Around 80% of the total cost of software is not the original development. </take this with a grain of salt> </often quoted possibly completely wrong statistic> I think I heard once that 94.37% of all statistics are misused, and the remaining 84.12% are actually a bit suspect too. 8 out of 10 statistics are completely made up. ;) jcc7  Aug 04 2005 AJG <AJG_member pathlink.com> writes: Hi, The examples there contradict yours. I don't have a compiler at hand, though, so I can't say. I prefer what the docs say, frankly. Yes, I knew about the docs, but I believe Walter has said that the docs are wrong in this case. The compiler does compile the examples I gave and this is the intended behaviour. I just can't find the Walter-quote right now. Hm... interesting. I hope Walter brings back the behaviour in the docs. Yes, I can understand this. We have defined coding standards here and all code is peer-reviewed and non-standard code is identified and corrected before being allowed to go out into the world. But under more common development methodologies, I can understand that you'd like to have a mechanized method of enforcing coding decency - i.e. the compiler. Exactly! Thank you. I'm glad somebody here understands. <often quoted possibly completely wrong statistic> <take this with a grain of salt> Around 80% of the total cost of software is not the original development. </take this with a grain of salt> </often quoted possibly completely wrong statistic> I think I heard once that 94.37% of all statistics are misused, and the remaining 84.12% are actually a bit suspect too. "There are three types of lies - lies, damn lies, and documentation." Hehehe... --AJG.  Aug 05 2005 Derek Parnell <derek psych.ward> writes: On Fri, 5 Aug 2005 13:45:14 +0000 (UTC), AJG wrote: [snip] Yes, I can understand this. We have defined coding standards here and all code is peer-reviewed and non-standard code is identified and corrected before being allowed to go out into the world. But under more common development methodologies, I can understand that you'd like to have a mechanized method of enforcing coding decency - i.e. the compiler. Exactly! Thank you. I'm glad somebody here understands. But the problem is that we are then stuck with somebody else's interpretation of what decency is (i.e. the language designer) and we don't have the freedom to use another interpretation. A general purpose language should be just that - general purpose. Inserting artificial restrictions that are only there to support a specific coding philosophy is fine - just don't call it a general purpose language in that case. I actually support the idea of reducing the number of keywords in D. The current built-in types really shouldn't be keywords. And they are inconsistent ... 'Object' is not a keyword, 'bool' is not a keyword. void main() { int Object = 2; // okay int bool = 3; // okay int real = 4; // error } The other keywords in the language are punctuation symbols that the compiler uses, and this is the reasonable use for keywords. -- Derek Parnell Melbourne, Australia 6/08/2005 6:01:37 AM  Aug 05 2005 AJG <AJG_member pathlink.com> writes: Hi, Yes, I can understand this. We have defined coding standards here and all code is peer-reviewed and non-standard code is identified and corrected before being allowed to go out into the world. But under more common development methodologies, I can understand that you'd like to have a mechanized method of enforcing coding decency - i.e. the compiler. Exactly! Thank you. I'm glad somebody here understands. But the problem is that we are then stuck with somebody else's interpretation of what decency is (i.e. the language designer) and we don't have the freedom to use another interpretation. A general purpose language should be just that - general purpose. Inserting artificial restrictions that are only there to support a specific coding philosophy is fine - just don't call it a general purpose language in that case. _Some_ decency is better than _no_ decency. The "interpretations" are generally linear in that there is usually "more" and "less." I'll always want more. And of course, I wouldn't want the "zero" interpretation. And they are inconsistent ... 'Object' is not a keyword, 'bool' is not a keyword. Hey, that inconsistency is a problem by itself. I'd treat it as a bug. I actually support the idea of reducing the number of keywords in D. The other keywords in the language are punctuation symbols that the compiler uses, and this is the reasonable use for keywords. Ceteris paribus, I also support the idea of less keywords. But not when it reduces the "decency," if you will, and also not when it leads the syntax to become like Perl's. Cheers, --AJG.  Aug 05 2005 Greg Smith <greg siliconoptix.com> writes: AJG wrote: ulong m = real.max; // Is this your module's or the primitive's? My example did not use "realaudio.max," but rather "real.max." There's the difference. Do you not see a problem with that usage? change the fact that there are useful implications. No. Mere introduction of the "feature" would unlikely break havoc immediately. What I mean is that it creates the potential for confusion. In addition, the feature has _very_ _little_ _benefit_. In other words, it is not worth introducing the great potential for confusion for such a miniscule gain in functionality. The juice is not worth the squeeze. We'll have to agree to differ on that. I keep bringing up C++, not because I'm confusing D with C++, but because, more than once, I've been bitten by C++ adding new keywords (either relative to C, or to its own former self). Well, this is all relative. Would it cause great physical harm to the user? Unlikely. Will it create confusion? Yes. If a user only knows the D language and sees: version (real) what is he supposed to think? It kinda looks like "use this version if D is compiled with real number support." Or perhaps "use this version if you want real number precision." clicks away :-). I had no idea at all how 'version' worked until I read the manual. And again, it's not the existing keywords that are the issue, it's the new ones that will be defined later. Frankly, I don't see a flood of new primitive types and keywords. I'd speculate maybe a couple per year, since Walter likes to reuse his keywords. That's just a risk you're gonna have to take. Btw, perhaps you shouldn't be naming your things so close to already existing keywords. I had trouble with C++, at a much lower rate of keyword growth. There's char, there's dchar, there's wchar. It doesn't take a genius to figure out the naming convention here. Therefore it also doesn't take a genius to figure out that rchar is particularly prone. Also, here's another quick tip: keywords in D are all lowercase. Hint, hint. written. If you have full control over all of the code you work with, none of this matters nearly as much. Lucky you. I don't intend to stop using local variable names consisting entirely of lowercase letters. And, the set of likely new type names is (a) not quite as predictable as that and (b) not utterly disjoint from the set of likely names for local vars, parameter names, and member names, all of which are lowercase by many coding conventions. I think you can, for practical purposes. The user definition of the name will hide the predefined definition. This is what scoping was designed to do; to allow local namespaces to be protected from changes to global namespaces. Hm... actually, it's the other way around. The purpose is to protect the global namespace from the local ones. A local namespace cannot override the global one: int x; { float x; //IIRC, error. } something about how a D local isn't allowed to hide a local in an enclosing scope,which is fine; but a local variable *can* hide a global. I just tried it. I agree there is a risk of a global namespace being affected by modifications to the local namespace, but the effect of that is contained to the locality of the change, so it's reasonable to expect that to be addressed by diligence (and via variable naming conventions). If a mere variable is prevented from redefinition, I think it's safe to say a predefined type/identifier/keyword would be too. And for good reason. I wouldn't want subtle redefinitions to happen in my code without so much as a warning. new built-in type 'rchar', you are doing it because it wasn't there when you wrote the code. So, you want the new 'rchar' type to be hidden in all places where the user's 'rchar' is in scope. Which is what the existing scope rules do already. The *only* argument you are making is that this change adds the ability to do a few additional confusing things. Therefore it shouldn't be done, despite the benefits. That doesn't wash. Besides, you can still make it illegal to redefine 'int'. No. You seem to ignore the costs of your little operation: it would probably take valuable time to implement. It would require complex scoping rules to be created. In addition to the potential for abuse. than now. D already has a bunch of predefined identifiers (Object, max, nan, infinity, sizeof...) - does that bother you? If you want it to be illegal to redefine int, you can *still* make int a predefined identifier, not a keyword, and make it illegal to redefine it, in whatever contexts you think it might be harmful, including all of them. I'm not going to repeat all the reasons why this is different from int being a keyword, you can go check out the other thread "Why are type names keywords". So then you need to introduce complex scoping rules for predefined identifiers just so you can use language keywords in certain special scopes were they are not meaningful. Does this makes sense over a simple "no use" rule across the language? have mentioned, it's not future-proof. If you insist on using 'int' as an example and fail to recognize that the language has both a past (C code compatiblity) D does not have C code compatibility. It would be hazardous to your mental health to keep believing that. If you fail to recognize that then you should work on it. You can link to external C code. This is what I meant. This compatibility is degraded by having too many keywords. This suggestion is separate from the suggestion that type names should not be keywords; either could be adopted without the other. This makes a _lot_ more sense than allowing people to name their functions "delegate". You should push for this suggestion instead. functions "delegate". I'm just proposing a method of doing it despite the presence of a keyword "delegate". Note that foo() and 'foo'() ...would be identical. Well, no they wouldn't (if foo were a keyword), because you'd have to always use quotes to accesss foo. In other words, it's almost as if the quotes became part of the identifier and they were taken off only for C-compat. function "delegate", plain and simple. You said 'this makes more sense than allowing people to name their functions "delegate"' . I found that statement very confusing and hoped to clarify things. I see now what you were getting at; but I have never proposed removal of any keywords other than those which are manifestly unnecessary for parsing; 'delegate' isn't in that list. Allowing redefinition of "this" is insane IMO. I agree that it's no huge improvement. 'Insane', no. In fact, 'this' in C++ (not D) is a perfect example of something that doesn't need to be a keyword. It is *only* meaningful in contexts where a member function parameter has scope, and has exactly the same grammar treatment as an identifier. Nothing is gained by making it a keyword. In D the 'this' keyword is doing more work, in constructor declarations, etc, so the situation is different. Unfortunately, D is not C++ and in D, this has almost universal meaning. In fact, at the base scope (module-level, I assume), 'this' is relevant, due to static ctors/dtors. My suggestion that redefining 'this' is insane applies to D, not C++, or I would have said so. Since apparently you didn't like my redefining int that much, let me propose redefining this then ;) well that caused me a lot of confusion since I explicitly said I was talking about removing keyword 'this' in C++ and not D. The parallel is that 'wchar' in D, like 'this' in C++, and 'Object' in D, doesn't actually need to be a keyword. Of these three, only 'Object' has the benefit of being a predefined identifier rather than a keyword. I suppose you would have no problem with the following: # module this; # this() { # version (this) { # debug (this) { # int this; # } # } # } Nope, no confusion in sight. Do you have a problem with # # module wombat; # import wiffle; # # wombat( wiffle.wombat w) { # version (wombat) { # debug (wombat) { # int wombat = w.wombat; # } # } # } ... because I think that's perfectly legal, and isn't any less confusing. Construct all the confusing examples you like. But if you had some old C code that defined a struct member 'this', you wouldn't have to change that to convert it into our mutant C++, whereas with real C++, you do. What about the simple suggestion to allow quoted identifiers? extern (C) 'this'(int foo); // for a function called 'this'. I was discussing converting old C source code into C++, not linking old C code to D. If C++ 'this' were *not* a keyword, I wouldn't need to change any 'this' identifiers in the C code, since none of them could conflict with the C++ meaning. If you are converting old C code to D chances are you are already doing some heavy lifting. general effects of adding new keywords. The task of converting C source to C++ is just a real world example of where these effects show up. They will also show up if/when new keywords are added to D. -Greg  Aug 04 2005 AJG <AJG_member pathlink.com> writes: Hi, Yes, you can construct cases where there is confusion. This does not change the fact that there are useful implications. The minimal useful "implications" are overshadowed. Moreover, even these implications can be achieved thru far simpler, far cheaper solutions: Either the quotes proposal, or a simple global search and replace. In addition, the feature has _very_ _little_ _benefit_. In other words, it is not worth introducing the great potential for confusion for such a miniscule gain in functionality. The juice is not worth the squeeze. We'll have to agree to differ on that. I keep bringing up C++, not because I'm confusing D with C++, but because, more than once, I've been bitten by C++ adding new keywords (either relative to C, or to its own former self). Yes, we'll have to differ. I haven't once run into a single keyword collision, in any language I've used. Maybe I've just been lucky. :p what is he supposed to think? It kinda looks like "use this version if D is compiled with real number support." Or perhaps "use this version if you want real number precision." clicks away :-). I had no idea at all how 'version' worked until I read the manual. The more intuitive you make things, the better overall. That's just commonsense. And again, it's not the existing keywords that are the issue, it's the new ones that will be defined later. Frankly, I don't see a flood of new primitive types and keywords. I'd speculate maybe a couple per year, since Walter likes to reuse his keywords. That's just a risk you're gonna have to take. Btw, perhaps you shouldn't be naming your things so close to already existing keywords. I had trouble with C++, at a much lower rate of keyword growth. When you say you "had" trouble, does this mean: "Well, there was this one time where one identifier collided with a keyword;" or "Heck, I ran into keywords left and right. It seemed the very language evolved against my code!" There's char, there's dchar, there's wchar. It doesn't take a genius to figure out the naming convention here. Therefore it also doesn't take a genius to figure out that rchar is particularly prone. Also, here's another quick tip: keywords in D are all lowercase. Hint, hint. written. If you have full control over all of the code you work with, none of this matters nearly as much. Lucky you. As a matter of fact I don't. Despite this fact it's never been a problem. I don't intend to stop using local variable names consisting entirely of lowercase letters. And, the set of likely new type names is (a) not quite as predictable as that and (b) not utterly disjoint from the set of likely names for local vars, parameter names, and member names, all of which are lowercase by many coding conventions. In D, it's actually predictable to a certain degree: stmt is unlikely to become a keyword. statement could. expr, wouldn't. expression, could. * The one exception I can think of is "const" which I believe was kept due to C. Had walter had his choice, I think he would have gone with "constant." Verbs/prepositions: in/do/is/for/a/as/be. Things like: while/when/where/what/why. Moreover, if you are talking about _types_, it is a good naming convention to capitalize those (and keywords are all lowercase): Database; Statement; Result; I think following those simple conventions almost guarantees not running into a keyword. Language designers (D included) are not stupid. "i" and "j" won't become keywords. flawed premise, see above. The main point is, if you are redefining the new built-in type 'rchar', you are doing it because it wasn't there when you wrote the code. So, you want the new 'rchar' type to be hidden in all places where the user's 'rchar' is in scope. Which is what the existing scope rules do already. AND I think you are mistaken, or just being evasive. I remember seeing something about how a D local isn't allowed to hide a local in an enclosing scope,which is fine; but a local variable *can* hide a global. I just tried it. I'm sorry about this. I'm confused ATM about scopes/identifiers because apparently the D docs say one thing and Walter "said" another. I'm not sure what I'm supposed to conclude. I agree there is a risk of a global namespace being affected by modifications to the local namespace, but the effect of that is contained to the locality of the change, so it's reasonable to expect that to be addressed by diligence (and via variable naming conventions). The "containment" you speak of is very leaky. Throughout the code that redefines rchar, the original rchar would be subtly hidden. So if a programmer with no knowledge of this comes along (since, as you said, you are not the only one involved), and uses rchar, then there will be trouble. Whereas as it stands, simple knowledge of D itself (not of whatever "variations" the specific code imposes) is enough to understand unambiguously what (the hypothetical) rchar is. IMHO it is a much safer assumption to think a programmer will know the language itself before whatever else is written with it. No. You seem to ignore the costs of your little operation: it would probably take valuable time to implement. It would require complex scoping rules to be created. In addition to the potential for abuse. than now. D already has a bunch of predefined identifiers (Object, max, nan, infinity, sizeof...) - does that bother you? No, because right now I couldn't accidentally redefine those to hide the real ones. For instance, "real.max" will always be real.max, not some user defined thing. If you want it to be illegal to redefine int, you can *still* make int a predefined identifier, not a keyword, and make it illegal to redefine it, in whatever contexts you think it might be harmful, including all of them. I'm not going to repeat all the reasons why this is different from int being a keyword, you can go check out the other thread "Why are type names keywords". So then you need to introduce complex scoping rules for predefined identifiers just so you can use language keywords in certain special scopes were they are not meaningful. Does this makes sense over a simple "no use" rule across the language? have mentioned, it's not future-proof. Are you saying the variable scoping rules could be applied _identically_ to type redefinition? That would yield some interesting results. Moreover, the language will _never_ be future-proof. If you name your type something that becomes a non-type keyword, you are skunked too my friend. If you insist on using 'int' as an example and fail to recognize that the language has both a past (C code compatiblity) D does not have C code compatibility. It would be hazardous to your mental health to keep believing that. If you fail to recognize that then you should work on it. You can link to external C code. This is what I meant. You can't "link" to code. You link to objects, which are no longer code. Once again, D does not have C code compatibility. I don't know why you keep bringing this up. D has C library _link_ compatibility, which is quite a different beast. Most languages can link to C too. D's syntax being _similar_ to C has little relevance in this respect. This compatibility is degraded by having too many keywords. Furthermore, clearly the definition of "too many" is quite subjective and relative. If you want to see some languages with truly "too many" keywords, then I can show you plenty. This suggestion is separate from the suggestion that type names should not be keywords; either could be adopted without the other. This makes a _lot_ more sense than allowing people to name their functions "delegate". You should push for this suggestion instead. functions "delegate". I'm just proposing a method of doing it despite the presence of a keyword "delegate". Note that foo() and 'foo'() ...would be identical. Well, no they wouldn't (if foo were a keyword), because you'd have to always use quotes to accesss foo. In other words, it's almost as if the quotes became part of the identifier and they were taken off only for C-compat. function "delegate", plain and simple. You said 'this makes more sense than allowing people to name their functions "delegate"' . I found that statement very confusing and hoped to clarify things. I see now what you were getting at; but I have never proposed removal of any keywords other than those which are manifestly unnecessary for parsing; 'delegate' isn't in that list. Then why did your example have delegate expressly in it??? Now who's being evasive? ;) Allowing redefinition of "this" is insane IMO. I agree that it's no huge improvement. 'Insane', no. In fact, 'this' in C++ (not D) is a perfect example of something that doesn't need to be a keyword. It is *only* meaningful in contexts where a member function parameter has scope, and has exactly the same grammar treatment as an identifier. Nothing is gained by making it a keyword. In D the 'this' keyword is doing more work, in constructor declarations, etc, so the situation is different. Unfortunately, D is not C++ and in D, this has almost universal meaning. In fact, at the base scope (module-level, I assume), 'this' is relevant, due to static ctors/dtors. My suggestion that redefining 'this' is insane applies to D, not C++, or I would have said so. Since apparently you didn't like my redefining int that much, let me propose redefining this then ;) well that caused me a lot of confusion since I explicitly said I was talking about removing keyword 'this' in C++ and not D. The parallel is that 'wchar' in D, like 'this' in C++, and 'Object' in D, doesn't actually need to be a keyword. Of these three, only 'Object' has the benefit of being a predefined identifier rather than a keyword. Mostly because Object is not a primitive. That's a big difference. But if you had some old C code that defined a struct member 'this', you wouldn't have to change that to convert it into our mutant C++, whereas with real C++, you do. What about the simple suggestion to allow quoted identifiers? extern (C) 'this'(int foo); // for a function called 'this'. I was discussing converting old C source code into C++, not linking old C code to D. If C++ 'this' were *not* a keyword, I wouldn't need to change any 'this' identifiers in the C code, since none of them could conflict with the C++ meaning. If you are converting old C code to D chances are you are already doing some heavy lifting. general effects of adding new keywords. You keep going back and forth using C code as a point and then saying you are not "talking about" it. But you _were_ talking about C code just there. Also, I'm sure converting C code to COBOL would pose some difficulties, and I'm sure adding new keywords to COBOL has its own set of problems, but I wouldn't bring it up as proof of my point. The task of converting C source to C++ is just a real world example of where these effects show up. They will also show up if/when new keywords are added to D. If the conversion is going to take place, then you are back to what I said about the "heavy lifting." You can't have your cake and eat it. Are you going to convert code to D or not? Cheers, --AJG.  Aug 05 2005 Greg Smith <greg siliconoptix.com> writes: AJG wrote: And again, it's not the existing keywords that are the issue, it's the new ones that will be defined later. Frankly, I don't see a flood of new primitive types and keywords. I'd speculate maybe a couple per year, since Walter likes to reuse his keywords. That's just a risk you're gonna have to take. Btw, perhaps you shouldn't be naming your things so close to already existing keywords. I had trouble with C++, at a much lower rate of keyword growth. When you say you "had" trouble, does this mean: "Well, there was this one time where one identifier collided with a keyword;" or "Heck, I ran into keywords left and right. It seemed the very language evolved against my code!" is the difference between "I have these 10K lines of code which work, out of the box" and "I have to spend a bunch of time changing things, and now I possibly have to maintain a local mod of 3rd-partty source, because the other guys prefer to use the old compiler which I can't use for some other reason". The maintenance issue can easily overshadow the effort of making the change. The "containment" you speak of is very leaky. Throughout the code that redefines rchar, the original rchar would be subtly hidden. So if a programmer with no knowledge of this comes along (since, as you said, you are not the only one involved), and uses rchar, then there will be trouble. not when you upgrade the compiler; and it's local to the change. I prefer it to the other difficulty. No, because right now I couldn't accidentally redefine those to hide the real ones. For instance, "real.max" will always be real.max, not some user defined thing. object. My point is, the required scoping rules are clearly already in place. So then you need to introduce complex scoping rules for predefined identifiers just so you can use language keywords in certain special scopes were they are not meaningful. Does this makes sense over a simple "no use" rule across the language? The existing scope rules are fine. The 'no use' rule has the problems I have mentioned, it's not future-proof. Are you saying the variable scoping rules could be applied _identically_ to type redefinition? That would yield some interesting results. The scoping rules already apply to types! every time you define an alias, or a typedef, or a class, or you import a type from a module, you are making some identifier refer to a type. There are scoping rules for that. Moreover, the language will _never_ be future-proof. If you name your type something that becomes a non-type keyword, you are skunked too my friend. than other kinds of keywords. There are a lot of specialized C compilers (for DSPs, etc) which define special types for the environment; the same may happen to D; the new types should be p.d. indentifiers, not kwords, and this will be easier if the existing ones are too. You can link to external C code. This is what I meant. You can't "link" to code. You link to objects, which are no longer code. Once again, D does not have C code compatibility. I don't know why you keep bringing this up. D has C library _link_ compatibility, which is quite a different beast. Most languages can link to C too. D's syntax being _similar_ to C has little relevance in this respect. If I write "int is_it_prime(int n);" in C, and leave it in a .c file, then I can write "extern (C) int is_it_prime( int n);" in D, and link them together and call the func from D. No? That's what I mean by 'linking to external C code'. I use the linker. The external code is in C. The linker obviously cannot process C source, so yes, it needs to be compiled, but it's still 'C code' due to its origins. I know how things like this can be confusing; when I first read your 'You link to objects', I thought you meant 'class' objects rather than .o files, and was temporarily baffled. But I took the time to understand what you actually meant. I've been linking object modules (not objects) using linkers since about 1980, so please try to give me the benefit of the doubt for any confusing phraseology. I'm just saying that by declaring " int 'delegate'(), you are naming a function "delegate", plain and simple. You said 'this makes more sense than allowing people to name their functions "delegate"' . I found that statement very confusing and hoped to clarify things. I see now what you were getting at; but I have never proposed removal of any keywords other than those which are manifestly unnecessary for parsing; 'delegate' isn't in that list. Then why did your example have delegate expressly in it??? Now who's being evasive? ;) Stay with me here. I didn't want to use a type name in the example, since I've proposed that they *not* be keywords, and I wanted it to be very clear that the word used in the example was definitely a keyword, hopefully avoiding any confusion with the other proposal. Also, 'delegate' seems fairly prone to a collision with a C function in some weird library. Mostly because Object is not a primitive. That's a big difference. grammar. They're all built-in types. Again, I'm not talking about converting C to D; I'm talking about the general effects of adding new keywords. You keep going back and forth using C code as a point and then saying you are not "talking about" it. But you _were_ talking about C code just there. Also, I'm sure converting C code to COBOL would pose some difficulties, and I'm sure adding new keywords to COBOL has its own set of problems, but I wouldn't bring it up as proof of my point. how an unnecessary keyword could be easily removed from the Arcturian language Zlatwold-II, without degrading it in any way, and how that would make converting code from the similar language Zlatwold-I (*not* C) easier, and if everybody knew the languages enough to make sense of the example, then maybe I would have used that example to illustrate my point. I've used C/C++ instead of Zlatwold-I/II. Any similarity between C/C++ and D is not relevant. I'm not talking about *converting* *C* *code* *to* *D*, but I was, as you say, talking about C. I don't see any "going back and forth" there. The relevance is to when you have to someday convert D-I to D-II, and D-II may have additional, unnecessary, keywords. If the conversion is going to take place, then you are back to what I said about the "heavy lifting." You can't have your cake and eat it. Are you going to convert code to D or not? discussed is via the extern(C) linking process. -Greg  Aug 05 2005 J C Calvarese <technocrat7 gmail.com> writes: In article <dctfjq$gs3\$1 digitaldaemon.com>, Greg Smith says...
J C Calvarese wrote:

I think keywords serve a purpose ("This identifier is off-limits"). It
hurts my head to consider all of the possible pitfalls.

'mmedia.real.realaudio', simply because 'real' is a keyword and the
parser would balk. If 'real' were a predefined identifier, this would
work, and would not create a harmful redefinition of 'real'.

It'd be nice if module names were more flexible, but all of kinds of practical
effects of such a naming would either have to be specifically prohibited by the
compiler or the programmer would have to watch out for any clashes (see my
"real.max" discussion below). I think Walter has decided it's easier just to
prohibit the name altogether and make it a keyword than come up with more
complex rules about what happens to be barely allowed and barely disallowed.

(You can compile a module "int.module.import.real.typeof.3d", but if you try to
import it, you computer will crash.)

In my view, "This identifier is off-limits" is precisely the collateral
damage *caused* by keywords, *not* their purpose. They serve to provide
a set of readable punctuation marks to guide the parser, and the
downside is that they are reserved in all possible contexts, whereas
normal identifiers are scoped.

Well, I'm not sure about the whole cause-and-effect issue, but the reality is we
have to program with the allowable syntax of the language. If we want the
benefit of having the functionality provided by a keyword, we can't use it an
identifier, too.

This is more of an issue when you consider the likely possibility that
new built-in types will be added to the language in the future; if they
are added as keywords, they will break existing code; but if they are
added as predefined identifiers, they won't.

Yes, it's easy to laugh at silly examples where 'int' is used as a local
variable of type 'char'.  We could sit here all day writing dangerously
misleading code, without having to propose language changes to make that

I'm not trying to be humorous. I'm exploring the possible ramifications of your
proposal. It's not a matter to be taken lightly. I'm getting a vibe from your
many lengthy posts that you think this is definitely the way to go, but I see
many downsides and not much (if any) upside.

Throwing away all of those all-so-pesky rules would have real consequences. Even
opening up module names could allow all kinds of disturbing code. If you have a
module called real what would happen if you had a function called max. These
days "real.max" means "largest representable value that's not infinity for a
real type". Isn't it ambiguous with a "real.max" function thrown into the mix?

possible. "alias wchar rea1;" comes to mind... You'd fill less silly,

Don't change the subject. ;)

and more p-d off, if you had used 'rchar' as a perfectly good local
variable name, or struct member name, and it later became a new built-in
type (and a new keyword). And by the way, it's possible to redefine

I think Walter would agree that adding a new keyword after 1.0 is established
would only be done after significant deliberation. That's why cent (signed 128
bits) and ucent (unsigned 128 bits) are already reserved for future use even
though it's unclear when they'll be implemented.

'integer' in Pascal, and nobody has ever seemed to complain of their
head hurting, or even notice, for that matter. They just don't do it.
[Ada too, I think]. I have never once proposed that anybody should be
encouraged to redefine int; in fact, this can be prohibited without
making it a keyword.

So why does it matter it's not technically a "keyword" if we should prohibit
this anyway.

There's IMHO an issue in D, arising from the fact that there are a lot
of keywords; and while I believe that more than 20 of them are
unnecessary (all the type names, and true/false), even if they are
eliminated there will still be a problem as follows:

The true and false keywords are a different situation than the types, but I can
see a reason to make them keywords: to compel consistancy. If they aren't
keywords, one programmer can say "true = 0" and another can say "false = 99". If
everyone got creative with defining their own true and false, it'd be that must
harder to understand someone else's code. Someone can use True and False (or
TRUE and FALSE) if a need arises.

extern(C) int delegate( foo*p, int id );	// Can't do it!

I simply can't interface to this existing C function since its name is a
D keyword. Almost 100 C identifiers are simply 'off-limits', and there
is no real solution to this unless you can modify the C code.

Are they commonly used? (No hypotheticals, how often have you seen these
actually used?)

I've previously proposed that a lexical convention be added
to escape identifiers from keyword recognition, e.g.

extern(C) int 'delegate'( foo*p, int id );

...
if( can_delegate )  'delegate' ( foop, 0); // call C function

int 'if' = 0;   'if'++;   // now possible, if you really want to.
char x = 'c';	// still a char constant if only 1 char

import commlib.'interface'.localapi;	// interface is a keyword

Or, w"delegate".

This suggestion is separate from the suggestion that type names should
not be keywords; either could be adopted without the other.

This is a whole other argument. I don't see a problem with this except that
apostrophes are already used for character literals. I guess they could be
reused here, but it might be worth it to use another symbol. Could we go to
Unicode for a purpose such as this?

So then we could call a method "this". Could we use the "this" of
that method's class? :x

identifier, it would not be so simple, since 'this' in the class
namespace would be the constructor. I don't know enough about all the
details of D classes to comment further.

I agree with this.

Well, it seems we still mostly disagree and I don't expect you to suddenly agree
with me on everything, so maybe we should just agree to disagree.

jcc7

Aug 04 2005