www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Notes IV

reply bearophile <bearophileHUGS lycos.com> writes:
With a bit more experience of D, here is my 4th list of notes about D v.1.x
(maybe this is the last one, because I have already said most things I can
think of). Few bits are repeated from the precedent notes because I have new
arguments to support them.
Probably some of the following things are silly, and you can ignore them, but I
believe some of them are meaningful enough.

1) In D I often enough do mistekes like this:
foreach (i, a, myobj)
But usually not this one:
foreach (i; a; myobj)
There for my eyes it's not always easy to spot the difference between "," and
";", so I think a "in" instead of ";" (as used in Python and C# too) can be
seen better:
foreach (i, a in myobj)


2) I think in some cases it may be possible to unify the functions of std.conv,
like:
toFloat(x)
with the casting, like:
cast(float)x
So to convert an int x to float you can use:
float(x)
To convert a string to float you can use the same syntax, with no need of the
std lib:
float(" -12.6  ")
(Note the spaces, they are ignored).
I don't know how badly this can interact with other bits of D syntax.


3) In my D code I keep writing "length" all the time (currently I can find 458
"length" words inside my d libs). It's a long word, $ inside the scope of []
helps reducing the typing, but I often enough write "lenght", so I think still
a default attribute named "len" (as in Python) may be better than "length". The
attribute "dup" too is an abbreviation, probably of "duplicate", so
abbreviations seem acceptable in such context.


4) Regarding the array/string concatenation the D docs say:
Many languages overload the + operator to mean concatenation. This confusingly
leads to, does:

produce the number 13 or the string "103" as the result? It isn't obvious, and the language designers wind up carefully writing rules to disambiguate it - rules that get incorrectly implemented, overlooked, forgotten, and ignored. It's much better to have + mean addition, and a separate operator to be array concatenation.< I don't agree with that. Maybe that comment is true for Perl, where the wild casting is automatic, or it's true for Java, that automatically converts values to strings if you add them to a strings, but other languages are quite strict, like Python, that allows you only to "sum" two strings or to sum two numbers:
 "10" + "3"



 10 + 3



 "10" + 3



File "<stdin>", line 1, in <module> TypeError: cannot concatenate 'str' and 'int' objects
 10 + "3"



File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for +: 'int' and 'str' Thanks to such strong (but dynamic) typing in Python I've never had problems in using "+" to join strings, and if you look in the Python newsgroup you will find surprisingly little complaints from people doing errors because of such overloading of "+". So I think "+" and "+=" are fine to join strings/arrays if you use them in a strong typed way, beside being more standard, because various languages use "+" for that purpose. (And string join is a very common operation, and to input "~" it forces to use the numerical keyboard to input it if you don't use an English keyboard). 5) AST macros: I think they add power to the language, and I think I'll enjoy using them. It will increase the appeal & sexiness of the language for some people. But they have downsides too, and not just derived from how much/how little hygienic they are. In Lisp macros are very useful, but a common compliant is that "with macros every programmer reinvents his/her language, making difficult understand and modify the code written by others". So I suggest to be careful in adding macros to D... Unfortunately I don't have a better suggestion to give. Note that macros that are present inside the STD lib avoid that problem, because they are standard, everyone uses them, and most people don't need to understand how the insides of the std lib works (as the source of the C++ STL, or of Blitz++, etc). 6a) Often most of the time necessary to write programs is used to debug the code. So I encourage D to try to adopt syntax and other constructs that help avoid bugs in the first place. Many bugs can be avoided adding certain runtime cheeks that the compiler can remove when the code is compiled in release mode. 6b) It may be useful to create an online repository of bugs present in D code written by all people (ranked by their experience?), that may allow us to know what parts of D syntax lead to more bugs in people code, so we can fix the language to avoid some of them :-) For example I'd like to know if the error in the foreach() caused by "," and ";" of point (1) is common to other D programmers, or if (unlikely) it's just a problem of mine. 6c) Some of the features of MemCheek seem one of those useful things that can be active by default and be disabled in release mode: http://hald.dnsalias.net/projects/memcheck/ (If they are built-in they are more useful because if built-in then everyone uses them by default, like array bound cheeks). 6d) (This was present in one of my precedent lists of notes, but in a less defined way) To reduce some kinds of bugs "*" can be used for GC pointers and the " " symbol can be used for normal pointers, so the programmer can better specify his/her intentions, so the compiler can catch as compile-time errors the operations that aren't allowed on GC-pointers (but are allowed on normal pointers). A casting operation can be then be defined to convert the two classes of pointers. 7) The D syntax of is() is powerful, but I think in some of its variants it's not much readable, so may there may be a better syntax (even if requires is() to be split in my more than one syntax). 8) I think string functions of Phobos are quite usable and powerful enough, but I think they are a bit too many/much complex. So I think it's better to reduce their number/complexity a bit. I think it's very positive when about 90% of the string functions can fit in a brain and they can be used from memory, leaving the necessity to read the Phobos docs only in the few cases where you need some subtler/less common string function. Such high memory recall rate is common among Python programmers (while Delphi has tons of very fast string functions (often written in assembly) and I have never succeed learning a large percentage of them), and it allows you to speed up you programming a lot. If you put lot of string functions in a lib, with complex usage, you obtain a more powerful string library, but then you have to look the manuals often, and the coding becomes slow. That's why I suggested that the "chomp"/"chop" names are too much similar and easy to confound. 9) AA literals need lot of improvement: void main() { int[][int] aa1; aa1[10] = [1, 2, 3]; aa1[20] = [10, 20]; //auto aa2 = [10: [1, 2, 3], // 20: [10, 20]]; //test.d(9): Error: cannot infer type from this array initializer //test.d(9): Error: array initializers as expressions are not allowed //test.d(9): Error: cannot use array to initialize int //test.d(9): Error: array initializers as expressions are not allowed //auto aa2 = [10: [1, 2, 3].dup, // 20: [10, 20]]; // test.d(16): comma expected separating array initializers, not . // test.d(16): semicolon expected following auto declaration, not 'dup' // test.d(17): found ':' when expecting ';' following 'statement' // test.d(17): found ']' when expecting ';' following 'statement' //int[][int] aa2 = [10: [1, 2, 3].dup, // 20: [10, 20]]; // test.d(23): comma expected separating array initializers, not . // test.d(23): semicolon expected, not 'dup' // test.d(24): found ':' when expecting ';' following 'statement' // test.d(24): found ']' when expecting ';' following 'statement' //auto aa2 = [10: ([1, 2, 3].dup), // 20: [10, 20]]; //test.d(30): Error: cannot infer type from this array initializer //test.d(30): Error: array initializers as expressions are not allowed //test.d(30): Error: cannot use array to initialize int //test.d(30): Error: array initializers as expressions are not allowed //test.d(31): variable test.main.aa2 is not a static and cannot have static initializer int[][int] aa2 = [10: ([1, 2, 3].dup), 20: [10, 20]]; // OK } 10a) The new syntax for properties in C# seems nice; instead of this code: private int myval; public int Myval { get { return myval; } private set { myval = value; } } You just need: public int property Myval { get; private set; } 10b) In C# "yield" too seem to have a nice syntax: http://en.wikipedia.org/wiki/Comparison_of_C_Sharp_and_Java 11) Regarding the way to reference variables in the global scope, D uses the syntax: .varname (The Python community has discussed similar topics, but they have different problems (because D variables are explicitly present or absent: http://www.python.org/dev/peps/pep-3104/ ). But ".varname" may be too much error-prone because the dot isn't much visible. So something longer and more explicit may be more visible and less easy to miss, like: global(x) Note that we may think about a notation to specify how many scopes to ascend: ...x That can be written as: outer(outer(outer(x))) But this capability of ascending many scopes makes the code a tangled mess, so it's anti-feature. 12) In Python functions are objects, so you can add them attributes: def foo(inc): foo.tot += inc return foo.tot foo.tot = 10 foo(10) print foo.tot # prints 20 So is it a silly idea to allow public static variables in D functions, to do something similar? int foo(int inc) { public static int tot; foo.tot += inc; return foo.tot; } void main() { foo.tot = 10; foo(10); printf("%d\n", foo.tot); } 13a) random functions in Phobos of DMD 2.x: a random function has contrasting requirements, because you need them in very different situations. I think such requirements can be satisfied with using two different random functions: - One very fast RND function, with very simple and short syntax, useful for little programs or where you need to compute lot of randoms, like in a little game. It may use the Kiss algorithm used by Tango. - One slower RND generator, it has to be very good, and thread safe. So this can use the Mersenne Twister, be a class and use a longer dotted syntax. 13b) In my d libs I have added some very useful functions like choice(sequence), randInt(a,b), randRange, shuffle(sequence), etc. I think they are almost necessary, and very easy to implement. 14) D follows the good choice of fixing the length of all types, but real. I can accept that some compilers and CPUs can support 80-bit floating point values, while others can't, but I don't like to use "real"s leaving the compiler the choice to use 64 or 80 bit to implement them. So "real" can be renamed "real80", and have a fixed length. If the compiler/CPU doesn't allow 80 bit floating point numbers, then fine, you don't find real80 defined, and if you use them you get a syntax error (or you use a static if + an alias to rename float as real, so you can fake them by yourself. I don't like the compiler to fake them silently for me). 15) I'd like to import modules only inside unittests, or just when I use the -unittest flag. With the help of a static if something simple like this may suffice: static if (unittest) { import std.stdio; unittest foo1 { ... } unittest bar1 { ... } } 16) Against C rules, in some situations I think it may be better if some results are upcasted to ulong/long: import std.stdio; void main() { uint a = 3_000_000_000; uint b = 3_000_000_000; writefln(a + b); // 1705032704 int c = -1_600_000_000; int d = -1_600_000_000; writefln(c + d); // 1094967296 } (The Ada language uses a different solution to avoid such class of bugs, but it may be too much far from the style of C-like languages. Delphi looks like a compromise). 17) After using D for some time I think still that "and", "or" (and maybe "not") as in Python are more readable than "&&" and "||" (and maybe !, but this is less important). The only good side of keeping them in D is to make the compiler digest C-like code better. GCC has the -foperator-names option, that allows you to use "and", "or", etc, in C++ code. -------------------------- The following things are sources of bugs, I don't know how much frequent such bugs are in real code, and I don't have idea how to modify the syntax/grammar/compiler to avoid/reduce their occurrence: 18a) In some situations side-effects may be bad for the health of the programmer: import std.stdio; void main() { int[5] x = 10, y = 20; writefln (x, " ", y); // [10,10,10,10,10] [20,20,20,20,20] int i = 0; while (i < x.length) y[i] = x[i++]; //y[i++] = x[i]; // OK writefln (x, " ", y); // [10,10,10,10,10] [20,10,10,10,10] } D doesn't like warnings, but maybe some way can be invented to avoid such kind of errors (Python doesn't have ++ -- to avoid that kind of bugs. You use += -= or two separated instructions. I like compact code, but not when it leads to more bugs, so I tend to avoid putting ++ and -- as subexpressions, I use them on their own or in instructions separated by comma like in the for()). 18b) Here indentation doesn't follow the program meaning: if (x == 0) if (y == 0) foo(); else z = x + y; In Python such possible source of bugs is absent, because the indentation is the only important thing, so this: if x == 0: if y == 0: foo() else: z = x + y Is different from this other one, and you can see the difference very well: if x == 0: if y == 0: foo() else: z = x + y The net result is that in Python (once you have an editor that helps you avoiding mixing leading tabs and spaces) those bugs can be avoided. And I like that a lot. To avoid that kind of bugs the guidelines of good C/C++ (and probably Java/C# too) coding often tell you to always use brackets: if (x == 0) { if (y == 0) { foo(); } } else { z = x + y; } That way the code becomes longer, but I think you can avoid some bugs. D already disallows: if (a > b); for (...); But Making D *require* {} after for, while, if/else, ecc looks like a "draconian" way to avoid that kind of bugs. I don't have better solutions, but maybe we/you can think some about this. 18c) This is another silly bug, but I presume it's not common enough to justify compiler changes: void main() { void foo(int x) { printf("%d\n", x); } foo((1, 2)); // prints 2 } Bye, bearophile
Jan 22 2008
next sibling parent reply downs <default_357-line yahoo.de> writes:
bearophile wrote:
 
 3) In my D code I keep writing "length" all the time (currently I can find 458
"length" words inside my d libs). It's a long word, $ inside the scope of []
helps reducing the typing, but I often enough write "lenght", so I think still
a default attribute named "len" (as in Python) may be better than "length". The
attribute "dup" too is an abbreviation, probably of "duplicate", so
abbreviations seem acceptable in such context.
 

Agreed. That abbreviation would be useful.
 
 4) Regarding the array/string concatenation the D docs say:
 Many languages overload the + operator to mean concatenation. This confusingly
leads to, does:

produce the number 13 or the string "103" as the result? It isn't obvious, and the language designers wind up carefully writing rules to disambiguate it - rules that get incorrectly implemented, overlooked, forgotten, and ignored. It's much better to have + mean addition, and a separate operator to be array concatenation.< I don't agree with that. Maybe that comment is true for Perl, where the wild casting is automatic, or it's true for Java, that automatically converts values to strings if you add them to a strings, but other languages are quite strict, like Python, that allows you only to "sum" two strings or to sum two numbers:
 "10" + "3"



 10 + 3



 "10" + 3



File "<stdin>", line 1, in <module> TypeError: cannot concatenate 'str' and 'int' objects
 10 + "3"



File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for +: 'int' and 'str' Thanks to such strong (but dynamic) typing in Python I've never had problems in using "+" to join strings, and if you look in the Python newsgroup you will find surprisingly little complaints from people doing errors because of such overloading of "+". So I think "+" and "+=" are fine to join strings/arrays if you use them in a strong typed way, beside being more standard, because various languages use "+" for that purpose. (And string join is a very common operation, and to input "~" it forces to use the numerical keyboard to input it if you don't use an English keyboard).

Strongly disagreed. Addition simply isn't the same as concatenation, and using the same operator for both will cause confusion. Besides, Walter is planning to eventually add array operations, i.e. [2, 3, 4] + [1, 1, 5] -> [3, 4, 9] which would be impossible if + already meant "concatenate".
 
 
 5) AST macros: I think they add power to the language, and I think I'll enjoy
using them. It will increase the appeal & sexiness of the language for some
people. But they have downsides too, and not just derived from how much/how
little hygienic they are. In Lisp macros are very useful, but a common
compliant is that "with macros every programmer reinvents his/her language,
making difficult understand and modify the code written by others". So I
suggest to be careful in adding macros to D... Unfortunately I don't have a
better suggestion to give. Note that macros that are present inside the STD lib
avoid that problem, because they are standard, everyone uses them, and most
people don't need to understand how the insides of the std lib works (as the
source of the C++ STL, or of Blitz++, etc). 
 

 
 6a) Often most of the time necessary to write programs is used to debug the
code. So I encourage D to try to adopt syntax and other constructs that help
avoid bugs in the first place. Many bugs can be avoided adding certain runtime
cheeks that the compiler can remove when the code is compiled in release mode.

This is already done in some places (array bounds checking).
 6b) It may be useful to create an online repository of bugs present in D code
written by all people (ranked by their experience?), that may allow us to know
what parts of D syntax lead to more bugs in people code, so we can fix the
language to avoid some of them :-) For example I'd like to know if the error in
the foreach() caused by "," and ";" of point (1) is common to other D
programmers, or if (unlikely) it's just a problem of mine.

I think what you want goes more in the direction of a Wishlist with vote features. Such a thing already exists, although I don't know the URL.
 8) I think string functions of Phobos are quite usable and powerful enough,
but I think they are a bit too many/much complex. So I think it's better to
reduce their number/complexity a bit. I think it's very positive when about 90%
of the string functions can fit in a brain and they can be used from memory,
leaving the necessity to read the Phobos docs only in the few cases where you
need some subtler/less common string function. Such high memory recall rate is
common among Python programmers (while Delphi has tons of very fast string
functions (often written in assembly) and I have never succeed learning a large
percentage of them), and it allows you to speed up you programming a lot. If
you put lot of string functions in a lib, with complex usage, you obtain a more
powerful string library, but then you have to look the manuals often, and the
coding becomes slow. That's why I suggested that the "chomp"/"chop" names are
too much similar and easy to confound.

Disagreed. Code space isn't limited; I see no reason to leave out functions that might be useful. Maybe create a better sorting for the documentation, though (grep+count over some large code base for importance?)
 
 
 
 9) AA literals need lot of improvement:

Agreed. The current syntax is needlessly verbose.
 
 10b) In C# "yield" too seem to have a nice syntax:
 http://en.wikipedia.org/wiki/Comparison_of_C_Sharp_and_Java

There exist several implementations of StackThreads for D (Tango's fibers, scrapple.tools' StackThreads, Mikola Lysenko's first StackThreads package (http://assertfalse.com/projects.shtml )). Example using scrapple.tools: auto GetEven = stackthread = (int delegate() read, void delegate(int) yield) { while (true) { auto n = read(); if (n % 2 == 0) yield(n); } }; writefln(GetEven([2, 3, 4, 5, 6]));
 13a) random functions in Phobos of DMD 2.x: a random function has contrasting
requirements, because you need them in very different situations. I think such
requirements can be satisfied with using two different random functions:
 - One very fast RND function, with very simple and short syntax, useful for
little programs or where you need to compute lot of randoms, like in a little
game. It may use the Kiss algorithm used by Tango.
 - One slower RND generator, it has to be very good, and thread safe. So this
can use the Mersenne Twister, be a class and use a longer dotted syntax.

Vote [+] for Mersenne Twister in the standard library.
 13b) In my d libs I have added some very useful functions like
choice(sequence), randInt(a,b), randRange, shuffle(sequence), etc. I think they
are almost necessary, and very easy to implement.

I'm not sure whether such things should be in the standard library. Very tentatively in favor.
 
 
 14) Reals [...] I don't like the compiler to fake them silently for me).

 18c) This is another silly bug, but I presume it's not common enough to
justify compiler changes:
 void main() {
   void foo(int x) { printf("%d\n", x); }
   foo((1, 2)); // prints 2
 }

It's only in D for C compatibility, which is stupid if you think about it. Also, the existing usage of the comma expression makes it harder to implement more powerful native tuples. IMNSHO, the comma expression should be deprecated. :)
 
 Bye,
 bearophile

--downs
Jan 22 2008
next sibling parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
downs wrote:
 bearophile wrote:
 3) In my D code I keep writing "length" all the time (currently I can find 458
"length" words inside my d libs). 

Agreed. That abbreviation would be useful.

If it's gonna change, in my opinion it should be changed to "size". C++ had it right. "length" doesn't generalize well to non-linear containers. What's the "length" of a binary tree?
 It's a long word, $ inside the scope of [] helps reducing the typing, 

named "len" (as in Python) may be better than "length". The attribute "dup" too is an abbreviation, probably of "duplicate", so abbreviations seem acceptable in such context. My #1 typo is writelfn(). I don't know why but my fingers type that probably more often than the proper spelling. I vote for std.stdio to have a writelfn alias. :-) kidding of course.
 18c) This is another silly bug, but I presume it's not common enough to
justify compiler changes:
 void main() {
   void foo(int x) { printf("%d\n", x); }
   foo((1, 2)); // prints 2
 }

It's only in D for C compatibility, which is stupid if you think about it. Also, the existing usage of the comma expression makes it harder to implement more powerful native tuples. IMNSHO, the comma expression should be deprecated. :)

Well , does get some real play in for loops: for(i=0; i<N; i++,j++,k++) { } But maybe that could be made part of the for-loop syntax rather than a general purpose thing. Real tuples with a nice literal syntax would get used a billion times more than any comma sequencing thing. And the comma could probably be replaced with some sort of macro/special function syntax. Like "progn(i++,j++,k++)" :-) [snigger] --bb
Jan 22 2008
parent reply bearophile <bearophileHUGS lycos.com> writes:
Bill Baxter:

downs wrote:

You may comment my original post too, beside commenting the comments written by downs...
My #1 typo is writelfn().<

That's why I use put/putr (as in Ruby, I think) in my d libs. "r" stands for "return". Shorter, simpler to write, and clear ;-)
Real tuples with a nice literal syntax would get

See Fortress language for better sequence literals than (..., ..., ...) Bye, and thank you, bearophile
Jan 22 2008
parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
bearophile wrote:
 Bill Baxter:
 
 downs wrote:

You may comment my original post too, beside commenting the comments written by downs...

Sorry, didn't mean you no disrespect by that.
 
 
 My #1 typo is writelfn().<

That's why I use put/putr (as in Ruby, I think) in my d libs. "r" stands for "return". Shorter, simpler to write, and clear ;-)

Sounds good. I'd vote for that in the standard library. Then maybe putraw and putrawr for the non-formatted alternatives? What does ruby do for write and writeln? I guess those could stay as is...
 Real tuples with a nice literal syntax would get

See Fortress language for better sequence literals than (..., ..., ...)

Like what? Or is it some funky symbols that can't be posted here? --bb
Jan 22 2008
parent bearophile <bearophileHUGS lycos.com> writes:
Bill Baxter:

Then maybe putraw and putrawr for the non-formatted alternatives?  What does
ruby do for write and writeln?  I guess those could stay as is...<

put()/putr() (plus str() that is similar to format(), and repr() that is something new and similar to the Python repr()) are non-formatting. In the situations where I need formatting I use the normal format(), writefln(), writef() of std.string...
See Fortress language for better sequence literals than (..., ..., ...)<<


Fortress can be written both with few (Unicode) "funky symbols" and in pure ASCII, you can find something here ap page 11: http://research.sun.com/projects/plrg/Publications/SNU.pdf Or here at page 22: http://research.sun.com/projects/plrg/Publications/JapanLecture2006public.pdf Full specification: http://research.sun.com/projects/plrg/Publications/fortress1.0beta.pdf Fortress has literals for lists, vectors, sets, multisets, maps, and they can be written in ASCII or with Unicode symbols, for example when expressed in ASCII: lists: <| a, b, c |> vectors (no commas): [a b c] sets: {a, b, c} multisets: {| a, b, c |} Fortress has many ideas D may steal & adapt, expecially if D wants to use many CPUs in parallel, perform some number crunching, and appeal to scientific software programmers. Bye, bearophile
Jan 23 2008
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
downs:

Programmers will create miniature languages anyway.<

But doing it with Java allows you to keep the code itself more readable and simpler for less skilled programmers, and that's essential for a large wide spreading of a language.
This is already done in some places (array bounds checking).<

I was mostly talking about pointers (and other things).
I think what you want goes more in the direction of a Wishlist with vote
features.<

No, I am talking about a list of the most common mistakes people do writing their programs, so D syntax can be modified to avoid some of them.
Disagreed. Code space isn't limited; I see no reason to leave out functions
that might be useful.<

Brain space is limited; from experience I have seen that a language that "fits the brain" allows you to code much faster.
Example using scrapple.tools:<

I'll take a look, thank you.
Vote [+] for Mersenne Twister in the standard library.<

No need to vote, it's already there in DMD 2.x. But I think that's the only generator present (beside the crappy C one), and I have explained why I think that's not enough.
I'm not sure whether such things should be in the standard library. Very
tentatively in favor.<

Because otherwise you need to write them yourself in your personal lib :-)
14) Reals [...] I don't like the compiler to fake them silently for me).<<


Maybe you can explain me why.
This is not a bug;<

It's a bug (of mine). In that section I am not talking about DMD bugs, but about bugs written by programmers, and how they can be avoided/reduced...
Also, the existing usage of the comma expression makes it harder to implement
more powerful native tuples.<

( and ) are used for other purposes, if you use them for tuples you may end with the ugly syntax python uses for singleton tuples: x, that you can write as this because it's the comma that defines the tuple: (x,) I think the Fortress language uses better sequence literals. Bye, bearophile
Jan 22 2008
parent reply downs <default_357-line yahoo.de> writes:
bearophile wrote:
 downs:
 
 Programmers will create miniature languages anyway.<

But doing it with Java allows you to keep the code itself more readable and simpler for less skilled programmers, and that's essential for a large wide spreading of a language.

 
 I think what you want goes more in the direction of a Wishlist with vote
features.<

No, I am talking about a list of the most common mistakes people do writing their programs, so D syntax can be modified to avoid some of them.

Yeah but nobody is going to add every small typo/bug they make. Be realistic.
 
 Disagreed. Code space isn't limited; I see no reason to leave out functions
that might be useful.<

Brain space is limited; from experience I have seen that a language that "fits the brain" allows you to code much faster.

 
 Example using scrapple.tools:<

I'll take a look, thank you.

 
 14) Reals [...] I don't like the compiler to fake them silently for me).<<


Maybe you can explain me why.

 
 Also, the existing usage of the comma expression makes it harder to implement
more powerful native tuples.<

( and ) are used for other purposes, if you use them for tuples you may end with the ugly syntax python uses for singleton tuples:

Like, int e, f; blabla; int,int test() { return 2, 3; } e,f = test(); Since tuples are unrolled in D anyway, no () is needed. --downs
Jan 23 2008
parent reply Robert Fraser <fraserofthenight gmail.com> writes:
downs wrote:

 Disagreed. Code space isn't limited; I see no reason to leave out functions
that might be useful.<



True, if you're the only one coding. But if there are 20 different people working on a project, who know 20 different subsets of the language, then it becomes a huge issue. Especially if you're maintaining code written by someone else. This is a big issue with C++, since there are so many weird & unexpected rules, and it's becoming so with D as new features are added.
Jan 23 2008
parent bearophile <bearophileHUGS lycos.com> writes:
Robert Fraser:
True, if you're the only one coding. But if there are 20 different people
working on a project, who know 20 different subsets of the language, then it
becomes a huge issue. Especially if you're maintaining code written by someone
else.<

If you have some experience of API design you know that Downs is quite probably wrong there. The size of an API (string functions are an API) *can't* grow without bounds, it must to be the result of a (more than one) compromise. If you put too much things in it, you end having functions with similar purpose that you can exchange with each other for mistake, and you end looking in the manual often, slowing down your coding speed (and debuggin/code reading speed of code written by others, as you say. Experience shows that often reading code is more common than writing it). If you take a look at Delphi (or even Ruby) you can see that problem. While if you don't have enough functions (see the C std lib) you end re-writing (or downloading & including) your own string functions (or string-processing code) all the time, so you end with code written in many different ways, with redundant code, and often not efficient enough (because very good string functions contain refined algorithms, see the substring match problem), or even buggy. The current Phobos is almost there, IMHO it just needs some cleaning, to remove few bits, to make it a bit simpler, and to have its functions written in a more efficient way (and maybe work with Unicode better, but I am not sure about this). Walter has told me that I can offer more efficient string functions, but I don't know how or where to submit them :-) Bye, bearophile
Jan 24 2008
prev sibling next sibling parent reply Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
bearophile wrote:

 16) Against C rules, in some situations I think it may be better if some
results are upcasted to ulong/long:
 import std.stdio;
 void main() {
   uint a = 3_000_000_000;
   uint b = 3_000_000_000;
   writefln(a + b); // 1705032704
 
   int c = -1_600_000_000;
   int d = -1_600_000_000;
   writefln(c + d); // 1094967296
 }
 (The Ada language uses a different solution to avoid such class of bugs, but
it may be too much far from the style of C-like languages. Delphi looks like a
compromise).
 

On a similar topic, here are two of my favorite bugs: // Spot the bug # 1 double randomDelta() { return (rand() % 3) - 1; } // spot the bug #2 void fun(int[] arr) { long d = arr.length - 10; while (d > 0) arr[--d] = 0; } -- Oskar
Jan 24 2008
next sibling parent Sean Kelly <sean f4.ca> writes:
Oskar Linde wrote:
 bearophile wrote:
 
 16) Against C rules, in some situations I think it may be better if
 some results are upcasted to ulong/long:
 import std.stdio;
 void main() {
   uint a = 3_000_000_000;
   uint b = 3_000_000_000;
   writefln(a + b); // 1705032704

   int c = -1_600_000_000;
   int d = -1_600_000_000;
   writefln(c + d); // 1094967296
 }
 (The Ada language uses a different solution to avoid such class of
 bugs, but it may be too much far from the style of C-like languages.
 Delphi looks like a compromise).

On a similar topic, here are two of my favorite bugs: // Spot the bug # 1 double randomDelta() { return (rand() % 3) - 1; } // spot the bug #2 void fun(int[] arr) { long d = arr.length - 10; while (d > 0) arr[--d] = 0; }

I wonder if this will be resolved by the polysemous value change in 2.0, whenever that happens? Sean
Jan 24 2008
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Oskar Linde:
 On a similar topic, here are two of my favorite bugs:

In ObjectPascal you often restrict numbers to an interval (like 1..20) to *avoid* certain bugs, so you often use an unsigned integer if you want to be sure a number is never negative. In D I do the opposite, I usually use signed numbers (but in the very uncommon situations I need to use the number as a bit sequence, and few other situations) because they seem a bit safer. array.length being unsigned probably leads to some bugs, so maybe it's better to waste one bit (and reduce the max array length by half on 32 bit CPUs. half the range on 64 bit CPUs isn't a problem, I presume), using a signed integer, to increase code safety (there are another solutions to this problem, like improving the upcasting, or introducing integer overflow cheeks). Bye, bearophile
Jan 24 2008
prev sibling next sibling parent reply Jarrod <qwerty ytre.wq> writes:
Okay you wrote a lot of stuff, and I generally agree for a lot of it, but 
I disagree for a few things.

 
 1) In D I often enough do mistekes like this: foreach (i, a, myobj)
 But usually not this one:
 foreach (i; a; myobj)
 There for my eyes it's not always easy to spot the difference between
 "," and ";", so I think a "in" instead of ";" (as used in Python and C#
 too) can be seen better: foreach (i, a in myobj)

I don't really see what this fixes. It seems you just prefer to use 'in' because you're used to python.
 3) In my D code I keep writing "length" all the time (currently I can
 find 458 "length" words inside my d libs). It's a long word, $ inside
 the scope of [] helps reducing the typing, but I often enough write
 "lenght", so I think still a default attribute named "len" (as in
 Python) may be better than "length". The attribute "dup" too is an
 abbreviation, probably of "duplicate", so abbreviations seem acceptable
 in such context.

'len' is kind of ugly, something like 'size' would be nice.
 4) Use + instead of ~

No. + means add. Not concatenate. I really don't want to see + being used as both the addition operator and concatenate. A different unused operator maybe like a comma would be fine if you think a tilde is an uncommon keyboard input character, but not a +
 12) In Python functions are objects, so you can add them attributes: def
 foo(inc):
   foo.tot += inc
   return foo.tot
 foo.tot = 10
 foo(10)
 print foo.tot # prints 20
 
 So is it a silly idea to allow public static variables in D functions,
 to do something similar?
 
 int foo(int inc) {
   public static int tot;
   foo.tot += inc;
   return foo.tot;
 }
 void main() {
   foo.tot = 10;
   foo(10);
   printf("%d\n", foo.tot);
 }

Couldn't you just make an object for this in the first place? I don't want to see functions being used as objects when they should be.. well, functions.
 18b)
 Here indentation doesn't follow the program meaning: if (x == 0)
     if (y == 0)
         foo();
 else
     z = x + y;
 
 In Python such possible source of bugs is absent, because the
 indentation is the only important thing, so this:
 
 if x == 0:
     if y == 0:
         foo()
 else:
     z = x + y

Most people don't like the forced indentation in python and I'm one of those people. Enforcing 'prettier' code limits personal styles and makes for a slower implementation. No one wants to spend time worrying about how their code looks when sometimes they just want to get it done and move on. I doubt many bugs would occur from this, unless someone was incredibly messy when it came to coding. The rest of your points I pretty much agree with.
Jan 25 2008
next sibling parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Jarrod wrote:
 Okay you wrote a lot of stuff, and I generally agree for a lot of it, but 
 I disagree for a few things.
 
  
 1) In D I often enough do mistekes like this: foreach (i, a, myobj)
 But usually not this one:
 foreach (i; a; myobj)
 There for my eyes it's not always easy to spot the difference between
 "," and ";", so I think a "in" instead of ";" (as used in Python and C#
 too) can be seen better: foreach (i, a in myobj)

I don't really see what this fixes. It seems you just prefer to use 'in' because you're used to python.

It's funny you should say that, because just today I was staring at a couple of foreaches that were giving me compiler errors and scratching my head. Turns out I had written this: foreach(i,ang, angles) { ... } Now maybe you can see right away what's wrong, but in my somewhat sleepy state and with my preferred ProggyTT programming font, I just didn't notice the missing pixel there. It's much harder to mistake a comma for an 'in'.
 Most people don't like the forced indentation in python and I'm one of 
 those people.

Hmm, that doesn't sound like a scientific study. Python has many more users than D despite the fact that apparently most people don't like it. That said, I think trying to change such a basic part of D's syntax at this point is a bad idea. --bb
Jan 25 2008
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Janice Caron:
 For that matter, it has two meanngs. So wouldn't
     foreach (a; b in c)
 parse as
     foreach (a; (b in c))
 which might actually make sense if c defines opIn_r(). So you'd still
 end up having to distinguish a comma from a semicolon anyway.

But this situation is probably 100 times less common than the normal usage of foreach, so this may produce way less bugs in the code, and you can add (). Bye, bearophile
Jan 25 2008
prev sibling parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Jarrod wrote:
 On Fri, 25 Jan 2008 18:08:46 +0900, Bill Baxter wrote:
 
 It's funny you should say that, because just today I was staring at a
 couple of foreaches that were giving me compiler errors and scratching
 my head.  Turns out I had written this:

       foreach(i,ang, angles) {
           ...
       }

 Now maybe you can see right away what's wrong, but in my somewhat sleepy
 state and with my preferred ProggyTT programming font, I just didn't
 notice the missing pixel there.   It's much harder to mistake a comma
 for an 'in'.

But there will still be a comma or a semicolon there for cases where you want the index. Wouldn't just allowing commas and semicolons work better?

Was that part of the proposal? I was just wanting the semicolon to be replaced by 'in' in all cases so you'd never see a semicolon in a foreach. Ever. Anyway this has been asked for before. Walter said "no" because the grammar would have to be special cased to treat 'in' differently inside the foreach. It's a reasonable decree. I still pine for it, though, every time I accidentally put a comma in there instead of a semicolon. :-)
 
 Hmm, that doesn't sound like a scientific study.  Python has many more
 users than D despite the fact that apparently most people don't like it.

Perhaps I should say 'many' instead of 'most' don't like it. I frequent and moderate a few programming forums, and trust me, there are a *lot* of groans and moans about it. *Especially* from the lisp/scheme fans. They seem to hate it with a passion. Some have even written some 'interesting' poems about it. True story.

Ok, that is certainly true. I myself thought it was ridiculous till I tried it. Now I merely think of it as an acceptable alternative. I think it's mostly a wash. Slightly cleaner-looking code at the cost of some slight annoyances. The biggest of those to me is that you have to be a little more careful when you copy-paste a block of code in the middle of some function. In many cases an editor cannot tell what nesting level you intend for it to be at. But it's not a big problem by any means. --bb
Jan 26 2008
prev sibling next sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On Jan 25, 2008 9:08 AM, Bill Baxter <dnewsgroup billbaxter.com> wrote:
 It's funny you should say that, because just today I was staring at a
 couple of foreaches that were giving me compiler errors and scratching
 my head.  Turns out I had written this:

       foreach(i,ang, angles) {
           ...
       }

 Now maybe you can see right away what's wrong, but in my somewhat sleepy
 state and with my preferred ProggyTT programming font, I just didn't
 notice the missing pixel there.   It's much harder to mistake a comma
 for an 'in'.

There is a slight problem there, which is that "in" already has a meaning, and one day "foreach" might understand it. As in: class C { /*...*/ } C[] array; foreach(in element; array) meaning, the body of foreach may not modify any of C's member variables. (Currently, foreach does not play nice with constancy, but one might suppose this will be fixed in the future).
Jan 25 2008
prev sibling next sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On Jan 25, 2008 9:22 AM, Janice Caron <caron800 googlemail.com> wrote:
 There is a slight problem there, which is that "in" already has a
 meaning

For that matter, it has two meanngs. So wouldn't foreach (a; b in c) parse as foreach (a; (b in c)) which might actually make sense if c defines opIn_r(). So you'd still end up having to distinguish a comma from a semicolon anyway.
Jan 25 2008
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Jarrod:

I don't really see what this fixes.<

It improves code readability, probably reducing total bug count. Two common (and well enough designed) languages do it (Python and C#) so you have to take the idea seriously before refusing it.
It seems you just prefer to use 'in' because you're used to python.<

I am used to program in many different languages, as probably most D programmers :-)
'len' is kind of ugly, something like 'size' would be nice.<

size is better than length, so I am okay with it.
Couldn't you just make an object for this in the first place? I don't want to
see functions being used as objects when they should be.. well, functions.<

Static variables are already available in functions, so that's just a dotted way to access those variables. That idea may be a way to keep memory usage low and to implment singletons (pattern) in a simple way :-)
Most people don't like the forced indentation in python and I'm one of those
people.<

Some people don't like it, most Python/Haskell programmers like it, I have liked it before knowing Python or the citation of Knuth :-) Note that in my post I haven't suggested to change D to introduce significant whitespace like in Python, I have just shown a possible source of D bugs that Python avoids... if you are right when you say that's not a common bug (and I may agree with you) then no changes are necessary.
Enforcing 'prettier' code limits personal styles<

Often enforcing a common style is positive, if you program at professional level with many hands on the same code you too can see that style standards are very used and useful. Even if you write code alone, you probably have later to lend the code to others, or receive it from them, so a more standard way of writing it improves code sharing, readability, etc. The net result is that's more likely for you to find on internet that module you just really need for your code soon right now.
and makes for a slower implementation.<

I don't belive/understand this.
No one wants to spend time worrying about how their code looks when sometimes
they just want to get it done and move on.<

Then few people will want to read/mantain/modify the code you write, even if all you write are rarely used 20-lines long Perl scripts. My code is elegant, well written, and I'm proud of it ;-) Bye and than you for your answers, a bear hug, bearophile
Jan 25 2008
parent reply "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
bearophile <bearophileHUGS lycos.com> wrote:

 'len' is kind of ugly, something like 'size' would be nice.<

size is better than length, so I am okay with it.

Problem with 'size' might be confusion with 'sizeof'. 'lenght' is a common typo for me, so it might be worthwhile to change. Any other fitting names? 'count'? Simen
Jan 25 2008
parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Simen Kjaeraas wrote:
 bearophile <bearophileHUGS lycos.com> wrote:
 
 'len' is kind of ugly, something like 'size' would be nice.<

size is better than length, so I am okay with it.

Problem with 'size' might be confusion with 'sizeof'. 'lenght' is a common typo for me, so it might be worthwhile to change. Any other fitting names? 'count'? Simen

Count doesn't make sense for multidimensional arrays. Anyway, I don't think it's confusing at all. It's what C++ uses. I've never heard anyone complain STL was confusing or difficult because it used "size" everywhere when "sizeof" was already part of the language. (Of course that could be because they were too busy complaining about other things in STL that really are too confusing and difficult ;-) ) --bb
Jan 25 2008
prev sibling next sibling parent Jarrod <qwerty ytre.wq> writes:
On Fri, 25 Jan 2008 18:08:46 +0900, Bill Baxter wrote:

 It's funny you should say that, because just today I was staring at a
 couple of foreaches that were giving me compiler errors and scratching
 my head.  Turns out I had written this:
 
       foreach(i,ang, angles) {
           ...
       }
 
 Now maybe you can see right away what's wrong, but in my somewhat sleepy
 state and with my preferred ProggyTT programming font, I just didn't
 notice the missing pixel there.   It's much harder to mistake a comma
 for an 'in'.

But there will still be a comma or a semicolon there for cases where you want the index. Wouldn't just allowing commas and semicolons work better?
 Hmm, that doesn't sound like a scientific study.  Python has many more
 users than D despite the fact that apparently most people don't like it.

Perhaps I should say 'many' instead of 'most' don't like it. I frequent and moderate a few programming forums, and trust me, there are a *lot* of groans and moans about it. *Especially* from the lisp/scheme fans. They seem to hate it with a passion. Some have even written some 'interesting' poems about it. True story.
Jan 26 2008
prev sibling parent Jarrod <qwerty ytre.wq> writes:
On Fri, 25 Jan 2008 06:03:19 -0500, bearophile wrote:

 It improves code readability, probably reducing total bug count. Two
 common (and well enough designed) languages do it (Python and C#) so you
 have to take the idea seriously before refusing it.

I did. But all you did was replace a semicolon with the word 'in'. You still need a semicolon or comma there half the time. Why not just allow commas or semicolons? Better than reusing 'in'.
 Static variables are already available in functions, so that's just a
 dotted way to access those variables. That idea may be a way to keep
 memory usage low and to implment singletons (pattern) in a simple way
 :-)

Well, I guess. I'm still teeter-tottering on that one because it sort of changes the 'meaning' of a function to me. But hey I won't really mind if it's added or not.
 ...
and makes for a slower implementation.<

I don't belive/understand this.

Because I have to worry about how indented my code is before I can move on. Sometimes I want it on one line to look more compact, or because I'm just punching out code as fast as I can think it up and I don't want to worry about it's indentation. When I'm done and it works, if I can't read it and need to look at it, Perltidy/Autoindent to the rescue.
No one wants to spend time worrying about how their code looks when
sometimes they just want to get it done and move on.<

Then few people will want to read/mantain/modify the code you write, even if all you write are rarely used 20-lines long Perl scripts. My code is elegant, well written, and I'm proud of it ;-)

The majority of my code is pretty readable too, but I don't want to be / forced/ to make it look good. Just like how I don't want to be forced to use braces (Perl does, and I really, really don't like it. Fortunately Perl allows reverse conditionals without braces so it rarely happens)
Jan 26 2008
prev sibling parent reply Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
bearophile wrote:

 18b)
 Here indentation doesn't follow the program meaning:
 if (x == 0)
     if (y == 0)
         foo();
 else
     z = x + y;
 

I've only been bitten by this maybe two times ever, but one was quite recent and took a fair amount of time to track down. (I went as far as reading the compiler assembler output before realizing what was wrong... and feeling quite stupid.) I think it went something like starting with if (a) b(); else c(); and later mindlessly changing several b(); into if(d) b();
 But Making D *require* {} after for, while, if/else, ecc looks like a
"draconian" way to avoid that kind of bugs.

I agree. A working compromise would be to disallow only "ambiguous" else clauses, forcing one to add {} in those cases only. -- Oskar
Jan 25 2008
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Oskar Linde:
 A working compromise would be to disallow only "ambiguous" else 
 clauses, forcing one to add {} in those cases only.

I like this idea, can you show few examples? If it's a good compromise then it may be used in D 2.x... Bye, bearophile
Jan 25 2008
parent Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
bearophile wrote:
 Oskar Linde:
 A working compromise would be to disallow only "ambiguous" else 
 clauses, forcing one to add {} in those cases only.

I like this idea, can you show few examples? If it's a good compromise then it may be used in D 2.x...

D, C and many other languages have grammar rules such as S : if ( E ) S else S S : if ( E ) S S : other ; under such rules, a program such as if(a) if(b) f1(); else f2(); could have two ambiguous interpretations: 1. if(a) { if(b) f1(); else f2(); } 2. if(a) { if(b) f1(); } else f2(); D (and many other languages) disambiguate by always picking interpretation 1. The else always matches the most recent if. One could instead flag this as an error, requiring the addition of braces whenever an ambiguous dangling else is found. For example: void main() { if (a) if (b) f1(); else f2(); } test.d:5: error: ambiguous 'else' found. Please disambiguate with braces {}. To resolve this error, one would have to add one pair of braces. Either: void main() { if (a) { if (b) f1(); } else f2(); } Or: void main() { if (a) { if (b) f1(); else f2(); } } I hope I've made it a bit clearer. Regards, -- Oskar
Jan 27 2008
prev sibling next sibling parent Don Clugston <dac nospam.com.au> writes:
Oskar Linde wrote:
 bearophile wrote:
 
 18b)
 Here indentation doesn't follow the program meaning:
 if (x == 0)
     if (y == 0)
         foo();
 else
     z = x + y;

I've only been bitten by this maybe two times ever, but one was quite recent and took a fair amount of time to track down. (I went as far as reading the compiler assembler output before realizing what was wrong... and feeling quite stupid.) I think it went something like starting with if (a) b(); else c(); and later mindlessly changing several b(); into if(d) b();
 But Making D *require* {} after for, while, if/else, ecc looks like a 
 "draconian" way to avoid that kind of bugs.

I agree. A working compromise would be to disallow only "ambiguous" else clauses, forcing one to add {} in those cases only.

I like that. That would catch almost all of the bugs, with no effect on good code.
Jan 25 2008
prev sibling next sibling parent reply Jason House <jason.james.house gmail.com> writes:
Oskar Linde Wrote:

 bearophile wrote:
 
 18b)
 Here indentation doesn't follow the program meaning:
 if (x == 0)
     if (y == 0)
         foo();
 else
     z = x + y;
 

I've only been bitten by this maybe two times ever, but one was quite recent and took a fair amount of time to track down. (I went as far as reading the compiler assembler output before realizing what was wrong... and feeling quite stupid.) I think it went something like starting with if (a) b(); else c(); and later mindlessly changing several b(); into if(d) b();
 But Making D *require* {} after for, while, if/else, ecc looks like a
"draconian" way to avoid that kind of bugs.

I agree. A working compromise would be to disallow only "ambiguous" else clauses, forcing one to add {} in those cases only.

My personal style is to use {} after an if/else/while/etc whenever the stuff that follows uses more than one line with normal indentation practices. For me, I'd never write: if (x == 0) if (y == 0) foo(); But would instead write: if (x == 0){ if (y == 0) foo(); } Maybe I've just done too much coding/debugging/recoding/sharing of code/etc... It just seems safer to me. It also happens to solve the whole else issue. I wonder if something like this would be an acceptable safety feature in D? Would anyone complain about the above being an error? Making it a warning could be a compromise, but Walter dislikes warnings.
Jan 25 2008
parent reply Matti Niemenmaa <see_signature for.real.address> writes:
Jason House wrote:
 Oskar Linde Wrote:
 I think it went something like starting with
 
 if (a)
     b();
 else
     c();
 
 and later mindlessly changing several b(); into if(d) b();


 My personal style is to use {} after an if/else/while/etc whenever the stuff
 that follows uses more than one line with normal indentation practices.
 
 For me, I'd never write:
 if (x == 0)
     if (y == 0)
         foo();
 
 But would instead write:
 if (x == 0){
     if (y == 0)
         foo();
 }

 Maybe I've just done too much coding/debugging/recoding/sharing of 
 code/etc... It just seems safer to me. It also happens to solve the whole
 else issue.
 
 I wonder if something like this would be an acceptable safety feature in D? 
 Would anyone complain about the above being an error?  Making it a warning 
 could be a compromise, but Walter dislikes warnings.

I would write: if (x == 0) if (y == 0) foo(); Which makes the else-association clear. There are many ways of dealing with this and I don't think it's necessary to make it a D or DMD feature. A compiler which keeps track of whitespace (which I doubt (m)any do) could warn about it, I suppose. -- E-mail address: matti.niemenmaa+news, domain is iki (DOT) fi
Jan 25 2008
parent Robert Fraser <fraserofthenight gmail.com> writes:
Matti Niemenmaa wrote:
 There are many ways of dealing with this and I don't think it's 
 necessary to make it a D or DMD feature. A compiler which keeps track of 
 whitespace (which I doubt (m)any do) could warn about it, I suppose.

Descent's compiler does (well, can)
Jan 25 2008
prev sibling parent Jesse Phillips <jessekphillips gmail.com> writes:
On Fri, 25 Jan 2008 09:26:13 -0500, bearophile wrote:

 Oskar Linde:
 A working compromise would be to disallow only "ambiguous" else
 clauses, forcing one to add {} in those cases only.

I like this idea, can you show few examples? If it's a good compromise then it may be used in D 2.x... Bye, bearophile

Well what I would think is: if(true) if(false) a() else b() This would not be allowed, that is to say an if statement with a corresponding else must use braces: if(true) if(false) { a() } else b() In any case I like this idea and will now be using it for my own coding style.
Jan 25 2008