www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Top 5

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Ok, per Aarti's suggestion: without speaking officially for Walter, let 
me ask this - what do you think are the top issues you'd like to see 
fixed in D?

Andrei
Oct 08 2008
next sibling parent reply Frank Benoit <keinfarbton googlemail.com> writes:
Andrei Alexandrescu schrieb:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let
 me ask this - what do you think are the top issues you'd like to see
 fixed in D?
 
 Andrei

class to interface compatibility
Oct 08 2008
next sibling parent reply Robert Fraser <fraserofthenight gmail.com> writes:
Frank Benoit wrote:
 Andrei Alexandrescu schrieb:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let
 me ask this - what do you think are the top issues you'd like to see
 fixed in D?

 Andrei

class to interface compatibility

This.
Oct 09 2008
next sibling parent Christopher Wright <dhasenan gmail.com> writes:
Robert Fraser wrote:
 Frank Benoit wrote:
 Andrei Alexandrescu schrieb:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let
 me ask this - what do you think are the top issues you'd like to see
 fixed in D?

 Andrei

class to interface compatibility

This.

Complete agreement.
Oct 09 2008
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Robert Fraser wrote:
 Frank Benoit wrote:
 Andrei Alexandrescu schrieb:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let
 me ask this - what do you think are the top issues you'd like to see
 fixed in D?

 Andrei

class to interface compatibility

This.

What is it? Andrei
Oct 09 2008
next sibling parent Sergey Gromov <snake.scaly gmail.com> writes:
Thu, 09 Oct 2008 07:23:30 -0500,
Andrei Alexandrescu wrote:
 Robert Fraser wrote:
 Frank Benoit wrote:
 Andrei Alexandrescu schrieb:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let
 me ask this - what do you think are the top issues you'd like to see
 fixed in D?

 Andrei

class to interface compatibility

This.

What is it?

I think it's this: interface I {} class A : I {} I i = new A; i.classinfo.name; // "I" ?????
Oct 09 2008
prev sibling parent Frank Benoit <keinfarbton googlemail.com> writes:
Andrei Alexandrescu schrieb:
 Robert Fraser wrote:
 Frank Benoit wrote:
 Andrei Alexandrescu schrieb:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let
 me ask this - what do you think are the top issues you'd like to see
 fixed in D?

 Andrei

class to interface compatibility

This.

What is it? Andrei

interface I { } I i1 = getInstance(); I i2 = getAnotherInstance(); bool res = (cast(Object)i1).opEquals( cast(Object) i2 ); bool res = i1 == i2; // should be this If you really use interfaces and code that simply manages some objects, you often need those silly casts. alias HashSet!(I) MySet; // error, interface has no .toHash Every interface is in fact at least of type Object.
Except M$-COM<<< But this is a _special_ case and can be handled in a



in D there is no compatibility between interfaces and class Object.
Oct 09 2008
prev sibling parent "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Thu, Oct 9, 2008 at 8:49 AM, Sergey Gromov <snake.scaly gmail.com> wrote:
 I think it's this:

 interface I {}
 class A : I {}
 I i = new A;
 i.classinfo.name; // "I" ?????

And furthermore, Object o = i; // should be no error
Oct 09 2008
prev sibling next sibling parent Frank Benoit <keinfarbton googlemail.com> writes:
Andrei Alexandrescu schrieb:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let
 me ask this - what do you think are the top issues you'd like to see
 fixed in D?
 
 Andrei

fix the full closure. Give back the possibility to do it without the heap allocation like in D1.
Oct 08 2008
prev sibling next sibling parent "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Wed, Oct 8, 2008 at 4:07 PM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let me
 ask this - what do you think are the top issues you'd like to see fixed in
 D?

 Andrei

Reasonable varargs that don't require either horrible platform-dependent machinations or massive code bloat. A swift resolution to the standard library / runtime debacle. Iron out licensing later. Removal of SFINAE. Oh, and I suppose every bug in zilla ;)
Oct 08 2008
prev sibling next sibling parent reply KennyTM~ <kennytm gmail.com> writes:
Andrei Alexandrescu wrote:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let 
 me ask this - what do you think are the top issues you'd like to see 
 fixed in D?
 
 Andrei

1. Running speed. Especially for associative arrays. (Well that's not a problem in the D Spec, but still...) 2. In the future don't overload meaning of some keyword (enum, invariant) when better alternatives (define, immutable) are there. 3. AST Macros. 4. Return by reference. 5. Pure functions. (Personally I'd also like to see a dmd port for Darwin (that includes Mac OS X already), but since gdc and later (should) llvmdc also works I guess this isn't very important.)
Oct 08 2008
parent reply Ary Borenszweig <ary esperanto.org.ar> writes:
KennyTM~ escribió:
 Andrei Alexandrescu wrote:
 Ok, per Aarti's suggestion: without speaking officially for Walter, 
 let me ask this - what do you think are the top issues you'd like to 
 see fixed in D?

 Andrei

1. Running speed. Especially for associative arrays. (Well that's not a problem in the D Spec, but still...) 2. In the future don't overload meaning of some keyword (enum, invariant) when better alternatives (define, immutable) are there. 3. AST Macros. 4. Return by reference. 5. Pure functions.

"to see fixed in D" 2, 3, 4 and 5 are not broken: they don't exist yet.
Oct 08 2008
parent KennyTM~ <kennytm gmail.com> writes:
Ary Borenszweig wrote:
 KennyTM~ escribió:
 Andrei Alexandrescu wrote:
 Ok, per Aarti's suggestion: without speaking officially for Walter, 
 let me ask this - what do you think are the top issues you'd like to 
 see fixed in D?

 Andrei

1. Running speed. Especially for associative arrays. (Well that's not a problem in the D Spec, but still...) 2. In the future don't overload meaning of some keyword (enum, invariant) when better alternatives (define, immutable) are there. 3. AST Macros. 4. Return by reference. 5. Pure functions.

"to see fixed in D" 2, 3, 4 and 5 are not broken: they don't exist yet.

Then you just need to look for the highest priority bugs in bugzilla.
Oct 09 2008
prev sibling next sibling parent reply Mike <vertex gmx.at> writes:
Andrei Alexandrescu Wrote:

 Ok, per Aarti's suggestion: without speaking officially for Walter, let 
 me ask this - what do you think are the top issues you'd like to see 
 fixed in D?
 
 Andrei

Mostly some things from C#: 1. Real Properties Please! Seriously ... please. 2. Attributes - and deprecate some keywords we wouldn't need anymore // just an example; I don't really want to pick on foreach_reverse, although I wouldn't mind if it died :) foreach_reverse(a; b) { ... } [reversed] foreach(a; b) { ... } 3. Extension methods 4. Replace C-style switch with something modern and more D-like: switch (x) { case (0) foo(); case (1) { bar(); baz(); } else throw new Exception("nope"); } 5. In the future: LINQ
Oct 08 2008
next sibling parent reply "Chris R. Miller" <lordsauronthegreat gmail.com> writes:
Mike wrote:
 4. Replace C-style switch with something modern and more D-like:
 
 switch (x)
 {
     case (0) foo();
     case (1)
     {
         bar();
         baz();
     }
     else throw new Exception("nope");
 }

But that would break the techniques of a duff's device. The existing syntax of a switch statement is more accurate to how it behaves, anyways. It's like a more complex series of goto statements. http://en.wikipedia.org/wiki/Duff%27s_device http://en.wikipedia.org/wiki/Switch_statement Your proposal really makes it redundant with chains of if-else statements IMHO.
Oct 08 2008
parent reply "Chris R. Miller" <lordsauronthegreat gmail.com> writes:
Bruce Adams wrote:
 On Thu, 09 Oct 2008 01:08:54 +0100, Chris R. Miller 
 <lordsauronthegreat gmail.com> wrote:
 
 Mike wrote:
 4. Replace C-style switch with something modern and more D-like:
  switch (x)
 {
     case (0) foo();
     case (1)
     {
         bar();
         baz();
     }
     else throw new Exception("nope");
 }

But that would break the techniques of a duff's device. The existing syntax of a switch statement is more accurate to how it behaves, anyways. It's like a more complex series of goto statements. http://en.wikipedia.org/wiki/Duff%27s_device

Duff's device is a perversion. The compiler should be left to perform those sorts of optimisations.

I really don't trust the compiler to make those optimizations. What if someone makes a different compiler implementation and forgets to include the new implicit optimizations? I also know people who work with real-time signal analysis algorithms, and the flexibility of the switch statement really aids their bottom line of performance.
 http://en.wikipedia.org/wiki/Switch_statement

 Your proposal really makes it redundant with chains of if-else 
 statements IMHO.

The difference is a switch might be optimised into a look-up table. A chain of if/then/else's requires a different kind of optimisation thought it might end up in the same form. Though the main point is nothing to do with optimisation. Its about clarity of intent. A switch selects possible values of a single item and can be expected (in certain cases) to cover all cases. That is never true for if-then-else.

I was really thinking of if-if chains, sorry for the confusion, it was a brain-language-keyboard deficiency. The switch statement as it currently is allows you a great degree of flexibility if you know how to (ab)use it. If it's so dumb, don't use it. I really am not amenable to the concept of changing the switch statement on such a fundamental level unless some clear advantage can be established that doesn't compromise existing functionality.
Oct 10 2008
parent "Nick Sabalausky" <a a.a> writes:
"Chris R. Miller" <lordsauronthegreat gmail.com> wrote in message 
news:gcorva$2uf6$1 digitalmars.com...
 Bruce Adams wrote:
 On Thu, 09 Oct 2008 01:08:54 +0100, Chris R. Miller 
 <lordsauronthegreat gmail.com> wrote:

 Mike wrote:
 4. Replace C-style switch with something modern and more D-like:
  switch (x)
 {
     case (0) foo();
     case (1)
     {
         bar();
         baz();
     }
     else throw new Exception("nope");
 }

But that would break the techniques of a duff's device. The existing syntax of a switch statement is more accurate to how it behaves, anyways. It's like a more complex series of goto statements. http://en.wikipedia.org/wiki/Duff%27s_device

Duff's device is a perversion. The compiler should be left to perform those sorts of optimisations.

I really don't trust the compiler to make those optimizations. What if someone makes a different compiler implementation and forgets to include the new implicit optimizations? I also know people who work with real-time signal analysis algorithms, and the flexibility of the switch statement really aids their bottom line of performance.
 http://en.wikipedia.org/wiki/Switch_statement

 Your proposal really makes it redundant with chains of if-else 
 statements IMHO.

The difference is a switch might be optimised into a look-up table. A chain of if/then/else's requires a different kind of optimisation thought it might end up in the same form. Though the main point is nothing to do with optimisation. Its about clarity of intent. A switch selects possible values of a single item and can be expected (in certain cases) to cover all cases. That is never true for if-then-else.

I was really thinking of if-if chains, sorry for the confusion, it was a brain-language-keyboard deficiency. The switch statement as it currently is allows you a great degree of flexibility if you know how to (ab)use it. If it's so dumb, don't use it. I really am not amenable to the concept of changing the switch statement on such a fundamental level unless some clear advantage can be established that doesn't compromise existing functionality.

I don't see any reason why adding a more modern switch would necessitate getting rid of the existing switch. I have no problem keeping the existing switch in the language, but I've always wanted a switch like this: switch(any expression) case(7) {} case(2, 4, 10) {} case(>= 25) {} case(> 17, <= -10) {} case(in 20..22) {} case(!in MyIntArray) {} Yea, I *can* code that with ifs (and currently do), but there are times when all of those ifs basically boil down to branching the execution based on the value of one single expression. That's a switch. But the conditions aren't always just a simple "==". And if someone wants to make an old-style switch to manually implement a duff's, I say fine, let them.
Oct 10 2008
prev sibling next sibling parent ore-sama <spam here.lot> writes:
Mike Wrote:

 4. Replace C-style switch with something modern and more D-like:
 
 switch (x)
 {
     case (0) foo();
     case (1)
     {
         bar();
         baz();
     }
     else throw new Exception("nope");
 }

looks interesting :)
Oct 09 2008
prev sibling next sibling parent reply Leandro Lucarella <llucax gmail.com> writes:
Mike, el  8 de octubre a las 16:51 me escribiste:
 2. Attributes - and deprecate some keywords we wouldn't need anymore
 
 // just an example; I don't really want to pick on foreach_reverse, although I
wouldn't mind if it died :)
 
 foreach_reverse(a; b) { ... }
 
 [reversed] foreach(a; b) { ... }

I once saw somebody proposing template-like syntax for this: foreach!(reversed)(a; b) { ... } But, well, I think first all the new-template-syntax fuzz has to stop ;) -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- For me to ask a woman out, I've got to get into a mental state like the karate guys before they break the bricks. -- George Constanza
Oct 09 2008
parent reply Matti Niemenmaa <see_signature for.real.address> writes:
Leandro Lucarella wrote:
 Mike, el  8 de octubre a las 16:51 me escribiste:
 2. Attributes - and deprecate some keywords we wouldn't need anymore

 // just an example; I don't really want to pick on foreach_reverse, although I
wouldn't mind if it died :)

 foreach_reverse(a; b) { ... }

 [reversed] foreach(a; b) { ... }

I once saw somebody proposing template-like syntax for this: foreach!(reversed)(a; b) { ... } But, well, I think first all the new-template-syntax fuzz has to stop ;)

We already have scope (success) and such, which need no !. Why not foreach (reversed)? Alternatively, since foreach got the ability to take a delegate, arrays should have the built-in property .reversed and then you could do foreach (a; b.reversed). -- E-mail address: matti.niemenmaa+news, domain is iki (DOT) fi
Oct 09 2008
parent reply bearophile <bearophileHUGS lycos.com> writes:
Matti Niemenmaa:
 Alternatively, since foreach got the ability to take a delegate, arrays should
 have the built-in property .reversed and then you could do foreach (a;
b.reversed).

Is that a lazy operation? Another possibility is to add a (lazy) "view" (a struct plus function template), that reverses: foreach (a; reversed(b)) Of course you can also write that as: foreach (a; b.reversed()) That's a solution I use in my code. Bye, bearophile
Oct 09 2008
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
bearophile wrote:
 Matti Niemenmaa:
 Alternatively, since foreach got the ability to take a delegate, arrays should
 have the built-in property .reversed and then you could do foreach (a;
b.reversed).

Is that a lazy operation? Another possibility is to add a (lazy) "view" (a struct plus function template), that reverses: foreach (a; reversed(b)) Of course you can also write that as: foreach (a; b.reversed()) That's a solution I use in my code. Bye, bearophile

http://www.digitalmars.com/d/2.0/phobos/std_iterator.html#retro Andrei
Oct 09 2008
prev sibling parent "Bruce Adams" <tortoise_74 yeah.who.co.uk> writes:
On Thu, 09 Oct 2008 01:08:54 +0100, Chris R. Miller  
<lordsauronthegreat gmail.com> wrote:

 Mike wrote:
 4. Replace C-style switch with something modern and more D-like:
  switch (x)
 {
     case (0) foo();
     case (1)
     {
         bar();
         baz();
     }
     else throw new Exception("nope");
 }

But that would break the techniques of a duff's device. The existing syntax of a switch statement is more accurate to how it behaves, anyways. It's like a more complex series of goto statements. http://en.wikipedia.org/wiki/Duff%27s_device

Duff's device is a perversion. The compiler should be left to perform those sorts of optimisations.
 http://en.wikipedia.org/wiki/Switch_statement

 Your proposal really makes it redundant with chains of if-else  
 statements IMHO.

The difference is a switch might be optimised into a look-up table. A chain of if/then/else's requires a different kind of optimisation thought it might end up in the same form. Though the main point is nothing to do with optimisation. Its about clarity of intent. A switch selects possible values of a single item and can be expected (in certain cases) to cover all cases. That is never true for if-then-else.
Oct 09 2008
prev sibling next sibling parent reply Benji Smith <dlanguage benjismith.net> writes:
Andrei Alexandrescu wrote:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let 
 me ask this - what do you think are the top issues you'd like to see 
 fixed in D?

1. Merge phobos and tango (and solve whatever obstacles are currently preventing a D2 tango branch). I won't use D2 until I can use Tango with it, and I think having two semi-standard libraries is a bad state of affairs for the community. 2. Syntax tricks are usually more trouble than they're worth, especially when they create language ambiguities. Parentheses should be mandatory for function calls. Property-style function invocation should be eliminated. opCall should be fed to a crocodile and never heard from again. 3. The type-system is getting way too complex. Features like the const system, functional purity, and shared vs unshared memory are really neat ideas. But the implementation of all those features as aspects of the type-system creates a model that's hard for me to wrap my head around. In many cases, I don't know when I need to write casts or when to create duplicate function implementations (differing only in their prototype). This is the primary reason I've avoided D2. 4. String processing sucks. Writing code that works transparently with all three character types and correctly handles all unicode characters is basically impossible. Or at least it feels impossible. Trying to support both Phobos and Tango in the same string-processing routine is a definite no-go. In an ideal world, string-processing code shouldn't have to be littered with "static if" all over the place. 5. I don't really feel like D needs any new features. I'd rather have the majority of new work be focused on library development, tools, stability, optimization of the back-end, and maybe a rewrite of the linker. (I could be wrong, but I often get the impression that the status quo of the linker prevents certain language features and optimizations. We have a 21st century language, with a 1980's linker.) HONORABLE MENTIONS: * I'd love to be able to get a stack trace for every uncaught exception. I know that there are libraries to do this, but I'd like it to be built into the language, without any platform-dependent functionality. Stack traces on windows should be exactly the same as on linux. * I keep hearing about how the DMC backend needs improvement in its floating-point code. Personally, I think this would be a bigger win than pure functions, whose optimization opportunities are still hypothetical. --benji
Oct 08 2008
next sibling parent reply "Chris R. Miller" <lordsauronthegreat gmail.com> writes:
Benji Smith wrote:
 Andrei Alexandrescu wrote:
 Ok, per Aarti's suggestion: without speaking officially for Walter, 
 let me ask this - what do you think are the top issues you'd like to 
 see fixed in D?

1. Merge phobos and tango (and solve whatever obstacles are currently preventing a D2 tango branch). I won't use D2 until I can use Tango with it, and I think having two semi-standard libraries is a bad state of affairs for the community.

I get a few emails every month from people using Easy D that either run into problems with the whole Tango/Phobos problem or want to know why this (obvious) problem exists. I just try to leave it as "they have incompatible licenses" or some other non-partisan euphemism like that, explain why I distribute Tango instead of Phobos, and hope they don't go nuts and tell everyone in the ng the situation is stupid (it is) and that we should just shut up and fix it, licenses aside. While it's what I've wanted to do on several occasions, I don't think it'd really help anything to rattle cages. I feel it's probably hurting D adoption, and it's certainly made it more difficult for some people trying to learn D.
 4. String processing sucks. Writing code that works transparently with 
 all three character types and correctly handles all unicode characters 
 is basically impossible. Or at least it feels impossible. Trying to 
 support both Phobos and Tango in the same string-processing routine is a 
 definite no-go. In an ideal world, string-processing code shouldn't have 
 to be littered with "static if" all over the place.

How would this be "fixed?" Hint: don't suggest making strings an object. We tried that a while back, and it was more or less shot down.
Oct 08 2008
parent reply Benji Smith <dlanguage benjismith.net> writes:
Chris R. Miller wrote:
 Benji Smith wrote:
 4. String processing sucks. Writing code that works transparently with 
 all three character types and correctly handles all unicode characters 
 is basically impossible. Or at least it feels impossible. Trying to 
 support both Phobos and Tango in the same string-processing routine is 
 a definite no-go. In an ideal world, string-processing code shouldn't 
 have to be littered with "static if" all over the place.

How would this be "fixed?" Hint: don't suggest making strings an object. We tried that a while back, and it was more or less shot down.

Well, personally, I'd prefer it if strings were objects. But I could accept strings as character arrays if they were actually characters arrays. The current state of affairs, where strings are transparently just arrays of UTF-8 bytes makes them impossible to work with. They're unindexable, unsliceable. You can't operate directly on those arrays. You're forced to use the phobos/tango functions (which, by the way, are incompatible with one another). If D strings must be character arrays, I'd love for them to at least be ordinary arrays. Each element of a char[] array should be a single character. And if a sizeof(char) == 1, then a char should be limited to a single byte. To represent mutlibyte characters, it should be necessary to use a wchar[] or dchar[] array. --benji
Oct 08 2008
next sibling parent reply "Chris R. Miller" <lordsauronthegreat gmail.com> writes:
Benji Smith wrote:
 Chris R. Miller wrote:
 Benji Smith wrote:
 4. String processing sucks. Writing code that works transparently 
 with all three character types and correctly handles all unicode 
 characters is basically impossible. Or at least it feels impossible. 
 Trying to support both Phobos and Tango in the same string-processing 
 routine is a definite no-go. In an ideal world, string-processing 
 code shouldn't have to be littered with "static if" all over the place.

How would this be "fixed?" Hint: don't suggest making strings an object. We tried that a while back, and it was more or less shot down.

Well, personally, I'd prefer it if strings were objects.

As well for me. Worked just fine in Java, I don't see why it can't work here. Rather, knowing what I know about D and Java, I would tend to think that a D implementation of a String as an object would be /more/ powerful than the Java implementation.
 But I could accept strings as character arrays if they were actually 
 characters arrays.

Amen.
Oct 08 2008
parent reply Robert Fraser <fraserofthenight gmail.com> writes:
Chris R. Miller wrote:
 Benji Smith wrote:
 Chris R. Miller wrote:
 Benji Smith wrote:
 4. String processing sucks. Writing code that works transparently 
 with all three character types and correctly handles all unicode 
 characters is basically impossible. Or at least it feels impossible. 
 Trying to support both Phobos and Tango in the same 
 string-processing routine is a definite no-go. In an ideal world, 
 string-processing code shouldn't have to be littered with "static 
 if" all over the place.

How would this be "fixed?" Hint: don't suggest making strings an object. We tried that a while back, and it was more or less shot down.

Well, personally, I'd prefer it if strings were objects.

As well for me. Worked just fine in Java, I don't see why it can't work here. Rather, knowing what I know about D and Java, I would tend to think that a D implementation of a String as an object would be /more/ powerful than the Java implementation.
 But I could accept strings as character arrays if they were actually 
 characters arrays.

Amen.

http://www.dprogramming.com/mtext.php Nearly as efficient as regular strings & same memory footprint as the type of the lagest character within (i.e. if it contains only ascii, 8 bits/char. If it contains things representable in UTF-16, 16 bits a character. If it contains cuneoform, 32 bits per char).
Oct 09 2008
next sibling parent reply Benji Smith <dlanguage benjismith.net> writes:
Robert Fraser wrote:
 Chris R. Miller wrote:
 Benji Smith wrote:
 Chris R. Miller wrote:
 Benji Smith wrote:
 4. String processing sucks. Writing code that works transparently 
 with all three character types and correctly handles all unicode 
 characters is basically impossible. Or at least it feels 
 impossible. Trying to support both Phobos and Tango in the same 
 string-processing routine is a definite no-go. In an ideal world, 
 string-processing code shouldn't have to be littered with "static 
 if" all over the place.

How would this be "fixed?" Hint: don't suggest making strings an object. We tried that a while back, and it was more or less shot down.

Well, personally, I'd prefer it if strings were objects.

As well for me. Worked just fine in Java, I don't see why it can't work here. Rather, knowing what I know about D and Java, I would tend to think that a D implementation of a String as an object would be /more/ powerful than the Java implementation.
 But I could accept strings as character arrays if they were actually 
 characters arrays.

Amen.

http://www.dprogramming.com/mtext.php Nearly as efficient as regular strings & same memory footprint as the type of the lagest character within (i.e. if it contains only ascii, 8 bits/char. If it contains things representable in UTF-16, 16 bits a character. If it contains cuneoform, 32 bits per char).

Looks interesting. I've downloaded it. I'll give it a whirl and let you know what I think. But, upon first glance, you definitely need some example code and a tutorial. Right now, I don't really know where to start. --benji
Oct 09 2008
parent reply Robert Fraser <fraserofthenight gmail.com> writes:
Benji Smith wrote:
 Robert Fraser wrote:
 Chris R. Miller wrote:
 Benji Smith wrote:
 Chris R. Miller wrote:
 Benji Smith wrote:
 4. String processing sucks. Writing code that works transparently 
 with all three character types and correctly handles all unicode 
 characters is basically impossible. Or at least it feels 
 impossible. Trying to support both Phobos and Tango in the same 
 string-processing routine is a definite no-go. In an ideal world, 
 string-processing code shouldn't have to be littered with "static 
 if" all over the place.

How would this be "fixed?" Hint: don't suggest making strings an object. We tried that a while back, and it was more or less shot down.

Well, personally, I'd prefer it if strings were objects.

As well for me. Worked just fine in Java, I don't see why it can't work here. Rather, knowing what I know about D and Java, I would tend to think that a D implementation of a String as an object would be /more/ powerful than the Java implementation.
 But I could accept strings as character arrays if they were actually 
 characters arrays.

Amen.

http://www.dprogramming.com/mtext.php Nearly as efficient as regular strings & same memory footprint as the type of the lagest character within (i.e. if it contains only ascii, 8 bits/char. If it contains things representable in UTF-16, 16 bits a character. If it contains cuneoform, 32 bits per char).

Looks interesting. I've downloaded it. I'll give it a whirl and let you know what I think. But, upon first glance, you definitely need some example code and a tutorial. Right now, I don't really know where to start. --benji

I totally agree it needs more docs. I didn't write it; Chris Miller gets that credit.
Oct 09 2008
parent Benji Smith <dlanguage benjismith.net> writes:
cemiller wrote:
 http://www.dprogramming.com/mtext.php

 Nearly as efficient as regular strings & same memory footprint as 
 the type of the lagest character within (i.e. if it contains only 
 ascii, 8 bits/char. If it contains things representable in UTF-16, 
 16 bits a character. If it contains cuneoform, 32 bits per char).

you know what I think. But, upon first glance, you definitely need some example code and a tutorial. Right now, I don't really know where to start. --benji

I totally agree it needs more docs. I didn't write it; Chris Miller gets that credit.

I tried to make it work like arrays (strings) as much as possible. It has .length, concat operators, slicing, and other utility functions. The documentation is available at: http://www.dprogramming.com/docs/mtext/mtext.html Here's some older examples from the unittest: mstring s = mstring("foo"c); assert(s == "foo"c); assert(s.startsWith("f"c)); assert(s.startsWith("fo"w)); assert(s.startsWith("foo"c)); assert(!s.startsWith("fooo"c)); assert(!s.startsWith("F"c)); assert(s.startsWith("F"c, true)); assert(!s.startsWith("E"c, true)); assert(s.endsWith("o"c)); assert(s.endsWith("oo"d)); assert(s.endsWith("foo"c)); assert(!s.endsWith("fooo"c)); assert(!s.endsWith("ooo"c)); assert(s.endsWith("O"c, true)); assert(!s.endsWith("P"c, true)); int i; s = mstring("The quick brown fox jumped over the lazy dog."w); i = s.find("Lazy"c, true); assert(-1 != i); assert(s[i .. i + 4] == "lazy"c); i = s.find("Lazy"c, false); assert(-1 == i); i = s.find("."c, true); assert(-1 != i); i = s.find("dog."c, true); assert(-1 != i); wchar[] ws = "Arrrrr!"; assert(ws == mstring(ws)); assert(!(ws > mstring(ws))); assert(!(ws < mstring(ws))); assert(ws == mstring(ws)[0 .. ws.length]); assert(!(ws > mstring(ws)[0 .. ws.length])); assert(!(ws < mstring(ws)[0 .. ws.length])); mstring s2; s2 = s.dup; assert(s.type != s2.type); // Compacted. assert(s == s2); // Same value. and with the newer opAssign, import mtext; void main() { mstring hi = "hello"c; hi ~= ", world"c; } FAQ is on the main page: http://www.dprogramming.com/mtext.php I'll try to add better examples soon. - Chris Miller (cemiller)

Very nice. Thanks! --benji
Oct 10 2008
prev sibling parent "Chris R. Miller" <lordsauronthegreat gmail.com> writes:
Robert Fraser wrote:
 Chris R. Miller wrote:
 Benji Smith wrote:
 Chris R. Miller wrote:
 Benji Smith wrote:
 4. String processing sucks. Writing code that works transparently 
 with all three character types and correctly handles all unicode 
 characters is basically impossible. Or at least it feels 
 impossible. Trying to support both Phobos and Tango in the same 
 string-processing routine is a definite no-go. In an ideal world, 
 string-processing code shouldn't have to be littered with "static 
 if" all over the place.

How would this be "fixed?" Hint: don't suggest making strings an object. We tried that a while back, and it was more or less shot down.

Well, personally, I'd prefer it if strings were objects.

As well for me. Worked just fine in Java, I don't see why it can't work here. Rather, knowing what I know about D and Java, I would tend to think that a D implementation of a String as an object would be /more/ powerful than the Java implementation.
 But I could accept strings as character arrays if they were actually 
 characters arrays.

Amen.

http://www.dprogramming.com/mtext.php Nearly as efficient as regular strings & same memory footprint as the type of the lagest character within (i.e. if it contains only ascii, 8 bits/char. If it contains things representable in UTF-16, 16 bits a character. If it contains cuneoform, 32 bits per char).

That's all fine and dandy, but the appeal of the Java-String Object is that it is ubiquitous. All the existing code already accepts it as the standard. Even though mtext looks good, it still won't be able to be plugged into existing code (unless I'm mistaken, which I would be delighted to be). Perhaps the two object and native strings could coexist? I know that in Objective-C you can create a normal "dumb" C-style string using the normal syntax "foo bar is a string", but you can create a beefier object-as-a-string (NSString) by using another syntax "foo bar is an NSString". I'm not directly saying "this is what we should do!" (or even "this is what we should do!") but I'm just throwing the idea out there - maybe someone will have a better iteration on the concept?
Oct 10 2008
prev sibling parent Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Benji Smith wrote:

 The current state of affairs, where strings are transparently just 
 arrays of UTF-8 bytes makes them impossible to work with. They're 
 unindexable, unsliceable. You can't operate directly on those arrays. 
 You're forced to use the phobos/tango functions (which, by the way, are 
 incompatible with one another).

I disagreed with you the last time you said this too. As far as I see it, there is nothing to gain from making strings objects. I've done quite a lot of string processing in D including non-latin text and never found a problem slicing or indexing char[]s.
 If D strings must be character arrays, I'd love for them to at least be 
 ordinary arrays. Each element of a char[] array should be a single 
 character. And if a sizeof(char) == 1, then a char should be limited to 
 a single byte. To represent mutlibyte characters, it should be necessary 
 to use a wchar[] or dchar[] array.

And you would be back to the horrors of the pre-Unicode world with incompatible code-pages and encodings. Not even dchars can represent single Unicode characters, since they can be composed of combining character sequences. There is nothing wrong with D's way of handling strings. You just need to lose the preconception of needing atomic characters. -- Oskar
Oct 11 2008
prev sibling next sibling parent Christopher Wright <dhasenan gmail.com> writes:
Benji Smith wrote:
 * I keep hearing about how the DMC backend needs improvement in its 
 floating-point code. Personally, I think this would be a bigger win than 
 pure functions, whose optimization opportunities are still hypothetical.
 
 --benji

Alternative: replace the DMC backend with LLVM.
Oct 09 2008
prev sibling parent cemiller <chris dprogramming.com> writes:
 http://www.dprogramming.com/mtext.php

 Nearly as efficient as regular strings & same memory footprint as the  
 type of the lagest character within (i.e. if it contains only ascii, 8  
 bits/char. If it contains things representable in UTF-16, 16 bits a  
 character. If it contains cuneoform, 32 bits per char).

you know what I think. But, upon first glance, you definitely need some example code and a tutorial. Right now, I don't really know where to start. --benji

I totally agree it needs more docs. I didn't write it; Chris Miller gets that credit.

I tried to make it work like arrays (strings) as much as possible. It has .length, concat operators, slicing, and other utility functions. The documentation is available at: http://www.dprogramming.com/docs/mtext/mtext.html Here's some older examples from the unittest: mstring s = mstring("foo"c); assert(s == "foo"c); assert(s.startsWith("f"c)); assert(s.startsWith("fo"w)); assert(s.startsWith("foo"c)); assert(!s.startsWith("fooo"c)); assert(!s.startsWith("F"c)); assert(s.startsWith("F"c, true)); assert(!s.startsWith("E"c, true)); assert(s.endsWith("o"c)); assert(s.endsWith("oo"d)); assert(s.endsWith("foo"c)); assert(!s.endsWith("fooo"c)); assert(!s.endsWith("ooo"c)); assert(s.endsWith("O"c, true)); assert(!s.endsWith("P"c, true)); int i; s = mstring("The quick brown fox jumped over the lazy dog."w); i = s.find("Lazy"c, true); assert(-1 != i); assert(s[i .. i + 4] == "lazy"c); i = s.find("Lazy"c, false); assert(-1 == i); i = s.find("."c, true); assert(-1 != i); i = s.find("dog."c, true); assert(-1 != i); wchar[] ws = "Arrrrr!"; assert(ws == mstring(ws)); assert(!(ws > mstring(ws))); assert(!(ws < mstring(ws))); assert(ws == mstring(ws)[0 .. ws.length]); assert(!(ws > mstring(ws)[0 .. ws.length])); assert(!(ws < mstring(ws)[0 .. ws.length])); mstring s2; s2 = s.dup; assert(s.type != s2.type); // Compacted. assert(s == s2); // Same value. and with the newer opAssign, import mtext; void main() { mstring hi = "hello"c; hi ~= ", world"c; } FAQ is on the main page: http://www.dprogramming.com/mtext.php I'll try to add better examples soon. - Chris Miller (cemiller)
Oct 10 2008
prev sibling next sibling parent reply Ary Borenszweig <ary esperanto.org.ar> writes:
Andrei Alexandrescu escribi:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let 
 me ask this - what do you think are the top issues you'd like to see 
 fixed in D?
 
 Andrei

I think most of the previous replies talk about new features in D that might help make D more nice to work with, but not about what's broken in D. For me, its: 1. Errors because of forward references shouldn't exist. ... Hmmm... I have an exam in 45 minutes, I'll write down the other four later at night. :-)
Oct 08 2008
parent "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Wed, Oct 8, 2008 at 5:15 PM, Ary Borenszweig <ary esperanto.org.ar> wrote:

 1. Errors because of forward references shouldn't exist.

Funny how interpretation works. I was thinking the thread was about problems in the language we'd want to see fixed; that DMD is horribly buggy and does not implement the spec is pretty much a given at this point ;)
Oct 08 2008
prev sibling next sibling parent reply "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 Ok, per Aarti's suggestion: without speaking officially for Walter, let  
 me ask this - what do you think are the top issues you'd like to see  
 fixed in D?

 Andrei

Things to be fixed: Enum for manifest constants. Let us use alias or pure - or something completely different, as long as it makes sense. Shared Phobos/Tango runtime. I have a dream that future D programmers will not be judged by the name of their standard library but by the content of their programs. Forward references. This has bitten me several times, and I'd really like to get rid of it. Things that I want: Reference return values. Think this is the top one at the moment. Multiple return values. Would be nice, but can be made to work atm. Real properties. As much as I love fiddling with templates to make my own properties that try to do The Right Thing™, language support would be better. -- Simen
Oct 08 2008
parent Vincent Richomme <forumer smartmobili.com> writes:
Simen Kjaeraas a écrit :
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 
 Ok, per Aarti's suggestion: without speaking officially for Walter, 
 let me ask this - what do you think are the top issues you'd like to 
 see fixed in D?

 Andrei

Things to be fixed: Enum for manifest constants. Let us use alias or pure - or something completely different, as long as it makes sense. Shared Phobos/Tango runtime. I have a dream that future D programmers will not be judged by the name of their standard library but by the content of their programs. Forward references. This has bitten me several times, and I'd really like to get rid of it. Things that I want: Reference return values. Think this is the top one at the moment. Multiple return values. Would be nice, but can be made to work atm. Real properties. As much as I love fiddling with templates to make my own properties that try to do The Right Thing™, language support would be better.

1. Merge phobos and tango 2. Merge phobos and tango 3. Merge phobos and tango 4. Merge phobos and tango 5. Merge phobos and tango Fortunately you only ask for top 5 ;-)
Oct 08 2008
prev sibling next sibling parent Jason House <jason.james.house gmail.com> writes:
Andrei Alexandrescu Wrote:

 Ok, per Aarti's suggestion: without speaking officially for Walter, let 
 me ask this - what do you think are the top issues you'd like to see 
 fixed in D?
 
 Andrei

1. Tango/Phobos/D2 compatibility. This problem has been around for way too long! A fragmented community leads to many problems. Druntime integration into Phobos would solve 70% of this problem. 2. Speed of built in types. It's a sad day when a user realizes they have to rework their code to eliminate AAs and array appending. 3. GC scalability. N threads means allocating memory N times as fast and there's N times as much. There's no point using D on an 8 core machine. 4. Return by reference... a[x].property=1 should just work...
Oct 08 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Thu, 09 Oct 2008 00:07:27 +0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Ok, per Aarti's suggestion: without speaking officially for Walter, let  
 me ask this - what do you think are the top issues you'd like to see  
 fixed in D?

 Andrei

Even though I wrote large amount of templated code, I still have many problems with them. - I can't figure out why static if (is (condition)) works in some cases and doesn't in others (sorry, no examples) - the (T : T), (T : T*), (T : T[]), (T : S!(T)), (T S == class), (T S == T*) etc stuff is not intuitive and hard to understand to me - I don't like the static if (is (T t)) { } hack to introduce compile-time variable of type T in a template level without polluting template namespace and without introducing actual variables. Something like this: template CanAdd(X, Y) { static if (is (X x)) { static if (is (Y y)) { enum CanAdd = is (x + y); } else { enum CanAdd = false; } } else { enum CanAdd = false; } } It is a common practice that deserves better syntax, I believe. There might exists solutions that I don't know of, still (after a year+ D development). - static if (is (X[0].Y == Z)) { .. } syntax doesn't work (X is a tuple and X[0] is a type that contains Y - alias, manifest constant or type name). This might be a bug, I don't know, but I often need it and it is bad that I have to do make an explicit alias so that it is usable in the is expression. - I believe conditional template matching is a bad hack that needs to be revised: template Foo(Bar)(Bar bar) if (SomeCheck!(Bar)) { // C++0x concepts are thousand time better ... } - Lack of struct inheritance. I'm having hard time porting and using some C++ code (namely GDI+) that relies on struct inheritance. No, I can't make them classes because they should remain POD-types. This involves code duplication and struct pointer casts (argh!). - The following code doesn't work: void foo(T)(T t) { } void foo(float f) { } I'm having hard time with all the static ifs that complicate the code and makes it ugly: void foo(T)(T t) { static if (is (T == float)) { // implementation1 here } else if (is (T == Bar)) { // implementation2 here } else { // implementation3 here } } or void foo(T)(T t) if (is(T == float)) { } void foo(T)(T t) if (is (T == Bar)) { } void foo(T)(T t) if (!is (T == Bar) && !is (T == float)) { } - Template function specialization. I miss this feature from C++ alot. It allow user to extend existing library functionality to support user type, or make an optimized version. - version'ing is not smart enough (in my optinion). It doesn't support mixing different assembly styles (because DMD doesn't know some op-codes that GDC knows, for example), different language versions (D1/D2/D3?) etc. There is definitely a room for improvement. - There are also a few (confirmed) bugs that are annoying and I am looking for them to be fixed. [Wish] Better compile-time reflection. It is currently quite hard to work with. [Wish] (Varible/Template/Function) Attributes. These are priceless for compile-time reflection. I can explain them in a separate topic. [Wish] Named parameters. Compare the following: updateSettings(true, 32, true); Color color(255, 0, 255, 100); with: updateSettings(fullScreen: true, colorDepth: 32, fastSimulation: true); Color color(red: 255, green: 0, blue: 255, alpha: 100); [Minor one] int -> short implicit cast (signed/unsigned comparison without error belong here) [This one is not about D but about its development] - I dislike some design decisions that are spontaneously made without discussion and we all regret about and suffer afterwards - Some good proposals are made and it is somewhat bad that no one from the "higher powers" comments them. I know it is hard to answer everyone, but it is a share so much time is wasted on hubbub and good proposals are ignored. On the other hand, now that you are back I am more optimistic about D future. Thanks for the time and effort that you put into all this! Thank you!
Oct 08 2008
prev sibling next sibling parent superdan <super dan.org> writes:
Andrei Alexandrescu Wrote:

 Ok, per Aarti's suggestion: without speaking officially for Walter, let 
 me ask this - what do you think are the top issues you'd like to see 
 fixed in D?
 
 Andrei

1. phobos+tango 2. threads n stuff 3. fix them arrays n strings. strings must not be static size 4. many bugs in templates that make life impossible 5. ref returnz
Oct 08 2008
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Andrei Alexandrescu" wrote
 Ok, per Aarti's suggestion: without speaking officially for Walter, let me 
 ask this - what do you think are the top issues you'd like to see fixed in 
 D?

1. Any bugs that prevent Tango from being ported to D2 (an empty list at the moment, but I'm still working on it ;) 2. Shared library support. 3. Prevent unnecessary closure allocations (or allow specification to prevent them). 4. Tail-const class references (you may say this isn't a bug, but its a huge hole that needs to be fixed for D const to be useful). 5. Scoped const (see bug http://d.puremagic.com/issues/show_bug.cgi?id=1961 ) #5 isn't really a fix to something that's 'broken', but it is a fix to the design that allows much easier const code. If I can sneak in a couple more enhancements ;) 6. struct interfaces 7. true property definitions -Steve
Oct 08 2008
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
 4. Tail-const class references (you may say this isn't a bug, but its a huge 
 hole that needs to be fixed for D const to be useful).

Did you try Rebindable in std.typecons? http://www.digitalmars.com/d/2.0/phobos/std_typecons.html Andrei
Oct 08 2008
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Andrei Alexandrescu" wrote
 Steven Schveighoffer wrote:
 4. Tail-const class references (you may say this isn't a bug, but its a 
 huge hole that needs to be fixed for D const to be useful).

Did you try Rebindable in std.typecons? http://www.digitalmars.com/d/2.0/phobos/std_typecons.html Andrei

I don't really use D2, so I forgot that opDot makes this possible. I think the last time I looked at it, maybe opDot wasn't included or implemented yet. But that should work fine. Would be nice to have aliases like: tailconst!(Widget) tailinvariant!(Widget) Which to me feel better than: Rebindable!(const(Widget)) I'll replace my #4 with overloading based on return values, but that really is an enhancement more than a fix ;) Thanks -Steve
Oct 08 2008
prev sibling next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Ok, per Aarti's suggestion: without speaking officially for Walter, let
 me ask this - what do you think are the top issues you'd like to see
 fixed in D?
 Andrei

1. Phobos and Tango should be mix-and-match-able in both D1 and D2. 2. Less conservative GC. It's aggravating when a supposedly GC'd language for all practical purposes still requires manual deallocation of some stuff if you're working with large datasets, due to false pointer issues. 3. Scalable memory allocator that isn't a bottleneck on multithreaded code. 4. Array capacity field so that appending to arrays isn't slow and isn't a multithreading bottleneck. 5. *Please* fix bug 929. http://d.puremagic.com/issues/show_bug.cgi?id=929
Oct 08 2008
next sibling parent reply "Chris R. Miller" <lordsauronthegreat gmail.com> writes:
dsimcha wrote:
 2.  Less conservative GC.  It's aggravating when a supposedly GC'd language for
 all practical purposes still requires manual deallocation of some stuff if
you're
 working with large datasets, due to false pointer issues.

I am curious, what is the feasability of writing a garbage collector where you could change the conservativsy at run-time? (Perhaps stupid to ask, but I never know 'till I ask).
Oct 08 2008
parent Christopher Wright <dhasenan gmail.com> writes:
Chris R. Miller wrote:
 dsimcha wrote:
 2.  Less conservative GC.  It's aggravating when a supposedly GC'd 
 language for
 all practical purposes still requires manual deallocation of some 
 stuff if you're
 working with large datasets, due to false pointer issues.

I am curious, what is the feasability of writing a garbage collector where you could change the conservativsy at run-time? (Perhaps stupid to ask, but I never know 'till I ask).

This is an issue of RTTI granularity. Since there is no significant speed benefit from using a less conservative GC, there is never a reason to prefer a less conservative GC. Therefore your GC should always be as precise as the type information allows.
Oct 09 2008
prev sibling parent "Vladimir Panteleev" <thecybershadow gmail.com> writes:
On Thu, 09 Oct 2008 01:22:18 +0300, dsimcha <dsimcha yahoo.com> wrote:

 4.  Array capacity field so that appending to arrays isn't slow and  
 isn't a
 multithreading bottleneck.

Yes please! This is very easy to implement, just bind a property to two GC calls! (capacity and realloc) -- Best regards, Vladimir mailto:thecybershadow gmail.com
Oct 09 2008
prev sibling next sibling parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Thu, 09 Oct 2008 00:07:27 +0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Ok, per Aarti's suggestion: without speaking officially for Walter, let  
 me ask this - what do you think are the top issues you'd like to see  
 fixed in D?

 Andrei

Oh, I forgot about these: - omittable parens - bad feature. I wish a real property syntax existed (with only .foo; allowed, no .foo(); please). [Minor|Wish] overlapping array operation. The following should be allowed, even if it makes things slightly slower. Why user should care? void foo(T[] t1, T[] t2) { ... t1[] = t2[]; // overlap or not? who knows.. t1[0..100] = t1[10..110]; // yes, they do, so what? ... } [Wish] Stack allocation - why allocate on heap when you can allocate on stack? void drawString(char[] text) { int length = getLengthAsUtf16String(text); // for leading zero wchar[length + 1] buffer; // C++ and C++0x both have this feature convertToUtf16String(text, buffer); buffer[length] = 0; graphics.DrawString(font, color, buffer.ptr); } Do you write down? :) Thanks.
Oct 08 2008
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Denis Koroskin wrote:
 On Thu, 09 Oct 2008 00:07:27 +0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Ok, per Aarti's suggestion: without speaking officially for Walter, 
 let me ask this - what do you think are the top issues you'd like to 
 see fixed in D?

 Andrei

Oh, I forgot about these: - omittable parens - bad feature. I wish a real property syntax existed (with only .foo; allowed, no .foo(); please).

This doesn't have much support from Walter, I'm afraid.
 [Minor|Wish] overlapping array operation. The following should be 
 allowed, even if it makes things slightly slower. Why user should care?
 void foo(T[] t1, T[] t2) {
     ...
     t1[] = t2[]; // overlap or not? who knows..
     t1[0..100] = t1[10..110]; // yes, they do, so what?
     ...
 }

I think this often indicates an error.
 [Wish] Stack allocation - why allocate on heap when you can allocate on 
 stack?
 
 void drawString(char[] text) {
     int length = getLengthAsUtf16String(text); // for leading zero
     wchar[length + 1] buffer;                  // C++ and C++0x both 
 have this feature
     convertToUtf16String(text, buffer); buffer[length] = 0;
     graphics.DrawString(font, color, buffer.ptr);
 }
 
 Do you write down? :)
 Thanks.

Neither C++ nor C++0x have that feature. C99 does (VLAs). But I do think it's a useful feature. In a class I taught that entailed teaching students some C, most of them actually tried VLAs naturally without knowing about it, gcc accepted it, and they used it to great effect. Andrei
Oct 08 2008
next sibling parent Sean Kelly <sean invisibleduck.org> writes:
Andrei Alexandrescu wrote:
 Denis Koroskin wrote:
 On Thu, 09 Oct 2008 00:07:27 +0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:

 Ok, per Aarti's suggestion: without speaking officially for Walter, 
 let me ask this - what do you think are the top issues you'd like to 
 see fixed in D?

 Andrei

Oh, I forgot about these: - omittable parens - bad feature. I wish a real property syntax existed (with only .foo; allowed, no .foo(); please).

This doesn't have much support from Walter, I'm afraid.
 [Minor|Wish] overlapping array operation. The following should be 
 allowed, even if it makes things slightly slower. Why user should care?
 void foo(T[] t1, T[] t2) {
     ...
     t1[] = t2[]; // overlap or not? who knows..
     t1[0..100] = t1[10..110]; // yes, they do, so what?
     ...
 }

I think this often indicates an error.

There appears to be some rather new runtime support for overlapping array copies in D2 though. Any idea where this is headed? Sean
Oct 08 2008
prev sibling next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
Andrei Alexandrescu wrote:
 Denis Koroskin wrote:
 
 [Wish] Stack allocation - why allocate on heap when you can allocate 
 on stack?

 void drawString(char[] text) {
     int length = getLengthAsUtf16String(text); // for leading zero
     wchar[length + 1] buffer;                  // C++ and C++0x both 
 have this feature
     convertToUtf16String(text, buffer); buffer[length] = 0;
     graphics.DrawString(font, color, buffer.ptr);
 }

 Do you write down? :)
 Thanks.

Neither C++ nor C++0x have that feature. C99 does (VLAs). But I do think it's a useful feature. In a class I taught that entailed teaching students some C, most of them actually tried VLAs naturally without knowing about it, gcc accepted it, and they used it to great effect.

We can generally use alloca for this in D, so I don't consider this a major issue. Be pretty cool to have it in-language though. Sean
Oct 08 2008
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Sean Kelly:
 We can generally use alloca for this in D, so I don't consider this a 
 major issue.  Be pretty cool to have it in-language though.

Time ago I have shown this bug that I think may be of alloca(): import std.c.stdlib: alloca; void main() { const int n = 8; for (int i; i < 2; i++) printf("%p\n", alloca(n)); } Error: it prints two times the same address. (I also think alloca of DMD is currently not much fast, compared to the allocation of variable length arrays of C99 of GCC, but I don't have hard data to support this at the moment). Bye, bearophile
Oct 08 2008
parent bearophile <bearophileHUGS lycos.com> writes:
Denis Koroskin:
 Is this really an error?

I think so.
 "for" body creates a new scope and alloca'ted  
 memory is automatically released once you leave the block.

I think that's not true: http://www.archive.geschichte.mpg.de/doc/susehilf/gnu/libc/Variable_Size_Automatic.html
but all the blocks are freed when you exit the function that alloca was called
from,<

In that code if you remove the "const" the program works, giving two different addresses. Bye, bearophile
Oct 08 2008
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Sean Kelly wrote:
 Andrei Alexandrescu wrote:
 Denis Koroskin wrote:

 [Wish] Stack allocation - why allocate on heap when you can allocate 
 on stack?

 void drawString(char[] text) {
     int length = getLengthAsUtf16String(text); // for leading zero
     wchar[length + 1] buffer;                  // C++ and C++0x both 
 have this feature
     convertToUtf16String(text, buffer); buffer[length] = 0;
     graphics.DrawString(font, color, buffer.ptr);
 }

 Do you write down? :)
 Thanks.

Neither C++ nor C++0x have that feature. C99 does (VLAs). But I do think it's a useful feature. In a class I taught that entailed teaching students some C, most of them actually tried VLAs naturally without knowing about it, gcc accepted it, and they used it to great effect.

We can generally use alloca for this in D, so I don't consider this a major issue. Be pretty cool to have it in-language though.

The problem with alloca is that it's not "encapsulatable"---Walter must love this because he thrives on "evaluatable"---because you can't squirrel the call to alloca in a function that takes care of the casting. Andrei
Oct 08 2008
prev sibling parent Jerry Quinn <jlquinn optonline.net> writes:
Andrei Alexandrescu Wrote:

 Denis Koroskin wrote:
 On Thu, 09 Oct 2008 00:07:27 +0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Ok, per Aarti's suggestion: without speaking officially for Walter, 
 let me ask this - what do you think are the top issues you'd like to 
 see fixed in D?

 Andrei

Oh, I forgot about these:

 [Minor|Wish] overlapping array operation. The following should be 
 allowed, even if it makes things slightly slower. Why user should care?
 void foo(T[] t1, T[] t2) {
     ...
     t1[] = t2[]; // overlap or not? who knows..
     t1[0..100] = t1[10..110]; // yes, they do, so what?
     ...
 }

I think this often indicates an error.

This is just overlapping memcpy. I've wanted to do this several times and get annoyed that I have to go looking into std.algorithm for something that really should be in the language. If the compiler can tell it's overlapping, it can write out the safer copying versions required. Jerry
Oct 08 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Thu, 09 Oct 2008 02:30:21 +0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Denis Koroskin wrote:
 On Thu, 09 Oct 2008 00:07:27 +0400, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:

it's a useful feature. In a class I taught that entailed teaching students some C, most of them actually tried VLAs naturally without knowing about it, gcc accepted it, and they used it to great effect. Andrei

Ooops, I meant C99, of course. And I could sware I've heard that C++0x incorporates this C99 extension, but I was wrong.
Oct 08 2008
prev sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Thu, 09 Oct 2008 03:14:33 +0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Sean Kelly:
 We can generally use alloca for this in D, so I don't consider this a
 major issue.  Be pretty cool to have it in-language though.

Time ago I have shown this bug that I think may be of alloca(): import std.c.stdlib: alloca; void main() { const int n = 8; for (int i; i < 2; i++) printf("%p\n", alloca(n)); } Error: it prints two times the same address. (I also think alloca of DMD is currently not much fast, compared to the allocation of variable length arrays of C99 of GCC, but I don't have hard data to support this at the moment). Bye, bearophile

Is this really an error? "for" body creates a new scope and alloca'ted memory is automatically released once you leave the block.
Oct 08 2008
prev sibling next sibling parent Sean Kelly <sean invisibleduck.org> writes:
Andrei Alexandrescu wrote:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let 
 me ask this - what do you think are the top issues you'd like to see 
 fixed in D?

It was kind of a stretch to come up with 5 off the top of my head, but here they are: 1. There needs to be a way to explicitly declare static closures. I'd be happy if this were the default and "new &fn" created a dynamic closure, but I can understand the desire to make the "safe" option the default despite its inconsistency with D1. 2. Templates need work, particularly concerning template functions. Every time I try to do serious template programming I run into weird compile errors for things that seem like they should be fine. A related issue would be the ability to overload template functions with normal functions. Template metaprogramming in D is light years ahead of C++, but everyday template programming is often worse. 3. Forwards compatibility from D1 to D2. This isn't an issue that will be a blight upon the language many years from now, but at the moment the changes in meaning for some tokens (enum, const, invariant) make writing portable code difficult. Changing 'invariant' to 'immutable' will eliminate the problem with class invariants needing to be defined differently between versions however. 4. Tighter restrictions for implicit conversions between concrete types. At the very least, narrowing conversions should be an error. 5a. Dump support for legacy features such as the pre-D1 dual meaning of 'auto'. I'd also like to see support for C-style array an function pointer declarations dropped. These aren't a huge issue to me, but I'd prefer if D didn't have any such easter eggs. Drop foreach_reverse too--it is a well-intended disaster. 5b. AAs are currently pretty quirky. It would be nice if the functionality could be reviewed and cleaned up. 5c. I'd like to see real support for properties or at least more stringent compiler rules regarding the instances where a function can and can't be called without parens. Calling delegates should always require the use of parens, for example. This isn't a show-stopper, but the current situation feels like it requires experimentation to make sure actual behavior matches expected behavior, and I think this wouldn't be an issue with in-language support for properties. (I wasn't sure which of the 5s I considered most important so I included them all.) Sean
Oct 08 2008
prev sibling next sibling parent BCS <ao pathlink.com> writes:
Reply to Andrei,

 Ok, per Aarti's suggestion: without speaking officially for Walter,
 let me ask this - what do you think are the top issues you'd like to
 see fixed in D?
 
 Andrei
 

-Tango/Phobos -general bugs (like error messages with wrong/missing filenames and line numbers) -tools support
Oct 08 2008
prev sibling next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Wed, Oct 08, 2008 at 03:07:27PM -0500, Andrei Alexandrescu wrote:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let 
 me ask this - what do you think are the top issues you'd like to see 
 fixed in D?

1: There is a compiler bugs I'd like to see worked out that cause my code to randomly crash or behave incorrectly on Windows, but not Linux. (Sadly, I haven't tracked this down yet, so I can't point to a specific bug report.) This is very vague, I know, but the main idea is I'd like the compiler to always generate correct code as much as realistically possible. 2: D programs are a bit fat on memory usage. Not greatly, but when I'm writing code for my old 1995 laptop, I still use straight C to better use of the limited resources. I think a better GC and the standard lib only linking in what is needed should help this. 3: Again, linking in only what is used in the stdlib will solve this one I believe: trimming down the size of D executable files. Again, when I need small exes, I still feel that I need to write C. 4: I find D's contracts to be cool, but not terribly useful in practice. I really don't know how to fix this: maybe contracts in an interface would help. 5: I'm actually having a hard time completing this list... oh, here's one. Touching member objects in a destructor is undefined behaviour, since the garbage collector can reap them ahead of time unless the object is deleted manually. To fix it, what I think I'd like is a type modifier to make this ok. class A{ SomeObject a; unmanaged SomeObject b; // my addon: tell the GC to not manage this object. ~this(){ a.whatever(); // if the object is garbage collected, this might crash, // since the garbage collector might have already // deleted a. Thus, I think it should be a compiler // error to access it this way. // UNLESS, SomeObject is guaranteed to be // deleted deterministically, such as if it is a // scope class (or if it is unmanaged as well.) b.whatever(); // this, however, is safe in my proposed addition, since // b is marked unmanaged, meaning the garbage collector // leaves it alone. } } While I propose a modifier above, that could probably be done as a template (though I haven't really thought about it), I do think the current behaviour of allowing the destructor to be potentially dangerous is less than optimal. That said, I'd prefer keeping access to a.whatever() like it is now over banning it without some kind of workaround, since if I manually remember to delete the object, the above code works correctly. Stuff I'd like to see added, in no particular order: 1) Writing to global variables in type constructors - whenever an object of a particular type is constructed, I can run a compile time function and do something with that value, like append it to a global array. string[] allPrintedStrings; typedef PrintedString (string value) { allPrintedStrings ~= value; return cast(PrintedString) value; }; // this semicolon might not be desired, but is here since typedefs normally // end with them. (The syntax here probably sucks.) Whenever a PrintedString is declared in the code, the function in the typedef is called at compile time only (not runtime - for that, I'd just use regular constructors) and creates a global array of all the values in the code. Now my main() might do foreach(s; allPrintedStrings) writefln("%s", s); and list them all off, so say, a translator could see all those strings and do his job a bit more easily. Right now, I think the only way to accomplish this is with a preprocessor. The above example is trivial to implement with a simple add-on preprocessor, but I think it might be more useful built into the language. 2) Subtype typedefs - something like inheritance, but with typedefs instead of classes, so there is no runtime overhead. It is like a weakened typedef. So "typedef extends string PrintedString;" or some such syntax is like doing class string {} class PrintedString : string {} As far as the type system is concerned. The new template constraints I believe can do this now, but this would be prettier: PrintedString a = cast(PrintedString) "Hello, world!"; std.string.split(a, ','); // this works since we get the implicit // downcast to the base type of string. // At runtime, the PrintedString and // string are identical anyway, so no // templating is necessary. I think that's all that's on the mind right now. -- Adam D. Ruppe http://arsdnet.net
Oct 08 2008
parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Adam D. Ruppe wrote:
 
 To fix it, what I think I'd like is a type modifier to make this ok.
 
 class A{
   SomeObject a;
   unmanaged SomeObject b; // my addon: tell the GC to not manage this object.
   ~this(){
 	a.whatever();  // if the object is garbage collected, this might crash,
 			// since the garbage collector might have already
 			// deleted a. Thus, I think it should be a compiler
 			// error to access it this way.
 			// UNLESS, SomeObject is guaranteed to be
 			// deleted deterministically, such as if it is a
 			// scope class (or if it is unmanaged as well.)
 	b.whatever(); // this, however, is safe in my proposed addition, since
 			// b is marked unmanaged, meaning the garbage collector
 			// leaves it alone.
   }
 }
 

Err, you can do this yourself by allocating b outside of the GC heap. Might not be as safe as a solution integrated with the type system, but it's possible. -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Oct 15 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Thu, 09 Oct 2008 00:07:27 +0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Ok, per Aarti's suggestion: without speaking officially for Walter, let  
 me ask this - what do you think are the top issues you'd like to see  
 fixed in D?

 Andrei

One more, if you please: class Foo(T) { } Foo!(Foo!(Foo!(int))) bar; pragma(msg, typeof(bar).stringof); Always prints Foo regardless of template arguments. Giving correct template class name (with all the dependencies) would make template debugging alot more easier.
Oct 08 2008
prev sibling next sibling parent reply nazo <lovesyao gmail.com> writes:
Andrei Alexandrescu wrote:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let 
 me ask this - what do you think are the top issues you'd like to see 
 fixed in D?
 
 Andrei

I'm doing meta programming but I have some problems. so please fix it. 1. there is no tuple literal. please please add tuple literal! 2. "int+long" is supported but "foo(int, long)" is not supported 3. there is no template literal 4. short syntax for nested function/class templates is needed. void foo(T)(T t)(){ } 5. recursive function template can't deduce return type auto foo()(int i){ if(i) return foo(i-1); return i; } Others: * can't instantiate nested template of same name http://d.puremagic.com/issues/show_bug.cgi?id=539 * variable template short syntax http://all-technology.com/eigenpolls/dwishlist/index.php?it=149 * there is no tuple properties writefln(TypeTuple!(int, long).init); * can't return sarray and tuple * can't return self type ??? foo(){return &foo;} * C++ class compatibility * stack trace on linux * nested tuple
Oct 08 2008
parent reply nazo <lovesyao gmail.com> writes:
nazo wrote:
 2. "int+long" is supported but "foo(int, long)" is not supported

oops, this is my mistake. it's works with mysterious way. import std.stdio; alias int Int; alias long Long; typeof(Int + Long) foo(){ return 10; } void baz(Int a, Long b){ } typeof(baz(Int, Long)) bar(){ } void main(){ writefln(foo()); bar(); }
Oct 08 2008
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
nazo wrote:
 nazo wrote:
 2. "int+long" is supported but "foo(int, long)" is not supported

oops, this is my mistake. it's works with mysterious way. import std.stdio; alias int Int; alias long Long; typeof(Int + Long) foo(){ return 10; } void baz(Int a, Long b){ } typeof(baz(Int, Long)) bar(){ } void main(){ writefln(foo()); bar(); }

I think a bug report would be in order. Andrei
Oct 08 2008
prev sibling next sibling parent reply MIURA Masahiro <echochamber gmail.com> writes:
Andrei Alexandrescu wrote:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let 
 me ask this - what do you think are the top issues you'd like to see 
 fixed in D?

These are the 'blockers' for me: - gdb support on Linux - Unicode string index/offset
Oct 08 2008
parent "Denis Koroskin" <2korden gmail.com> writes:
On Thu, 09 Oct 2008 05:14:28 +0400, MIURA Masahiro <echochamber gmail.com>  
wrote:

 Andrei Alexandrescu wrote:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let
 me ask this - what do you think are the top issues you'd like to see
 fixed in D?

These are the 'blockers' for me: - gdb support on Linux

 - Unicode string index/offset

Oct 09 2008
prev sibling next sibling parent "Bill Baxter" <wbaxter gmail.com> writes:
On Thu, Oct 9, 2008 at 10:50 AM, Bill Baxter <wbaxter gmail.com> wrote:
 On Thu, Oct 9, 2008 at 5:07 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let me
 ask this - what do you think are the top issues you'd like to see fixed in
 D?

 Andrei

1. Unify tango and phobos. 2. Make overloading templates with functions or other templates (in the same module at least) "just work". 3. fix the #of fixups bug. (http://d.puremagic.com/issues/show_bug.cgi?id=424) This is a killer for large code bases that make extensive use of templates. It requires painful import gymnastics to ensure that intermediate types always get instantiated in some module other than "main". This may be a good motivator for trying to make LLVMDC the primary D implementation. 4. fix the memory leaking in CTFE functions. There's already a GC in DMD, enabling it fixes the problem, just there are some bugs that prevent it from being used, as the LLVMDC guys found. 5. override of closure memory allocation in D2 -- This one is for Frank. As a DWT user, I'd like to see Frank's #1 DWT-related issue fixed. (I think this was your #1 DWT-related request wasn't, Frank?)

Ooh, I forgot reference return values. But sounds like that's already work in progress so I don't really need to list it. :-) --bb
Oct 08 2008
prev sibling next sibling parent "Bill Baxter" <wbaxter gmail.com> writes:
On Thu, Oct 9, 2008 at 5:07 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let me
 ask this - what do you think are the top issues you'd like to see fixed in
 D?

 Andrei

1. Unify tango and phobos. 2. Make overloading templates with functions or other templates (in the same module at least) "just work". 3. fix the #of fixups bug. (http://d.puremagic.com/issues/show_bug.cgi?id=424) This is a killer for large code bases that make extensive use of templates. It requires painful import gymnastics to ensure that intermediate types always get instantiated in some module other than "main". This may be a good motivator for trying to make LLVMDC the primary D implementation. 4. fix the memory leaking in CTFE functions. There's already a GC in DMD, enabling it fixes the problem, just there are some bugs that prevent it from being used, as the LLVMDC guys found. 5. override of closure memory allocation in D2 -- This one is for Frank. As a DWT user, I'd like to see Frank's #1 DWT-related issue fixed. (I think this was your #1 DWT-related request wasn't, Frank?) But as a meta-wish I heartily agree with whoever it was who said the development process needs to be made more open. It seems that is happening some these days, but the core compiler is still quite closed in terms of development. This might also force switching whole-hog to an open framework like LLVM. If that's what it takes, then that's what should be done. At least an official entry on a roadmap saying that eventually D aims to have a fully open compiler as the *primary* (Walter-developed) compiler. --bb
Oct 08 2008
prev sibling next sibling parent BLS <nanali nospam-wanadoo.fr> writes:
Andrei Alexandrescu schrieb:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let 
 me ask this - what do you think are the top issues you'd like to see 
 fixed in D?
 
 Andrei

1-5 Full shared library support: classes/exceptions Bjoern
Oct 08 2008
prev sibling next sibling parent "Bruce Adams" <tortoise_74 yeah.who.co.uk> writes:
On Wed, 08 Oct 2008 21:07:27 +0100, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Ok, per Aarti's suggestion: without speaking officially for Walter, let  
 me ask this - what do you think are the top issues you'd like to see  
 fixed in D?

 Andrei

Unquestionably we should concentrate on getting the right colour for each and every bikeshed. Only properly coloured will D rule the world. :p Seriously though, imagine what could be acheived if all the brain power wasted on bikeshed issues was actually put to productive use.
Oct 09 2008
prev sibling next sibling parent reply ore-sama <spam here.lot> writes:
1. Properties and mandatory parens.
2. Struct/array initializers.
3. return const.
4. safe casts.
5. Don't forget that 99% of bugs in D aren't design-related.
Oct 09 2008
parent "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Thu, Oct 9, 2008 at 6:30 AM, ore-sama <spam here.lot> wrote:

 5. Don't forget that 99% of bugs in D aren't design-related.

Very well put.
Oct 09 2008
prev sibling next sibling parent Aarti_pl <aarti interia.pl> writes:
Andrei Alexandrescu pisze:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let 
 me ask this - what do you think are the top issues you'd like to see 
 fixed in D?
 
 Andrei

My current top 5 (+ extensions :-) ): 1. Tango/Phobos merge minimal: merge runtime - high priority good : minimal + compatibility in low level modules e.g. IO, variant, core types perfect: full merge of libraries 2. Working process of error handling / better communication with community minimal: bugs blocking most important libraries in D and long standing bugs should be solved first, before other tasks. good : A bit more discussion when there is different point of view from Walter, and from bug/enhancement reporter. I saw (and unfortunately also experienced it) few times that bug was changing few times status from open -> invalid -> reopen -> wontfix -> reopen. That's really discouraging for reporters: filling good report is really time consuming, so shouldn't be dismissed without good understanding of problem. perfect: Walter drops his backend and work together with LDC (previously LLVMDC) team on modernized compiler. Probably one reason of current situation is lack of manpower. Completely open frontend and open process should solve also this problem. If Walter would decide to go this way, it could happen as soon as llvmdc reach production quality on major platforms. 3. Destructors are not very usable as it's not possible to call in them methods on member objects. It makes imposible to close e.g. file in destructor, so there must be special API to make it. It's error prone and should be done automatically. Side note: I know that this might be difficult to achieve because of way GC work. But it probably is doable somehow (using reference counting?) 4. Uniform syntax for templates http://d.puremagic.com/issues/show_bug.cgi?id=1827 I will send additional post with examples of proposed syntax. I just need a little bit more time. 5. Anchored types http://d.puremagic.com/issues/show_bug.cgi?id=1835 It is very usable for dupping (cloning) of objects and for call chaining. With this implemented, cast should appear really rarely in code, what will be clear win over existing major languages like e.g. Java, C++. ---- Extensions :-) 6. Variable arguments functions. Current way of reading arguments is really ugly. Reading hidden variables to get access to function arguments breaks all rules of good style programming. One way of fixing it probably would be define simple struct in Object.d which will keep pointer to TypeInfo and pointer to data as void*. There could be also used variant type as a container for arguments. Then function signature would look like below: void writefln(Args[] args ...) As a result arguments could be read as normal D type safe variadic argument functions. It will also allow to pass variable arguments to other functions, which is currently *very* inconvenient. I think that variable arguments functions shouldn't be depreciated in favor of variable argument templates. At least not yet. 7. Keywords with underscores in them are ugly. One can say that it is another example of bicycle shed color problem. But at least I can say that this ugliness is well established in *modern* programming languages. Java people will probably never agree that this is good notion for keywords. I unfortunately don't know what is status of underscores in C#, Python, Ruby. Examples in D language: __traits, __FILE__, __LINE__, foreach_reverse() 8. foreach() should be more universal. It should have a form of foreach(Element element; Iterator iterator). Other syntaxes which allows to apply different policies of iterating collection would be ok also. If this change would be introduced foreach_reverse should be just dropped from langauge. 9. Removing SFINAE. Actually I use SFINAE in doost serializer to discover if templated function was defined by user in his/her class. But as I understand it is possible to use __traits(compile, ......) in 2.0 to get the same effect without all problems of SFINAE, which occurs when you make simple syntax error in your code... 10. 3 versions of functions for const correctness Defining 3 versions of functions to just be const correct is definitely not the right way. It should be fixed. 11. Real properties Not big priority for me, but it would be nice... Best Regards Marcin Kuszczak (aarti_pl)
Oct 09 2008
prev sibling next sibling parent reply Sergey Gromov <snake.scaly gmail.com> writes:
Wed, 08 Oct 2008 15:07:27 -0500,
Andrei Alexandrescu wrote:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let 
 me ask this - what do you think are the top issues you'd like to see 
 fixed in D?

I'd generally like less syntactic freedom in D. Excessive amounts of syntactic sugar cause troubles in many different places. Economy of syntax principle has a flip side: a single mistake can make your program mean a whole different thing instead of simply giving you a syntax error. Redundancy is required to detect an error and suggest a fix. 1. Drop the "array is a slice" idea. I liked it a lot when I saw the D specs but it bit me immediately while I implemented a text parser. The idea sounds nice but it causes too many problems. There are two major use cases for arrays: building and parsing. When building, you want fast append, you've got a limited number of arrays, and you likely don't care if your arrays are a little fat. When parsing, you want data consistency, that is you don't want a sub-array to change if you occasionally append to another sub-array. You also want to pass those sub-arrays around a lot and you probably want them to be cheap. An array should be a fat object optimized for fast appending. It should be a class, because you want to pass it into functions and store references to it while keeping it a distinct object. Call it Array!(T) or List or Vector or whatever. It should be implicitly convertable to a slice type T[] which can stay what it is now. It should be built-in because array new, .dup and slice ~ should return a new array instance. It'll probably break lots of generic code. 2. Drop the "call a function however you want" idea. Simple property method syntax is nice, but excessive freedom is not. Adding one simple keyword fixes the situation. See
 http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D.announce&artnum=13471

It'll probably break lots of regular code. 3. Same concern but little experience. Never bit me personally. But probably the delegate literal syntax should be dropped in favour of a dedicated, distinct lambda expression syntax. 4. Fix the const system. It scares people. It scared me off D2 for a while even though I liked D2 and I liked the *idea* of a const system. There are two items on my wish-list: a) const-transparent and const-inheriting methods, so that you don't have to write the same method thrice as you do in C++ to grant const- correctness: class A { Object o; constof!(o)(Object) get() { return o; } } invariant(A) a; typeof(a.get()); // invariant(Object) b) unique return values that can be cast to any of mutable, const and invariant. The proposal is to have a "unique" keyword and a unique constness type. Only rvalue can be unique. Unique can be returned. Unique casts implicitly to any other constness type. Nothing implicitly casts to unique. You can explicitly cast to unique. New, malloc, .deepdup return unique. .dup returns unique if hasNoPointers.
Oct 09 2008
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Sergey Gromov wrote:
 I'd generally like less syntactic freedom in D.  Excessive amounts of 
 syntactic sugar cause troubles in many different places.  Economy of 
 syntax principle has a flip side: a single mistake can make your program 
 mean a whole different thing instead of simply giving you a syntax 
 error.  Redundancy is required to detect an error and suggest a fix.

Ok. I guess what you're saying is that the Hamming distance between semantically correct notation should be large.
 1.  Drop the "array is a slice" idea.  I liked it a lot when I saw the D 
 specs but it bit me immediately while I implemented a text parser.  The 
 idea sounds nice but it causes too many problems.
 
 There are two major use cases for arrays: building and parsing.  When 
 building, you want fast append, you've got a limited number of arrays, 
 and you likely don't care if your arrays are a little fat.  When 
 parsing, you want data consistency, that is you don't want a sub-array 
 to change if you occasionally append to another sub-array.  You also 
 want to pass those sub-arrays around a lot and you probably want them to 
 be cheap.
 
 An array should be a fat object optimized for fast appending.  It should 
 be a class, because you want to pass it into functions and store 
 references to it while keeping it a distinct object.  Call it Array!(T) 
 or List or Vector or whatever.  It should be implicitly convertable to a 
 slice type T[] which can stay what it is now.  It should be built-in 
 because array new, .dup and slice ~ should return a new array instance.
 
 It'll probably break lots of generic code.

I think there are very solid reasons for keeping slices the primitive array type of choice. Rock-solid in fact. I will go over them in TDPL, but in short pointers are too low-level a primitive for arrays, and more than (the equivalent of) a pair of pointers would be too non-primitive. For appending, I already put Appender in std.array. For parsing, I didn't get your argument.
 2.  Drop the "call a function however you want" idea.  Simple property 
 method syntax is nice, but excessive freedom is not.  Adding one simple 
 keyword fixes the situation.  See
 http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D.announce&artnum=13471

It'll probably break lots of regular code.

This would be an uphill battle with Walter. He'd like to implement the property calls to work only for pairs of functions, something discussed here before.
 3.  Same concern but little experience.  Never bit me personally.  But 
 probably the delegate literal syntax should be dropped in favour of a 
 dedicated, distinct lambda expression syntax.

That needs to be looked at indeed.
 4.  Fix the const system.  It scares people.  It scared me off D2 for a 
 while even though I liked D2 and I liked the *idea* of a const system.  
 There are two items on my wish-list:
 
 a) const-transparent and const-inheriting methods, so that you don't 
 have to write the same method thrice as you do in C++ to grant const-
 correctness:
 
 class A {
   Object o;
   constof!(o)(Object) get() { return o; }
 }
 invariant(A) a;
 typeof(a.get()); // invariant(Object)

We definitely must implement what I call qualifier-equivariance.
 b) unique return values that can be cast to any of mutable, const and 
 invariant.  The proposal is to have a "unique" keyword and a unique 
 constness type.  Only rvalue can be unique.  Unique can be returned.  
 Unique casts implicitly to any other constness type.  Nothing implicitly 
 casts to unique.  You can explicitly cast to unique.  New, malloc, 
 .deepdup return unique. .dup returns unique if hasNoPointers.

I think Unique is important and can be implemented as a library type (after Walter operates a few changes in the way copy construction works). More on that later. Andrei
Oct 09 2008
next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Andrei Alexandrescu" wrote
 Sergey Gromov wrote:
 2.  Drop the "call a function however you want" idea.  Simple property 
 method syntax is nice, but excessive freedom is not.  Adding one simple 
 keyword fixes the situation.  See
 http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D.announce&artnum=13471

It'll probably break lots of regular code.

This would be an uphill battle with Walter. He'd like to implement the property calls to work only for pairs of functions, something discussed here before.

Might I ask the reasoning behind the hilliness of the battle? You can look at a formal property syntax exactly as pairing (or grouping) functions together: int x { get() { return _x;} set(int y) {return _x = y;} set(string s) {return _x = to!(int)(s);} } Note, int return type implied on every function as x is an int. Also note that get/set do not need to be keywords. What am I doing here? Grouping related functions together... Isn't this what Walter is looking for? Walter's primary objection I would be most interested in. I think this is one item that has been on a fair amount of top-5 posts in this thread, it could be because of the recent thread about it, it's fresh in everyone's mind, but it could also be because people really do want this feature. -Steve
Oct 09 2008
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:
 For appending, I already put Appender in std.array.

Nice name, short and easy to understand. Is it based on my code, or is it an improvement of the code you have shown time ago? How many years it will take for you to re-implement most of the functionalities of my libs? Bye, bearophile
Oct 09 2008
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
bearophile wrote:
 Andrei Alexandrescu:
 For appending, I already put Appender in std.array.

Nice name, short and easy to understand. Is it based on my code, or is it an improvement of the code you have shown time ago?

It's a very simple implementation that amortizes calls to gc.capacity. My tests show it does quite well.
 How many
 years it will take for you to re-implement most of the
 functionalities of my libs?

Last time I looked it was delegate-based, is that still the case? Andrei
Oct 09 2008
prev sibling parent reply dsimcha <dsimcha yahoo.com> writes:
Andrei Alexandrescu:
 For appending, I already put Appender in std.array.

Can you please make sure std.array is included in the documentation for the next D release? Right now, it is completely undocumented. Also, while we're on the subject of Phobos documentation, what is the status of std.loader or whatever it is? Does it work properly? Is it usable? Is the lack of documentation for it an oversight or because it's not ready yet?
Oct 09 2008
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jarrett Billingsley wrote:
 On Thu, Oct 9, 2008 at 8:42 PM, dsimcha <dsimcha yahoo.com> wrote:
 Andrei Alexandrescu:
 For appending, I already put Appender in std.array.

release? Right now, it is completely undocumented. Also, while we're on the subject of Phobos documentation, what is the status of std.loader or whatever it is? Does it work properly? Is it usable? Is the lack of documentation for it an oversight or because it's not ready yet?

std.loader has been in Phobos for 5 years. It's one of a few additions to Phobos by Matt Wilson that have never really seemed like they were part of Phobos (along with std.perf, std.recls, std.openrj..). It's probably undocumented since Matt kind of disappeared and W might not know what the library actually does.

By the way, who is using openrj? After googling around for a bit, it looks like the Andy Warhol of text file formats. Andrei
Oct 09 2008
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
dsimcha wrote:
 Andrei Alexandrescu:
 For appending, I already put Appender in std.array.

Can you please make sure std.array is included in the documentation for the next D release? Right now, it is completely undocumented.

Yes.
 Also, while we're on the
 subject of Phobos documentation, what is the status of std.loader or whatever
it
 is?  Does it work properly?  Is it usable?  Is the lack of documentation for
it an
 oversight or because it's not ready yet?

I know nothing about it. Andrei
Oct 09 2008
prev sibling next sibling parent reply Sergey Gromov <snake.scaly gmail.com> writes:
Thu, 09 Oct 2008 11:22:11 -0500,
Andrei Alexandrescu wrote:
 Sergey Gromov wrote:
 I'd generally like less syntactic freedom in D.  Excessive amounts of 
 syntactic sugar cause troubles in many different places.  Economy of 
 syntax principle has a flip side: a single mistake can make your program 
 mean a whole different thing instead of simply giving you a syntax 
 error.  Redundancy is required to detect an error and suggest a fix.

Ok. I guess what you're saying is that the Hamming distance between semantically correct notation should be large.

Exactly, I forgot the term and didn't manage to google it up. :D
 1.  Drop the "array is a slice" idea.  I liked it a lot when I saw the D 
 specs but it bit me immediately while I implemented a text parser.  The 
 idea sounds nice but it causes too many problems.
 
 There are two major use cases for arrays: building and parsing.  When 
 building, you want fast append, you've got a limited number of arrays, 
 and you likely don't care if your arrays are a little fat.  When 
 parsing, you want data consistency, that is you don't want a sub-array 
 to change if you occasionally append to another sub-array.  You also 
 want to pass those sub-arrays around a lot and you probably want them to 
 be cheap.
 
 An array should be a fat object optimized for fast appending.  It should 
 be a class, because you want to pass it into functions and store 
 references to it while keeping it a distinct object.  Call it Array!(T) 
 or List or Vector or whatever.  It should be implicitly convertable to a 
 slice type T[] which can stay what it is now.  It should be built-in 
 because array new, .dup and slice ~ should return a new array instance.
 
 It'll probably break lots of generic code.

I think there are very solid reasons for keeping slices the primitive array type of choice. Rock-solid in fact. I will go over them in TDPL, but in short pointers are too low-level a primitive for arrays, and more than (the equivalent of) a pair of pointers would be too non-primitive.

I think there's a terminology issue again. In this proposal when I talk about slices I mean the primitives which we currently have as arrays, T[]. Because they *are* slices. And when I talk about arrays I mean fat memory management constructs similar to your Appender or bearophile's ArrayBuilder.
 For appending, I already put Appender in std.array. For parsing, I 
 didn't get your argument.

When you try to use current arrays for memory management, i.e. appending/growing, you get two major problems. First is what your Appender solves, speed. Second is that appending to a slice can overwrite contents of a larger slice, i.e. broken safety. Your library solution can improve speed. But it cannot improve safety, and the speed improvement is kinda optional: you must know about that particular library feature to get a speed gain. I think that the base type should offer both speed and safety by default. To achieve this I propose to disallow unsafe operations on current arrays, call them slices, and introduce a built-in memory management class, Array(T). Here's why built-in: void main() { auto arr = new char[]; // arr is of type Array!(char) arr.length = 10; // OK arr ~= 'c'; // fast arr ~= "foo bar"; // fast foo(arr); // implicit cast to char[] bar(arr); // passed by reference // "text" is appended } void foo(char[] a) { char x = a[0]; // OK char[] word = a[10..14]; // fine size_t l = a.length; // 18 a.length = 20; // error, .length is read-only a ~= "text"; // error, append not supported char[] n = a[15..18] ~ "some" // type of expression is Array!(char) ~ word // fast ~ "baz" // fast ; // implicit cast to char[] } void bar(Array!(char) a) { a ~= "text"; // OK }
 2.  Drop the "call a function however you want" idea.  Simple property 
 method syntax is nice, but excessive freedom is not.  Adding one simple 
 keyword fixes the situation.  See
 http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D.announce&artnum=13471

It'll probably break lots of regular code.

This would be an uphill battle with Walter. He'd like to implement the property calls to work only for pairs of functions, something discussed here before.

I'm not fighting here. Just saying what I think is the right thing to do.
Oct 09 2008
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Sergey Gromov wrote:
 I think that the base type should offer both speed and safety by 
 default.  To achieve this I propose to disallow unsafe operations on 
 current arrays, call them slices, and introduce a built-in memory 
 management class, Array(T).

I agree that ~= for T[] is bad. Andrei
Oct 09 2008
prev sibling parent reply Sergey Gromov <snake.scaly gmail.com> writes:
Fri, 10 Oct 2008 13:45:41 +0400,
Denis Koroskin wrote:
 On Fri, 10 Oct 2008 03:50:58 +0400, Sergey Gromov <snake.scaly gmail.com>  
 wrote:
 I think that the base type should offer both speed and safety by
 default.  To achieve this I propose to disallow unsafe operations on
 current arrays, call them slices, and introduce a built-in memory
 management class, Array(T).  Here's why built-in:

 void main()
 {
   auto arr = new char[]; // arr is of type Array!(char)
   arr.length = 10;       // OK
   arr ~= 'c';            // fast
   arr ~= "foo bar";      // fast
   foo(arr);              // implicit cast to char[]
   bar(arr);              // passed by reference
                          // "text" is appended
 }

 void foo(char[] a)
 {
   char x = a[0];                // OK
   char[] word = a[10..14];      // fine
   size_t l = a.length;          // 18
   a.length = 20;                // error, .length is read-only
   a ~= "text";                  // error, append not supported
   char[] n = a[15..18] ~ "some" // type of expression is Array!(char)
              ~ word             // fast
              ~ "baz"            // fast
              ;                  // implicit cast to char[]
 }

 void bar(Array!(char) a)
 {
   a ~= "text"; // OK
 }

The basic idea is good, but I believe that the proposal as is is too much complex. I think Array!(T) should not be casted to T[] implicitly, instead, it could have T[] all() method to be on par with ranges design. But then, this reduces usefulness of T[]. When should it be used? What are the benefits of const(T)[] over const(Array!(T))? As you prosope, difference between T[] functionality (mutable but not resizable) over Array!(T) (mutable and resizable) is just too narrow.

My T[] is useful when you want to recursively split a megabyte file into a couple thousands of tokens, and then modify some of those tokens. For that, your T[] must be lightweight, it must reference a bigger piece of data, and it must guarantee not to write anything into memory outside its boundaries. The Array is for appending. It must always own its memory. Therefore you should be able to pass it around by reference, so Array is a *class* and cannot be nearly as lightweight as T[]. You see, many of their properties are orthogonal. If you drop one, you lose flexibility.
 Besides, Array!(T) is not a good name for build-in type.

Names are placeholders here, not an actual proposal.
Oct 10 2008
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Sergey Gromov wrote:
 My T[] is useful when you want to recursively split a megabyte file into 
 a couple thousands of tokens, and then modify some of those tokens.  For 
 that, your T[] must be lightweight, it must reference a bigger piece of 
 data, and it must guarantee not to write anything into memory outside 
 its boundaries.
 
 The Array is for appending.  It must always own its memory.  Therefore 
 you should be able to pass it around by reference, so Array is a *class* 
 and cannot be nearly as lightweight as T[].
 
 You see, many of their properties are orthogonal.  If you drop one, you 
 lose flexibility.
 
 Besides, Array!(T) is not a good name for build-in type.

Names are placeholders here, not an actual proposal.

What's wrong with making Array a library type? Andrei
Oct 10 2008
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Andrei Alexandrescu" wrote
 Sergey Gromov wrote:
 My T[] is useful when you want to recursively split a megabyte file into 
 a couple thousands of tokens, and then modify some of those tokens.  For 
 that, your T[] must be lightweight, it must reference a bigger piece of 
 data, and it must guarantee not to write anything into memory outside its 
 boundaries.

 The Array is for appending.  It must always own its memory.  Therefore 
 you should be able to pass it around by reference, so Array is a *class* 
 and cannot be nearly as lightweight as T[].

 You see, many of their properties are orthogonal.  If you drop one, you 
 lose flexibility.

 Besides, Array!(T) is not a good name for build-in type.

Names are placeholders here, not an actual proposal.

What's wrong with making Array a library type?

What would new T[x] return? If it returns Array!(T), then this has to be a compiler-aware type. -Steve
Oct 10 2008
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
 "Andrei Alexandrescu" wrote
 Sergey Gromov wrote:
 My T[] is useful when you want to recursively split a megabyte file into 
 a couple thousands of tokens, and then modify some of those tokens.  For 
 that, your T[] must be lightweight, it must reference a bigger piece of 
 data, and it must guarantee not to write anything into memory outside its 
 boundaries.

 The Array is for appending.  It must always own its memory.  Therefore 
 you should be able to pass it around by reference, so Array is a *class* 
 and cannot be nearly as lightweight as T[].

 You see, many of their properties are orthogonal.  If you drop one, you 
 lose flexibility.

 Besides, Array!(T) is not a good name for build-in type.



What would new T[x] return? If it returns Array!(T), then this has to be a compiler-aware type. -Steve

new T[x] is a brain-dead syntax that I wish Walter hadn't imported in the first place. Andrei
Oct 10 2008
parent reply Benji Smith <dlanguage benjismith.net> writes:
Andrei Alexandrescu wrote:
 new T[x] is a brain-dead syntax that I wish Walter hadn't imported in 
 the first place.

Really? I think it's very valuable. The "new T[x]" syntax lets you construct an array as an RValue. Without that syntax, you have to declare an array before using it. // nice x.setOutputBuffer(new char[64]); // not so nice char[64] buffer; x.setOutputBuffer(buffer); Personally, I'd love to see the distinction between static arrays and dynamic arrays disappear. (The compiler can do whatever it wants behind the scenes, but usually I just don't care which is which, and I'd prefer a unified syntax.) I think *all* arrays should be declared like this: T[] array = new T[n]; If "n" is known it compile time, then D can use CTFE to create a static array, and if "n" isn't known until runtime, it can create a dynamic array. But as the user, I don't want to care which is which. (And I don't see how the distinction in the type-system between T[] and T[3] is useful.) --benji
Oct 10 2008
next sibling parent reply KennyTM~ <kennytm gmail.com> writes:
Benji Smith wrote:
 Andrei Alexandrescu wrote:
 new T[x] is a brain-dead syntax that I wish Walter hadn't imported in 
 the first place.

Really? I think it's very valuable. The "new T[x]" syntax lets you construct an array as an RValue. Without that syntax, you have to declare an array before using it. // nice x.setOutputBuffer(new char[64]); // not so nice char[64] buffer; x.setOutputBuffer(buffer); Personally, I'd love to see the distinction between static arrays and dynamic arrays disappear. (The compiler can do whatever it wants behind the scenes, but usually I just don't care which is which, and I'd prefer a unified syntax.) I think *all* arrays should be declared like this: T[] array = new T[n]; If "n" is known it compile time, then D can use CTFE to create a static array, and if "n" isn't known until runtime, it can create a dynamic array. But as the user, I don't want to care which is which. (And I don't see how the distinction in the type-system between T[] and T[3] is useful.) --benji

I think... “new S” creating a struct pointer while “new C” creating a class object reference is confusing enough...
Oct 10 2008
parent Benji Smith <dlanguage benjismith.net> writes:
KennyTM~ wrote:
 I think... “new S” creating a struct pointer while “new C” creating a 
 class object reference is confusing enough...

The rule I'd like to see in place would be something like this: "To create an instance of a type, you always use the 'new' keyword. The value returned by 'new' depends on the type of instance you're creating. For structs, it's a pointer and for classes it's a reference. Since arrays are (or should be) objects, 'new' returns a reference to the resultant array object." Easy peasey, lemon-squeezey :-) --benji
Oct 10 2008
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Benji Smith" wrote
 Andrei Alexandrescu wrote:
 new T[x] is a brain-dead syntax that I wish Walter hadn't imported in the 
 first place.

Really? I think it's very valuable. The "new T[x]" syntax lets you construct an array as an RValue. Without that syntax, you have to declare an array before using it. // nice x.setOutputBuffer(new char[64]); // not so nice char[64] buffer; x.setOutputBuffer(buffer); Personally, I'd love to see the distinction between static arrays and dynamic arrays disappear. (The compiler can do whatever it wants behind the scenes, but usually I just don't care which is which, and I'd prefer a unified syntax.) I think *all* arrays should be declared like this: T[] array = new T[n]; If "n" is known it compile time, then D can use CTFE to create a static array, and if "n" isn't known until runtime, it can create a dynamic array. But as the user, I don't want to care which is which.

What if n is 10000? It's small enough that it could be stack allocated, but large enough that you might not want it to do that.
 (And I don't see how the distinction in the type-system between T[] and 
 T[3] is useful.)

It tells you the scope of where to allocate data. But once declared, usage should be identical. Currently there are some quirks in the language that make this not true (which should be fixed). e.g. you can't return static arrays from functions. -Steve
Oct 10 2008
next sibling parent KennyTM~ <kennytm gmail.com> writes:
Steven Schveighoffer wrote:
 "Benji Smith" wrote
 Andrei Alexandrescu wrote:
 new T[x] is a brain-dead syntax that I wish Walter hadn't imported in the 
 first place.

The "new T[x]" syntax lets you construct an array as an RValue. Without that syntax, you have to declare an array before using it. // nice x.setOutputBuffer(new char[64]); // not so nice char[64] buffer; x.setOutputBuffer(buffer); Personally, I'd love to see the distinction between static arrays and dynamic arrays disappear. (The compiler can do whatever it wants behind the scenes, but usually I just don't care which is which, and I'd prefer a unified syntax.) I think *all* arrays should be declared like this: T[] array = new T[n]; If "n" is known it compile time, then D can use CTFE to create a static array, and if "n" isn't known until runtime, it can create a dynamic array. But as the user, I don't want to care which is which.

What if n is 10000? It's small enough that it could be stack allocated, but large enough that you might not want it to do that.

I think this is implementation dependent. DMD for example limits the size to some power of 2 between 10k and 300k (forgotten the exact number ;) )
 (And I don't see how the distinction in the type-system between T[] and 
 T[3] is useful.)

It tells you the scope of where to allocate data. But once declared, usage should be identical. Currently there are some quirks in the language that make this not true (which should be fixed). e.g. you can't return static arrays from functions. -Steve

Oct 10 2008
prev sibling parent reply Benji Smith <dlanguage benjismith.net> writes:
Steven Schveighoffer wrote:
 "Benji Smith" wrote
 I think *all* arrays should be declared like this:

    T[] array = new T[n];

 If "n" is known it compile time, then D can use CTFE to create a static 
 array, and if "n" isn't known until runtime, it can create a dynamic 
 array. But as the user, I don't want to care which is which.

What if n is 10000? It's small enough that it could be stack allocated, but large enough that you might not want it to do that.

Sounds like a perfect decision for the compiler (or the runtime) to make. D eliminated the "register" and "inline" keywords for exactly the same reason. --benji
Oct 10 2008
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Benji Smith" wrote
 Steven Schveighoffer wrote:
 "Benji Smith" wrote
 I think *all* arrays should be declared like this:

    T[] array = new T[n];

 If "n" is known it compile time, then D can use CTFE to create a static 
 array, and if "n" isn't known until runtime, it can create a dynamic 
 array. But as the user, I don't want to care which is which.

What if n is 10000? It's small enough that it could be stack allocated, but large enough that you might not want it to do that.

Sounds like a perfect decision for the compiler (or the runtime) to make. D eliminated the "register" and "inline" keywords for exactly the same reason.

How can the compiler quantify exactly how many times this function will be called in the same stack? How will it know that this won't cause a stack overflow? The register keyword is easy, the compiler can't run out of register space and cause an exception. inline also won't cause unforseen runtime errors. So I see this as a different issue. -Steve
Oct 10 2008
prev sibling next sibling parent Benji Smith <dlanguage benjismith.net> writes:
Denis Koroskin wrote:
 On Fri, 10 Oct 2008 19:54:33 +0400, Benji Smith 
 <dlanguage benjismith.net> wrote:
 
 I think *all* arrays should be declared like this:

     T[] array = new T[n];

You often want to avoid heap allocation at all cost. It can't be done as you propose.

For all other reference types, allocation on the stack is accomplished with the "scope" keyword, without having a different type, or a different constructor-call syntax. I think the same thing could apply to arrays just as easily. --benji
Oct 10 2008
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jarrett Billingsley wrote:
 On Fri, Oct 10, 2008 at 11:54 AM, Benji Smith <dlanguage benjismith.net> wrote:
 
 new T[x] is a brain-dead syntax that I wish Walter hadn't imported in the
 first place.

The "new T[x]" syntax lets you construct an array as an RValue. Without that syntax, you have to declare an array before using it.

No, what he's getting at is that "new T[x]" does not mean "allocate a statically-sized array", it means "allocate a dynamically-sized array". "new T" for any T should mean "allocate a T", not "allocate something that's kind of close to a T." What Andrei is implying, then is that for dynamic arrays, we should have to use the (already-legal) "new T[](n)" form, and "new T[x]" would mean to allocate a statically-sized array on the heap.

Well yah but I think this will confuse people coming from C++. I just wish new was abolished entirely: struct S {} auto a = S(); auto b = Object(); auto c = char[](15); auto d = char[15](); So in general Type followed by "(" ...optional arguments... ")" yields a value. Andrei
Oct 10 2008
next sibling parent Benji Smith <dlanguage benjismith.net> writes:
Andrei Alexandrescu wrote:
 Jarrett Billingsley wrote:
 What Andrei is implying, then is that for dynamic arrays, we should
 have to use the (already-legal) "new T[](n)" form, and "new T[x]"
 would mean to allocate a statically-sized array on the heap.

Well yah but I think this will confuse people coming from C++. I just wish new was abolished entirely: struct S {} auto a = S(); auto b = Object(); auto c = char[](15); auto d = char[15](); So in general Type followed by "(" ...optional arguments... ")" yields a value. Andrei

Interesting. I could live with that. --benji
Oct 10 2008
prev sibling next sibling parent reply Sergey Gromov <snake.scaly gmail.com> writes:
Fri, 10 Oct 2008 13:24:51 -0500,
Andrei Alexandrescu wrote:
 Jarrett Billingsley wrote:
 What Andrei is implying, then is that for dynamic arrays, we should
 have to use the (already-legal) "new T[](n)" form, and "new T[x]"
 would mean to allocate a statically-sized array on the heap.

Well yah but I think this will confuse people coming from C++. I just wish new was abolished entirely: struct S {} auto a = S(); auto b = Object(); auto c = char[](15); auto d = char[15](); So in general Type followed by "(" ...optional arguments... ")" yields a value.

Type constructors? Nice. You will have a hard time justifying the delete keyword though.
Oct 10 2008
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Sergey Gromov wrote:
 Fri, 10 Oct 2008 13:24:51 -0500,
 Andrei Alexandrescu wrote:
 Jarrett Billingsley wrote:
 What Andrei is implying, then is that for dynamic arrays, we should
 have to use the (already-legal) "new T[](n)" form, and "new T[x]"
 would mean to allocate a statically-sized array on the heap.

wish new was abolished entirely: struct S {} auto a = S(); auto b = Object(); auto c = char[](15); auto d = char[15](); So in general Type followed by "(" ...optional arguments... ")" yields a value.

Type constructors? Nice. You will have a hard time justifying the delete keyword though.

That's even more useless. It should be a library function. Andrei
Oct 10 2008
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Andrei Alexandrescu" wrote
 Jarrett Billingsley wrote:
 On Fri, Oct 10, 2008 at 11:54 AM, Benji Smith <dlanguage benjismith.net> 
 wrote:

 new T[x] is a brain-dead syntax that I wish Walter hadn't imported in 
 the
 first place.

The "new T[x]" syntax lets you construct an array as an RValue. Without that syntax, you have to declare an array before using it.

No, what he's getting at is that "new T[x]" does not mean "allocate a statically-sized array", it means "allocate a dynamically-sized array". "new T" for any T should mean "allocate a T", not "allocate something that's kind of close to a T." What Andrei is implying, then is that for dynamic arrays, we should have to use the (already-legal) "new T[](n)" form, and "new T[x]" would mean to allocate a statically-sized array on the heap.

Well yah but I think this will confuse people coming from C++. I just wish new was abolished entirely: struct S {} auto a = S(); auto b = Object(); auto c = char[](15); auto d = char[15](); So in general Type followed by "(" ...optional arguments... ")" yields a value.

to clarify (for myself mostly), you are implying: typeof(a) == S * typeof(b) == Object typeof(c) == char[] typeof(d) == char[15]* Is that correct? What is the syntax for allocating a struct on the stack using a constructor? It does look more appealing. That's a really really big change though :) Not sure anyone uses it, but what about parameters to new? I suppose you could just write another function that does it. -Steve
Oct 10 2008
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
 "Andrei Alexandrescu" wrote
 Jarrett Billingsley wrote:
 On Fri, Oct 10, 2008 at 11:54 AM, Benji Smith <dlanguage benjismith.net> 
 wrote:

 new T[x] is a brain-dead syntax that I wish Walter hadn't imported in 
 the
 first place.

The "new T[x]" syntax lets you construct an array as an RValue. Without that syntax, you have to declare an array before using it.

statically-sized array", it means "allocate a dynamically-sized array". "new T" for any T should mean "allocate a T", not "allocate something that's kind of close to a T." What Andrei is implying, then is that for dynamic arrays, we should have to use the (already-legal) "new T[](n)" form, and "new T[x]" would mean to allocate a statically-sized array on the heap.

new was abolished entirely: struct S {} auto a = S(); auto b = Object(); auto c = char[](15); auto d = char[15](); So in general Type followed by "(" ...optional arguments... ")" yields a value.

to clarify (for myself mostly), you are implying: typeof(a) == S *

typeof(a) = S just like now
 typeof(b) == Object

yah
 typeof(c) == char[]

yah, and 15 chars get allocated
 typeof(d) == char[15]*

just char[15].
 Is that correct?
 
 What is the syntax for allocating a struct on the stack using a constructor?

Library call S * pS = allocate!(S)(... optional args ...);
 It does look more appealing.  That's a really really big change though :)
 
 Not sure anyone uses it, but what about parameters to new?  I suppose you 
 could just write another function that does it.

Maybe parameters to new are an even better illustration for how poor that syntax is. Much energy is expended on explaining the vagaries of new syntax in C++... and what is there to it? Nothing. Allocate memory. Plop an object in it. That's pretty much it. Andrei
Oct 10 2008
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Andrei Alexandrescu" wrote
 Steven Schveighoffer wrote:
 "Andrei Alexandrescu" wrote
 Jarrett Billingsley wrote:
 On Fri, Oct 10, 2008 at 11:54 AM, Benji Smith 
 <dlanguage benjismith.net> wrote:

 new T[x] is a brain-dead syntax that I wish Walter hadn't imported in 
 the
 first place.

The "new T[x]" syntax lets you construct an array as an RValue. Without that syntax, you have to declare an array before using it.

statically-sized array", it means "allocate a dynamically-sized array". "new T" for any T should mean "allocate a T", not "allocate something that's kind of close to a T." What Andrei is implying, then is that for dynamic arrays, we should have to use the (already-legal) "new T[](n)" form, and "new T[x]" would mean to allocate a statically-sized array on the heap.

wish new was abolished entirely: struct S {} auto a = S(); auto b = Object(); auto c = char[](15); auto d = char[15](); So in general Type followed by "(" ...optional arguments... ")" yields a value.

to clarify (for myself mostly), you are implying: typeof(a) == S *

typeof(a) = S just like now

Ah, ok. I thought anything with () was allocated on the heap...
 typeof(b) == Object

yah
 typeof(c) == char[]

yah, and 15 chars get allocated
 typeof(d) == char[15]*

just char[15].

What is the point of that? Why wouldn't you just say: char[15] d;
 Is that correct?

 What is the syntax for allocating a struct on the stack using a 
 constructor?

Library call S * pS = allocate!(S)(... optional args ...);

Ugh. How many extra 'wrapper' functions are built into the code because of this? I suppose it probably would be inlined. If new is abolished as a keyword, couldn't we use it instead of allocate? S * pS = new!(S)(...); The proposal makes sense, I'd be concerned that syntax like this is ambiguous: auto x = X(); auto y = Y(); You don't know if those are value or reference types, so you don't know how they are used. As opposed to: auto x = new X(); auto y = Y(); Where you see that x is a reference type. It might be more confusing to someone who cares about the value semantics. -Steve
Oct 10 2008
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
 "Andrei Alexandrescu" wrote
 typeof(d) == char[15]*


What is the point of that? Why wouldn't you just say: char[15] d;

Only uniformity. In general Type() creates an instance of Type. Easy!
 S * pS = allocate!(S)(... optional args ...);

Ugh. How many extra 'wrapper' functions are built into the code because of this? I suppose it probably would be inlined.

Performance is not to worry about. allocate does only a call to gc.allocate, the requisite initialization, and a cast. The cost of call to gc.allocate dwarfs the call overhead even in absence of inlining.
 If new is abolished as a keyword, couldn't we use it instead of allocate?
 
 S * pS = new!(S)(...);

Yah. Just don't want to surprise anyone.
 The proposal makes sense, I'd be concerned that syntax like this is 
 ambiguous:
 
 auto x = X();
 auto y = Y();
 
 You don't know if those are value or reference types, so you don't know how 
 they are used.  As opposed to:
 
 auto x = new X();
 auto y = Y();
 
 Where you see that x is a reference type.
 
 It might be more confusing to someone who cares about the value semantics.

That's a benefit in fact. Uniformity is good. Structs don't have value semantics unless designed to, and that means you only count on value semantics only when you know the type. Andrei
Oct 10 2008
parent reply Christopher Wright <dhasenan gmail.com> writes:
Andrei Alexandrescu wrote:
 Steven Schveighoffer wrote:
 "Andrei Alexandrescu" wrote
 typeof(d) == char[15]*


What is the point of that? Why wouldn't you just say: char[15] d;

Only uniformity. In general Type() creates an instance of Type. Easy!
 S * pS = allocate!(S)(... optional args ...);

Ugh. How many extra 'wrapper' functions are built into the code because of this? I suppose it probably would be inlined.

Performance is not to worry about. allocate does only a call to gc.allocate, the requisite initialization, and a cast. The cost of call to gc.allocate dwarfs the call overhead even in absence of inlining.

Compile times are something to worry about. I'm quite hesitant to use templates when there is a reasonable alternative; they tend to increase compilation times dramatically, even if they're relatively simple. In this case, it's equivalent to: S* obj = allocate!(typeid (S) ...);
Oct 11 2008
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Christopher Wright wrote:
 Andrei Alexandrescu wrote:
 Steven Schveighoffer wrote:
 "Andrei Alexandrescu" wrote
 typeof(d) == char[15]*


What is the point of that? Why wouldn't you just say: char[15] d;

Only uniformity. In general Type() creates an instance of Type. Easy!
 S * pS = allocate!(S)(... optional args ...);

Ugh. How many extra 'wrapper' functions are built into the code because of this? I suppose it probably would be inlined.

Performance is not to worry about. allocate does only a call to gc.allocate, the requisite initialization, and a cast. The cost of call to gc.allocate dwarfs the call overhead even in absence of inlining.

Compile times are something to worry about. I'm quite hesitant to use templates when there is a reasonable alternative; they tend to increase compilation times dramatically, even if they're relatively simple.

I think this is a self-sustaining myth. I use many templates but rarely notice a compilation time issue. Andrei
Oct 11 2008
parent reply Christopher Wright <dhasenan gmail.com> writes:
Andrei Alexandrescu wrote:
 Christopher Wright wrote:
 Andrei Alexandrescu wrote:
 Steven Schveighoffer wrote:
 "Andrei Alexandrescu" wrote
 typeof(d) == char[15]*


What is the point of that? Why wouldn't you just say: char[15] d;

Only uniformity. In general Type() creates an instance of Type. Easy!
 S * pS = allocate!(S)(... optional args ...);

Ugh. How many extra 'wrapper' functions are built into the code because of this? I suppose it probably would be inlined.

Performance is not to worry about. allocate does only a call to gc.allocate, the requisite initialization, and a cast. The cost of call to gc.allocate dwarfs the call overhead even in absence of inlining.

Compile times are something to worry about. I'm quite hesitant to use templates when there is a reasonable alternative; they tend to increase compilation times dramatically, even if they're relatively simple.

I think this is a self-sustaining myth. I use many templates but rarely notice a compilation time issue. Andrei

I've noticed a huge difference. However, that was with a largish recursive set of templates.
Oct 11 2008
parent Don <nospam nospam.com.au> writes:
Christopher Wright wrote:
 Andrei Alexandrescu wrote:
 Christopher Wright wrote:
 Andrei Alexandrescu wrote:
 Steven Schveighoffer wrote:
 "Andrei Alexandrescu" wrote
 typeof(d) == char[15]*


What is the point of that? Why wouldn't you just say: char[15] d;

Only uniformity. In general Type() creates an instance of Type. Easy!
 S * pS = allocate!(S)(... optional args ...);

Ugh. How many extra 'wrapper' functions are built into the code because of this? I suppose it probably would be inlined.

Performance is not to worry about. allocate does only a call to gc.allocate, the requisite initialization, and a cast. The cost of call to gc.allocate dwarfs the call overhead even in absence of inlining.

Compile times are something to worry about. I'm quite hesitant to use templates when there is a reasonable alternative; they tend to increase compilation times dramatically, even if they're relatively simple.

I think this is a self-sustaining myth. I use many templates but rarely notice a compilation time issue. Andrei

I've noticed a huge difference. However, that was with a largish recursive set of templates.

should disappear.
Oct 13 2008
prev sibling next sibling parent reply Benji Smith <dlanguage benjismith.net> writes:
Jarrett Billingsley wrote:
 On Fri, Oct 10, 2008 at 11:54 AM, Benji Smith <dlanguage benjismith.net> wrote:
 
 new T[x] is a brain-dead syntax that I wish Walter hadn't imported in the
 first place.

The "new T[x]" syntax lets you construct an array as an RValue. Without that syntax, you have to declare an array before using it.

No, what he's getting at is that "new T[x]" does not mean "allocate a statically-sized array", it means "allocate a dynamically-sized array". "new T" for any T should mean "allocate a T", not "allocate something that's kind of close to a T."

As long as T[3] and T[5] and T[] are considered different types, I agree with that sentiment. But then again, I think array semantics would make a lot more sense if all arrays were of type T[], regardless of their size, their location (stack vs heap), and whether they're static or dynamic. --benji
Oct 10 2008
parent reply Robert Fraser <fraserofthenight gmail.com> writes:
Benji Smith wrote:
 Jarrett Billingsley wrote:
 On Fri, Oct 10, 2008 at 11:54 AM, Benji Smith 
 <dlanguage benjismith.net> wrote:

 new T[x] is a brain-dead syntax that I wish Walter hadn't imported 
 in the
 first place.

The "new T[x]" syntax lets you construct an array as an RValue. Without that syntax, you have to declare an array before using it.

No, what he's getting at is that "new T[x]" does not mean "allocate a statically-sized array", it means "allocate a dynamically-sized array". "new T" for any T should mean "allocate a T", not "allocate something that's kind of close to a T."

As long as T[3] and T[5] and T[] are considered different types, I agree with that sentiment. But then again, I think array semantics would make a lot more sense if all arrays were of type T[], regardless of their size, their location (stack vs heap), and whether they're static or dynamic. --benji

Static arrays are needed for C compatibility (in extern(C) structs), so they're not going anywhere.
Oct 10 2008
parent reply Benji Smith <dlanguage benjismith.net> writes:
Robert Fraser wrote:
 Benji Smith wrote:
 Jarrett Billingsley wrote:
 On Fri, Oct 10, 2008 at 11:54 AM, Benji Smith 
 <dlanguage benjismith.net> wrote:

 new T[x] is a brain-dead syntax that I wish Walter hadn't imported 
 in the
 first place.

The "new T[x]" syntax lets you construct an array as an RValue. Without that syntax, you have to declare an array before using it.

No, what he's getting at is that "new T[x]" does not mean "allocate a statically-sized array", it means "allocate a dynamically-sized array". "new T" for any T should mean "allocate a T", not "allocate something that's kind of close to a T."

As long as T[3] and T[5] and T[] are considered different types, I agree with that sentiment. But then again, I think array semantics would make a lot more sense if all arrays were of type T[], regardless of their size, their location (stack vs heap), and whether they're static or dynamic. --benji

Static arrays are needed for C compatibility (in extern(C) structs), so they're not going anywhere.

So what? Null-terminated strings are also necessary for C compatibility, but that doesn't mean *all* strings should be null terminated. Anyhow, I'm not going to keep chasing this point. For people new to D, the subtle differences between static and dynamic arrays can be a source of confusion. I still have my share of gotcha moments with them, and I think D would be well served by minimizing those differences. --benji
Oct 10 2008
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Benji Smith (dlanguage benjismith.net)'s article
 Anyhow, I'm not going to keep chasing this point. For people new to D,
 the subtle differences between static and dynamic arrays can be a source
 of confusion. I still have my share of gotcha moments with them, and I
 think D would be well served by minimizing those differences.
 --benji

I disagree, not only specifically on this issue but on a more philosophical level about a lot of stuff that's been mentioned here in the past few days about simplifying D. The fact is that D is a performance language that retains the ability to program close to the metal. It may be a relatively friendly, modern, easy to use performance language, but it's still a performance language. It's also a multiparadigm language. These two things necessarily means that some tradeoffs have to be made in terms of simplicity to allow for more efficient code to be written, and for procedural, functional, metaprogramming and OO paradigms to be mixed and matched. If people can't or don't want to understand the difference between a static, stack-allocated array and a dynamic, heap-allocated array, they probably don't need a performance language. They probably want Python or Ruby and they know where to find it. If they can't or don't want to learn more than one programming style, they probably don't need a multiparadigm language. They probably want Java and they know where to find it. The bottom line is that yes, D is becoming a fairly complicated language, but it's complicated because it's powerful, unlike for example C++, which is complicated because it's full of cruft from before people like me were even born, or Haskell (disclaimer: I've never actually used Haskell before, though I've heard a decent amount about it), which is complicated because it rigidly emphasizes theoretical purity.
Oct 10 2008
parent reply Benji Smith <dlanguage benjismith.net> writes:
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

dsimcha wrote:
 == Quote from Benji Smith (dlanguage benjismith.net)'s article
 Anyhow, I'm not going to keep chasing this point. For people new to D,
 the subtle differences between static and dynamic arrays can be a source
 of confusion. I still have my share of gotcha moments with them, and I
 think D would be well served by minimizing those differences.
 --benji

I disagree, not only specifically on this issue but on a more philosophical level about a lot of stuff that's been mentioned here in the past few days about simplifying D. The fact is that D is a performance language that retains the ability to program close to the metal.

Actually, when it comes to string processing, D is decidedly *not* a "performance language". Compared to...say...Java (which gets a bum rap around here for being slow), D is nothing special when it comes to string processing speed. I've attached a couple of benchmarks, implemented in both Java and D (the "shakespeare.txt" file I'm benchmarking against is from the Gutenburg project. It's about 5 MB, and you can grab it from here: http://www.gutenberg.org/dirs/etext94/shaks12.txt ) In some of those benchmarks, D is slightly faster. In some of them, Java is a lot faster. Overall, on my machine, the D code runs in about 12.5 seconds, and the Java code runs in about 2.5 seconds. Keep in mind, all java characters are two-bytes wide. And you can't access a character directly. You have to retrieve it from the String object, using the charAt() method. And splitting a string creates a new object for every fragment. I admire the goal in D to be a performance language, but it drives me crazy when people use performance as justification for an inferior design, when other languages that use the superior design also accomplish superior performance. --benji
Oct 10 2008
next sibling parent Benji Smith <dlanguage benjismith.net> writes:
Benji Smith wrote:
 blah blah blah

I forgot to mention... the reason my post was all about strings was because those are the arrays I've had to work with most often, and this set of benchmarks is something I cooked up last month to justify my position that strings should be objects. But I think the benchmarks are apropos to the general discussion about array design, since strings in D are arrays. Anyhoo... that's all for now. I'm going to bed :) --benji
Oct 11 2008
prev sibling parent reply Sascha Katzner <sorry.no spam.invalid> writes:
Benji Smith wrote:
 Actually, when it comes to string processing, D is decidedly *not* a 
 "performance language".
 
 Compared to...say...Java (which gets a bum rap around here for being 
 slow), D is nothing special when it comes to string processing speed.
 
 I've attached a couple of benchmarks, implemented in both Java and D 
 (the "shakespeare.txt" file I'm benchmarking against is from the 
 Gutenburg project. It's about 5 MB, and you can grab it from here: 
 http://www.gutenberg.org/dirs/etext94/shaks12.txt )
 
 In some of those benchmarks, D is slightly faster. In some of them, Java 
 is a lot faster. Overall, on my machine, the D code runs in about 12.5 
 seconds, and the Java code runs in about 2.5 seconds.
 
 Keep in mind, all java characters are two-bytes wide. And you can't 
 access a character directly. You have to retrieve it from the String 
 object, using the charAt() method. And splitting a string creates a new 
 object for every fragment.
 
 I admire the goal in D to be a performance language, but it drives me 
 crazy when people use performance as justification for an inferior 
 design, when other languages that use the superior design also 
 accomplish superior performance.

I think your benchmark is not very meaningful. Without going into implementation details of Tango (because I don't use Tango) here are some notes: - The D version uses UTF8 strings whereas the Java version uses "wanna-be" UTF16 (Java has a lot of problems with surrogates). This means you are comparing apples with pears (D has to *parse* an UTF8 string and Java simply uses an wchar array without proper surrogate handling in *many* cases). - At least in runCharIterateTest() you also convert the D UTF8 string also additionally into an UTF32 string, in the Java version you did not do this. - The StringBuilder in the Java version is *much* faster because it doesn't have to allocate a new memory block in each step. You can use a similar class in D too, without the need of a special string class/object. ... LLAP, Sascha
Oct 11 2008
next sibling parent reply Sergey Gromov <snake.scaly gmail.com> writes:
Sat, 11 Oct 2008 12:16:43 +0200,
Sascha Katzner wrote:
 Benji Smith wrote:
 Actually, when it comes to string processing, D is decidedly *not* a 
 "performance language".
 
 Compared to...say...Java (which gets a bum rap around here for being 
 slow), D is nothing special when it comes to string processing speed.
 
 I've attached a couple of benchmarks, implemented in both Java and D 
 (the "shakespeare.txt" file I'm benchmarking against is from the 
 Gutenburg project. It's about 5 MB, and you can grab it from here: 
 http://www.gutenberg.org/dirs/etext94/shaks12.txt )
 
 In some of those benchmarks, D is slightly faster. In some of them, Java 
 is a lot faster. Overall, on my machine, the D code runs in about 12.5 
 seconds, and the Java code runs in about 2.5 seconds.
 
 Keep in mind, all java characters are two-bytes wide. And you can't 
 access a character directly. You have to retrieve it from the String 
 object, using the charAt() method. And splitting a string creates a new 
 object for every fragment.
 
 I admire the goal in D to be a performance language, but it drives me 
 crazy when people use performance as justification for an inferior 
 design, when other languages that use the superior design also 
 accomplish superior performance.

I think your benchmark is not very meaningful. Without going into implementation details of Tango (because I don't use Tango) here are some notes: - The D version uses UTF8 strings whereas the Java version uses "wanna-be" UTF16 (Java has a lot of problems with surrogates). This means you are comparing apples with pears (D has to *parse* an UTF8 string and Java simply uses an wchar array without proper surrogate handling in *many* cases).

This is the whole point. The benchmark is valid because it performs the same *task*, and the task is somewhat close to real world. It measures *time*, which is universal. The compared languages use different approaches and techniques to achieve the goal, that's why benchmark is useful. It allows to justify usefulness of these languages for a particular class of tasks.
 - At least in runCharIterateTest() you also convert the D UTF8 string 
 also additionally into an UTF32 string, in the Java version you did not 
 do this.

Same as above. If they were using the same approach there wouldn't be much to benchmark. Why don't you mention, for instance, that Java is a virtual machine?
 - The StringBuilder in the Java version is *much* faster because it 
 doesn't have to allocate a new memory block in each step. You can use a 
 similar class in D too, without the need of a special string class/object.

I agree here. Both word tango.text.Util.split and runConcatenateTest use default array appending which is currently dead slow. Benji, to actually compare the speed of string operations you better use one of array builders discussed in this group.
Oct 11 2008
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Sergey Gromov wrote:
 Sat, 11 Oct 2008 12:16:43 +0200,
 Sascha Katzner wrote:
 Benji Smith wrote:
 Actually, when it comes to string processing, D is decidedly *not* a 
 "performance language".

 Compared to...say...Java (which gets a bum rap around here for being 
 slow), D is nothing special when it comes to string processing speed.

 I've attached a couple of benchmarks, implemented in both Java and D 
 (the "shakespeare.txt" file I'm benchmarking against is from the 
 Gutenburg project. It's about 5 MB, and you can grab it from here: 
 http://www.gutenberg.org/dirs/etext94/shaks12.txt )

 In some of those benchmarks, D is slightly faster. In some of them, Java 
 is a lot faster. Overall, on my machine, the D code runs in about 12.5 
 seconds, and the Java code runs in about 2.5 seconds.

 Keep in mind, all java characters are two-bytes wide. And you can't 
 access a character directly. You have to retrieve it from the String 
 object, using the charAt() method. And splitting a string creates a new 
 object for every fragment.

 I admire the goal in D to be a performance language, but it drives me 
 crazy when people use performance as justification for an inferior 
 design, when other languages that use the superior design also 
 accomplish superior performance.

implementation details of Tango (because I don't use Tango) here are some notes: - The D version uses UTF8 strings whereas the Java version uses "wanna-be" UTF16 (Java has a lot of problems with surrogates). This means you are comparing apples with pears (D has to *parse* an UTF8 string and Java simply uses an wchar array without proper surrogate handling in *many* cases).

This is the whole point. The benchmark is valid because it performs the same *task*, and the task is somewhat close to real world. It measures *time*, which is universal. The compared languages use different approaches and techniques to achieve the goal, that's why benchmark is useful. It allows to justify usefulness of these languages for a particular class of tasks.
 - At least in runCharIterateTest() you also convert the D UTF8 string 
 also additionally into an UTF32 string, in the Java version you did not 
 do this.

Same as above. If they were using the same approach there wouldn't be much to benchmark. Why don't you mention, for instance, that Java is a virtual machine?
 - The StringBuilder in the Java version is *much* faster because it 
 doesn't have to allocate a new memory block in each step. You can use a 
 similar class in D too, without the need of a special string class/object.

I agree here. Both word tango.text.Util.split and runConcatenateTest use default array appending which is currently dead slow. Benji, to actually compare the speed of string operations you better use one of array builders discussed in this group.

If anyone wants to try it, I'm pasting the draft version of Appender from std.array below. Andrei struct Appender(A : T[], T) { private T[] * pArray; private size_t _capacity; this(T[] * p) { pArray = p; if (!pArray) pArray = (new typeof(*pArray)[1]).ptr; _capacity = .capacity(pArray.ptr) / T.sizeof; } T[] data() { return pArray ? *pArray : null; } size_t capacity() const { return _capacity; } void write(T item) { if (!pArray) pArray = (new typeof(*pArray)[1]).ptr; if (pArray.length < _capacity) { // Should do in-place construction here pArray.ptr[pArray.length] = item; *pArray = pArray.ptr[0 .. pArray.length + 1]; } else { // Time to reallocate, do it and cache capacity *pArray ~= item; _capacity = .capacity(pArray.ptr) / T.sizeof; } } static if (is(const(T) : T)) { alias const(T) AcceptedElementType; } else { alias T AcceptedElementType; } void write(AcceptedElementType[] items) { for (; !items.empty(); items.next()) { write(items.head()); } } static if (is(const(T) == const(char))) { void write(in wchar wc) { assert(false); } void write(in wchar[] wcs) { encode!(T)(wcs, *this); } void write(in dchar dc) { assert(false); } void write(in dchar[] dcs) { encode!(T)(dcs, *this); } } void clear() { if (!pArray) return; pArray.length = 0; _capacity = .capacity(pArray.ptr) / T.sizeof; } } auto appender(T)(T[] * t) { Appender!(T[]) r = Appender!(T[])(t); return r; }
Oct 11 2008
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article

 If anyone wants to try it, I'm pasting the draft version of Appender
 from std.array below.
 Andrei

One comment: Part of the reason why ~= is slow is that for anything other than a tiny array, the array doesn't expand geometrically. It expands a memory page at a time. You can see this by printing the capacity of an array as it is expanded, using this trivial program: import std.stdio, std.gc; void main () { uint[] foo; size_t oldCapacity; foreach(i; 0..100_000) { foo ~= 1; if(capacity(foo.ptr) != oldCapacity) { writeln(capacity(foo.ptr)); oldCapacity = capacity(foo.ptr); } } } This means that, even with cached capacity, appends are worse than amortized O(1). On the other hand, sometimes space is more important than speed. Maybe how this memory allocation is done could be a ctor parameter, with the default being geometric.
Oct 11 2008
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 
 If anyone wants to try it, I'm pasting the draft version of Appender
 from std.array below.
 Andrei

One comment: Part of the reason why ~= is slow is that for anything other than a tiny array, the array doesn't expand geometrically. It expands a memory page at a time. You can see this by printing the capacity of an array as it is expanded, using this trivial program: import std.stdio, std.gc; void main () { uint[] foo; size_t oldCapacity; foreach(i; 0..100_000) { foo ~= 1; if(capacity(foo.ptr) != oldCapacity) { writeln(capacity(foo.ptr)); oldCapacity = capacity(foo.ptr); } } } This means that, even with cached capacity, appends are worse than amortized O(1). On the other hand, sometimes space is more important than speed. Maybe how this memory allocation is done could be a ctor parameter, with the default being geometric.

That test is misleading. A while ago Walter added in-place expansion; whenever a non-in-place expansion occurs, he does grow size geometrically. Andrei
Oct 11 2008
prev sibling parent reply Sergey Gromov <snake.scaly gmail.com> writes:
Sat, 11 Oct 2008 09:05:26 -0500,
Andrei Alexandrescu wrote:
 Sergey Gromov wrote:
 Sat, 11 Oct 2008 12:16:43 +0200,
 Sascha Katzner wrote:
 - The StringBuilder in the Java version is *much* faster because it 
 doesn't have to allocate a new memory block in each step. You can use a 
 similar class in D too, without the need of a special string class/object.

I agree here. Both word tango.text.Util.split and runConcatenateTest use default array appending which is currently dead slow. Benji, to actually compare the speed of string operations you better use one of array builders discussed in this group.

If anyone wants to try it, I'm pasting the draft version of Appender from std.array below.

I took the trouble to port your Appender to DMD 1.033 and fixed Benji's benchmark to use it. Here's my timings, best of 5 runs each: D: 0.37s Java: 1.05s Here are sample logs from both:
StringTest2.exe

iterated through 3921852 characters, and found 910473 spaces in 0.051324 seconds String.indexOf(): found 31318 instances of 'the' in 0.033293 seconds replaced 31318 instances of 'the' with 'XXXX' in 0.085224 seconds split text into 910474 words in 0.109966 seconds concatenated 910474 words (with 3921853 chars) in 0.085061 seconds overall test duration: 0.374258 seconds

java StringTest

iterated through 3921852 characters, and found 910473 spaces in 0.026583267 seconds String.indexOf(): found 31318 instances of 'the' in 0.014550173 seconds replaced 31318 instances of 'the' with 'XXXX' in 0.212482871 seconds split text into 910474 words in 0.452282064 seconds concatenated 910474 words (with 3921853 chars) in 0.220009526 seconds overall test duration: 1.054729581 seconds

Who rules? ;D
Oct 11 2008
parent reply Benji Smith <dlanguage benjismith.net> writes:
Sergey Gromov wrote:
 I took the trouble to port your Appender to DMD 1.033 and fixed Benji's 
 benchmark to use it.  Here's my timings, best of 5 runs each:

Hey, would you mind sending me a copy of that Appender? I was about to start backporting it myself, but then I figured it'd be a better apples-to-apples comparison if I use the version you've already written. Thanks! --benji
Oct 11 2008
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Benji Smith:
 Hey, would you mind sending me a copy of that Appender? I was about to 
 start backporting it myself, but then I figured it'd be a better 
 apples-to-apples comparison if I use the version you've already written.

Where can I find the code of the program that uses Appender? I may try my ArrayBuilder, if you want... (I presume its performance isn't that far from Appender). Bye, bearophile
Oct 11 2008
parent reply Benji Smith <dlanguage benjismith.net> writes:
bearophile wrote:
 Benji Smith:
 Hey, would you mind sending me a copy of that Appender? I was about to 
 start backporting it myself, but then I figured it'd be a better 
 apples-to-apples comparison if I use the version you've already written.

Where can I find the code of the program that uses Appender? I may try my ArrayBuilder, if you want... (I presume its performance isn't that far from Appender). Bye, bearophile

Where's the code for your ArrayBuilder? I'd like to try it too. --benji
Oct 11 2008
parent reply bearophile <bearophileHUGS lycos.com> writes:
Benji Smith:
 Where's the code for your ArrayBuilder? I'd like to try it too.

Inside my libs, module appender.d, works with D1+Phobos (and it requires some other modules of the same package, probably you can't adapt it for D2 or Tango): http://www.fantascienza.net/leonardo/so/libs_d.zip I have seen the full code posted by Sergey Gromov for D1+Tango, I'll try to try /adapt it. Bye, bearophile
Oct 11 2008
parent reply bearophile <bearophileHUGS lycos.com> writes:
Benji Smith:
 Where's the code for your ArrayBuilder? I'd like to try it too.

If you need it, I can may show stand-along code too, but it will probably be rather long, because I have to pull the parts from the other modules... Bye, bearophile
Oct 11 2008
parent reply Benji Smith <dlanguage benjismith.net> writes:
bearophile wrote:
 Benji Smith:
 Where's the code for your ArrayBuilder? I'd like to try it too.

If you need it, I can may show stand-along code too, but it will probably be rather long, because I have to pull the parts from the other modules... Bye, bearophile

I just downloaded it, and I'll take a peek tomorrow :) --benji
Oct 11 2008
parent Benji Smith <dlanguage benjismith.net> writes:
Benji Smith wrote:
 bearophile wrote:
 Benji Smith:
 Where's the code for your ArrayBuilder? I'd like to try it too.

If you need it, I can may show stand-along code too, but it will probably be rather long, because I have to pull the parts from the other modules... Bye, bearophile

I just downloaded it, and I'll take a peek tomorrow :) --benji

So I just spent some time looking through your code (especially the "func" module) and it looks like you've put together a very useful collection of idioms. Nice work! Unfortunately, I'm using Tango extensively, and you're using Phobos extensively. So there's really no way for me to use your stuff right now :( I'll keep in on my radar screen, though, for when the mythical Phobos/Tango unification takes place. --benji
Oct 12 2008
prev sibling parent reply Sergey Gromov <snake.scaly gmail.com> writes:
Sat, 11 Oct 2008 15:00:20 -0400,
Benji Smith wrote:
 Sergey Gromov wrote:
 I took the trouble to port your Appender to DMD 1.033 and fixed Benji's 
 benchmark to use it.  Here's my timings, best of 5 runs each:

Hey, would you mind sending me a copy of that Appender? I was about to start backporting it myself, but then I figured it'd be a better apples-to-apples comparison if I use the version you've already written.

Here's everything. Sorry for a long post! :D module d1; template Const(T) { version (D_Version2) mixin("alias const(T) Const;"); else alias T Const; } template Invariant(T) { version (D_Version2) mixin("alias invariant(T) Invariant;"); else alias T Invariant; } module appender; import std.gc: capacity; import d1; struct Appender(A : T[], T) { private T[] * pArray; private size_t _capacity; version (D_Version2) { this(T[] * p) { pArray = p; if (!pArray) pArray = (new typeof(*pArray)[1]).ptr; _capacity = .capacity(pArray.ptr) / T.sizeof; } } else { static Appender opCall(T[] * p) { Appender r; r.pArray = p; if (!r.pArray) r.pArray = (new typeof(*r.pArray)[1]).ptr; r._capacity = .capacity(r.pArray.ptr) / T.sizeof; return r; } } T[] data() { return pArray ? *pArray : null; } size_t capacity() /+ const +/ { return _capacity; } void write(T item) { if (!pArray) pArray = (new typeof(*pArray)[1]).ptr; if (pArray.length < _capacity) { // Should do in-place construction here pArray.ptr[pArray.length] = item; *pArray = pArray.ptr[0 .. pArray.length + 1]; } else { // Time to reallocate, do it and cache capacity *pArray ~= item; _capacity = .capacity(pArray.ptr) / T.sizeof; } } static if (is(Const!(T) : T)) { alias Const!(T) AcceptedElementType; } else { alias T AcceptedElementType; } void write(AcceptedElementType[] items) { while (items.length) { write(items[0]); items = items[1..$]; } } void clear() { if (!pArray) return; pArray.length = 0; _capacity = .capacity(pArray.ptr) / T.sizeof; } } module StringTest2; import tango.io.FileConduit; import tango.io.Stdout; import tango.time.StopWatch; import Util = tango.text.Util; import appender; void main() { StopWatch overallWatch; overallWatch.start(); StopWatch loadWatch; loadWatch.start(); auto file = new FileConduit("shakespeare.txt"); char[] text = new char[cast(size_t)file.length]; file.input.read(text); Stdout.formatln("time to load file: {0:f6} seconds", loadWatch.stop()); runCharIterateTest(text); runWordFindTest(text); runWordReplaceTest(text); char[][] splitWords = runWordSplitTest(text); runConcatenateTest(splitWords); Stdout.formatln("overall test duration: {0:f6} seconds", overallWatch.stop()); } private static void runCharIterateTest(char[] text) { StopWatch watch; watch.start(); int spaceCount = 0; foreach (dchar c; text) { if (c == ' ') spaceCount++; } Stdout.formatln("iterated through {0} characters, and found {1} spaces in {2:f6} seconds", text.length, spaceCount, watch.stop()); } private static void runWordFindTest(char[] text) { StopWatch watch; watch.start(); int wordInstanceCount = -1; int position = 0; do { wordInstanceCount++; position = Util.locatePattern(text, "the", position + 1); } while (position < text.length); Stdout.formatln("String.indexOf(): found {0} instances of 'the' in {1:f6} seconds", wordInstanceCount, watch.stop()); } private static void runWordReplaceTest(char[] text) { int oldLength = text.length; StopWatch watch; watch.start(); char[] replaced = Util.substitute(text, "the", "XXXX"); int newLength = replaced.length; int replacementCount = newLength - oldLength; Stdout.formatln("replaced {0} instances of 'the' with 'XXXX' in {1:f6} seconds", replacementCount, watch.stop()); } private static char[][] runWordSplitTest(char[] text) { StopWatch watch; watch.start(); Appender!(char[][]) splitWords; foreach (segment; Util.patterns (text, " ")) splitWords.write(segment); Stdout.formatln("split text into {0} words in {1:f6} seconds", splitWords.data.length, watch.stop()); return splitWords.data; } private static void runConcatenateTest(char[][] splitWords) { StopWatch watch; watch.start(); Appender!(char[]) buffer; foreach (char[] word; splitWords) { buffer.write(word); buffer.write(" "); } Stdout.formatln("concatenated {0} words (with {1} chars) in {2:f6} seconds", splitWords.length, buffer.data.length, watch.stop()); }
Oct 11 2008
parent reply Benji Smith <dlanguage benjismith.net> writes:
Sergey Gromov wrote:
 Here's everything.  Sorry for a long post!  :D

Sweet. Thanks! Here are my most recent timings: StringTest.exe ---------------------------------------- time to load file: 0.016087 seconds iterated through 5582655 characters, and found 1293934 spaces in 0.068550 seconds String.indexOf(): found 43889 instances of 'the' in 0.022940 seconds replaced 43889 instances of 'the' with 'XXXX' in 0.182749 seconds split text into 1293935 words in 8.372263 seconds concatenated 1293935 words (with 5582656 chars) in 3.933354 seconds overall test duration: 12.597674 seconds StringTest2.exe ---------------------------------------- time to load file: 0.037179 seconds iterated through 5582655 characters, and found 1293934 spaces in 0.129138 seconds String.indexOf(): found 43889 instances of 'the' in 0.023242 seconds replaced 43889 instances of 'the' with 'XXXX' in 0.178492 seconds split text into 1293935 words in 0.165258 seconds concatenated 1293935 words (with 5582656 chars) in 0.183627 seconds overall test duration: 0.718944 seconds StringTest.java ---------------------------------------- time to load file: 0.154415715 seconds iterated through 5582655 characters, and found 1293934 spaces in 0.037969071 seconds String.indexOf(): found 43889 instances of 'the' in 0.04771919 seconds replaced 43889 instances of 'the' with 'XXXX' in 0.363044465 seconds split text into 1293935 words in 1.008946389 seconds concatenated 1293935 words (with 5582656 chars) in 0.28996758 seconds overall test duration: 1.916152193 seconds Nice work Sergey and Andrei! Do you think the Appender functionality should be in a template, or should it be built into the array implementation? I still think the string *design* sucks though :) --benji
Oct 11 2008
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Benji Smith wrote:
 Sergey Gromov wrote:
 Here's everything.  Sorry for a long post!  :D

Sweet. Thanks!

[snip]
 Nice work Sergey and Andrei!

Thanks for all the work to you. The results are even better than I'd expected.
 Do you think the Appender functionality should be in a template, or 
 should it be built into the array implementation?

I wrote Appender with future extensibility in mind: any collection of the future (list, deque...) can specialize Appender to append to them. Then client code can transparently use Appender!C to append to any collection C. Yum, I even used the new syntax :o).
 I still think the string *design* sucks though :)

I agree it could be done better. Andrei
Oct 11 2008
parent Christian Kamm <kamm-incasoftware removethis.de> writes:
 I wrote Appender with future extensibility in mind: any collection of
 the future (list, deque...) can specialize Appender to append to them.
 Then client code can transparently use Appender!C to append to any
 collection C. Yum, I even used the new syntax :o).

Hm, so are there plans to make template specializations in different modules work together seamlessly? At the moment: a.d: struct Appender(A : T[], T) {} b.d: struct MyArr(T) {} struct Appender(A : MyArr!(T), T) {} use.d: import a; import b; void main() { Appender!(int[]) ainst; Appender!(MyArr!(int)) binst; } yields (DMD 2.018) use.d(6): template instance Appender is not a template declaration, it is a overloadset use.d(6): Error: Appender!(int[]) is used as a type use.d(6): variable use.main.a voids have no value
Oct 12 2008
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Denis Koroskin wrote:
 On Sat, 11 Oct 2008 18:00:38 +0400, Andrei Alexandrescu 
 Two notes:
 1) I thought Appender would have an 'append' method as well as opCatAssign.

Appender has write because it is an output range. That way you can direct any algorithm that uses output iterators to append to an array.
 2) Shouldn't the following member be called 'size'/'length' instead?
 size_t capacity() const { return pArray ? pArray.length : 0; }
 'capacity' would look like size_t capacity() const { return _capacity; }

Yah, I deleted that post and reposted without the culprit. Andrei
Oct 11 2008
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Denis Koroskin wrote:
 On Sat, 11 Oct 2008 18:32:25 +0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Denis Koroskin wrote:
 On Sat, 11 Oct 2008 18:00:38 +0400, Andrei Alexandrescu Two notes:
 1) I thought Appender would have an 'append' method as well as 
 opCatAssign.

Appender has write because it is an output range. That way you can direct any algorithm that uses output iterators to append to an array.

BTW, I wouldn't know that Appender is a range if you didn't say it. I believe it should be specified (and enforced) somehow in the code, like 'implements the output range contract' (C++0x contracts come to mind). For example, an error could be risen if Output Range definition is changed and Appender is not updated yet.

Walter is considering allowing structs to inherit from interfaces. Andrei
Oct 11 2008
parent reply Sergey Gromov <snake.scaly gmail.com> writes:
Sat, 11 Oct 2008 09:55:19 -0500,
Andrei Alexandrescu wrote:
 Denis Koroskin wrote:
 On Sat, 11 Oct 2008 18:32:25 +0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Denis Koroskin wrote:
 On Sat, 11 Oct 2008 18:00:38 +0400, Andrei Alexandrescu Two notes:
 1) I thought Appender would have an 'append' method as well as 
 opCatAssign.

Appender has write because it is an output range. That way you can direct any algorithm that uses output iterators to append to an array.

BTW, I wouldn't know that Appender is a range if you didn't say it. I believe it should be specified (and enforced) somehow in the code, like 'implements the output range contract' (C++0x contracts come to mind). For example, an error could be risen if Output Range definition is changed and Appender is not updated yet.

Walter is considering allowing structs to inherit from interfaces.

I think Appender should be a class. It's obviously non-copyable, it's not too convenient to pass structs by pointers, and I don't think you need so many appenders sitting around that they being heap objects should become a concern.
Oct 11 2008
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Sergey Gromov wrote:
 Sat, 11 Oct 2008 09:55:19 -0500,
 Andrei Alexandrescu wrote:
 Denis Koroskin wrote:
 On Sat, 11 Oct 2008 18:32:25 +0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:

 Denis Koroskin wrote:
 On Sat, 11 Oct 2008 18:00:38 +0400, Andrei Alexandrescu Two notes:
 1) I thought Appender would have an 'append' method as well as 
 opCatAssign.

direct any algorithm that uses output iterators to append to an array.

believe it should be specified (and enforced) somehow in the code, like 'implements the output range contract' (C++0x contracts come to mind). For example, an error could be risen if Output Range definition is changed and Appender is not updated yet.


I think Appender should be a class. It's obviously non-copyable, it's not too convenient to pass structs by pointers, and I don't think you need so many appenders sitting around that they being heap objects should become a concern.

I didn't want to force a dynamic allocation just to store a little structure. Andrei
Oct 11 2008
prev sibling parent reply Sascha Katzner <sorry.no spam.invalid> writes:
Sergey Gromov wrote:
 This is the whole point.  The benchmark is valid because it performs
 the same *task*, and the task is somewhat close to real world.  It
 measures *time*, which is universal.  The compared languages use
 different approaches and techniques to achieve the goal, that's why
 benchmark is useful.  It allows to justify usefulness of these
 languages for a particular class of tasks.

My point was, that it is *not* the same task both programs perform. The D version has to do a lot more because it accounts for multi-byte codepoints in UTF8, but the Java version doesn't account for surrogate pairs. I bet if you simply scan byte-wise through the D UTF8 array for whitespaces without converting them to UTF32 it would perform even better, but that wouldn't be a fair comparison neither. ;-) It's like if you would remove all runtime security checks and exception code from a programm and benchmark it against the original version... it simply doesn't make much sense. ;-)
 - At least in runCharIterateTest() you also convert the D UTF8
 string also additionally into an UTF32 string, in the Java version
 you did not do this.

Same as above. If they were using the same approach there wouldn't be much to benchmark.

That's my whole point here, you can use the exact same approach in D as in Java and achieve more or less the same speed. That's the beauty of Ds string approach.
 Why don't you mention, for instance, that Java is a virtual machine?

Because that is irrelevant in this context. LLAP, Sascha
Oct 11 2008
parent reply Benji Smith <dlanguage benjismith.net> writes:
Sascha Katzner wrote:
 Sergey Gromov wrote:
 This is the whole point.  The benchmark is valid because it performs
 the same *task*, and the task is somewhat close to real world.  It
 measures *time*, which is universal.  The compared languages use
 different approaches and techniques to achieve the goal, that's why
 benchmark is useful.  It allows to justify usefulness of these
 languages for a particular class of tasks.

My point was, that it is *not* the same task both programs perform. The D version has to do a lot more because it accounts for multi-byte codepoints in UTF8, but the Java version doesn't account for surrogate pairs. I bet if you simply scan byte-wise through the D UTF8 array for whitespaces without converting them to UTF32 it would perform even better, but that wouldn't be a fair comparison neither. ;-) It's like if you would remove all runtime security checks and exception code from a programm and benchmark it against the original version... it simply doesn't make much sense. ;-)

And my whole point was that Java's design decision to always use two-byte characters is a superior choice, since performance is not an issue, and since having a single character type makes the programmer's life a helluva lot simpler. The D design makes things pointlessly complex, and now you want brownie points for dealing with that pointless complexity? And, btw, you *can't* scan bytewise through a D string to find space characters, because the value '32' can occur as the least-significant-byte in a multi-byte non-whitespace character. Any code that iterates bytewise through a char[] array is fundamentally broken. But D's strings *look* like they can be iterated byte-by-byte, because they're arrays. And all other kinds of arrays in D can be iterated that way. You can't retrieve a long value from an int array, because it doesn't make sense. And it doesn't make sense to foreach through a collection of dchars in a char[] array. The purpose of this benchmark is not to show Java's speed advantage (because my primary concern with string processing is not speed). The purpose was to show that the speed justifications for D's wonky design are not valid. D strings are a trainwreck not because of a few milliseconds of execution time. They're a trainwreck because they break the rules of the language. --benji
Oct 11 2008
next sibling parent KennyTM~ <kennytm gmail.com> writes:
Benji Smith wrote:
 Sascha Katzner wrote:
 Sergey Gromov wrote:
 This is the whole point.  The benchmark is valid because it performs
 the same *task*, and the task is somewhat close to real world.  It
 measures *time*, which is universal.  The compared languages use
 different approaches and techniques to achieve the goal, that's why
 benchmark is useful.  It allows to justify usefulness of these
 languages for a particular class of tasks.

My point was, that it is *not* the same task both programs perform. The D version has to do a lot more because it accounts for multi-byte codepoints in UTF8, but the Java version doesn't account for surrogate pairs. I bet if you simply scan byte-wise through the D UTF8 array for whitespaces without converting them to UTF32 it would perform even better, but that wouldn't be a fair comparison neither. ;-) It's like if you would remove all runtime security checks and exception code from a programm and benchmark it against the original version... it simply doesn't make much sense. ;-)

And my whole point was that Java's design decision to always use two-byte characters is a superior choice, since performance is not an issue, and since having a single character type makes the programmer's life a helluva lot simpler. The D design makes things pointlessly complex, and now you want brownie points for dealing with that pointless complexity?

Many C libraries work on char arrays instead of wchar_t arrays. This can be tackled with by defaulting string literals to wstring, but in all ways, the string/wstring division cannot be lifted. I agree that wstring is easier to work with if you expect non-English text.
 And, btw, you *can't* scan bytewise through a D string to find space 
 characters, because the value '32' can occur as the 
 least-significant-byte in a multi-byte non-whitespace character. Any 
 code that iterates bytewise through a char[] array is fundamentally broken.
 
 But D's strings *look* like they can be iterated byte-by-byte, because 
 they're arrays. And all other kinds of arrays in D can be iterated that 
 way. You can't retrieve a long value from an int array, because it 
 doesn't make sense. And it doesn't make sense to foreach through a 
 collection of dchars in a char[] array.
 
 The purpose of this benchmark is not to show Java's speed advantage 
 (because my primary concern with string processing is not speed). The 
 purpose was to show that the speed justifications for D's wonky design 
 are not valid.
 
 D strings are a trainwreck not because of a few milliseconds of 
 execution time. They're a trainwreck because they break the rules of the 
 language.
 
 --benji

Oct 11 2008
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Benji Smith:
 Java's design decision to always use two-byte characters is a superior choice,<

It's a design error caused by the early adoption of unicode by Java, because unicode needs 4 bytes. So it may lead to problems. Bye, bearophile
Oct 11 2008
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
bearophile wrote:
 Benji Smith:
 Java's design decision to always use two-byte characters is a
 superior choice,<

It's a design error caused by the early adoption of unicode by Java, because unicode needs 4 bytes. So it may lead to problems.

I agree. I find it odd that anyone finds Java's character choice superior now, when it's acknowledged it missed the mark somewhat dramatically (only a short time shy of UTF-32 adoption). Andrei
Oct 11 2008
parent reply Benji Smith <dlanguage benjismith.net> writes:
Andrei Alexandrescu wrote:
 bearophile wrote:
 Benji Smith:
 Java's design decision to always use two-byte characters is a
 superior choice,<

It's a design error caused by the early adoption of unicode by Java, because unicode needs 4 bytes. So it may lead to problems.

I agree. I find it odd that anyone finds Java's character choice superior now, when it's acknowledged it missed the mark somewhat dramatically (only a short time shy of UTF-32 adoption). Andrei

I think you make a good point. It never occurred to me before, though, because I've never actually run across it in the last eight years of Java coding. But if you think java's implementation is a design mistake, because of sloppy integration across two-byte/four-byte lines, isn't D's string design guilty of the same mistake, but also across the one-byte/two-byte line? --benji
Oct 11 2008
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Benji Smith wrote:
 Andrei Alexandrescu wrote:
 bearophile wrote:
 Benji Smith:
 Java's design decision to always use two-byte characters is a
 superior choice,<

It's a design error caused by the early adoption of unicode by Java, because unicode needs 4 bytes. So it may lead to problems.

I agree. I find it odd that anyone finds Java's character choice superior now, when it's acknowledged it missed the mark somewhat dramatically (only a short time shy of UTF-32 adoption). Andrei

I think you make a good point. It never occurred to me before, though, because I've never actually run across it in the last eight years of Java coding.

Well it does occur, and the fact that it occurs less frequently makes it all the more catastrophic when it comes to the unprepared. A friend of mine working at Adobe told me they have had huge issues with very, very rare 4-byte surrogates occuring in otherwise tame 16-bit characters.
 But if you think java's implementation is a design mistake, because of 
 sloppy integration across two-byte/four-byte lines, isn't D's string 
 design guilty of the same mistake, but also across the one-byte/two-byte 
 line?

I don't think so because D openly acknowledges 8/16/32-bit encodings, whereas Java only does 16 and kind of acts as if 32-bit surrogates don't exists. I honestly think D is on the brink of receiving the best character processing abilities of all languages in existence. It has openly embraced the reality of multi-byte characters when others either try to settle on one-size-fits-all or sweep the issue under the rug. The best encoding of the day, UTF, which is there to stay, is the standard embraced by the language. In addition, the std.encoding module (where's Janice? sigh) is very promising in that it offers open-ended support to other current and possibly future encodings. I plan to work on that at some time to make it fast, because its use of delegates is inefficient. The advent of ranges also clarifies that the right way to treat a string of any encoding as a collection of characters is a bidirectional range: you can move forward or backward, but there's no random access. Once a library type UTFRange is in place, that will work with all current and future algorithms accepting bidirectional ranges. Insertion and replacement in strings is not easy to code, but certainly doable and most of the time as efficient as for regular arrays. Andrei
Oct 11 2008
parent Benji Smith <dlanguage benjismith.net> writes:
Andrei Alexandrescu wrote:
 In addition, the std.encoding module (where's Janice? sigh) is very 
 promising in that it offers open-ended support to other current and 
 possibly future encodings. I plan to work on that at some time to make 
 it fast, because its use of delegates is inefficient. The advent of 
 ranges also clarifies that the right way to treat a string of any 
 encoding as a collection of characters is a bidirectional range: you can 
 move forward or backward, but there's no random access. Once a library 
 type UTFRange is in place, that will work with all current and future 
 algorithms accepting bidirectional ranges. Insertion and replacement in 
 strings is not easy to code, but certainly doable and most of the time 
 as efficient as for regular arrays.

Yeah, I'm really looking forward to trying out the UTFRange stuff when it becomes available. Nice work on that. --benji
Oct 11 2008
prev sibling next sibling parent reply Sergey Gromov <snake.scaly gmail.com> writes:
Sat, 11 Oct 2008 14:46:55 -0400,
Benji Smith wrote:
 And, btw, you *can't* scan bytewise through a D string to find space 
 characters, because the value '32' can occur as the 
 least-significant-byte in a multi-byte non-whitespace character. Any 
 code that iterates bytewise through a char[] array is fundamentally broken.

You're wrong. char[] is not MBCS, it's UTF-8. In UTF-8 any byte which is part of a multi-byte sequence always has its most significant bit set. You can safely search for any ASCII in UTF-8 sequence as if it were an array of bytes.
Oct 11 2008
parent Benji Smith <dlanguage benjismith.net> writes:
Sergey Gromov wrote:
 Sat, 11 Oct 2008 14:46:55 -0400,
 Benji Smith wrote:
 And, btw, you *can't* scan bytewise through a D string to find space 
 characters, because the value '32' can occur as the 
 least-significant-byte in a multi-byte non-whitespace character. Any 
 code that iterates bytewise through a char[] array is fundamentally broken.

You're wrong. char[] is not MBCS, it's UTF-8. In UTF-8 any byte which is part of a multi-byte sequence always has its most significant bit set. You can safely search for any ASCII in UTF-8 sequence as if it were an array of bytes.

Oh yeah. I totally forgot about that. Good point. --benji
Oct 11 2008
prev sibling parent reply Sascha Katzner <sorry.no spam.invalid> writes:
Benji Smith wrote:
 And, btw, you *can't* scan bytewise through a D string to find space
  characters, because the value '32' can occur as the 
 least-significant-byte in a multi-byte non-whitespace character. Any
  code that iterates bytewise through a char[] array is fundamentally
 broken.

And here you're wrong. In fact you can do this with every ASCII character, because it's codepoint is below 128 and therefore it's most significant bit is always cleared and it is always represented with only one byte. *Every* other UTF8 codepoint is represented with a byte sequence with more than one byte and where *every* byte has set it's most significant bit. If you don't believe me, here is a very good documentation of the Unicode standard: http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&item_id=IWS-AppendixA#sec3 LLAP, Sascha
Oct 11 2008
parent Benji Smith <dlanguage benjismith.net> writes:
Sascha Katzner wrote:
 Benji Smith wrote:
 And, btw, you *can't* scan bytewise through a D string to find space
  characters, because the value '32' can occur as the 
 least-significant-byte in a multi-byte non-whitespace character. Any
  code that iterates bytewise through a char[] array is fundamentally
 broken.

And here you're wrong. In fact you can do this with every ASCII character, because it's codepoint is below 128 and therefore it's most significant bit is always cleared and it is always represented with only one byte. *Every* other UTF8 codepoint is represented with a byte sequence with more than one byte and where *every* byte has set it's most significant bit. If you don't believe me, here is a very good documentation of the Unicode standard: http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&item_i =IWS-AppendixA#sec3 LLAP, Sascha

Yeah. I have been schooled :( --benji
Oct 11 2008
prev sibling parent Benji Smith <dlanguage benjismith.net> writes:
Sascha Katzner wrote:
 Benji Smith wrote:
 Actually, when it comes to string processing, D is decidedly *not* a 
 "performance language".

 Compared to...say...Java (which gets a bum rap around here for being 
 slow), D is nothing special when it comes to string processing speed.

 I've attached a couple of benchmarks, implemented in both Java and D 
 (the "shakespeare.txt" file I'm benchmarking against is from the 
 Gutenburg project. It's about 5 MB, and you can grab it from here: 
 http://www.gutenberg.org/dirs/etext94/shaks12.txt )

 In some of those benchmarks, D is slightly faster. In some of them, 
 Java is a lot faster. Overall, on my machine, the D code runs in about 
 12.5 seconds, and the Java code runs in about 2.5 seconds.

 Keep in mind, all java characters are two-bytes wide. And you can't 
 access a character directly. You have to retrieve it from the String 
 object, using the charAt() method. And splitting a string creates a 
 new object for every fragment.

 I admire the goal in D to be a performance language, but it drives me 
 crazy when people use performance as justification for an inferior 
 design, when other languages that use the superior design also 
 accomplish superior performance.

I think your benchmark is not very meaningful. Without going into implementation details of Tango (because I don't use Tango) here are some notes: - The D version uses UTF8 strings whereas the Java version uses "wanna-be" UTF16 (Java has a lot of problems with surrogates). This means you are comparing apples with pears (D has to *parse* an UTF8 string and Java simply uses an wchar array without proper surrogate handling in *many* cases). - At least in runCharIterateTest() you also convert the D UTF8 string also additionally into an UTF32 string, in the Java version you did not do this. - The StringBuilder in the Java version is *much* faster because it doesn't have to allocate a new memory block in each step. You can use a similar class in D too, without the need of a special string class/object. ... LLAP, Sascha

Nonsense! The benchmark is valid because I use the best string processing tools that each language provides. If D had anything like a StringBuilder, I would use it. If D had any way of iterating over the characters in a string without converting them to UTF-32, I'd use that too. People argue that D string processing uses these funky idioms for performance reasons, and that using a more elegant design, with objects and polymorphism would be hopelessly slow. I'm just showing that those idioms don't actually provide the performance that people claim. --benji
Oct 11 2008
prev sibling parent Robert Fraser <fraserofthenight gmail.com> writes:
Benji Smith wrote:
 Static arrays are needed for C compatibility (in extern(C) structs), 
 so they're not going anywhere.

So what? Null-terminated strings are also necessary for C compatibility, but that doesn't mean *all* strings should be null terminated.

And *all* arrays aren't static; most people use dynamic arrays exclusively except when interfacing with C code.
Oct 10 2008
prev sibling parent Jacob Carlborg <doobnet gmail.com> writes:
I thought that would share my list

1. The Tango/Phobos issue
2. Stack traced exceptions, add it in the language and the specification
3. dmd on osx (but will not happen)

Unordered:

Be able to use versions the same way you can use bools, like this:
version ((linux || darwin) && (PPC64 || X86_64))
version (!Windows)

Property shortcut:
public get set int foo;

"Reasonable varargs that don't require either horrible
platform-dependent machinations or massive code bloat."
Oct 10 2008
prev sibling parent reply Sergey Gromov <snake.scaly gmail.com> writes:
Fri, 10 Oct 2008 08:22:10 -0500,
Andrei Alexandrescu wrote:
 Sergey Gromov wrote:
 My T[] is useful when you want to recursively split a megabyte file into 
 a couple thousands of tokens, and then modify some of those tokens.  For 
 that, your T[] must be lightweight, it must reference a bigger piece of 
 data, and it must guarantee not to write anything into memory outside 
 its boundaries.
 
 The Array is for appending.  It must always own its memory.  Therefore 
 you should be able to pass it around by reference, so Array is a *class* 
 and cannot be nearly as lightweight as T[].
 
 You see, many of their properties are orthogonal.  If you drop one, you 
 lose flexibility.
 
 Besides, Array!(T) is not a good name for build-in type.

Names are placeholders here, not an actual proposal.

What's wrong with making Array a library type?

Well, I'd like new Object[15]; to be immediately appendable and therefore a syntactic sugar for new Array!(Object)(15); I'd also like "foo" ~ text ~ "bar" to become something like (new Array!(char)) ~= "foo" ~= text ~= "bar" that is what Java does to string concatenation. Sugar doesn't seem to couple well with a purely library type. Well, the latter is probably too complex and can cause major problems. But new T[] should return something appendable.
Oct 10 2008
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Sergey Gromov wrote:
 Fri, 10 Oct 2008 08:22:10 -0500,
 Andrei Alexandrescu wrote:
 Sergey Gromov wrote:
 My T[] is useful when you want to recursively split a megabyte file into 
 a couple thousands of tokens, and then modify some of those tokens.  For 
 that, your T[] must be lightweight, it must reference a bigger piece of 
 data, and it must guarantee not to write anything into memory outside 
 its boundaries.

 The Array is for appending.  It must always own its memory.  Therefore 
 you should be able to pass it around by reference, so Array is a *class* 
 and cannot be nearly as lightweight as T[].

 You see, many of their properties are orthogonal.  If you drop one, you 
 lose flexibility.

 Besides, Array!(T) is not a good name for build-in type.



Well, I'd like new Object[15]; to be immediately appendable and therefore a syntactic sugar for new Array!(Object)(15);

I have a nagging impression the syntax Array!(Object) strikes you as hard on the hand and the eyes... Anyhow the syntax new Object[15] is idiotic because Object[15] is a type in itself. The syntax makes it next to impossible to actually generate a fixed-sized array dynamically. In fact here's a challenge for you. Please generate a pointer to an Object[15] using new.
 I'd also like
   "foo" ~ text ~ "bar"
 to become something like
   (new Array!(char)) ~= "foo" ~= text ~= "bar"
 that is what Java does to string concatenation.  Sugar doesn't seem to 
 couple well with a purely library type.
 
 Well, the latter is probably too complex and can cause major problems.  
 But new T[] should return something appendable.

I'm not 100% sure about that. Andrei
Oct 10 2008
next sibling parent reply Sergey Gromov <snake.scaly gmail.com> writes:
Fri, 10 Oct 2008 19:10:19 +0400,
Denis Koroskin wrote:
 On Fri, 10 Oct 2008 18:33:43 +0400, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Sergey Gromov wrote:
 I'd also like
   "foo" ~ text ~ "bar"
 to become something like
   (new Array!(char)) ~= "foo" ~= text ~= "bar"
 that is what Java does to string concatenation.  Sugar doesn't seem to  
 couple well with a purely library type.


I doubt that. Java strings are immutable, their length can't be changed either (i.e. they are a direct analog of a proposed invariant(char)[]). Concatanating Java strings don't turn them into array, and it is dead slow. Every sane person uses StringBuilder instead.

Java compiler substitutes "foo" + text + "bar" with new StringBuilder().append("foo").append(text).append("bar").toString() all by itself because both String and StringBuilder are built-in classes.
Oct 10 2008
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Denis Koroskin wrote:
 On Fri, 10 Oct 2008 19:19:45 +0400, Sergey Gromov 
 <snake.scaly gmail.com> wrote:
 
 Fri, 10 Oct 2008 19:10:19 +0400,
 Denis Koroskin wrote:
 On Fri, 10 Oct 2008 18:33:43 +0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 Sergey Gromov wrote:
 I'd also like
   "foo" ~ text ~ "bar"
 to become something like
   (new Array!(char)) ~= "foo" ~= text ~= "bar"
 that is what Java does to string concatenation.  Sugar doesn't 


 couple well with a purely library type.


I doubt that. Java strings are immutable, their length can't be changed either (i.e. they are a direct analog of a proposed invariant(char)[]). Concatanating Java strings don't turn them into array, and it is dead slow. Every sane person uses StringBuilder instead.

Java compiler substitutes "foo" + text + "bar" with new StringBuilder().append("foo").append(text).append("bar").toString() all by itself because both String and StringBuilder are built-in classes.

Yes, you are right. From The Java Language Specification, 3rd edition:
 To increase the performance of repeated string concatenation,
 a Java compiler may use the StringBuffer class or a similar technique
 to reduce the number of intermediate String objects that are created
 by evaluation of an expression.


But that doesn't require StringBuffer to be part of the language. With literals, the compiler can do whatever magic it wants. Andrei
Oct 10 2008
parent Sergey Gromov <snake.scaly gmail.com> writes:
Fri, 10 Oct 2008 10:37:28 -0500,
Andrei Alexandrescu wrote:
 Denis Koroskin wrote:
 On Fri, 10 Oct 2008 19:19:45 +0400, Sergey Gromov 
 <snake.scaly gmail.com> wrote:
 
 Fri, 10 Oct 2008 19:10:19 +0400,
 Denis Koroskin wrote:
 On Fri, 10 Oct 2008 18:33:43 +0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 Sergey Gromov wrote:
 I'd also like
   "foo" ~ text ~ "bar"
 to become something like
   (new Array!(char)) ~= "foo" ~= text ~= "bar"
 that is what Java does to string concatenation.  Sugar doesn't 


 couple well with a purely library type.


I doubt that. Java strings are immutable, their length can't be changed either (i.e. they are a direct analog of a proposed invariant(char)[]). Concatanating Java strings don't turn them into array, and it is dead slow. Every sane person uses StringBuilder instead.

Java compiler substitutes "foo" + text + "bar" with new StringBuilder().append("foo").append(text).append("bar").toString() all by itself because both String and StringBuilder are built-in classes.

Yes, you are right. From The Java Language Specification, 3rd edition:
 To increase the performance of repeated string concatenation,
 a Java compiler may use the StringBuffer class or a similar technique
 to reduce the number of intermediate String objects that are created
 by evaluation of an expression.


But that doesn't require StringBuffer to be part of the language. With literals, the compiler can do whatever magic it wants.

They're no literals. They're String objects. Try this: public class test { public static void main(String[] args) { String s = "foo" + args[0] + "bar" + args[1]; } } $ javac test.java $ javap -c test Though you are right in the sense that the result of this expression is still String, StringBuilder can be replaced with anything else or not used at all.
Oct 10 2008
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Andrei Alexandrescu" wrote
 Sergey Gromov wrote:
 Fri, 10 Oct 2008 08:22:10 -0500,
 Andrei Alexandrescu wrote:
 Sergey Gromov wrote:
 My T[] is useful when you want to recursively split a megabyte file 
 into a couple thousands of tokens, and then modify some of those 
 tokens.  For that, your T[] must be lightweight, it must reference a 
 bigger piece of data, and it must guarantee not to write anything into 
 memory outside its boundaries.

 The Array is for appending.  It must always own its memory.  Therefore 
 you should be able to pass it around by reference, so Array is a 
 *class* and cannot be nearly as lightweight as T[].

 You see, many of their properties are orthogonal.  If you drop one, you 
 lose flexibility.

 Besides, Array!(T) is not a good name for build-in type.



Well, I'd like new Object[15]; to be immediately appendable and therefore a syntactic sugar for new Array!(Object)(15);

I have a nagging impression the syntax Array!(Object) strikes you as hard on the hand and the eyes... Anyhow the syntax new Object[15] is idiotic because Object[15] is a type in itself. The syntax makes it next to impossible to actually generate a fixed-sized array dynamically. In fact here's a challenge for you. Please generate a pointer to an Object[15] using new.

struct StaticArray(T, int x) { T[x] arr; } Object[15] * x = &((new StaticArray!(Object, 15)).arr); lovely, no ;) I'd probably make a static function in the template to do it, if it was really important to do that... There may be a way to do it nicer with a template alias, but I'm not sure it will work. -Steve
Oct 10 2008
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
 "Andrei Alexandrescu" wrote
 Sergey Gromov wrote:
 Fri, 10 Oct 2008 08:22:10 -0500,
 Andrei Alexandrescu wrote:
 Sergey Gromov wrote:
 My T[] is useful when you want to recursively split a megabyte file 
 into a couple thousands of tokens, and then modify some of those 
 tokens.  For that, your T[] must be lightweight, it must reference a 
 bigger piece of data, and it must guarantee not to write anything into 
 memory outside its boundaries.

 The Array is for appending.  It must always own its memory.  Therefore 
 you should be able to pass it around by reference, so Array is a 
 *class* and cannot be nearly as lightweight as T[].

 You see, many of their properties are orthogonal.  If you drop one, you 
 lose flexibility.

 Besides, Array!(T) is not a good name for build-in type.



new Object[15]; to be immediately appendable and therefore a syntactic sugar for new Array!(Object)(15);

on the hand and the eyes... Anyhow the syntax new Object[15] is idiotic because Object[15] is a type in itself. The syntax makes it next to impossible to actually generate a fixed-sized array dynamically. In fact here's a challenge for you. Please generate a pointer to an Object[15] using new.

struct StaticArray(T, int x) { T[x] arr; } Object[15] * x = &((new StaticArray!(Object, 15)).arr); lovely, no ;) I'd probably make a static function in the template to do it, if it was really important to do that... There may be a way to do it nicer with a template alias, but I'm not sure it will work.

auto a = (new Object[5][1]).ptr; The new operator is the plague. Andrei
Oct 10 2008
prev sibling parent Sergey Gromov <snake.scaly gmail.com> writes:
Fri, 10 Oct 2008 09:33:43 -0500,
Andrei Alexandrescu wrote:
 Sergey Gromov wrote:
 Fri, 10 Oct 2008 08:22:10 -0500,
 Andrei Alexandrescu wrote:
 Sergey Gromov wrote:
 My T[] is useful when you want to recursively split a megabyte file into 
 a couple thousands of tokens, and then modify some of those tokens.  For 
 that, your T[] must be lightweight, it must reference a bigger piece of 
 data, and it must guarantee not to write anything into memory outside 
 its boundaries.

 The Array is for appending.  It must always own its memory.  Therefore 
 you should be able to pass it around by reference, so Array is a *class* 
 and cannot be nearly as lightweight as T[].

 You see, many of their properties are orthogonal.  If you drop one, you 
 lose flexibility.

 Besides, Array!(T) is not a good name for build-in type.



Well, I'd like new Object[15]; to be immediately appendable and therefore a syntactic sugar for new Array!(Object)(15);

I have a nagging impression the syntax Array!(Object) strikes you as hard on the hand and the eyes...

Yes you are right. If I had to write "new AA!(int, string)" instead of simply "int[string]" I think I wouldn't even bother learning D.
 Anyhow the syntax new Object[15] is idiotic because Object[15] is a type 
 in itself. The syntax makes it next to impossible to actually generate a 
 fixed-sized array dynamically.
 
 In fact here's a challenge for you. Please generate a pointer to an 
 Object[15] using new.

alias Object[15] Type; pragma(msg, typeof(new Type).stringof);
 test.d(7): Error: new can only create structs, dynamic arrays or class
objects, not Object[15u]'s
 Object[15u]*

:-D
 
 I'd also like
   "foo" ~ text ~ "bar"
 to become something like
   (new Array!(char)) ~= "foo" ~= text ~= "bar"
 that is what Java does to string concatenation.  Sugar doesn't seem to 
 couple well with a purely library type.
 
 Well, the latter is probably too complex and can cause major problems.  
 But new T[] should return something appendable.

I'm not 100% sure about that.

I'm not sure either, in part because of implicit type juggling.
Oct 10 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Sat, 11 Oct 2008 18:32:25 +0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Denis Koroskin wrote:
 On Sat, 11 Oct 2008 18:00:38 +0400, Andrei Alexandrescu Two notes:
 1) I thought Appender would have an 'append' method as well as  
 opCatAssign.

Appender has write because it is an output range. That way you can direct any algorithm that uses output iterators to append to an array.

Yes, but ~= is A LOT more handy. Why not provide both?
Oct 11 2008
prev sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Sat, 11 Oct 2008 18:32:25 +0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Denis Koroskin wrote:
 On Sat, 11 Oct 2008 18:00:38 +0400, Andrei Alexandrescu Two notes:
 1) I thought Appender would have an 'append' method as well as  
 opCatAssign.

Appender has write because it is an output range. That way you can direct any algorithm that uses output iterators to append to an array.

BTW, I wouldn't know that Appender is a range if you didn't say it. I believe it should be specified (and enforced) somehow in the code, like 'implements the output range contract' (C++0x contracts come to mind). For example, an error could be risen if Output Range definition is changed and Appender is not updated yet.
Oct 11 2008
prev sibling parent reply Christopher Wright <dhasenan gmail.com> writes:
Sergey Gromov wrote:
 Wed, 08 Oct 2008 15:07:27 -0500,
 Andrei Alexandrescu wrote:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let 
 me ask this - what do you think are the top issues you'd like to see 
 fixed in D?

I'd generally like less syntactic freedom in D. Excessive amounts of syntactic sugar cause troubles in many different places. Economy of syntax principle has a flip side: a single mistake can make your program mean a whole different thing instead of simply giving you a syntax error. Redundancy is required to detect an error and suggest a fix. 1. Drop the "array is a slice" idea. I liked it a lot when I saw the D specs but it bit me immediately while I implemented a text parser. The idea sounds nice but it causes too many problems. There are two major use cases for arrays: building and parsing. When building, you want fast append, you've got a limited number of arrays, and you likely don't care if your arrays are a little fat. When parsing, you want data consistency, that is you don't want a sub-array to change if you occasionally append to another sub-array. You also want to pass those sub-arrays around a lot and you probably want them to be cheap.

Catenating an array to a slice shouldn't do it in place unless the slice includes the end of the original array. If it does, that's a bug. And parsers generally don't need to modify the input. I'm rather curious as to why yours was.
 An array should be a fat object optimized for fast appending.  It should 
 be a class, because you want to pass it into functions and store 
 references to it while keeping it a distinct object.  Call it Array!(T) 
 or List or Vector or whatever.  It should be implicitly convertable to a 
 slice type T[] which can stay what it is now.  It should be built-in 
 because array new, .dup and slice ~ should return a new array instance.
 
 It'll probably break lots of generic code.
 
 2.  Drop the "call a function however you want" idea.  Simple property 
 method syntax is nice, but excessive freedom is not.  Adding one simple 
 keyword fixes the situation.  See
 http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D.announce&artnum=13471

It'll probably break lots of regular code. 3. Same concern but little experience. Never bit me personally. But probably the delegate literal syntax should be dropped in favour of a dedicated, distinct lambda expression syntax. 4. Fix the const system. It scares people. It scared me off D2 for a while even though I liked D2 and I liked the *idea* of a const system. There are two items on my wish-list: a) const-transparent and const-inheriting methods, so that you don't have to write the same method thrice as you do in C++ to grant const- correctness: class A { Object o; constof!(o)(Object) get() { return o; } } invariant(A) a; typeof(a.get()); // invariant(Object) b) unique return values that can be cast to any of mutable, const and invariant. The proposal is to have a "unique" keyword and a unique constness type. Only rvalue can be unique. Unique can be returned. Unique casts implicitly to any other constness type. Nothing implicitly casts to unique. You can explicitly cast to unique. New, malloc, .deepdup return unique. .dup returns unique if hasNoPointers.

Oct 10 2008
next sibling parent Sergey Gromov <snake.scaly gmail.com> writes:
Fri, 10 Oct 2008 08:08:11 -0400,
Christopher Wright wrote:
 Catenating an array to a slice shouldn't do it in place unless the slice 
 includes the end of the original array. If it does, that's a bug.

Only array knows where it's end is. That's the point.
 And parsers generally don't need to modify the input. I'm rather curious 
 as to why yours was.

It was a sort of refactoring tool. It automated conversion of Java sources into C++ (J2ME -> BREW actually). It basically moved things around and replaced patterns but tried to keep the original formatting as much as possible.
Oct 10 2008
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Christopher Wright" wrote
 Sergey Gromov wrote:
 Wed, 08 Oct 2008 15:07:27 -0500,
 Andrei Alexandrescu wrote:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let 
 me ask this - what do you think are the top issues you'd like to see 
 fixed in D?

I'd generally like less syntactic freedom in D. Excessive amounts of syntactic sugar cause troubles in many different places. Economy of syntax principle has a flip side: a single mistake can make your program mean a whole different thing instead of simply giving you a syntax error. Redundancy is required to detect an error and suggest a fix. 1. Drop the "array is a slice" idea. I liked it a lot when I saw the D specs but it bit me immediately while I implemented a text parser. The idea sounds nice but it causes too many problems. There are two major use cases for arrays: building and parsing. When building, you want fast append, you've got a limited number of arrays, and you likely don't care if your arrays are a little fat. When parsing, you want data consistency, that is you don't want a sub-array to change if you occasionally append to another sub-array. You also want to pass those sub-arrays around a lot and you probably want them to be cheap.

Catenating an array to a slice shouldn't do it in place unless the slice includes the end of the original array. If it does, that's a bug.

You'd think so, but you have it exactly backwards. In the current implementation a slice will append in place only if it contains the BEGINNING of an allocated array. I wrote a proposal a while back to do exactly what you said, and it can be done with minimal overhead in the GC. But it has been ignored AFAIK. -Steve
Oct 10 2008
prev sibling next sibling parent reply Gide Nwawudu <gide btinternet.com> writes:
On Wed, 08 Oct 2008 15:07:27 -0500, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

Ok, per Aarti's suggestion: without speaking officially for Walter, let 
me ask this - what do you think are the top issues you'd like to see 
fixed in D?

Andrei

1) Tango/Phobos compatibility. 2) Finalise const/invariant stuff and change manifest const from 'enum' to 'define' (or whatever). define { double PI = 3.14; string author = "Walter"; } define enum Direction { North, South, East, West }; The other things are mainly bug fixes but some of them have been around for ages, and make D look like it is not polished as it needs to be. I was watching Top Gear (a motoring program on the BBC) and they were road testing an Audi, an Alfa Romeo and a Mazda. The Alfa was preferred by the presenters. I've got the feeling the D language might be the Alfa of the programming world, that is; it looks great, works fine *most* of the time and is good fun, but for some reason you would not recommend it to a friend/colleague. 3) Fix all forward reference bugs. http://d.puremagic.com/issues/show_bug.cgi?id=340 4) Import bugs, and others in Bugzilla. http://d.puremagic.com/issues/show_bug.cgi?id=314 http://d.puremagic.com/issues/show_bug.cgi?id=1238 5) Fix .length bug on associative arrays. http://d.puremagic.com/issues/show_bug.cgi?id=929 Gide
Oct 09 2008
parent reply Benji Smith <dlanguage benjismith.net> writes:
Gide Nwawudu wrote:
 2) Finalise const/invariant stuff and change manifest const from
 'enum' to 'define' (or whatever).
 
 define {
 	double PI = 3.14;
 	string author = "Walter";
 }
 define enum Direction { North, South, East, West };

I've never quite understood what people are talking about when they refer to a "manifest" constant. What does that mean? And why do we need any special keyword? What does the "define" keyword give you that an ordinary variable declaration doesn't? Why not just write the code from above like this: double PI = 3.14; string author = "Walter"; enum Direction { North, South, East, West }; What am I missing here? --benji
Oct 09 2008
next sibling parent reply Sergey Gromov <snake.scaly gmail.com> writes:
Thu, 09 Oct 2008 09:07:19 -0400,
Benji Smith wrote:
 I've never quite understood what people are talking about when they 
 refer to a "manifest" constant. What does that mean?
 
 And why do we need any special keyword? What does the "define" keyword 
 give you that an ordinary variable declaration doesn't? Why not just 
 write the code from above like this:
 
    double PI = 3.14;
    string author = "Walter";
    enum Direction { North, South, East, West };
 
 What am I missing here?

Your "PI" and "author" cannot be optimized because they're public and mutable, so every time you use PI in your code compiler must access a variable just in case some other module changed its value to 180. Value of "North" on the other hand can never change so it can take part in constant folding etc. You "manifest" an identity between name "North" and a number 0. The closest to a manifest constant would be invariant double PI = 3.14; invariant string author = "Walter"; I think it even works in D2. I don't know why enum were introduced for declaring constants.
Oct 09 2008
next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Sergey Gromov" wrote
 Thu, 09 Oct 2008 09:07:19 -0400,
 Benji Smith wrote:
 I've never quite understood what people are talking about when they
 refer to a "manifest" constant. What does that mean?

 And why do we need any special keyword? What does the "define" keyword
 give you that an ordinary variable declaration doesn't? Why not just
 write the code from above like this:

    double PI = 3.14;
    string author = "Walter";
    enum Direction { North, South, East, West };

 What am I missing here?

Your "PI" and "author" cannot be optimized because they're public and mutable, so every time you use PI in your code compiler must access a variable just in case some other module changed its value to 180. Value of "North" on the other hand can never change so it can take part in constant folding etc. You "manifest" an identity between name "North" and a number 0. The closest to a manifest constant would be invariant double PI = 3.14; invariant string author = "Walter"; I think it even works in D2. I don't know why enum were introduced for declaring constants.

You can still take the address of invariant variables, and since the compiler never knows who will use them, it will reserve space for them in your executable, even if they aren't used. -Steve
Oct 09 2008
prev sibling parent reply Benji Smith <dlanguage benjismith.net> writes:
Sergey Gromov wrote:
 Thu, 09 Oct 2008 09:07:19 -0400,
 Benji Smith wrote:
 I've never quite understood what people are talking about when they 
 refer to a "manifest" constant. What does that mean?

 And why do we need any special keyword? What does the "define" keyword 
 give you that an ordinary variable declaration doesn't? Why not just 
 write the code from above like this:

    double PI = 3.14;
    string author = "Walter";
    enum Direction { North, South, East, West };

 What am I missing here?

Your "PI" and "author" cannot be optimized because they're public and mutable, so every time you use PI in your code compiler must access a variable just in case some other module changed its value to 180. Value of "North" on the other hand can never change so it can take part in constant folding etc. You "manifest" an identity between name "North" and a number 0. The closest to a manifest constant would be invariant double PI = 3.14; invariant string author = "Walter"; I think it even works in D2. I don't know why enum were introduced for declaring constants.

Oops. Silly me. I meant to declare those variables as const (or invariant, like in your example). I understand the optimization benefits of const values. I'v just never understood what people are talking about with "manifest" constants, and why they deserve to have some special keyword. I take it, from your example code, that you agree with me that there's no need for a "define" or "manifest" keyword... --benji
Oct 09 2008
next sibling parent reply =?iso-8859-1?Q?Julio=20C=e9sar=20Carrascal=20Urquijo?= <jcarrascal gmail.com> writes:
Hello Benji,

 I understand the optimization benefits of const values. I'v just never
 understood what people are talking about with "manifest" constants,
 and why they deserve to have some special keyword.
 
 I take it, from your example code, that you agree with me that there's
 no need for a "define" or "manifest" keyword...
 
 --benji
 

Someone correct me if I'm wrong. Manifest constants grew out of the necesity of one of the winapi projects. There are lots of constants in the win32 api and they were taking lots of space in the resulting binary. Most projects opted to declare constants using enum because they got replaced by the value at compile time. For a short period there was even a manifest keyword implemented in DMD but it got removed right before 1.0.
Oct 09 2008
parent Don <nospam nospam.com.au> writes:
Julio Csar Carrascal Urquijo wrote:
 Hello Benji,
 
 I understand the optimization benefits of const values. I'v just never
 understood what people are talking about with "manifest" constants,
 and why they deserve to have some special keyword.

 I take it, from your example code, that you agree with me that there's
 no need for a "define" or "manifest" keyword...

 --benji

Someone correct me if I'm wrong. Manifest constants grew out of the necesity of one of the winapi projects. There are lots of constants in the win32 api and they were taking lots of space in the resulting binary. Most projects opted to declare constants using enum because they got replaced by the value at compile time.

It was requested by me. The motivation was a bit different. Template metaprogramming is slow to compile mainly because of all the constants which need to be written into the object files -- they can easily be a few megabytes in size. But manifest constants exist only at compile time, so they don't need to be written into the obj file. In fact, they could rapidly be discarded from the symbol table in most cases. I observed that in D1.0, it's impossible to take the address of a constant, so it was pointless having them in the binary. I proposed that they should not be stored in the binary any more, to reduce exe sizes and speed compilation. But Walter said that the behaviour was a bug in DMD. enum for manifest constants provides the functionality which in D1.0 was provided by the bug.
Oct 14 2008
prev sibling parent "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Thu, Oct 9, 2008 at 5:19 PM, Julio C=E9sar Carrascal Urquijo
<jcarrascal gmail.com> wrote:
 Hello Benji,

 Someone correct me if I'm wrong.
 Manifest constants grew out of the necesity of one of the winapi projects=

 There are lots of constants in the win32 api and they were taking lots of
 space in the resulting binary. Most projects opted to declare constants
 using enum because they got replaced by the value at compile time.

It wasn't just the win32 API project.
 For a short period there was even a manifest keyword implemented in DMD b=

 it got removed right before 1.0.

It was never in D 1, only in D 2 and only for a short time in the Phobos Subversion repository.
Oct 09 2008
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Benji Smith" wrote
 Gide Nwawudu wrote:
 2) Finalise const/invariant stuff and change manifest const from
 'enum' to 'define' (or whatever).

 define {
 double PI = 3.14;
 string author = "Walter";
 }
 define enum Direction { North, South, East, West };

I've never quite understood what people are talking about when they refer to a "manifest" constant. What does that mean? And why do we need any special keyword? What does the "define" keyword give you that an ordinary variable declaration doesn't? Why not just write the code from above like this: double PI = 3.14; string author = "Walter"; enum Direction { North, South, East, West }; What am I missing here?

You cannot take the address of a manifest constant, and it doesn't live in a static data space or in memory anywhere. Instead it is built directly into the code. So when you say define double PI = 3.14; And then use PI: auto x = PI; This generates code that declares a variable x, then assigns it to 3.14. the PI symbol isn't stored in the final code. This has a huge benefit when you are declaring lots of constants, but only few will be used. You don't have to pay the penalty of storing all the constants in your code, only the ones you use, and only where you use them. -Steve
Oct 09 2008
next sibling parent reply Benji Smith <dlanguage benjismith.net> writes:
Steven Schveighoffer wrote:
 "Benji Smith" wrote
 Gide Nwawudu wrote:
 2) Finalise const/invariant stuff and change manifest const from
 'enum' to 'define' (or whatever).

 define {
 double PI = 3.14;
 string author = "Walter";
 }
 define enum Direction { North, South, East, West };

to a "manifest" constant. What does that mean? And why do we need any special keyword? What does the "define" keyword give you that an ordinary variable declaration doesn't? Why not just write the code from above like this: double PI = 3.14; string author = "Walter"; enum Direction { North, South, East, West }; What am I missing here?

You cannot take the address of a manifest constant, and it doesn't live in a static data space or in memory anywhere. Instead it is built directly into the code. So when you say define double PI = 3.14; And then use PI: auto x = PI; This generates code that declares a variable x, then assigns it to 3.14. the PI symbol isn't stored in the final code. This has a huge benefit when you are declaring lots of constants, but only few will be used. You don't have to pay the penalty of storing all the constants in your code, only the ones you use, and only where you use them. -Steve

Okay. That makes sense. I was just working with the tango.sys.win32.Types module yesterday, and it uses "enum" to declare something like five or six thousand different named constants, so I can see where that kind of thing would be helpful. --benji
Oct 09 2008
parent reply Yigal Chripun <yigal100 gmail.com> writes:
Benji Smith wrote:
 Steven Schveighoffer wrote:
 "Benji Smith" wrote
 Gide Nwawudu wrote:
 2) Finalise const/invariant stuff and change manifest const from
 'enum' to 'define' (or whatever).

 define {
 double PI = 3.14;
 string author = "Walter";
 }
 define enum Direction { North, South, East, West };

refer to a "manifest" constant. What does that mean? And why do we need any special keyword? What does the "define" keyword give you that an ordinary variable declaration doesn't? Why not just write the code from above like this: double PI = 3.14; string author = "Walter"; enum Direction { North, South, East, West }; What am I missing here?

You cannot take the address of a manifest constant, and it doesn't live in a static data space or in memory anywhere. Instead it is built directly into the code. So when you say define double PI = 3.14; And then use PI: auto x = PI; This generates code that declares a variable x, then assigns it to 3.14. the PI symbol isn't stored in the final code. This has a huge benefit when you are declaring lots of constants, but only few will be used. You don't have to pay the penalty of storing all the constants in your code, only the ones you use, and only where you use them. -Steve

Okay. That makes sense. I was just working with the tango.sys.win32.Types module yesterday, and it uses "enum" to declare something like five or six thousand different named constants, so I can see where that kind of thing would be helpful. --benji

instead of having a special syntax for manifest constants, they could be part of AST macros, or templates, or other compile time feature of D. also, can't we use: "invariant PI = 3.14;" and have the linker optimize this when necessary?
Oct 09 2008
parent Yigal Chripun <yigal100 gmail.com> writes:
Yigal Chripun wrote:
 Benji Smith wrote:
 Steven Schveighoffer wrote:
 "Benji Smith" wrote
 Gide Nwawudu wrote:
 2) Finalise const/invariant stuff and change manifest const from
 'enum' to 'define' (or whatever).

 define {
 double PI = 3.14;
 string author = "Walter";
 }
 define enum Direction { North, South, East, West };

refer to a "manifest" constant. What does that mean? And why do we need any special keyword? What does the "define" keyword give you that an ordinary variable declaration doesn't? Why not just write the code from above like this: double PI = 3.14; string author = "Walter"; enum Direction { North, South, East, West }; What am I missing here?

live in a static data space or in memory anywhere. Instead it is built directly into the code. So when you say define double PI = 3.14; And then use PI: auto x = PI; This generates code that declares a variable x, then assigns it to 3.14. the PI symbol isn't stored in the final code. This has a huge benefit when you are declaring lots of constants, but only few will be used. You don't have to pay the penalty of storing all the constants in your code, only the ones you use, and only where you use them. -Steve

I was just working with the tango.sys.win32.Types module yesterday, and it uses "enum" to declare something like five or six thousand different named constants, so I can see where that kind of thing would be helpful. --benji

instead of having a special syntax for manifest constants, they could be part of AST macros, or templates, or other compile time feature of D. also, can't we use: "invariant PI = 3.14;" and have the linker optimize this when necessary?

never mind my post, Don already raised the linker issue.
Oct 09 2008
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
 You cannot take the address of a manifest constant, and it doesn't live in a 
 static data space or in memory anywhere.  Instead it is built directly into 
 the code.

Let me add something completely unrelated. You cannot take the address of an enum value, and it doesn't live in a static data space or in memory anywhere. Instead it is built directly into the code. Andrei Disclaimer: all resemblance with existing or suggested features is purely coincidental.
Oct 09 2008
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Andrei Alexandrescu" wrote
 Steven Schveighoffer wrote:
 You cannot take the address of a manifest constant, and it doesn't live 
 in a static data space or in memory anywhere.  Instead it is built 
 directly into the code.

Let me add something completely unrelated. You cannot take the address of an enum value, and it doesn't live in a static data space or in memory anywhere. Instead it is built directly into the code. Andrei Disclaimer: all resemblance with existing or suggested features is purely coincidental.

Nobody is saying that enums are not manifest constants. They are just saying that manifest constants are not necessarily enumerations, and shouldn't be labeled as such. Note that I'm just voicing what others have said. Naming manifest constants 'enum' is a level higher than a bikeshed color, but not much higher. I can live with the syntax. -Steve
Oct 09 2008
prev sibling parent reply Don <nospam nospam.com.au> writes:
Steven Schveighoffer wrote:
 "Benji Smith" wrote
 Gide Nwawudu wrote:
 2) Finalise const/invariant stuff and change manifest const from
 'enum' to 'define' (or whatever).

 define {
 double PI = 3.14;
 string author = "Walter";
 }
 define enum Direction { North, South, East, West };

to a "manifest" constant. What does that mean? And why do we need any special keyword? What does the "define" keyword give you that an ordinary variable declaration doesn't? Why not just write the code from above like this: double PI = 3.14; string author = "Walter"; enum Direction { North, South, East, West }; What am I missing here?

You cannot take the address of a manifest constant, and it doesn't live in a static data space or in memory anywhere. Instead it is built directly into the code. So when you say define double PI = 3.14; And then use PI: auto x = PI; This generates code that declares a variable x, then assigns it to 3.14. the PI symbol isn't stored in the final code. This has a huge benefit when you are declaring lots of constants, but only few will be used. You don't have to pay the penalty of storing all the constants in your code, only the ones you use, and only where you use them.

Actually a smart linker would do that anyway. The only reason we need manifest constants is because OPTLINK isn't smart enough. (And DMD isn't smart enough to discard unreachable variables from the symbol table). const double PI = 3.14; works exactly the same in D2 as it does in D1. Except that you can't take the address of it in D1, but that's actually a bug. It's still stored. A perfect D compiler (D1 or D2) would make it identical to D2's enum PI = 3.14;
Oct 09 2008
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Don wrote:
 Steven Schveighoffer wrote:
 "Benji Smith" wrote
 Gide Nwawudu wrote:
 2) Finalise const/invariant stuff and change manifest const from
 'enum' to 'define' (or whatever).

 define {
 double PI = 3.14;
 string author = "Walter";
 }
 define enum Direction { North, South, East, West };

refer to a "manifest" constant. What does that mean? And why do we need any special keyword? What does the "define" keyword give you that an ordinary variable declaration doesn't? Why not just write the code from above like this: double PI = 3.14; string author = "Walter"; enum Direction { North, South, East, West }; What am I missing here?

You cannot take the address of a manifest constant, and it doesn't live in a static data space or in memory anywhere. Instead it is built directly into the code. So when you say define double PI = 3.14; And then use PI: auto x = PI; This generates code that declares a variable x, then assigns it to 3.14. the PI symbol isn't stored in the final code. This has a huge benefit when you are declaring lots of constants, but only few will be used. You don't have to pay the penalty of storing all the constants in your code, only the ones you use, and only where you use them.

Actually a smart linker would do that anyway. The only reason we need manifest constants is because OPTLINK isn't smart enough. (And DMD isn't smart enough to discard unreachable variables from the symbol table).

Exactly so. Andrei
Oct 09 2008
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jarrett Billingsley wrote:
 On Thu, Oct 9, 2008 at 10:12 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Actually a smart linker would do that anyway. The only reason we need
 manifest constants is because OPTLINK isn't smart enough.
 (And DMD isn't smart enough to discard unreachable variables from the
 symbol table).


So.. the solution to a toolchain problem is to add a feature into the language? Exported templates in C++? The 'register' keyword? Isn't this the kind of stuff we want to avoid?

I hear you. It was not a lightly taken decision. Andrei
Oct 09 2008
prev sibling parent reply Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Andrei Alexandrescu wrote:
 Don wrote:
 Steven Schveighoffer wrote:
 "Benji Smith" wrote
 And why do we need any special keyword?


manifest constants is because OPTLINK isn't smart enough. (And DMD isn't smart enough to discard unreachable variables from the symbol table).


 Exactly so.

That and the reason Walter mentioned: "There needs to be a way to declare a constant of type int." (as opposed to const(int)), but I disregard that statement as something symptomatic of a broken const-system. I've numerous times been trying to argue against the need for manifest constants, but it has been about as useful as repeatedly banging your head against a brick wall. And about as rewarding too. It is good to finally hear that the sole reason we have them is laziness. That makes acceptance much easier. :) -- Oskar
Oct 11 2008
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Oskar Linde wrote:
 Andrei Alexandrescu wrote:
 Don wrote:
 Steven Schveighoffer wrote:
 "Benji Smith" wrote
 And why do we need any special keyword?


isn't smart enough. (And DMD isn't smart enough to discard unreachable variables from the symbol table).


 Exactly so.

That and the reason Walter mentioned: "There needs to be a way to declare a constant of type int." (as opposed to const(int)), but I disregard that statement as something symptomatic of a broken const-system.

That is a random statement to make unless backed up. The problem was: auto x = CONSTANT; People would expect to initialize x with a particular constant but then be able to use it. But auto for all types means give x whatever type CONSTANT has. If CONSTANT has const as part of its type, that will be transferred into x because that's what auto does, not because const is broken in any way, shape, or form.
 I've numerous times been trying to argue against the need for manifest 
 constants, but it has been about as useful as repeatedly banging your 
 head against a brick wall. And about as rewarding too.

Manifest constants do exist. Maybe you mean you want a different naming for them.
 It is good to finally hear that the sole reason we have them is 
 laziness. That makes acceptance much easier. :)

Well priorities are an issue too. Rewriting the linker would be a major effort. Walter would do it if really necessary. Andrei
Oct 11 2008
parent Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Andrei Alexandrescu wrote:
 Oskar Linde wrote:
 Andrei Alexandrescu wrote:
 Don wrote:
 Steven Schveighoffer wrote:
 "Benji Smith" wrote
 And why do we need any special keyword?


isn't smart enough. (And DMD isn't smart enough to discard unreachable variables from the symbol table).


 Exactly so.

That and the reason Walter mentioned: "There needs to be a way to declare a constant of type int." (as opposed to const(int)), but I disregard that statement as something symptomatic of a broken const-system.

That is a random statement to make unless backed up. The problem was: auto x = CONSTANT; People would expect to initialize x with a particular constant but then be able to use it. But auto for all types means give x whatever type CONSTANT has. If CONSTANT has const as part of its type, that will be transferred into x because that's what auto does, not because const is broken in any way, shape, or form.

D1 has the neat concept of constant storage classes, which means that basic values can be constant without carrying a separate type modifier. Const type modifiers are something that only reference types need (and D1 lacks), but the D2 const design has forced such type meta-data to spill over on the actual data types. I consider that to be a flaw in the const system.
 I've numerous times been trying to argue against the need for manifest 
 constants, but it has been about as useful as repeatedly banging your 
 head against a brick wall. And about as rewarding too.

Manifest constants do exist. Maybe you mean you want a different naming for them.

I argued against the separation of manifest constants and "regular" constants. The distinction is unnecessary and confusing.
 It is good to finally hear that the sole reason we have them is 
 laziness. That makes acceptance much easier. :)

Well priorities are an issue too. Rewriting the linker would be a major effort. Walter would do it if really necessary.

I fully understand that. I just felt I had to write something provocative to get a reply. I do consider the reason perfectly valid. :) -- Oskar
Oct 11 2008
prev sibling next sibling parent "Bill Baxter" <wbaxter gmail.com> writes:
On Thu, Oct 9, 2008 at 10:21 PM, Sergey Gromov <snake.scaly gmail.com> wrote:
 Thu, 09 Oct 2008 09:07:19 -0400,
 Benji Smith wrote:
 I've never quite understood what people are talking about when they
 refer to a "manifest" constant. What does that mean?

 And why do we need any special keyword? What does the "define" keyword
 give you that an ordinary variable declaration doesn't? Why not just
 write the code from above like this:

    double PI = 3.14;
    string author = "Walter";
    enum Direction { North, South, East, West };

 What am I missing here?

Your "PI" and "author" cannot be optimized because they're public and mutable, so every time you use PI in your code compiler must access a variable just in case some other module changed its value to 180. Value of "North" on the other hand can never change so it can take part in constant folding etc. You "manifest" an identity between name "North" and a number 0. The closest to a manifest constant would be invariant double PI = 3.14; invariant string author = "Walter"; I think it even works in D2. I don't know why enum were introduced for declaring constants.

I think the main (only?) difference is that you can take the address of an invariant constant, but not a manifest constant. --bb
Oct 09 2008
prev sibling next sibling parent "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Thu, Oct 9, 2008 at 10:12 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Actually a smart linker would do that anyway. The only reason we need
 manifest constants is because OPTLINK isn't smart enough.
 (And DMD isn't smart enough to discard unreachable variables from the
 symbol table).

Exactly so.

So.. the solution to a toolchain problem is to add a feature into the language? Exported templates in C++? The 'register' keyword? Isn't this the kind of stuff we want to avoid?
Oct 09 2008
prev sibling next sibling parent reply Piotrek <starpit tlen.pl> writes:
Andrei Alexandrescu pisze:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let 
 me ask this - what do you think are the top issues you'd like to see 
 fixed in D?
 
 Andrei

1. Phobos and Tango fusion. 2. Phobos and Tango fusion. 3. Phobos and Tango fusion. 4. Phobos and Tango fusion. 5. Phobos and Tango fusion. I mean ONE community-orientated standard library. Walter still should be a language specification king, however. And of course good king listen to his people ;) BTW. Big thank-you to Walter and community for D - the language I was dreaming of.
Oct 09 2008
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Piotrek wrote:
 Andrei Alexandrescu pisze:
 Ok, per Aarti's suggestion: without speaking officially for Walter, 
 let me ask this - what do you think are the top issues you'd like to 
 see fixed in D?

 Andrei

1. Phobos and Tango fusion. 2. Phobos and Tango fusion. 3. Phobos and Tango fusion. 4. Phobos and Tango fusion. 5. Phobos and Tango fusion. I mean ONE community-orientated standard library. Walter still should be a language specification king, however. And of course good king listen to his people ;)

Phobos and Tango will be at least interoperable soo, that's virtually a done deal through Sean's immense contribution.
 BTW. Big thank-you to Walter and community for D - the language I was 
 dreaming of.

Yah, I'm sometimes dreaming of it, probably not in the same way :o). Andrei
Oct 09 2008
prev sibling next sibling parent reply "Dave" <Dave_member pathlink.com> writes:
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:gcj3u6$1lga$1 digitalmars.com...
 Ok, per Aarti's suggestion: without speaking officially for Walter, let me 
 ask this - what do you think are the top issues you'd like to see fixed in 
 D?

 Andrei

I haven't seen this mentioned yet, so here goes: What to do about reference aliasing and its effect on compiler optimizations? Some background to clarify what I mean by 'reference aliasing': http://en.wikipedia.org/wiki/Pointer_alias http://en.wikipedia.org/wiki/Restrict http://www.cellperformance.com/mike_acton/2006/05/demystifying_the_restrict_keyw.html I say 'reference' instead of 'pointer' so as to not confuse the issue with raw pointers and C's 'restrict'. I think since D arrays include the length, maybe something could be done there because there is enough information at runtime to quickly check for memory overlap. Also, a combination of compile time and runtime info. could probably do the same for classes, and for struct and primitive value types passed by ref. There doesn't seem to be many of the original scientific computing people left posting to this group, nor do I know of any game developers lurking about, but it may be important for D's adoption by those groups. I gather it certainly used to be an important reason why Fortran developers would not switch to C or C++. I know that invariants and pure functions open the possibility to optimize away, but the issues there are: a) Elements of invariant arrays are not straight-forward to initialize at present (requires casting). b) If used as intended, invariants don't allow re-population of the array elements for the same previously allocated array (encourages more memory use and pressure on the GC). c) Pure functions would require new'ing a returned result set, which is probably quite a bit slower than a runtime alias check under most circumstances. d) Assuming no aliasing on invariants can still lead to undefined behaviour that a runtime check could avoid. So bottom line is that the issue may need to be solved in a more general way than with invariants and pure functions to be really useful. - Dave
Oct 09 2008
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Dave wrote:
 
 "Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
 news:gcj3u6$1lga$1 digitalmars.com...
 Ok, per Aarti's suggestion: without speaking officially for Walter, 
 let me ask this - what do you think are the top issues you'd like to 
 see fixed in D?

 Andrei

I haven't seen this mentioned yet, so here goes: What to do about reference aliasing and its effect on compiler optimizations?

This is a biggie. Unfortunately D has at the present no solution in store for it. As for creating invariant objects, we'll definitely have good solutions for that. Andrei
Oct 09 2008
prev sibling next sibling parent "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Thu, Oct 9, 2008 at 8:42 PM, dsimcha <dsimcha yahoo.com> wrote:
 Andrei Alexandrescu:
 For appending, I already put Appender in std.array.

Can you please make sure std.array is included in the documentation for the next D release? Right now, it is completely undocumented. Also, while we're on the subject of Phobos documentation, what is the status of std.loader or whatever it is? Does it work properly? Is it usable? Is the lack of documentation for it an oversight or because it's not ready yet?

std.loader has been in Phobos for 5 years. It's one of a few additions to Phobos by Matt Wilson that have never really seemed like they were part of Phobos (along with std.perf, std.recls, std.openrj..). It's probably undocumented since Matt kind of disappeared and W might not know what the library actually does.
Oct 09 2008
prev sibling next sibling parent Chad J <gamerchad __spam.is.bad__gmail.com> writes:
Andrei Alexandrescu wrote:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let 
 me ask this - what do you think are the top issues you'd like to see 
 fixed in D?
 
 Andrei

OK. Here, have a wall of text. 1. Tango and Phobos merge. I'm with the others on this. The few people I've introduced to D seem very irritated by the existence of 2 standard libs. I think even a common runtime will be insufficient to calm the worries of newcomers, but it is a step in the right direction. A single standard API is what needs to happen. 2. Walter working on an open compiler. If Walter were to announce that he would be working on LLVMDC I would clench my fists and shout "FUCK YEAH!!". That's because this would be really awesome and would pump me up; sort of like watching Gurren Lagann. I think I might be a bit of an odd fellow, even by geek standards... 3. Forward references. Even I've run into this a few times, and I can see how people can run into this a lot and how it would make their lives miserable. This isn't quite a language issue as much as an implementation issue, but if you're looking for places to spend time then this is one of them. 4. The find-the-missing-parentheses game is not fun. I avoid playing it. Now let's make it easy for the unfortunate victi-, I mean newbies, to avoid playing this game. Just get rid of D's implicit property syntax. Walter seems opposed to explicit properties for whatever reason, so let's just not have properties. This implicit property syntax gives us this trade-off: On the upshot, you save a few keystrokes every now and then. On the downside, you get some horrible runtime bugs that take hours and hours to debug. I'll give that one a miss. So if it's too hard to do it right, then please don't do it at all. 5. 4 is all you get. Sorry. Notes: 4 may be a member of a broader category of language features: cute syntax sugary things that have the potential to cause really nasty bugs in people's code. Oh hey, switch-case, so glad you could join us (but not really). Also I'd like to hear an update on how Walter, and maybe you Andrei, feel about this implicit properties issue. I remember in 07 Walter saying something to the effect that explicit properties were not a priority and that they weren't something with proven usefulness. I'm realizing now why I'm not entirely convinced: the current implicit properties cause nasty bugs, and getting rid of that possibility MUST be a good idea. At that point, explicit properties aren't nearly as useful as removing implicit properties. So, if you don't mind, what are the thoughts of the D designer(s) on this one? On compilers, here are my observations: - DMD is nice, but it only targets x86 Windows and Linux. It will only ever support x86 Windows and Linux. Want to target Macs? Nope. How about 64-bit systems, namely the really good 64-bit linux systems that are around nowadays? nope. Cell phones? hahahaha. ... to me this is a fatal flaw. The fact that DMD has proprietary code in it doesn't help either. - GDC. It has the potential to target everything under the sun through shear brute force. Of course, someone has to bother to make the runtime and the glue work for these other platforms. Oh fun. Now, this is still pretty cool, but then we consider how GDC rarely gets updates. I use an SVN version of it nowadays--something I hardly ever do with software. GDC just doesn't have much muscle behind it, and I can see why. It's kind of like stereotypical sci-fi advanced high powered alien technology, only to use it you have to stick your hand into this horrid smelling sticky goopy stuff to use the controls. But wait, there's more! I've read that the GCC/GNU zealots won't let GDC into the fold without the frontend's copyright being assigned to the FSF. Have fun guys. It's still what I use though, because it can like, actually compile 64-bit programs for my 64-bit computer. Awesome feature, really. - LLVM. I really wish the D frontend for this was fully functional. The license is nice, it has growing numerous backends, and it has a portable C backend. That C backend means that even if you don't have a highly optimized backend for your platform already, you STILL win. Seriously wtf, it's broken. OK, well there are still those damned Java phones, but maybe a Java bytecode backend would fix that issue. It's like that gun you get towards the end of the game. Up until now you've had a knife for when your sidearm runs out of bullets. Now you're given this improved BFG9000-like thing that (1) kills everything on the screen (2) does it in one hit (3) does it repeatedly and quickly (4) doesn't run out of ammo. There is only one right answer. Oh and in our case we can have it early in the game. Another thing came to my mind as well. I think D is not strong as an experimental language. I think D IS strong as a robust, high-productivity language. I'm not too keen on numerous new nifty features. Back when I got into D, I noted that D seemed to follow this idea of progressive disclosure as well as the principle of least surprise. It wasn't the features that sold me. It was this remarkable ease of use and general smoothness that sold me. D is strong in that area, and I'd like to see D keep its strengths. Of course, new features are always nice, but the good design principles that are in D's roots should take priority, IMHO. If you read the whole thing, thank you for your time. I do have this habit of rambling on. Maybe you could tell. But now I have a class to go to. Later, - Chad
Oct 09 2008
prev sibling next sibling parent reply downs <default_357-line yahoo.de> writes:
Andrei Alexandrescu wrote:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let
 me ask this - what do you think are the top issues you'd like to see
 fixed in D?
 
 Andrei

Let me throw in mine as well. As per request, I won't list enhancements, only things I perceive to be issues with D as it _currently_ is. 1) the Phobos/Tango schism. 2) CTFE leaks. (been bitten by this recently) 3) the Exception/Error problem (they're different things, they shouldn't inherit!) 4) std.zip 5) contracts under inheritance. Feature improvements would be: 1) AST manipulation at compile-time (this might supercede and deprecate templates _and_ CTFE by merging them into a common component) 2) auto return type 3) reference types 4) tuple syntax 5) trailing delegates. Of course, all of these have been brought up before (and mostly ignored), so I don't hold much hope. --downs
Oct 09 2008
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
downs wrote:
 Andrei Alexandrescu wrote:
 Ok, per Aarti's suggestion: without speaking officially for Walter,
 let me ask this - what do you think are the top issues you'd like
 to see fixed in D?
 
 Andrei

Let me throw in mine as well. As per request, I won't list enhancements, only things I perceive to be issues with D as it _currently_ is. 1) the Phobos/Tango schism.

Consider it done.
 2) CTFE leaks. (been bitten by this recently)

Yah.
 3) the Exception/Error problem (they're different things, they
 shouldn't inherit!)

I think Exception should inherit Error.
 4) std.zip

Haven't gotten around to it, but I promise I will. I also have a tar file reader hanging around my code somewhere, but I'd first need to work on an extensible archive interface first.
 5) contracts under inheritance.

Go on...
 Feature improvements would be:
 
 1) AST manipulation at compile-time (this might supercede and
 deprecate templates _and_ CTFE by merging them into a common
 component) 2) auto return type 3) reference types 4) tuple syntax 5)
 trailing delegates.

I think this won't come around before D3.
 Of course, all of these have been brought up before (and mostly
 ignored), so I don't hold much hope.

I understand. Andrei
Oct 09 2008
next sibling parent Jason House <jason.james.house gmail.com> writes:
Andrei Alexandrescu Wrote:

 downs wrote:
 Andrei Alexandrescu wrote:
 5) contracts under inheritance.

Go on...

My interpretation: The D spec/website documents how contracts are inherited, but they're not implemented. It'd be nice to define an interface and provide contracts on the input and output of functions. Whatever implements the interface has to accept all valid in contracts, possibly accepting a wider range of input. It also has to satisfy the out contracts, possibly providing a narrower set of output. I hope that helps. It's definitely feature I'd like to see.
Oct 09 2008
prev sibling next sibling parent reply downs <default_357-line yahoo.de> writes:
Andrei Alexandrescu wrote:
 downs wrote:
 Andrei Alexandrescu wrote:
 Ok, per Aarti's suggestion: without speaking officially for Walter,
 let me ask this - what do you think are the top issues you'd like
 to see fixed in D?

 Andrei

Let me throw in mine as well. As per request, I won't list enhancements, only things I perceive to be issues with D as it _currently_ is.


Oh my god somebody important answered. I'm really calm right now and I'm hoping to finish writing this response before the wave of excitement hits.
 1) the Phobos/Tango schism.

Consider it done.

You have my gratitude.
 2) CTFE leaks. (been bitten by this recently)

Yah.

Good to know I'm not the only one.
 3) the Exception/Error problem (they're different things, they
 shouldn't inherit!)

I think Exception should inherit Error.

I agree. Certainly a step up from Error inheriting Exception! :)
 4) std.zip

Haven't gotten around to it, but I promise I will. I also have a tar file reader hanging around my code somewhere, but I'd first need to work on an extensible archive interface first.
 5) contracts under inheritance.

Go on...

Basically, contracts imply that a method "promises" to behave in a certain way. Thus, any subclasses' methods (the public ones at least) should relax the in { } of the "parent" method, and strengthen the out { }, in a logical extension of covariance and contravariance. The first of those works. The second, however, does not. Here's some code that demonstrates the problem. class A { int test() out(result) { assert(result < 10); } body { return 8; } } class B : A { override int test() { return 15; } } import std.stdio; void main() { A a = new B; writefln(a.test()); // violates contract, but does not fail. }
 Feature improvements would be:

 1) AST manipulation at compile-time (this might supercede and
 deprecate templates _and_ CTFE by merging them into a common
 component) 2) auto return type 3) reference types 4) tuple syntax 5)
 trailing delegates.

I think this won't come around before D3.

I wouldn't expect it any earlier.
 Of course, all of these have been brought up before (and mostly
 ignored), so I don't hold much hope.

I understand. Andrei

Thank you for taking the time to read the original post, and your response. --downs
Oct 09 2008
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
downs wrote:
 Andrei Alexandrescu wrote:
 1) the Phobos/Tango schism.


You have my gratitude.

Sean deserves it. Andrei
Oct 09 2008
prev sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
Andrei Alexandrescu wrote:
 downs wrote:
 
 3) the Exception/Error problem (they're different things, they
 shouldn't inherit!)

I think Exception should inherit Error.

I personally dislike the use of "Error" to denote exceptions of any sort. To me, errors are what /cause/ exceptions to be thrown. For this reason, in Tango and druntime, "Exception" is defined as the top-level class from which all exceptions should derive (imagine that), making it equivalent to your "Error" object. With this in mind, I'd like to solicit opinions about how exceptions should be categorized. Should unrecoverable exceptions derive directly from "Exception" with a sibling named something like "RecoverableException" as the parent for recoverable exceptions? Or would a bit more structure be better? I don't think it should be the runtime's responsibility to define a complex exception hierarchy, but there is clearly a desire to at least retain some distinction between recoverable and unrecoverable errors and my naming-fu is not terribly strong. Sean
Oct 09 2008
next sibling parent Kirk McDonald <kirklin.mcdonald gmail.com> writes:
Sean Kelly wrote:
 Andrei Alexandrescu wrote:
 downs wrote:

 3) the Exception/Error problem (they're different things, they
 shouldn't inherit!)

I think Exception should inherit Error.

I personally dislike the use of "Error" to denote exceptions of any sort. To me, errors are what /cause/ exceptions to be thrown. For this reason, in Tango and druntime, "Exception" is defined as the top-level class from which all exceptions should derive (imagine that), making it equivalent to your "Error" object. With this in mind, I'd like to solicit opinions about how exceptions should be categorized. Should unrecoverable exceptions derive directly from "Exception" with a sibling named something like "RecoverableException" as the parent for recoverable exceptions? Or would a bit more structure be better? I don't think it should be the runtime's responsibility to define a complex exception hierarchy, but there is clearly a desire to at least retain some distinction between recoverable and unrecoverable errors and my naming-fu is not terribly strong. Sean

I suggest looking at Python's exception hierarchy for inspiration: http://docs.python.org/library/exceptions.html The root of its exception tree is the BaseException type. The regular Exception type (whence all user exceptions should be derived) is a subclass of it. The purpose of this arrangement is so certain exceptions that should not be caught by an "except Exception:" block will get through. Exceptions that inherit from BaseException directly include SystemExit (the exception thrown to exit the program) and KeyboardInterrupt (thrown when the user presses ctrl-C at the console). This arrangement works well. It clearly delineates the "userland" exceptions (those which inherit from Exception which everyday code can interact with regularly) and those which are considered part of the system, which should not be touched (unless you really want to). -- Kirk McDonald
Oct 09 2008
prev sibling next sibling parent Benji Smith <dlanguage benjismith.net> writes:
Sean Kelly wrote:
 With this in mind, I'd like to solicit opinions about how exceptions 
 should be categorized.  Should unrecoverable exceptions derive directly 
 from "Exception" with a sibling named something like 
 "RecoverableException" as the parent for recoverable exceptions?  Or 
 would a bit more structure be better?  I don't think it should be the 
 runtime's responsibility to define a complex exception hierarchy, but 
 there is clearly a desire to at least retain some distinction between 
 recoverable and unrecoverable errors and my naming-fu is not terribly 
 strong.

I don't think the code throwing the exception necessarily knows whether or not it's recoverable. Only the caller really knows whether recovery is possible. --benji
Oct 09 2008
prev sibling next sibling parent reply Jason House <jason.james.house gmail.com> writes:
Sean Kelly Wrote:

 Andrei Alexandrescu wrote:
 downs wrote:
 
 3) the Exception/Error problem (they're different things, they
 shouldn't inherit!)

I think Exception should inherit Error.

I personally dislike the use of "Error" to denote exceptions of any sort. To me, errors are what /cause/ exceptions to be thrown. For this reason, in Tango and druntime, "Exception" is defined as the top-level class from which all exceptions should derive (imagine that), making it equivalent to your "Error" object. With this in mind, I'd like to solicit opinions about how exceptions should be categorized. Should unrecoverable exceptions derive directly from "Exception" with a sibling named something like "RecoverableException" as the parent for recoverable exceptions? Or would a bit more structure be better? I don't think it should be the runtime's responsibility to define a complex exception hierarchy, but there is clearly a desire to at least retain some distinction between recoverable and unrecoverable errors and my naming-fu is not terribly strong. Sean

I think the common user case has to be catch (Exception)... That implies to me that Exception must extend something. My vote would be an interface with an ugly name. Then non-recoverable errors can extend that interface.
Oct 09 2008
parent Gide Nwawudu <gide btinternet.com> writes:
On Thu, 09 Oct 2008 17:21:12 -0400, Jason House
<jason.james.house gmail.com> wrote:

Sean Kelly Wrote:

 Andrei Alexandrescu wrote:
 downs wrote:
 
 3) the Exception/Error problem (they're different things, they
 shouldn't inherit!)

I think Exception should inherit Error.

I personally dislike the use of "Error" to denote exceptions of any sort. To me, errors are what /cause/ exceptions to be thrown. For this reason, in Tango and druntime, "Exception" is defined as the top-level class from which all exceptions should derive (imagine that), making it equivalent to your "Error" object. With this in mind, I'd like to solicit opinions about how exceptions should be categorized. Should unrecoverable exceptions derive directly from "Exception" with a sibling named something like "RecoverableException" as the parent for recoverable exceptions? Or would a bit more structure be better? I don't think it should be the runtime's responsibility to define a complex exception hierarchy, but there is clearly a desire to at least retain some distinction between recoverable and unrecoverable errors and my naming-fu is not terribly strong. Sean

I think the common user case has to be catch (Exception)... That implies to me that Exception must extend something. My vote would be an interface with an ugly name. Then non-recoverable errors can extend that interface.

Would something like Java's Throwable interface work and have Error and Exception inherit from it? http://www.artima.com/designtechniques/exceptions.html Gide
Oct 10 2008
prev sibling next sibling parent reply downs <default_357-line yahoo.de> writes:
Sean Kelly wrote:
 Andrei Alexandrescu wrote:
 downs wrote:

 3) the Exception/Error problem (they're different things, they
 shouldn't inherit!)

I think Exception should inherit Error.

I personally dislike the use of "Error" to denote exceptions of any sort. To me, errors are what /cause/ exceptions to be thrown. For this reason, in Tango and druntime, "Exception" is defined as the top-level class from which all exceptions should derive (imagine that), making it equivalent to your "Error" object.

Errors are useful. There is a class of problems that should not be trivially recoverable, i.e. not if you don't know what you're doing. It's the class of problems that would not be caught in release mode. Why? Because if such problems are caught thoughtlessly, then the program behavior changes in release mode .. which is a big, BIG no-no. So ArrayBoundsError, AssertFailedError, and InvariantError are all valid, and should NOT be derived from Exceptions. --downs
Oct 11 2008
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jarrett Billingsley wrote:
 On Sat, Oct 11, 2008 at 5:04 PM, downs <default_357-line yahoo.de> wrote:
 Sean Kelly wrote:
 Andrei Alexandrescu wrote:
 downs wrote:

 3) the Exception/Error problem (they're different things, they
 shouldn't inherit!)


sort. To me, errors are what /cause/ exceptions to be thrown. For this reason, in Tango and druntime, "Exception" is defined as the top-level class from which all exceptions should derive (imagine that), making it equivalent to your "Error" object.

It's the class of problems that would not be caught in release mode. Why? Because if such problems are caught thoughtlessly, then the program behavior changes in release mode .. which is a big, BIG no-no. So ArrayBoundsError, AssertFailedError, and InvariantError are all valid, and should NOT be derived from Exceptions.

And by the same token, string-to-integer conversion methods should not throw an irrecoverable error. I'm looking at you, ConvError.

What??? What a mistake. I fixed it now (incidentally I was working on std.conv). If anyone thinks Conv should inherit Error, speak now, or forever hold your silence. Andrei
Oct 11 2008
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jarrett Billingsley wrote:
 On Sat, Oct 11, 2008 at 5:04 PM, downs <default_357-line yahoo.de> wrote:
 Sean Kelly wrote:
 Andrei Alexandrescu wrote:
 downs wrote:

 3) the Exception/Error problem (they're different things, they
 shouldn't inherit!)


sort. To me, errors are what /cause/ exceptions to be thrown. For this reason, in Tango and druntime, "Exception" is defined as the top-level class from which all exceptions should derive (imagine that), making it equivalent to your "Error" object.

It's the class of problems that would not be caught in release mode. Why? Because if such problems are caught thoughtlessly, then the program behavior changes in release mode .. which is a big, BIG no-no. So ArrayBoundsError, AssertFailedError, and InvariantError are all valid, and should NOT be derived from Exceptions.

And by the same token, string-to-integer conversion methods should not throw an irrecoverable error. I'm looking at you, ConvError.

ConvOverflowError too, right? Andrei
Oct 11 2008
prev sibling parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Sean Kelly wrote:
 Andrei Alexandrescu wrote:
 downs wrote:

 3) the Exception/Error problem (they're different things, they
 shouldn't inherit!)

I think Exception should inherit Error.

I personally dislike the use of "Error" to denote exceptions of any sort. To me, errors are what /cause/ exceptions to be thrown. For this reason, in Tango and druntime, "Exception" is defined as the top-level class from which all exceptions should derive (imagine that), making it equivalent to your "Error" object. With this in mind, I'd like to solicit opinions about how exceptions should be categorized. Should unrecoverable exceptions derive directly from "Exception" with a sibling named something like "RecoverableException" as the parent for recoverable exceptions? Or would a bit more structure be better? I don't think it should be the runtime's responsibility to define a complex exception hierarchy, but there is clearly a desire to at least retain some distinction between recoverable and unrecoverable errors and my naming-fu is not terribly strong. Sean

I agree there should be a distinction between recoverable exceptions (normal exceptions) and nonrecoverable exceptions (contract failures?). I agree that "Exception" should be the name for normal exceptions. The others could be named "Error" or "Failure". If we want the ability to catch these two separately, I don't see any other way other than having a third, top-level class, ie, a "Throwable", from which Exception and Error/Failure derive from. But while one will certainly want to catch Exception's without catching Error's, I'm not 100% sure it would be useful to be able to easily catch an Error but not an Exception. Does anyone know of such a case? -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Oct 16 2008
next sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Denis Koroskin wrote:
 On Thu, 16 Oct 2008 16:10:44 +0400, Bruno Medeiros 
 <brunodomedeiros+spam com.gmail> wrote:
 
 Sean Kelly wrote:
 Andrei Alexandrescu wrote:
 downs wrote:

 3) the Exception/Error problem (they're different things, they
 shouldn't inherit!)

I think Exception should inherit Error.

sort. To me, errors are what /cause/ exceptions to be thrown. For this reason, in Tango and druntime, "Exception" is defined as the top-level class from which all exceptions should derive (imagine that), making it equivalent to your "Error" object. With this in mind, I'd like to solicit opinions about how exceptions should be categorized. Should unrecoverable exceptions derive directly from "Exception" with a sibling named something like "RecoverableException" as the parent for recoverable exceptions? Or would a bit more structure be better? I don't think it should be the runtime's responsibility to define a complex exception hierarchy, but there is clearly a desire to at least retain some distinction between recoverable and unrecoverable errors and my naming-fu is not terribly strong. Sean

I agree there should be a distinction between recoverable exceptions (normal exceptions) and nonrecoverable exceptions (contract failures?). I agree that "Exception" should be the name for normal exceptions. The others could be named "Error" or "Failure". If we want the ability to catch these two separately, I don't see any other way other than having a third, top-level class, ie, a "Throwable", from which Exception and Error/Failure derive from. But while one will certainly want to catch Exception's without catching Error's, I'm not 100% sure it would be useful to be able to easily catch an Error but not an Exception. Does anyone know of such a case?

You shouldn't catch Error to recover the application, it should die.

I understand well the meaning and implications of Errors and contract failures. Despite their nature, it doesn't mean it's wrong to try to catch them. There's the example you mentioned, but there are others where you want to catch them and not terminate the application, such as unit testing, or catching an error in a plug-in in a modular application (although the feasibility of maintaining a working/stable process of a modular application written in a systems-language, when one of it's plug-ins fails, is somewhat dubious). -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Oct 16 2008
prev sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
Bruno Medeiros wrote:
 
 I agree there should be a distinction between recoverable exceptions 
 (normal exceptions) and nonrecoverable exceptions (contract failures?). 
 I agree that "Exception" should be the name for normal exceptions. The 
 others could be named "Error" or "Failure".
 If we want the ability to catch these two separately, I don't see any 
 other way other than having a third, top-level class, ie, a "Throwable", 
 from which Exception and Error/Failure derive from.

This is exactly the design that was decided upon.
 But while one will certainly want to catch Exception's without catching 
 Error's, I'm not 100% sure it would be useful to be able to easily catch 
 an Error but not an Exception. Does anyone know of such a case?

Possibly for reporting purposes--catch an Error to test if a critical error occurred, then re-throw. But I've never had a need for this myself. One could argue, then, that these should all simply derive from Throwable, but I think the logical grouping is useful for classification purposes if nothing else. Sean
Oct 16 2008
next sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Sean Kelly wrote:
 Bruno Medeiros wrote:
 I agree there should be a distinction between recoverable exceptions 
 (normal exceptions) and nonrecoverable exceptions (contract 
 failures?). I agree that "Exception" should be the name for normal 
 exceptions. The others could be named "Error" or "Failure".
 If we want the ability to catch these two separately, I don't see any 
 other way other than having a third, top-level class, ie, a 
 "Throwable", from which Exception and Error/Failure derive from.

This is exactly the design that was decided upon.

Cool! :) -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Oct 16 2008
prev sibling next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
Denis Koroskin wrote:
 On Thu, 16 Oct 2008 19:07:12 +0400, Sean Kelly <sean invisibleduck.org> 
 wrote:
 
 Bruno Medeiros wrote:
  I agree there should be a distinction between recoverable exceptions 
 (normal exceptions) and nonrecoverable exceptions (contract 
 failures?). I agree that "Exception" should be the name for normal 
 exceptions. The others could be named "Error" or "Failure".
 If we want the ability to catch these two separately, I don't see any 
 other way other than having a third, top-level class, ie, a 
 "Throwable", from which Exception and Error/Failure derive from.

This is exactly the design that was decided upon.

Will we still be able to throw Object? Is Throwable an interface or a class?

Throwable is a class, and contains all the stuff that Exception once contained: message, file, line, a "next" reference, and trace info. And it sounds like Walter may require that all thrown objects in D be a descendant of Throwable, but that won't happen immediately. Sean
Oct 16 2008
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Sean Kelly wrote:
 Denis Koroskin wrote:
 On Thu, 16 Oct 2008 19:07:12 +0400, Sean Kelly 
 <sean invisibleduck.org> wrote:

 Bruno Medeiros wrote:
  I agree there should be a distinction between recoverable 
 exceptions (normal exceptions) and nonrecoverable exceptions 
 (contract failures?). I agree that "Exception" should be the name 
 for normal exceptions. The others could be named "Error" or "Failure".
 If we want the ability to catch these two separately, I don't see 
 any other way other than having a third, top-level class, ie, a 
 "Throwable", from which Exception and Error/Failure derive from.

This is exactly the design that was decided upon.

Will we still be able to throw Object? Is Throwable an interface or a class?

Throwable is a class, and contains all the stuff that Exception once contained: message, file, line, a "next" reference, and trace info. And it sounds like Walter may require that all thrown objects in D be a descendant of Throwable, but that won't happen immediately.

I suggested Srowable instead of Throwable, but nobody liked it. What were they sinking about??? Andrei
Oct 16 2008
prev sibling parent reply Benji Smith <dlanguage benjismith.net> writes:
Sean Kelly wrote:
 Denis Koroskin wrote:
 Will we still be able to throw Object? Is Throwable an interface or a 
 class?

Throwable is a class, and contains all the stuff that Exception once contained: message, file, line, a "next" reference, and trace info.

Really??? Trace info? On all throwables? That would be fantastic!!! --benji
Oct 17 2008
parent reply Sean Kelly <sean invisibleduck.org> writes:
Benji Smith wrote:
 Sean Kelly wrote:
 Denis Koroskin wrote:
 Will we still be able to throw Object? Is Throwable an interface or a 
 class?

Throwable is a class, and contains all the stuff that Exception once contained: message, file, line, a "next" reference, and trace info.

Really??? Trace info? On all throwables? That would be fantastic!!!

Throwable only provides a callback for trace info to be generated. You'll still need to link a package that actually generates the trace. I used to use flectioned for this purpose, but it's too stale now and no longer works. However, this is exactly why tracing is supported via a plugin :-) Sean
Oct 17 2008
parent reply Benji Smith <dlanguage benjismith.net> writes:
Sean Kelly wrote:
 Benji Smith wrote:
 Sean Kelly wrote:
 Denis Koroskin wrote:
 Will we still be able to throw Object? Is Throwable an interface or 
 a class?

Throwable is a class, and contains all the stuff that Exception once contained: message, file, line, a "next" reference, and trace info.

Really??? Trace info? On all throwables? That would be fantastic!!!

Throwable only provides a callback for trace info to be generated. You'll still need to link a package that actually generates the trace. I used to use flectioned for this purpose, but it's too stale now and no longer works. However, this is exactly why tracing is supported via a plugin :-)

Gotcha. Well, I suppose that's okay. Do you know if there are any stacktracing libraries that currently work with D1? For Sean, Walter & Andrei: any opinions about including stack tracing in the core runtime? --benji
Oct 17 2008
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit

Brad Roberts wrote:
 For anything running on top of glibc (ie, every linux distribution) it's 
 fairly trivial via backtrace() and backtrace_symbols() found in 
 execinfo.h.  The symbols are the mangled form, but that's a simple matter 
 of code to translate back to more meaningful names.  An incomplete version 
 of that is available in phobos already (and probably somewhere in Tango 
 too) -- and it should really be in the core runtime, probably.
 
 Given the 2 or 3 implementations floating around, I imagine it wouldn't be 
 hard to integrate for both windows and linux.

Hey, those are a handy pair of functions. I just implemented a backtrace handler for Tango using those. The source is attached. Features: * Produces stacktraces (obviously). * Easy to use: just make sure it gets linked in (e.g. by importing trace.backtrace and using DSSS). The static constructor will then register as a stack trace provider with the runtime, and every exception will contain a stack trace. * Uses a slightly modified std.demangle (just enough to get it to compile with Tango) to demangle function names. * Filters out initial frames from its own module since they're not particularly interesting. * Lazily converts a stack trace to strings. Traces that are never printed only allocate the trace object and a list of return addresses. Stack traces that are printed multiple times only allocate the strings once. * Shouldn't be hard to port to Phobos. Less desirable behavior: * toString() returns a fresh string every time. This could be prevented with some more bookkeeping but I'm not sure it's worth it. Everyone: feel free to use this any way you want. In particular, feel free to commit this to the Tango, Phobos and/or druntime repositories :) . Example: ===== $ cat test.d module test; import trace.stacktrace; // just so DSSS picks it up. void main() { foo(); } void foo() { bar(); } void bar() { throw new Exception("Just showing off"); } $ dsss build [[[snip]]] $ ./test object.Exception: Just showing off ---------------- ./test(class Exception object.Exception._ctor(char[], class Exception)+0x24) [0x42b174] ./test(void test.bar()+0x37) [0x4186ba] ./test(void test.foo()+0x9) [0x418681] ./test(_Dmain+0x9) [0x418671] ./test [0x42dece] ./test [0x42dfae] ./test [0x42e540] ./test [0x42dfae] ./test(_d_run_main+0xbb) [0x42e42b] /lib/libc.so.6(__libc_start_main+0xf4) [0x7fd1aacff1c4] ./test [0x4185d9] ===== Note: the trace may be wrapped in your newsreader.
Oct 18 2008
parent reply Sean Kelly <sean invisibleduck.org> writes:
Frits van Bommel wrote:
 Brad Roberts wrote:
 For anything running on top of glibc (ie, every linux distribution) 
 it's fairly trivial via backtrace() and backtrace_symbols() found in 
 execinfo.h.  The symbols are the mangled form, but that's a simple 
 matter of code to translate back to more meaningful names.  An 
 incomplete version of that is available in phobos already (and 
 probably somewhere in Tango too) -- and it should really be in the 
 core runtime, probably.

 Given the 2 or 3 implementations floating around, I imagine it 
 wouldn't be hard to integrate for both windows and linux.

Hey, those are a handy pair of functions. I just implemented a backtrace handler for Tango using those. The source is attached.

Nice work! I think this would be a good thing to bundle with Tango and druntime. I haven't looked at the source yet, but do you have any objection to it being distributed under the Tango or druntime licenses? Sean P.S. Anyone interested in making a Win32 backtrace package? :-)
Oct 18 2008
next sibling parent Robert Fraser <fraserofthenight gmail.com> writes:
Sean Kelly wrote:
 P.S. Anyone interested in making a Win32 backtrace package? :-)

Depending on how much raytracing I'm doing next week, I may be able to work on one... If anyone else wants to give an attempt here's a good starting place: http://msdn.microsoft.com/en-us/library/ms680650(VS.85).aspx
Oct 18 2008
prev sibling next sibling parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Sean Kelly wrote:
 Frits van Bommel wrote:
 Hey, those are a handy pair of functions.
 I just implemented a backtrace handler for Tango using those.

 The source is attached.

Nice work! I think this would be a good thing to bundle with Tango and druntime. I haven't looked at the source yet, but do you have any objection to it being distributed under the Tango or druntime licenses?

You seem to have missed this bit of my message:
 Everyone: feel free to use this any way you want. In particular, feel
 free to commit this to the Tango, Phobos and/or druntime repositories 

So no, no problem at all. In fact, I'd be very happy if functionality like this made it in. If possible, I'd like my name to be kept in the file as the (original) author though. Please note there's a comment in the class about what you need to do to keep the "excluding initial stackframes from this module" working if you change the module name. (There's a hard-coded string constant that's the first part of any mangled name in the module, used to ignore the frames from that module)
Oct 18 2008
prev sibling next sibling parent Sean Kelly <sean invisibleduck.org> writes:
Jarrett Billingsley wrote:
 On Sat, Oct 18, 2008 at 12:30 PM, Sean Kelly <sean invisibleduck.org> wrote:
 Frits van Bommel wrote:
 Brad Roberts wrote:
 For anything running on top of glibc (ie, every linux distribution) it's
 fairly trivial via backtrace() and backtrace_symbols() found in execinfo.h.
  The symbols are the mangled form, but that's a simple matter of code to
 translate back to more meaningful names.  An incomplete version of that is
 available in phobos already (and probably somewhere in Tango too) -- and it
 should really be in the core runtime, probably.

 Given the 2 or 3 implementations floating around, I imagine it wouldn't
 be hard to integrate for both windows and linux.

I just implemented a backtrace handler for Tango using those. The source is attached.

druntime. I haven't looked at the source yet, but do you have any objection to it being distributed under the Tango or druntime licenses? Sean P.S. Anyone interested in making a Win32 backtrace package? :-)

Erm, I think team0xf has had one for a while now. I think it's a patch, though, and not a plug-in sort of thing.

I think it modifies the "throw" code in the compiler runtime. If it could be made into a plugin though, I'd love to use it. Sean
Oct 18 2008
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Jarrett Billingsley:
 Erm, I think team0xf has had one for a while now.  I think it's a
 patch, though, and not a plug-in sort of thing.

The phobos.lib of DMD 1.035 is about 1.8 MB, while their last "stacktrace hack" is just few hundred KB big. Do they compress it in some way? Bye, bearophile
Oct 18 2008
prev sibling next sibling parent Brad Roberts <braddr bellevue.puremagic.com> writes:
On Fri, 17 Oct 2008, Benji Smith wrote:

 Sean Kelly wrote:
 Benji Smith wrote:
 Sean Kelly wrote:
 Denis Koroskin wrote:
 Will we still be able to throw Object? Is Throwable an interface or a
 class?

Throwable is a class, and contains all the stuff that Exception once contained: message, file, line, a "next" reference, and trace info.

Really??? Trace info? On all throwables? That would be fantastic!!!

Throwable only provides a callback for trace info to be generated. You'll still need to link a package that actually generates the trace. I used to use flectioned for this purpose, but it's too stale now and no longer works. However, this is exactly why tracing is supported via a plugin :-)

Gotcha. Well, I suppose that's okay. Do you know if there are any stacktracing libraries that currently work with D1? For Sean, Walter & Andrei: any opinions about including stack tracing in the core runtime? --benji

For anything running on top of glibc (ie, every linux distribution) it's fairly trivial via backtrace() and backtrace_symbols() found in execinfo.h. The symbols are the mangled form, but that's a simple matter of code to translate back to more meaningful names. An incomplete version of that is available in phobos already (and probably somewhere in Tango too) -- and it should really be in the core runtime, probably. Given the 2 or 3 implementations floating around, I imagine it wouldn't be hard to integrate for both windows and linux. Later, Brad
Oct 17 2008
prev sibling parent "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Sat, Oct 18, 2008 at 12:30 PM, Sean Kelly <sean invisibleduck.org> wrote:
 Frits van Bommel wrote:
 Brad Roberts wrote:
 For anything running on top of glibc (ie, every linux distribution) it's
 fairly trivial via backtrace() and backtrace_symbols() found in execinfo.h.
  The symbols are the mangled form, but that's a simple matter of code to
 translate back to more meaningful names.  An incomplete version of that is
 available in phobos already (and probably somewhere in Tango too) -- and it
 should really be in the core runtime, probably.

 Given the 2 or 3 implementations floating around, I imagine it wouldn't
 be hard to integrate for both windows and linux.

Hey, those are a handy pair of functions. I just implemented a backtrace handler for Tango using those. The source is attached.

Nice work! I think this would be a good thing to bundle with Tango and druntime. I haven't looked at the source yet, but do you have any objection to it being distributed under the Tango or druntime licenses? Sean P.S. Anyone interested in making a Win32 backtrace package? :-)

Erm, I think team0xf has had one for a while now. I think it's a patch, though, and not a plug-in sort of thing.
Oct 18 2008
prev sibling next sibling parent "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Sat, Oct 11, 2008 at 5:04 PM, downs <default_357-line yahoo.de> wrote:
 Sean Kelly wrote:
 Andrei Alexandrescu wrote:
 downs wrote:

 3) the Exception/Error problem (they're different things, they
 shouldn't inherit!)

I think Exception should inherit Error.

I personally dislike the use of "Error" to denote exceptions of any sort. To me, errors are what /cause/ exceptions to be thrown. For this reason, in Tango and druntime, "Exception" is defined as the top-level class from which all exceptions should derive (imagine that), making it equivalent to your "Error" object.

Errors are useful. There is a class of problems that should not be trivially recoverable, i.e. not if you don't know what you're doing. It's the class of problems that would not be caught in release mode. Why? Because if such problems are caught thoughtlessly, then the program behavior changes in release mode .. which is a big, BIG no-no. So ArrayBoundsError, AssertFailedError, and InvariantError are all valid, and should NOT be derived from Exceptions.

And by the same token, string-to-integer conversion methods should not throw an irrecoverable error. I'm looking at you, ConvError.
Oct 11 2008
prev sibling next sibling parent "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Sat, Oct 11, 2008 at 11:29 PM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Jarrett Billingsley wrote:
 On Sat, Oct 11, 2008 at 5:04 PM, downs <default_357-line yahoo.de> wrote:
 Sean Kelly wrote:
 Andrei Alexandrescu wrote:
 downs wrote:

 3) the Exception/Error problem (they're different things, they
 shouldn't inherit!)

I think Exception should inherit Error.

I personally dislike the use of "Error" to denote exceptions of any sort. To me, errors are what /cause/ exceptions to be thrown. For this reason, in Tango and druntime, "Exception" is defined as the top-level class from which all exceptions should derive (imagine that), making it equivalent to your "Error" object.

trivially recoverable, i.e. not if you don't know what you're doing. It's the class of problems that would not be caught in release mode. Why? Because if such problems are caught thoughtlessly, then the program behavior changes in release mode .. which is a big, BIG no-no. So ArrayBoundsError, AssertFailedError, and InvariantError are all valid, and should NOT be derived from Exceptions.

And by the same token, string-to-integer conversion methods should not throw an irrecoverable error. I'm looking at you, ConvError.

ConvOverflowError too, right?

And why the heck doesn't ConvOverflowError inherit from ConvError?
Oct 11 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Thu, 16 Oct 2008 16:10:44 +0400, Bruno Medeiros  
<brunodomedeiros+spam com.gmail> wrote:

 Sean Kelly wrote:
 Andrei Alexandrescu wrote:
 downs wrote:

 3) the Exception/Error problem (they're different things, they
 shouldn't inherit!)

I think Exception should inherit Error.

sort. To me, errors are what /cause/ exceptions to be thrown. For this reason, in Tango and druntime, "Exception" is defined as the top-level class from which all exceptions should derive (imagine that), making it equivalent to your "Error" object. With this in mind, I'd like to solicit opinions about how exceptions should be categorized. Should unrecoverable exceptions derive directly from "Exception" with a sibling named something like "RecoverableException" as the parent for recoverable exceptions? Or would a bit more structure be better? I don't think it should be the runtime's responsibility to define a complex exception hierarchy, but there is clearly a desire to at least retain some distinction between recoverable and unrecoverable errors and my naming-fu is not terribly strong. Sean

I agree there should be a distinction between recoverable exceptions (normal exceptions) and nonrecoverable exceptions (contract failures?). I agree that "Exception" should be the name for normal exceptions. The others could be named "Error" or "Failure". If we want the ability to catch these two separately, I don't see any other way other than having a third, top-level class, ie, a "Throwable", from which Exception and Error/Failure derive from. But while one will certainly want to catch Exception's without catching Error's, I'm not 100% sure it would be useful to be able to easily catch an Error but not an Exception. Does anyone know of such a case?

You shouldn't catch Error to recover the application, it should die. However, you might want to get as much info as possible (so that it would be easier to track the issue) before terminating the application. Something like that, for example: void main() { try { runApp(); } catch (Error e) { logError(e.toString()); } catch { logError("Unknown exception has been caught."); } }
Oct 16 2008
prev sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Thu, 16 Oct 2008 19:07:12 +0400, Sean Kelly <sean invisibleduck.org>  
wrote:

 Bruno Medeiros wrote:
  I agree there should be a distinction between recoverable exceptions  
 (normal exceptions) and nonrecoverable exceptions (contract failures?).  
 I agree that "Exception" should be the name for normal exceptions. The  
 others could be named "Error" or "Failure".
 If we want the ability to catch these two separately, I don't see any  
 other way other than having a third, top-level class, ie, a  
 "Throwable", from which Exception and Error/Failure derive from.

This is exactly the design that was decided upon.

Will we still be able to throw Object? Is Throwable an interface or a class?
Oct 16 2008
prev sibling next sibling parent Lars Ivar Igesund <larsivar igesund.net> writes:
Andrei Alexandrescu wrote:

 Ok, per Aarti's suggestion: without speaking officially for Walter, let
 me ask this - what do you think are the top issues you'd like to see
 fixed in D?
 
 Andrei

* Ditch enum for manifest constants * Get rid of foreach_reverse * Make closures stack allocatable * prettier name for __traits What is really missing: * Struct interfaces (not necessarily implemented as with classes, although casting would be nice, but mainly the ability to say that a struct conforms to a given interface) -- Lars Ivar Igesund blog at http://larsivi.net DSource, #d.tango & #D: larsivi Dancing the Tango
Oct 10 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 10 Oct 2008 03:50:58 +0400, Sergey Gromov <snake.scaly gmail.com>  
wrote:
 I think that the base type should offer both speed and safety by
 default.  To achieve this I propose to disallow unsafe operations on
 current arrays, call them slices, and introduce a built-in memory
 management class, Array(T).  Here's why built-in:

 void main()
 {
   auto arr = new char[]; // arr is of type Array!(char)
   arr.length = 10;       // OK
   arr ~= 'c';            // fast
   arr ~= "foo bar";      // fast
   foo(arr);              // implicit cast to char[]
   bar(arr);              // passed by reference
                          // "text" is appended
 }

 void foo(char[] a)
 {
   char x = a[0];                // OK
   char[] word = a[10..14];      // fine
   size_t l = a.length;          // 18
   a.length = 20;                // error, .length is read-only
   a ~= "text";                  // error, append not supported
   char[] n = a[15..18] ~ "some" // type of expression is Array!(char)
              ~ word             // fast
              ~ "baz"            // fast
              ;                  // implicit cast to char[]
 }

 void bar(Array!(char) a)
 {
   a ~= "text"; // OK
 }

The basic idea is good, but I believe that the proposal as is is too much complex. I think Array!(T) should not be casted to T[] implicitly, instead, it could have T[] all() method to be on par with ranges design. But then, this reduces usefulness of T[]. When should it be used? What are the benefits of const(T)[] over const(Array!(T))? As you prosope, difference between T[] functionality (mutable but not resizable) over Array!(T) (mutable and resizable) is just too narrow. Besides, Array!(T) is not a good name for build-in type. Again, I think that basic idea is good, but details still need to be worked on.
Oct 10 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 10 Oct 2008 17:22:10 +0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Sergey Gromov wrote:
 My T[] is useful when you want to recursively split a megabyte file  
 into a couple thousands of tokens, and then modify some of those  
 tokens.  For that, your T[] must be lightweight, it must reference a  
 bigger piece of data, and it must guarantee not to write anything into  
 memory outside its boundaries.
  The Array is for appending.  It must always own its memory.  Therefore  
 you should be able to pass it around by reference, so Array is a  
 *class* and cannot be nearly as lightweight as T[].
  You see, many of their properties are orthogonal.  If you drop one,  
 you lose flexibility.

 Besides, Array!(T) is not a good name for build-in type.


What's wrong with making Array a library type? Andrei

Then, you'll have to drop new T[] syntax in favor of new Array!(T);
Oct 10 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 10 Oct 2008 17:43:42 +0400, Denis Koroskin <2korden gmail.com>  
wrote:

 On Fri, 10 Oct 2008 17:22:10 +0400, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:

 Sergey Gromov wrote:
 My T[] is useful when you want to recursively split a megabyte file  
 into a couple thousands of tokens, and then modify some of those  
 tokens.  For that, your T[] must be lightweight, it must reference a  
 bigger piece of data, and it must guarantee not to write anything into  
 memory outside its boundaries.
  The Array is for appending.  It must always own its memory.   
 Therefore you should be able to pass it around by reference, so Array  
 is a *class* and cannot be nearly as lightweight as T[].
  You see, many of their properties are orthogonal.  If you drop one,  
 you lose flexibility.

 Besides, Array!(T) is not a good name for build-in type.


What's wrong with making Array a library type? Andrei

Then, you'll have to drop new T[] syntax in favor of new Array!(T);

Not too bad if you stop thinking about T[] as a resizable array: T[n] foo; // an array of fixed size (n) T[] bar = array.all(); // fixed-sized, too It will take time to get used to it, but it seems resonable to me.
Oct 10 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 10 Oct 2008 18:33:43 +0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Sergey Gromov wrote:
 Fri, 10 Oct 2008 08:22:10 -0500,
 Andrei Alexandrescu wrote:
 Sergey Gromov wrote:
 My T[] is useful when you want to recursively split a megabyte file  
 into a couple thousands of tokens, and then modify some of those  
 tokens.  For that, your T[] must be lightweight, it must reference a  
 bigger piece of data, and it must guarantee not to write anything  
 into memory outside its boundaries.

 The Array is for appending.  It must always own its memory.   
 Therefore you should be able to pass it around by reference, so Array  
 is a *class* and cannot be nearly as lightweight as T[].

 You see, many of their properties are orthogonal.  If you drop one,  
 you lose flexibility.

 Besides, Array!(T) is not a good name for build-in type.



new Object[15]; to be immediately appendable and therefore a syntactic sugar for new Array!(Object)(15);

I have a nagging impression the syntax Array!(Object) strikes you as hard on the hand and the eyes...

I think "auto o = new Array!(Object)(15);" is good.
 Anyhow the syntax new Object[15] is idiotic because Object[15] is a type  
 in itself. The syntax makes it next to impossible to actually generate a  
 fixed-sized array dynamically.

I Agree.
 In fact here's a challenge for you. Please generate a pointer to an  
 Object[15] using new.

 I'd also like
   "foo" ~ text ~ "bar"
 to become something like
   (new Array!(char)) ~= "foo" ~= text ~= "bar"
 that is what Java does to string concatenation.  Sugar doesn't seem to  
 couple well with a purely library type.


I doubt that. Java strings are immutable, their length can't be changed either (i.e. they are a direct analog of a proposed invariant(char)[]). Concatanating Java strings don't turn them into array, and it is dead slow. Every sane person uses StringBuilder instead.
  Well, the latter is probably too complex and can cause major  
 problems.  But new T[] should return something appendable.

I'm not 100% sure about that. Andrei

T[] t = new T[n]; // returning fixed-sized array seams reasonable to me, we can't resize it through t anyway. OTOH how often do you create fixed-sized arrays? Other issue: auto f = "foo"; // s is invariant(char)[], i.e. a read-only view. No mutating, no length change f = f ~ "bar"; // allow this or not? it *should* be allowed because of CTFE and templates If this would be allowed then f ~= "bar"; should be also allowed as well (a short-cut), right? If this would be allowed should it call gc.capacity() prior to allocation or not? I guess not, but this way appending becomes even slower (*much* slower).
Oct 10 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 10 Oct 2008 19:19:45 +0400, Sergey Gromov <snake.scaly gmail.com>  
wrote:

 Fri, 10 Oct 2008 19:10:19 +0400,
 Denis Koroskin wrote:
 On Fri, 10 Oct 2008 18:33:43 +0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 Sergey Gromov wrote:
 I'd also like
   "foo" ~ text ~ "bar"
 to become something like
   (new Array!(char)) ~= "foo" ~= text ~= "bar"
 that is what Java does to string concatenation.  Sugar doesn't seem  


 couple well with a purely library type.


I doubt that. Java strings are immutable, their length can't be changed either (i.e. they are a direct analog of a proposed invariant(char)[]). Concatanating Java strings don't turn them into array, and it is dead slow. Every sane person uses StringBuilder instead.

Java compiler substitutes "foo" + text + "bar" with new StringBuilder().append("foo").append(text).append("bar").toString() all by itself because both String and StringBuilder are built-in classes.

Yes, you are right. From The Java Language Specification, 3rd edition:
 To increase the performance of repeated string concatenation,
 a Java compiler may use the StringBuffer class or a similar technique
 to reduce the number of intermediate String objects that are created
 by evaluation of an expression.

Oct 10 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 10 Oct 2008 19:37:28 +0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Denis Koroskin wrote:
 On Fri, 10 Oct 2008 19:19:45 +0400, Sergey Gromov  
 <snake.scaly gmail.com> wrote:

 Fri, 10 Oct 2008 19:10:19 +0400,
 Denis Koroskin wrote:
 On Fri, 10 Oct 2008 18:33:43 +0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 Sergey Gromov wrote:
 I'd also like
   "foo" ~ text ~ "bar"
 to become something like
   (new Array!(char)) ~= "foo" ~= text ~= "bar"
 that is what Java does to string concatenation.  Sugar doesn't  


 couple well with a purely library type.


I doubt that. Java strings are immutable, their length can't be changed either (i.e. they are a direct analog of a proposed invariant(char)[]). Concatanating Java strings don't turn them into array, and it is dead slow. Every sane person uses StringBuilder instead.

Java compiler substitutes "foo" + text + "bar" with new StringBuilder().append("foo").append(text).append("bar").toString() all by itself because both String and StringBuilder are built-in classes.

 To increase the performance of repeated string concatenation,
 a Java compiler may use the StringBuffer class or a similar technique
 to reduce the number of intermediate String objects that are created
 by evaluation of an expression.


But that doesn't require StringBuffer to be part of the language. With literals, the compiler can do whatever magic it wants. Andrei

Of course. C# solution needs consideration, too, I believe: it substitutes "foo" ~ text ~ "bar" with String.Concat("foo", text, "bar");
Oct 10 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 10 Oct 2008 19:54:33 +0400, Benji Smith <dlanguage benjismith.net>  
wrote:

 I think *all* arrays should be declared like this:

     T[] array = new T[n];

You often want to avoid heap allocation at all cost. It can't be done as you propose.
Oct 10 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 10 Oct 2008 21:30:23 +0400, Benji Smith <dlanguage benjismith.net>  
wrote:

 Denis Koroskin wrote:
 On Fri, 10 Oct 2008 19:54:33 +0400, Benji Smith  
 <dlanguage benjismith.net> wrote:

 I think *all* arrays should be declared like this:

     T[] array = new T[n];

as you propose.

For all other reference types, allocation on the stack is accomplished with the "scope" keyword, without having a different type, or a different constructor-call syntax. I think the same thing could apply to arrays just as easily. --benji

No, you can't create scoped arrays this way. Assume you have: scope T[] array = new T[n]; array ~= t; // calls realloc(array.ptr); If ptr points to stack space then you get an access violation.
Oct 10 2008
prev sibling next sibling parent "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Fri, Oct 10, 2008 at 11:54 AM, Benji Smith <dlanguage benjismith.net> wrote:

 new T[x] is a brain-dead syntax that I wish Walter hadn't imported in the
 first place.

Really? I think it's very valuable. The "new T[x]" syntax lets you construct an array as an RValue. Without that syntax, you have to declare an array before using it.

No, what he's getting at is that "new T[x]" does not mean "allocate a statically-sized array", it means "allocate a dynamically-sized array". "new T" for any T should mean "allocate a T", not "allocate something that's kind of close to a T." What Andrei is implying, then is that for dynamic arrays, we should have to use the (already-legal) "new T[](n)" form, and "new T[x]" would mean to allocate a statically-sized array on the heap.
Oct 10 2008
prev sibling next sibling parent "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Fri, Oct 10, 2008 at 2:24 PM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Jarrett Billingsley wrote:
 On Fri, Oct 10, 2008 at 11:54 AM, Benji Smith <dlanguage benjismith.net>
 wrote:

 new T[x] is a brain-dead syntax that I wish Walter hadn't imported in
 the
 first place.

Really? I think it's very valuable. The "new T[x]" syntax lets you construct an array as an RValue. Without that syntax, you have to declare an array before using it.

No, what he's getting at is that "new T[x]" does not mean "allocate a statically-sized array", it means "allocate a dynamically-sized array". "new T" for any T should mean "allocate a T", not "allocate something that's kind of close to a T." What Andrei is implying, then is that for dynamic arrays, we should have to use the (already-legal) "new T[](n)" form, and "new T[x]" would mean to allocate a statically-sized array on the heap.

Well yah but I think this will confuse people coming from C++. I just wish new was abolished entirely: struct S {} auto a = S(); auto b = Object(); auto c = char[](15); auto d = char[15](); So in general Type followed by "(" ...optional arguments... ")" yields a value.

I like that, but then you can't distinguish between heap and stack structs, and static opCall on classes is also taken (though that's not too big a deal).
Oct 10 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 10 Oct 2008 23:23:28 +0400, Benji Smith <dlanguage benjismith.net>  
wrote:

 Jarrett Billingsley wrote:
 On Fri, Oct 10, 2008 at 11:54 AM, Benji Smith  
 <dlanguage benjismith.net> wrote:

 new T[x] is a brain-dead syntax that I wish Walter hadn't imported in  
 the
 first place.

The "new T[x]" syntax lets you construct an array as an RValue. Without that syntax, you have to declare an array before using it.

statically-sized array", it means "allocate a dynamically-sized array". "new T" for any T should mean "allocate a T", not "allocate something that's kind of close to a T."

As long as T[3] and T[5] and T[] are considered different types, I agree with that sentiment. But then again, I think array semantics would make a lot more sense if all arrays were of type T[], regardless of their size, their location (stack vs heap), and whether they're static or dynamic. --benji

No prob, get a static, dynamic array, Vector, Appender, whatever and get a slice out of them: T[] t; T[3] t3; t = t3[]; T[4] t4; t = t4[]; auto a = new Array!(T); t = a.all; // I would prefer a[] here :) etc.
Oct 10 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Sat, 11 Oct 2008 10:59:17 +0400, Benji Smith <dlanguage benjismith.net>  
wrote:

 dsimcha wrote:
 == Quote from Benji Smith (dlanguage benjismith.net)'s article
 Anyhow, I'm not going to keep chasing this point. For people new to D,
 the subtle differences between static and dynamic arrays can be a  
 source
 of confusion. I still have my share of gotcha moments with them, and I
 think D would be well served by minimizing those differences.
 --benji

I disagree, not only specifically on this issue but on a more philosophical level about a lot of stuff that's been mentioned here in the past few days about simplifying D. The fact is that D is a performance language that retains the ability to program close to the metal.

Actually, when it comes to string processing, D is decidedly *not* a "performance language". Compared to...say...Java (which gets a bum rap around here for being slow), D is nothing special when it comes to string processing speed. I've attached a couple of benchmarks, implemented in both Java and D (the "shakespeare.txt" file I'm benchmarking against is from the Gutenburg project. It's about 5 MB, and you can grab it from here: http://www.gutenberg.org/dirs/etext94/shaks12.txt ) In some of those benchmarks, D is slightly faster. In some of them, Java is a lot faster. Overall, on my machine, the D code runs in about 12.5 seconds, and the Java code runs in about 2.5 seconds. Keep in mind, all java characters are two-bytes wide. And you can't access a character directly. You have to retrieve it from the String object, using the charAt() method. And splitting a string creates a new object for every fragment. I admire the goal in D to be a performance language, but it drives me crazy when people use performance as justification for an inferior design, when other languages that use the superior design also accomplish superior performance. --benji

I bet most of the performace is ate by ~= (both you and Tango routines you use use it extensively).
Oct 11 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Sat, 11 Oct 2008 18:00:38 +0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Sergey Gromov wrote:
 Sat, 11 Oct 2008 12:16:43 +0200,
 Sascha Katzner wrote:
 Benji Smith wrote:
 Actually, when it comes to string processing, D is decidedly *not* a  
 "performance language".

 Compared to...say...Java (which gets a bum rap around here for being  
 slow), D is nothing special when it comes to string processing speed.

 I've attached a couple of benchmarks, implemented in both Java and D  
 (the "shakespeare.txt" file I'm benchmarking against is from the  
 Gutenburg project. It's about 5 MB, and you can grab it from here:  
 http://www.gutenberg.org/dirs/etext94/shaks12.txt )

 In some of those benchmarks, D is slightly faster. In some of them,  
 Java is a lot faster. Overall, on my machine, the D code runs in  
 about 12.5 seconds, and the Java code runs in about 2.5 seconds.

 Keep in mind, all java characters are two-bytes wide. And you can't  
 access a character directly. You have to retrieve it from the String  
 object, using the charAt() method. And splitting a string creates a  
 new object for every fragment.

 I admire the goal in D to be a performance language, but it drives me  
 crazy when people use performance as justification for an inferior  
 design, when other languages that use the superior design also  
 accomplish superior performance.

implementation details of Tango (because I don't use Tango) here are some notes: - The D version uses UTF8 strings whereas the Java version uses "wanna-be" UTF16 (Java has a lot of problems with surrogates). This means you are comparing apples with pears (D has to *parse* an UTF8 string and Java simply uses an wchar array without proper surrogate handling in *many* cases).

the same *task*, and the task is somewhat close to real world. It measures *time*, which is universal. The compared languages use different approaches and techniques to achieve the goal, that's why benchmark is useful. It allows to justify usefulness of these languages for a particular class of tasks.
 - At least in runCharIterateTest() you also convert the D UTF8 string  
 also additionally into an UTF32 string, in the Java version you did  
 not do this.

much to benchmark. Why don't you mention, for instance, that Java is a virtual machine?
 - The StringBuilder in the Java version is *much* faster because it  
 doesn't have to allocate a new memory block in each step. You can use  
 a similar class in D too, without the need of a special string  
 class/object.

use default array appending which is currently dead slow. Benji, to actually compare the speed of string operations you better use one of array builders discussed in this group.

If anyone wants to try it, I'm pasting the draft version from std.array below. Andrei //------------------------------------------------------------------------------ struct Appender(A : T[], T) { private T[] * pArray; private size_t _capacity; this(T[] * p) { pArray = p; if (!pArray) pArray = (new typeof(*pArray)[1]).ptr; _capacity = .capacity(pArray.ptr) / T.sizeof; } this(this) { enforce(pArray); } T[] data() { return pArray ? *pArray : null; } size_t capacity() const { return pArray ? pArray.length : 0; } void write(T item) { if (!pArray) pArray = (new typeof(*pArray)[1]).ptr; if (pArray.length < _capacity) { // Should do in-place construction here pArray.ptr[pArray.length] = item; *pArray = pArray.ptr[0 .. pArray.length + 1]; } else { // Time to reallocate, do it and cache capacity *pArray ~= item; _capacity = .capacity(pArray.ptr) / T.sizeof; } } static if (is(const(T) : T)) { alias const(T) AcceptedElementType; } else { alias T AcceptedElementType; } void write(AcceptedElementType[] items) { for (; !items.empty(); items.next()) { write(items.head()); } } static if (is(const(T) == const(char))) { void write(in wchar wc) { assert(false); } void write(in wchar[] wcs) { encode!(T)(wcs, *this); } void write(in dchar dc) { assert(false); } void write(in dchar[] dcs) { encode!(T)(dcs, *this); } } void clear() { if (!pArray) return; pArray.length = 0; _capacity = .capacity(pArray.ptr) / T.sizeof; } } auto appender(T)(T[] * t) { Appender!(T[]) r = Appender!(T[])(t); return r; } unittest { auto arr = new char[0]; auto app = appender(&arr); string b = "abcdefg"; foreach (char c; b) app.write(c); assert(app.data == "abcdefg"); }

Two notes: 1) I thought Appender would have an 'append' method as well as opCatAssign. 2) Shouldn't the following member be called 'size'/'length' instead? size_t capacity() const { return pArray ? pArray.length : 0; } 'capacity' would look like size_t capacity() const { return _capacity; }
Oct 11 2008
prev sibling next sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Andrei Alexandrescu wrote:
 Ok, per Aarti's suggestion: without speaking officially for Walter, let 
 me ask this - what do you think are the top issues you'd like to see 
 fixed in D?
 
 Andrei

1. The D runtime issue, so that the standard libraries are both compatible. (seems we are on a good track on this one) 2. Better compiler tools. We keep hearing about limitations in the DMD linker and back-end, and it seems like it's really holding back lots of things (performance, some template features, portability to other platforms, etc.). Maybe LDC will help these, but as a side project, it's not as nice as Walter directly working on it, or something similar. It would also be nice if the bud/rebuild functionality were integrated into a D compiler. 3. Concentrate on cleaning up various language issues, even before you try venturing into complex new feature for concurrency and whatnot. *Get the basics right first.* Here's some examples, quoted from Sean because I fully agree: * "I'd like to see real support for properties or at least more stringent compiler rules regarding the instances where a function can and can't be called without parens. Calling delegates should always require the use of parens, for example. This isn't a show-stopper, but the current situation feels like it requires experimentation to make sure actual behavior matches expected behavior, and I think this wouldn't be an issue with in-language support for properties." Yes. There will be no peace as long as there are ambiguities in the function call syntax. * "Dump support for legacy features such as the pre-D1 dual meaning of 'auto'. I'd also like to see support for C-style array an function pointer declarations dropped. These aren't a huge issue to me, but I'd prefer if D didn't have any such easter eggs. Drop foreach_reverse too--it is a well-intended disaster." * I'll add my own: fix static arrays so that they are regular types, just like any other. Fix forward reference problems, as well as the myriad of other bugs that keep hampering D development. 4. Reform the system for implicit conversions between basic types. Forget about this C legacy, and implement a safer, better system (like the one Don proposed). 5. I know of a lot of other issues, but at least for now can't think of any that stands out more than the others to earn the 5th place. In any case the 4 issues above form a good picture of what I think are the most important issues. -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Oct 16 2008
prev sibling parent reply Max Samukha <samukha voliacable.com.removethis> writes:
On Wed, 08 Oct 2008 15:07:27 -0500, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

Ok, per Aarti's suggestion: without speaking officially for Walter, let 
me ask this - what do you think are the top issues you'd like to see 
fixed in D?

Andrei

1. CTFE and template bugs Besides already mentioned: Uninitialized arrays cause trouble. Byref parameters are usable only with basic types. Associative arrays have never worked. Problems with structs. "cannot evaluate at compile time" is not enough for functions with more than a couple of statements! __traits is buggy and incomplete. Are there still plans for static foreach?
Oct 28 2008
parent reply bearophile <bearophileHUGS lycos.com> writes:
Max Samukha:
 Are there still plans for static foreach?

While you/we wait for the static foreach, you can often do something similar defining a Range like this: template Range(int stop) { static if (stop <= 0) alias Tuple!() Range; else alias Tuple!(Range!(stop-1), stop-1) Range; } In my libs you can find a Range!() for 2 and 3 arguments too (stop, start-stop, start-stop-stride). Then you can use it for example like this: auto const foo = [x1, x2, x3]; foreach (i; Range!(foo.length)) func(x1); That foreach is static. But I suggest you to not use it with very large number of iterations. Bye, bearophile
Oct 28 2008
parent reply Max Samukha <samukha voliacable.com.removethis> writes:
On Tue, 28 Oct 2008 07:12:31 -0400, bearophile
<bearophileHUGS lycos.com> wrote:

Max Samukha:
 Are there still plans for static foreach?

While you/we wait for the static foreach, you can often do something similar defining a Range like this: template Range(int stop) { static if (stop <= 0) alias Tuple!() Range; else alias Tuple!(Range!(stop-1), stop-1) Range; } In my libs you can find a Range!() for 2 and 3 arguments too (stop, start-stop, start-stop-stride). Then you can use it for example like this: auto const foo = [x1, x2, x3]; foreach (i; Range!(foo.length)) func(x1);

works: void main() { alias Tuple!(1, int, "string") t; const l = t.length; // have to use temporary because of a bug foreach (i; Range!(l)) { static if (is(t[i])) pragma(msg, t[i].stringof ~ " is type"); else pragma(msg, t[i].stringof ~ " is not type"); } } The disadvantage is that you can use the trick only in functions
Oct 28 2008
parent reply bearophile <bearophileHUGS lycos.com> writes:
Max Samukha:
     alias Tuple!(1, int, "string") t;
     const l = t.length; // have to use temporary because of a bug
     foreach (i; Range!(l)) {
         static if (is(t[i]))

This may be enough (not tested): foreach (el; Tuple!(1, int, "string")) { static if (is(el)) Bye, bearophile
Oct 28 2008
parent Max Samukha <samukha voliacable.com.removethis> writes:
On Tue, 28 Oct 2008 13:04:51 -0400, bearophile
<bearophileHUGS lycos.com> wrote:

Max Samukha:
     alias Tuple!(1, int, "string") t;
     const l = t.length; // have to use temporary because of a bug
     foreach (i; Range!(l)) {
         static if (is(t[i]))

This may be enough (not tested): foreach (el; Tuple!(1, int, "string")) { static if (is(el)) Bye, bearophile

Sadly, no. It doesn't like the int and type of el is always type of first element.
Oct 28 2008