digitalmars.D - The demise of T[new]

Walter Bright (15/15) Oct 18 2009 The purpose of T[new] was to solve the problems T[] had with passing T[]...

bearophile (18/23) Oct 18 2009 I like how Andrei keeps trying to invent new possible ideas.
Denis Koroskin (16/31) Oct 18 2009 Well, I personally don't feel much of a need for T[new], and I agree T[]...

Andrei Alexandrescu (7/50) Oct 18 2009 That was the exact plan. I even have half a chapter written about it

Denis Koroskin (13/57) Oct 18 2009 What's problem with implicit cast of T[new] to T[]?

grauzone (9/12) Oct 18 2009 Whether T[new] is bultin (mostly implemented in the runtime), or a

Andrei Alexandrescu (3/8) Oct 18 2009 Can't increment length if it's read-only.

Jason House (2/12) Oct 18 2009 removing syntactic sugar doesn't really remove corner cases. T[new] is b...

Andrei Alexandrescu (5/22) Oct 18 2009 I think ArrayBuilder could a logic similar to Appender for "~=" and

dsimcha (6/28) Oct 18 2009 So basically, ArrayBuilder would be like T[new], except:

Andrei Alexandrescu (3/33) Oct 18 2009 Yes, that's the plan. I hope you're not setting me up or something :o).

Jason House (2/37) Oct 18 2009 I was ;) Well, only sort of. I'm undecided if array-like syntactic sugar...
dsimcha (15/48) Oct 18 2009 Given the issues surrounding T[new] and how ugly some cases of the impli...

dsimcha (53/68) Oct 18 2009 This is ridiculous. The status quo works well most of the time and just...

dsimcha (5/11) Oct 18 2009 Scratch this one. It breaks on the following:
Andrei Alexandrescu (15/59) Oct 18 2009 Check.

dsimcha (32/71) Oct 18 2009 So then why is slices not being closed over .dup, ~, etc. still a proble...

Andrei Alexandrescu (58/140) Oct 18 2009 Oh yah. Check that sucker.

dsimcha (62/123) Oct 18 2009 Ok, now I can see how this would be a legitimate problem, but only in a ...

Walter Bright (25/28) Oct 18 2009 We'd like to get D to the point where guarantees can be offered. This

Denis Koroskin (17/104) Oct 19 2009 Could you please bring any example? You talk a lot about corner cases an...

dsimcha (26/33) Oct 19 2009 This discussion originated in the T[new] thread, but I think it deserves...

Denis Koroskin (20/62) Oct 19 2009 It was suggested before and IIRC Walter said it is next to impossible to...

Don (14/70) Oct 19 2009 I had a bit of an attempt at it. version(ctfe) seems to be nearly

dsimcha (4/74) Oct 19 2009 So this magic bool would be like a regular if statement in that it would...

Don (8/81) Oct 19 2009 It'd just be a variable, which the CTFE interpreter would evaluate as

Andrei Alexandrescu (13/52) Oct 19 2009 version(ctfe) would be helpful in many other situations (e.g.

Leandro Lucarella (15/22) Oct 19 2009 That would mean moving the overhead of arrays to the GC, but that overhe...

Andrei Alexandrescu (6/22) Oct 19 2009 I forgot to mention that ~= would also have a means to communicate the

dsimcha (5/21) Oct 19 2009 This is the part of this thread I'm not understanding: Isn't this info ...

Andrei Alexandrescu (5/26) Oct 19 2009 Sorry, I got confused a bit: you have the size in the array and you can

Fawzi Mohamed (15/45) Oct 19 2009 The problem with the current state, apart slow access to the capacity,

Leandro Lucarella (17/35) Oct 19 2009 The control block is not part of the object's memory, so I don't think

Leandro Lucarella (76/101) Oct 19 2009 I think this is getting overcomplicated. I don't see it as complex, I se...

Denis Koroskin (10/106) Oct 19 2009 That's what was initially planned (except some nuances).

Don (4/136) Oct 19 2009 Don't worry. It's not hugely difficult to implement, though it'll be a

Leandro Lucarella (16/17) Oct 19 2009 To avoid making the language aware of the array type, slice.dup can be

Leandro Lucarella (14/27) Oct 19 2009 Unless you want to have an appendable array type in CTFE. I guess the

Andrei Alexandrescu (8/62) Oct 19 2009 At the point I introduce arrays I hadn't described classes.

Leandro Lucarella (25/90) Oct 19 2009 You are talking about the book, right? You don't have to! You can explai...

Andrei Alexandrescu (13/60) Oct 19 2009 I thought I explained that above and in several other posts. I have the

Johan Granberg (8/74) Oct 19 2009 I think you are seeing a larger problem than their is. But consider this...

Andrei Alexandrescu (6/11) Oct 19 2009 It's not the nice syntax that's hard to explain.

Denis Koroskin (9/19) Oct 19 2009 Put it simple: T[] is a range, and T[new] is a container. They belong to...

Andrei Alexandrescu (8/31) Oct 19 2009 No. You see, even you who have a close perspective on the issue have

Fawzi Mohamed (14/27) Oct 19 2009 unless I miss something

Leandro Lucarella (27/44) Oct 19 2009 If T[new] is "array" and T[] is "slice" as I described them, I see no

Andrei Alexandrescu (6/41) Oct 19 2009 Python lists and slices are quite different from D arrays.

Leandro Lucarella (14/50) Oct 19 2009 Not the API.

Andrei Alexandrescu (4/18) Oct 19 2009 At least in that case I had an objective point! I'd agree many others

Chris Nicholson-Sauls (9/58) Oct 19 2009 I would argue that, if they truly *needed* a slice and are relying on "a...

Andrei Alexandrescu (8/23) Oct 19 2009 I understand that, but I also believe it's a symptom of a problem

Yigal Chripun (3/97) Oct 19 2009 Vote += 1_000_000;

downs (7/27) Oct 18 2009 I think ArrayBuilder can work almost entirely transparently provided the...

dsimcha (8/35) Oct 18 2009 ArrayBuilder

downs (2/38) Oct 18 2009 Yes but changing the semantics of a library object is easier than changi...

Ary Borenszweig (4/24) Oct 19 2009 I remember seeing a lot of CTFE code that created a dynamic array and

Walter Bright (6/9) Oct 19 2009 Probably not. But you can rewrite:

Bill Baxter (5/15) Oct 19 2009 en
downs (2/16) Oct 19 2009 Is there any reason the first can't be a short-hand for the second?

dsimcha (3/19) Oct 19 2009 Devil's advocate because I somewhat agree with you: a = a ~ stuff; is s...

Michel Fortin (6/22) Oct 19 2009 I'd call that a premature optimization of the programmer's behavior. :-)

Kagamin (3/8) Oct 19 2009 You write it as if you can do it with T[new].
Yigal Chripun (11/31) Oct 19 2009 I think that Arrays and AAs need to be removed from the language.

BCS (4/8) Oct 19 2009 If it can be assured that all the current array ops will get inlined is ...

Yigal Chripun (9/18) Oct 19 2009 See Leandro Lucarella's posts which describe the same idea.

BCS (9/38) Oct 19 2009 // a is an array of ints

Yigal Chripun (4/42) Oct 19 2009 I don't see a reason why this can't be inlined by the compiler if Array

BCS (8/23) Oct 19 2009 Yes, it should inline. What I'm saying is that unless it is /guaranteed/...

gzp (63/66) Oct 19 2009 I've tried to follow the T[], T[new], T+slice discussion, but i'm lost,
Don (4/24) Oct 19 2009 Since noone else seems to have said it: The fact that you're both

Andrei Alexandrescu (6/31) Oct 19 2009 I'm relieved that somebody mentioned that :o). As soon as we gave up

Leandro Lucarella (10/11) Oct 19 2009 Keep throwing the good news :(
Denis Koroskin (4/34) Oct 19 2009 I *really* hope you told it so that we encourage actually implementing i...

Ary Borenszweig (3/43) Oct 19 2009 Not if the ones that implement it didn't use attributes before or don't

grauzone (3/36) Oct 19 2009 Yeah, let's just keep the language in the broken state it is, because we...

Andrei Alexandrescu (3/39) Oct 19 2009 Silly me, I was thinking the humor was all too obvious.

grauzone (2/42) Oct 19 2009

Piotrek (8/12) Oct 19 2009 Hey Don, you speak my words. Specially that I can't see good reason for

Michel Fortin (12/32) Oct 19 2009 I never liked T[new] much from the beginning, so good riddance. :-)
Steven Schveighoffer (20/35) Oct 20 2009 At the risk of sounding like bearophile -- I've proposed 2 solutions in ...

Bill Baxter (23/58) Oct 20 2009 he

Steven Schveighoffer (11/76) Oct 20 2009 Having the append operation extend into already allocated memory is an

grauzone (6/9) Oct 20 2009 Would that work with Andrei's append cache at all? Setting the length to...

Andrei Alexandrescu (6/17) Oct 20 2009 For my money, I'd get rid of that trick:

Denis Koroskin (4/19) Oct 20 2009 I agree it's ugly but that's the best we have in D, and it looks like
grauzone (5/25) Oct 20 2009 Yes, that's what we currently use for setting the capacity of an array.

Bill Baxter (34/110) Oct 20 2009 :

Andrei Alexandrescu (3/10) Oct 20 2009 I guess I'll try and let you all know.

Bill Baxter (26/36) Oct 20 2009 I can also see this becoming an Effective D tip --

Steven Schveighoffer (30/146) Oct 20 2009 or

Andrei Alexandrescu (3/5) Oct 20 2009 I think that's an entirely reasonable (and easy to explain) stance.

Bill Baxter (7/12) Oct 20 2009 I've definitely spent time tracking down exactly such bugs, where I

Steven Schveighoffer (24/37) Oct 20 2009 Except when you are passing ownership of an array. Basically, there are...

Stewart Gordon (18/22) Oct 30 2009 I've just caught up on this thread. Built-in dynamic arrays have been

Walter Bright <newshound1 digitalmars.com> writes:

The purpose of T[new] was to solve the problems T[] had with passing T[] 
to a function and then the function resizes the T[]. What happens with 
the original?

The solution we came up with was to create a third array type, T[new], 
which was a reference type.

Andrei had the idea that T[new] could be dispensed with by making a 
"builder" library type to handle creating arrays by doing things like 
appending, and then delivering a finished T[] type. This is similar to 
what std.outbuffer and std.array.Appender do, they just need a bit of 
refining.

The .length property of T[] would then become an rvalue only, not an 
lvalue, and ~= would no longer be allowed for T[].

We both feel that this would simplify D, make it more flexible, and 
remove some awkward corner cases like the inability to say a.length++.

What do you think?

Oct 18 2009

bearophile <bearophileHUGS lycos.com> writes:

Walter Bright:

 Andrei had the idea that T[new] could be dispensed with by making a 
 "builder" library type to handle creating arrays by doing things like 
 appending, and then delivering a finished T[] type. This is similar to 
 what std.outbuffer and std.array.Appender do, they just need a bit of 
 refining.

I like how Andrei keeps trying to invent new possible ideas.
Sometimes situations can be improved adding things, and some other times
removing things.
So arrays will keep being kinda values, so if inside a function you use an
appender to add items to an array, outside the function, when the function
returns, you will keep seeing the original shorter array. This is not fully
intuitive (but you can get used to it just because you use arrays all the time).

The following code is not possible any more because you can't change the length:

import std.stdio: writefln;
void main() {
    int[] a1 = new int[5];
    a1[] = 1;
    int[] a2 = a1[1..3];
    writefln(a1, " ", a2); // [1,1,1,1,1] [1,1]
    a2.length = a2.length + 4;
    writefln(a1, " ", a2); // [1,1,1,1,1] [1,1,0,0,0,0]
}

But how do you change the length of an array (because appending many single
items isn't always what you need to do)?
And how does that interacts with the slicing, as in this code example?

Bye,
bearophile

Oct 18 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Mon, 19 Oct 2009 01:05:39 +0400, Walter Bright  
<newshound1 digitalmars.com> wrote:

 The purpose of T[new] was to solve the problems T[] had with passing T[]  
 to a function and then the function resizes the T[]. What happens with  
 the original?

 The solution we came up with was to create a third array type, T[new],  
 which was a reference type.

 Andrei had the idea that T[new] could be dispensed with by making a  
 "builder" library type to handle creating arrays by doing things like  
 appending, and then delivering a finished T[] type. This is similar to  
 what std.outbuffer and std.array.Appender do, they just need a bit of  
 refining.

 The .length property of T[] would then become an rvalue only, not an  
 lvalue, and ~= would no longer be allowed for T[].

 We both feel that this would simplify D, make it more flexible, and  
 remove some awkward corner cases like the inability to say a.length++.

 What do you think?

Well, I personally don't feel much of a need for T[new], and I agree T[]  
needs to be "fixed" the way you describe (i.e. make a shrink-only range  
out of it).
But I believe a change like this coupled with a demise of T[new] would be  
way too restricting. Given a

T[] array;

What type would result of these operations be?

1) auto x = array.dup;
2) auto y = array ~ array;

T[] is a view into someone else's container. But when you create a new  
array (by dup'ing or concatenating), you get a fresh copy. It points to a  
beginning of data sequence, and there is nothing wrong to expand it.

I believe it should be T[new]. A reference type, yes, a type that would  
allow appending and would render Appender unnecessary.

Oct 18 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Denis Koroskin wrote:
 On Mon, 19 Oct 2009 01:05:39 +0400, Walter Bright 
 <newshound1 digitalmars.com> wrote:
 
 The purpose of T[new] was to solve the problems T[] had with passing 
 T[] to a function and then the function resizes the T[]. What happens 
 with the original?

 The solution we came up with was to create a third array type, T[new], 
 which was a reference type.

 Andrei had the idea that T[new] could be dispensed with by making a 
 "builder" library type to handle creating arrays by doing things like 
 appending, and then delivering a finished T[] type. This is similar to 
 what std.outbuffer and std.array.Appender do, they just need a bit of 
 refining.

 The .length property of T[] would then become an rvalue only, not an 
 lvalue, and ~= would no longer be allowed for T[].

 We both feel that this would simplify D, make it more flexible, and 
 remove some awkward corner cases like the inability to say a.length++.

 What do you think?

 
 Well, I personally don't feel much of a need for T[new], and I agree T[] 
 needs to be "fixed" the way you describe (i.e. make a shrink-only range 
 out of it).
 But I believe a change like this coupled with a demise of T[new] would 
 be way too restricting. Given a
 
 T[] array;
 
 What type would result of these operations be?
 
 1) auto x = array.dup;
 2) auto y = array ~ array;
 
 T[] is a view into someone else's container. But when you create a new 
 array (by dup'ing or concatenating), you get a fresh copy. It points to 
 a beginning of data sequence, and there is nothing wrong to expand it.
 
 I believe it should be T[new]. A reference type, yes, a type that would 
 allow appending and would render Appender unnecessary.

That was the exact plan. I even have half a chapter written about it 
with nice figures and all, that I can make available. The problem was, 
it sucked. Returning a distinct type from .dup and ~ makes slices not 
closed over these operations, a source of complication, confusion, and 
bloating.

Andrei

Oct 18 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Mon, 19 Oct 2009 01:55:28 +0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Denis Koroskin wrote:
 On Mon, 19 Oct 2009 01:05:39 +0400, Walter Bright  
 <newshound1 digitalmars.com> wrote:

 The purpose of T[new] was to solve the problems T[] had with passing  
 T[] to a function and then the function resizes the T[]. What happens  
 with the original?

 The solution we came up with was to create a third array type, T[new],  
 which was a reference type.

 Andrei had the idea that T[new] could be dispensed with by making a  
 "builder" library type to handle creating arrays by doing things like  
 appending, and then delivering a finished T[] type. This is similar to  
 what std.outbuffer and std.array.Appender do, they just need a bit of  
 refining.

 The .length property of T[] would then become an rvalue only, not an  
 lvalue, and ~= would no longer be allowed for T[].

 We both feel that this would simplify D, make it more flexible, and  
 remove some awkward corner cases like the inability to say a.length++.

 What do you think?

  Well, I personally don't feel much of a need for T[new], and I agree  
 T[] needs to be "fixed" the way you describe (i.e. make a shrink-only  
 range out of it).
 But I believe a change like this coupled with a demise of T[new] would  
 be way too restricting. Given a
  T[] array;
  What type would result of these operations be?
  1) auto x = array.dup;
 2) auto y = array ~ array;
  T[] is a view into someone else's container. But when you create a new  
 array (by dup'ing or concatenating), you get a fresh copy. It points to  
 a beginning of data sequence, and there is nothing wrong to expand it.
  I believe it should be T[new]. A reference type, yes, a type that  
 would allow appending and would render Appender unnecessary.

 That was the exact plan. I even have half a chapter written about it  
 with nice figures and all, that I can make available. The problem was,  
 it sucked. Returning a distinct type from .dup and ~ makes slices not  
 closed over these operations, a source of complication, confusion, and  
 bloating.

 Andrei

What's problem with implicit cast of T[new] to T[]?

T[new] container = [1, 2, 3].dup;
T[] range1 = container; // implicit container[] call via alias this
T[] range2 = [1, 2, 3].dup; // same here

An other option would be to declare

// duplicates a range
T[] rdup(T[] range)
{
     return range.dup[];
}

but it's less likely to happen.

Oct 18 2009

grauzone <none example.net> writes:

Walter Bright wrote:
 We both feel that this would simplify D, make it more flexible, and 
 remove some awkward corner cases like the inability to say a.length++.

How would this remove this corner case?

 What do you think?

Whether T[new] is bultin (mostly implemented in the runtime), or a 
library type (implemented in Phobos) doesn't make much a difference; 
it's pretty much the same, isn't it? I don't understand what the exact 
reasons for this decision.

If anything, the array type should be available without additional 
Phobos imports (i.e. it should be declared in object.d). Ease of access 
and ease of use is the key thing here.

Oct 18 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

grauzone wrote:
 Walter Bright wrote:
 We both feel that this would simplify D, make it more flexible, and 
 remove some awkward corner cases like the inability to say a.length++.

 
 How would this remove this corner case?

Can't increment length if it's read-only.

Andrei

Oct 18 2009

Jason House <jason.james.house gmail.com> writes:

Andrei Alexandrescu Wrote:

 grauzone wrote:
 Walter Bright wrote:
 We both feel that this would simplify D, make it more flexible, and 
 remove some awkward corner cases like the inability to say a.length++.

 
 How would this remove this corner case?

 
 Can't increment length if it's read-only.
 
 Andrei

removing syntactic sugar doesn't really remove corner cases. T[new] is being
replaced with an array builder. What usage semantics are given up with an array
builder? How will implicit conversions be handled? (alias this, etc...). How
will the Phobos array buiilder support user specified lengths (for efficiency)?

Oct 18 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Jason House wrote:
 Andrei Alexandrescu Wrote:
 
 grauzone wrote:
 Walter Bright wrote:
 We both feel that this would simplify D, make it more flexible,
 and remove some awkward corner cases like the inability to say
 a.length++.

 How would this remove this corner case?

 Can't increment length if it's read-only.
 
 Andrei

 
 removing syntactic sugar doesn't really remove corner cases. T[new]
 is being replaced with an array builder. What usage semantics are
 given up with an array builder? How will implicit conversions be
 handled? (alias this, etc...). How will the Phobos array buiilder
 support user specified lengths (for efficiency)?

I think ArrayBuilder could a logic similar to Appender for "~=" and 
otherwise offer regular array primitives. I don't think implicit 
conversion to T[] is an essential feature, but it can be done.

Andrei

Oct 18 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Jason House wrote:
 Andrei Alexandrescu Wrote:

 grauzone wrote:
 Walter Bright wrote:
 We both feel that this would simplify D, make it more flexible,
 and remove some awkward corner cases like the inability to say
 a.length++.

 How would this remove this corner case?

 Can't increment length if it's read-only.

 Andrei

 removing syntactic sugar doesn't really remove corner cases. T[new]
 is being replaced with an array builder. What usage semantics are
 given up with an array builder? How will implicit conversions be
 handled? (alias this, etc...). How will the Phobos array buiilder
 support user specified lengths (for efficiency)?

 I think ArrayBuilder could a logic similar to Appender for "~=" and
 otherwise offer regular array primitives. I don't think implicit
 conversion to T[] is an essential feature, but it can be done.
 Andrei

So basically, ArrayBuilder would be like T[new], except:

1.  It would be a plain old library type, and the core language would know
nothing
about it.
2.  T[].dup, T[] ~ T[], new T[3], etc. would return T[], not
ArrayBuilder/T[new].

Is this basically correct?

Oct 18 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Jason House wrote:
 Andrei Alexandrescu Wrote:

 grauzone wrote:
 Walter Bright wrote:
 We both feel that this would simplify D, make it more flexible,
 and remove some awkward corner cases like the inability to say
 a.length++.

 How would this remove this corner case?

 Can't increment length if it's read-only.

 Andrei

 removing syntactic sugar doesn't really remove corner cases. T[new]
 is being replaced with an array builder. What usage semantics are
 given up with an array builder? How will implicit conversions be
 handled? (alias this, etc...). How will the Phobos array buiilder
 support user specified lengths (for efficiency)?

 I think ArrayBuilder could a logic similar to Appender for "~=" and
 otherwise offer regular array primitives. I don't think implicit
 conversion to T[] is an essential feature, but it can be done.
 Andrei

 
 So basically, ArrayBuilder would be like T[new], except:
 
 1.  It would be a plain old library type, and the core language would know
nothing
 about it.
 2.  T[].dup, T[] ~ T[], new T[3], etc. would return T[], not
ArrayBuilder/T[new].
 
 Is this basically correct?

Yes, that's the plan. I hope you're not setting me up or something :o).

Andrei

Oct 18 2009

Jason House <jason.james.house gmail.com> writes:

Andrei Alexandrescu Wrote:

 dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Jason House wrote:
 Andrei Alexandrescu Wrote:

 grauzone wrote:
 Walter Bright wrote:
 We both feel that this would simplify D, make it more flexible,
 and remove some awkward corner cases like the inability to say
 a.length++.

 How would this remove this corner case?

 Can't increment length if it's read-only.

 Andrei

 removing syntactic sugar doesn't really remove corner cases. T[new]
 is being replaced with an array builder. What usage semantics are
 given up with an array builder? How will implicit conversions be
 handled? (alias this, etc...). How will the Phobos array buiilder
 support user specified lengths (for efficiency)?

 I think ArrayBuilder could a logic similar to Appender for "~=" and
 otherwise offer regular array primitives. I don't think implicit
 conversion to T[] is an essential feature, but it can be done.
 Andrei

 
 So basically, ArrayBuilder would be like T[new], except:
 
 1.  It would be a plain old library type, and the core language would know
nothing
 about it.
 2.  T[].dup, T[] ~ T[], new T[3], etc. would return T[], not
ArrayBuilder/T[new].
 
 Is this basically correct?

 
 Yes, that's the plan. I hope you're not setting me up or something :o).
 
 Andrei

I was ;) Well, only sort of. I'm undecided if array-like syntactic sugar makes
sense for array builder. Afterall, AA's fit into a similar category. I did
notice you wanted to make array builder a struct which implies it won't be used
in the same manner as T[new]

Oct 18 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Jason House wrote:
 Andrei Alexandrescu Wrote:

 grauzone wrote:
 Walter Bright wrote:
 We both feel that this would simplify D, make it more flexible,
 and remove some awkward corner cases like the inability to say
 a.length++.

 How would this remove this corner case?

 Can't increment length if it's read-only.

 Andrei

 removing syntactic sugar doesn't really remove corner cases. T[new]
 is being replaced with an array builder. What usage semantics are
 given up with an array builder? How will implicit conversions be
 handled? (alias this, etc...). How will the Phobos array buiilder
 support user specified lengths (for efficiency)?

 I think ArrayBuilder could a logic similar to Appender for "~=" and
 otherwise offer regular array primitives. I don't think implicit
 conversion to T[] is an essential feature, but it can be done.
 Andrei

 So basically, ArrayBuilder would be like T[new], except:

 1.  It would be a plain old library type, and the core language would know
nothing
 about it.
 2.  T[].dup, T[] ~ T[], new T[3], etc. would return T[], not
ArrayBuilder/T[new].

 Is this basically correct?

 Yes, that's the plan. I hope you're not setting me up or something :o).
 Andrei

Given the issues surrounding T[new] and how ugly some cases of the implicit
conversion can get, you've convinced me that this may be a good solution as
long as:

1.  ArrayBuilder implements full array semantics and is usable as a "real" array
as you suggest.  I wasn't aware that this was what was being proposed, as
ArrayBuilder implied to me that it's only good for building arrays as opposed to
being a full-fledged array library type.  Ideally, for convenience it should be
implicitly convertible to T[].
2.  The semantics are such that bug 2093
(http://d.puremagic.com/issues/show_bug.cgi?id=2093) is fixed, while still
allowing appending to arrays of immutables.  This would probably require some
casting under the hood, but the standard lib. is only one step above the core
language, so it's acceptable.  The semantics of the length property would
probably
have to be reference semantics, any block of array memory would have to be owned
exclusively by one ArrayBuilder, etc. just like with T[new].

Oct 18 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Walter Bright (newshound1 digitalmars.com)'s article
 The purpose of T[new] was to solve the problems T[] had with passing T[]
 to a function and then the function resizes the T[]. What happens with
 the original?
 The solution we came up with was to create a third array type, T[new],
 which was a reference type.
 Andrei had the idea that T[new] could be dispensed with by making a
 "builder" library type to handle creating arrays by doing things like
 appending, and then delivering a finished T[] type. This is similar to
 what std.outbuffer and std.array.Appender do, they just need a bit of
 refining.
 The .length property of T[] would then become an rvalue only, not an
 lvalue, and ~= would no longer be allowed for T[].
 We both feel that this would simplify D, make it more flexible, and
 remove some awkward corner cases like the inability to say a.length++.
 What do you think?

This is ridiculous.  The status quo works well most of the time and just has a
few
really ugly corner cases.  As long as you're only appending to one array at a
time, appending isn't even that slow anymore now that bug 2900
(http://d.puremagic.com/issues/show_bug.cgi?id=2900) is fixed.

I frankly think it's absurd to make working with arrays an order of magnitude
harder and less elegant in the 90+% of cases where they work fine just to fix a
few corner case bugs.  Don't get me wrong, the corner case bugs should be fixed
because they're pretty nasty safety issues.  They just shouldn't be fixed in a
way
that makes arrays substantially harder to use, or even syntactically uglier, in
the cases where they already work well.  As good a programmer as Andrei is, I'm
sure whatever he comes up with will be much less syntactially pleasing and easy
to
use than something the core language understands.

Here's my proposal for how T[new] should work:

1.  It should be a reference type to be consistent with slices.  Yes, slices are
kind of a hybrid, but they're more semantically similar to reference types than
value types.  If you can't modify the length of a slice anymore, then for all
practical purposes it will be a reference type.

2.  A T[new] should support all the same operations as a T[] with semantics as
similar as common sense will allow, including indexing, ~, .dup, .idup, slice
assign, etc.  Basically, it should have, to the greatest degree possible without
defeating the purpose, the same compile time interface.

3.  A T[new] should be implicitly convertible to a slice.  For example:

auto foo = someFunctionThatReturnsTnew();
// foo is a T[new].
T[] bar = someFunctionThatReturnsTnew();
// Works.  bar is a T[].  The T[new] went into oblivion.

This solves the problem of slices not being closed over .dup and ~.

4.  It should be guaranteed that no block of memory is ever referenced by more
than one T[new] instance.  This is needed to guarantee safety when appending to
immutable arrays, etc.

5.  Assigning a T[new] to another T[new] should be by reference, just like
assigning a class instance to another class instance.  Assigning a T[] to a
T[new]
should duplicate the memory block referenced by the T[] because this is probably
the only way to guarantee (4).

6.  Since T[new] guarantees unique access to a memory block, it should have an
assumeUnique() method that returns an immutable slice and sets the T[new]'s
reference to the memory block to null.  This solves the problem of building
immutable arrays without the performance penalty of not being able to
pre-allocate
or the unsafeness of having to cowboy cast it to immutable.

7.  As long as the GC is conservative, there absolutely *must* be a method of
manually freeing the memory block referenced by a T[new] provided that the GC
supports this operation, though it doesn't have to be particularly pretty.  In
general, since D is a systems language, T[new] should not be too opaque.  A good
way to do this might be to make all of the fields of the T[new] public but
undocumented.  If you *really* want to mess with it, you'll read the source code
and figure it out.

8.  The first call to opSlice on a T[new] should set a flag that indicates that
there may be multiple pointers to the underlying memory block.  Before that flag
is set, appends to a T[new] should result in calls to GC.free() to free the old
block whenever it needs to be expanded (since we can guarantee that we own it
exclusively).  This will help deal with false pointer issues, since D's GC looks
like it will remain conservative for the foreseeable future.

Oct 18 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from dsimcha (dsimcha yahoo.com)'s article
 8.  The first call to opSlice on a T[new] should set a flag that indicates that
 there may be multiple pointers to the underlying memory block.  Before that
flag
 is set, appends to a T[new] should result in calls to GC.free() to free the old
 block whenever it needs to be expanded (since we can guarantee that we own it
 exclusively).  This will help deal with false pointer issues, since D's GC
looks
 like it will remain conservative for the foreseeable future.

Scratch this one.  It breaks on the following:

T[new] stuff = somethingThatReturnsTnew();
T* ptrToStuffMemory = &(stuff[8]);
stuff ~= otherStuff;  // May delete memory referred to by ptrToStuffMemory.

Oct 18 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

dsimcha wrote:
 Here's my proposal for how T[new] should work:
 
 1.  It should be a reference type to be consistent with slices.  Yes, slices
are
 kind of a hybrid, but they're more semantically similar to reference types than
 value types.  If you can't modify the length of a slice anymore, then for all
 practical purposes it will be a reference type.

Check.

 2.  A T[new] should support all the same operations as a T[] with semantics as
 similar as common sense will allow, including indexing, ~, .dup, .idup, slice
 assign, etc.  Basically, it should have, to the greatest degree possible
without
 defeating the purpose, the same compile time interface.

Check.

 3.  A T[new] should be implicitly convertible to a slice.  For example:
 
 auto foo = someFunctionThatReturnsTnew();
 // foo is a T[new].
 T[] bar = someFunctionThatReturnsTnew();
 // Works.  bar is a T[].  The T[new] went into oblivion.
 
 This solves the problem of slices not being closed over .dup and ~.

Check.

 4.  It should be guaranteed that no block of memory is ever referenced by more
 than one T[new] instance.  This is needed to guarantee safety when appending to
 immutable arrays, etc.

That doesn't go with reference semantics. Uncheck.

 5.  Assigning a T[new] to another T[new] should be by reference, just like
 assigning a class instance to another class instance.

Check. (BTW contradicts 4)

 Assigning a T[] to a T[new]
 should duplicate the memory block referenced by the T[] because this is
probably
 the only way to guarantee (4).

No check, but could have been done.

 6.  Since T[new] guarantees unique access to a memory block, it should have an
 assumeUnique() method that returns an immutable slice and sets the T[new]'s
 reference to the memory block to null.  This solves the problem of building
 immutable arrays without the performance penalty of not being able to
pre-allocate
 or the unsafeness of having to cowboy cast it to immutable.

Uncheck.

 7.  As long as the GC is conservative, there absolutely *must* be a method of
 manually freeing the memory block referenced by a T[new] provided that the GC
 supports this operation, though it doesn't have to be particularly pretty.  In
 general, since D is a systems language, T[new] should not be too opaque.  A
good
 way to do this might be to make all of the fields of the T[new] public but
 undocumented.  If you *really* want to mess with it, you'll read the source
code
 and figure it out.

Check while delete still exists. Please use malloc for that stuff.

 8.  The first call to opSlice on a T[new] should set a flag that indicates that
 there may be multiple pointers to the underlying memory block.  Before that
flag
 is set, appends to a T[new] should result in calls to GC.free() to free the old
 block whenever it needs to be expanded (since we can guarantee that we own it
 exclusively).  This will help deal with false pointer issues, since D's GC
looks
 like it will remain conservative for the foreseeable future.

Uncheck.

So now: every place I've said "check" means there was implementation and 
book-quality illustrated documentation that Walter and I have done with 
the sweat of our brow. At the end we looked at the result and concluded 
we should throw away all that work.


Andrei

Oct 18 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 3.  A T[new] should be implicitly convertible to a slice.  For example:

 auto foo = someFunctionThatReturnsTnew();
 // foo is a T[new].
 T[] bar = someFunctionThatReturnsTnew();
 // Works.  bar is a T[].  The T[new] went into oblivion.

 This solves the problem of slices not being closed over .dup and ~.

 Check.

So then why is slices not being closed over .dup, ~, etc. still a problem?  With
implicit conversion, they for all practical purposes are.

 4.  It should be guaranteed that no block of memory is ever referenced by more
 than one T[new] instance.  This is needed to guarantee safety when appending to
 immutable arrays, etc.

 That doesn't go with reference semantics. Uncheck.
 5.  Assigning a T[new] to another T[new] should be by reference, just like
 assigning a class instance to another class instance.

 Check. (BTW contradicts 4)

There was a slight misunderstanding here.  When I say an instance, I mean one
T[new] heap object == one instance, no matter how many pointers/references to
this
instance you have.  The assumption is that T[new] objects live entirely on the
heap and have semantics similar to classes.  For example:

uint[new] foo = new uint[100];
uint[new] bar = foo;  // Really just a pointer assignment.

Now, both foo and bar point to the same uint[new] instance.  If anything is
modified via foo, it is seen when viewed from bar and vice-versa.

To clarify, no *main block of memory that holds the data in the array* is ever
referenced by more than one T[new] *heap object*.

 Assigning a T[] to a T[new]
 should duplicate the memory block referenced by the T[] because this is
probably
 the only way to guarantee (4).

 No check, but could have been done.

Actually, this can even be done by COW.  Initially, assigning a T[] to a T[new]
could just be a pointer assignment and setting a flag, as long as a copy of the
data is made when you try to increase the length of the T[new] or append to it.
Basically, as long as you use the T[new] as if it were a T[], no copying needs
to
be done.

 6.  Since T[new] guarantees unique access to a memory block, it should have an
 assumeUnique() method that returns an immutable slice and sets the T[new]'s
 reference to the memory block to null.  This solves the problem of building
 immutable arrays without the performance penalty of not being able to
pre-allocate
 or the unsafeness of having to cowboy cast it to immutable.

 Uncheck.

Now that I've explained my uniqueness thing better, does this sound feasible?

 7.  As long as the GC is conservative, there absolutely *must* be a method of
 manually freeing the memory block referenced by a T[new] provided that the GC
 supports this operation, though it doesn't have to be particularly pretty.  In
 general, since D is a systems language, T[new] should not be too opaque.  A
good
 way to do this might be to make all of the fields of the T[new] public but
 undocumented.  If you *really* want to mess with it, you'll read the source
code
 and figure it out.

 Check while delete still exists. Please use malloc for that stuff.

As long as I can get at the pointer to the memory block held by T[new] I can
pass
it to GC.free.  All that would really be needed is a .ptr property that does
something like what .ptr does for T[]s.

 So now: every place I've said "check" means there was implementation and
 book-quality illustrated documentation that Walter and I have done with
 the sweat of our brow. At the end we looked at the result and concluded
 we should throw away all that work.
 Andrei

This begs the question:  Why?  Walter's post on the subject was rather brief
and I
can't understand for the life of me why you guys would throw away such an
elegant
solution.  Given that we already agree that a T[new] can be implicitly cast to a
T[], the lack of closure under ~, .dup, etc. seems like a non-issue.  When I
was a
beginner, before I got into templates, a major attraction to D was that it was a
fast, compiled language where basic things like arrays (mostly) just worked.  To
take away all the syntactic sugar for appending and lengthening from arrays and
push this into a separate library type that doesn't feel like a first class
object
or (I assume) even support most array semantics would be a massive, massive
kludge
no matter how well-implemented that library type was.

Oct 18 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 3.  A T[new] should be implicitly convertible to a slice.  For example:

 auto foo = someFunctionThatReturnsTnew();
 // foo is a T[new].
 T[] bar = someFunctionThatReturnsTnew();
 // Works.  bar is a T[].  The T[new] went into oblivion.

 This solves the problem of slices not being closed over .dup and ~.

 Check.

 
 So then why is slices not being closed over .dup, ~, etc. still a problem? 
With
 implicit conversion, they for all practical purposes are.

The problems are with auto and template argument deduction.

 4.  It should be guaranteed that no block of memory is ever referenced by more
 than one T[new] instance.  This is needed to guarantee safety when appending to
 immutable arrays, etc.

 That doesn't go with reference semantics. Uncheck.
 5.  Assigning a T[new] to another T[new] should be by reference, just like
 assigning a class instance to another class instance.

 Check. (BTW contradicts 4)

 
 There was a slight misunderstanding here.  When I say an instance, I mean one
 T[new] heap object == one instance, no matter how many pointers/references to
this
 instance you have.  The assumption is that T[new] objects live entirely on the
 heap and have semantics similar to classes.  For example:
 
 uint[new] foo = new uint[100];
 uint[new] bar = foo;  // Really just a pointer assignment.
 
 Now, both foo and bar point to the same uint[new] instance.  If anything is
 modified via foo, it is seen when viewed from bar and vice-versa.
 
 To clarify, no *main block of memory that holds the data in the array* is ever
 referenced by more than one T[new] *heap object*.

Oh yah. Check that sucker.

 Assigning a T[] to a T[new]
 should duplicate the memory block referenced by the T[] because this is
probably
 the only way to guarantee (4).

 No check, but could have been done.

 
 Actually, this can even be done by COW.  Initially, assigning a T[] to a T[new]
 could just be a pointer assignment and setting a flag, as long as a copy of the
 data is made when you try to increase the length of the T[new] or append to it.
 Basically, as long as you use the T[new] as if it were a T[], no copying needs
to
 be done.
 
 6.  Since T[new] guarantees unique access to a memory block, it should have an
 assumeUnique() method that returns an immutable slice and sets the T[new]'s
 reference to the memory block to null.  This solves the problem of building
 immutable arrays without the performance penalty of not being able to
pre-allocate
 or the unsafeness of having to cowboy cast it to immutable.

 Uncheck.

 
 Now that I've explained my uniqueness thing better, does this sound feasible?

Nah, there are still problems with Ts that themselves are elaborate 
types containing references, aliasing etc. assumeUnique makes a rather 
strong claim.

 7.  As long as the GC is conservative, there absolutely *must* be a method of
 manually freeing the memory block referenced by a T[new] provided that the GC
 supports this operation, though it doesn't have to be particularly pretty.  In
 general, since D is a systems language, T[new] should not be too opaque.  A
good
 way to do this might be to make all of the fields of the T[new] public but
 undocumented.  If you *really* want to mess with it, you'll read the source
code
 and figure it out.

 Check while delete still exists. Please use malloc for that stuff.

 
 As long as I can get at the pointer to the memory block held by T[new] I can
pass
 it to GC.free.  All that would really be needed is a .ptr property that does
 something like what .ptr does for T[]s.

Alrighty. Check that.

 So now: every place I've said "check" means there was implementation and
 book-quality illustrated documentation that Walter and I have done with
 the sweat of our brow. At the end we looked at the result and concluded
 we should throw away all that work.
 Andrei

 
 This begs the question:  Why?  Walter's post on the subject was rather brief
and I
 can't understand for the life of me why you guys would throw away such an
elegant
 solution.

Here's what I wrote to Walter:

====================
I'm going to suggest something terrible - let's get rid of T[new]. I 
know it's difficult to throw away work you've already done, but really 
things with T[new] start to look like a Pyrrhic victory. Here are some 
issues:

* The abstraction doesn't seem to come off as crisp and clean as we both 
wanted;

* There are efficiency issues, such as the two allocations that you 
valiantly tried to eliminate in a subset of cases;

* Explaining two very similar but subtly different types to newcomers is 
excruciatingly difficult (I'll send you a draft of the chapter - it 
looks like a burn victim who didn't make it);

* Furthermore, explaining people when to use one vs. the other is much 
more difficult than it seems. On the surface, it goes like this: "If you 
need to append stuff, use T[new]. If not, use T[]." Reality is much more 
subtle. For one thing, T[new] does not allow contraction from the left, 
whereas T[] does. That puts T[] at an advantage. So if you want to 
append stuff and also contract from the left, there's nothing our 
abstractions can help you with.

Instead of all T[new] stuff, I suggest the following:

1. We stay with T[] and we define a struct ArrayBuilder that replaces 
T[new] with a much more clear name and charter. Phobos already has 
Appender which works very well. We can beef that up to allow array-like 
primitives.

2. Assigning to a slice's .length allocates a new slice if growth is needed.

3. Disallow ~= for slices. ArrayBuilder will define it.

4. That's it.

Java got away with a similar approach using StringBuilder:

http://java.sun.com/j2se/1.5.0/docs/api/java/lang/StringBuilder.html

Scala has something very similar called ArrayBuffer:

http://www.nabble.com/ArrayList-and-ArrayBuffer-td15448842.html



http://msdn.microsoft.com/en-us/library/2839d5h5%28VS.71%29.aspx

So it looks like many programmers coming from other languages will 
already be familiar with the idea that you use a "builder" to grow an 
array, and then you use a non-growable array. One thing that Appender 
has and is really cool is that it can grow an already-existing slice. So 
you can grow a slice, play with it for a while, and then grow it again 
at low cost. I don't think the other languages allow that.

I understand how you must feel about having implemented T[new] and all, 
but really please please try to detach for a minute and think back. Does 
what we've got now with T[new] make D a much better place? Between the 
increase of the language, the difficulty to explain the minute 
subtleties, and the annoying corner cases and oddities, I think it might 
be good to reconsider.
================

 Given that we already agree that a T[new] can be implicitly cast to a
 T[], the lack of closure under ~, .dup, etc. seems like a non-issue.  When I
was a
 beginner, before I got into templates, a major attraction to D was that it was
a
 fast, compiled language where basic things like arrays (mostly) just worked. 
To
 take away all the syntactic sugar for appending and lengthening from arrays and
 push this into a separate library type that doesn't feel like a first class
object
 or (I assume) even support most array semantics would be a massive, massive
kludge
 no matter how well-implemented that library type was.

I think the language will be in considerably better shape without T[new] 
than with it. I understand this is a subjective point. You can argue 
against on virtually every one of my objections.


Andrei

Oct 18 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 3.  A T[new] should be implicitly convertible to a slice.  For example:

 auto foo = someFunctionThatReturnsTnew();
 // foo is a T[new].
 T[] bar = someFunctionThatReturnsTnew();
 // Works.  bar is a T[].  The T[new] went into oblivion.

 This solves the problem of slices not being closed over .dup and ~.

 Check.

 So then why is slices not being closed over .dup, ~, etc. still a problem? 
With
 implicit conversion, they for all practical purposes are.

 The problems are with auto and template argument deduction.

Ok, now I can see how this would be a legitimate problem, but only in a few
corner
cases:

void doStuff(T)(T someArrayLikeObject) {
    someArrayLikeObject = someArrayLikeObject[0..$ - 1];
}

int[] foo = someFunction();
int[] bar = someFunction();
auto baz = foo ~ bar;

// baz is an int[new].  This doesn't usually matter, since you can use it
// just like an int[] and if you care about the small performance difference,
// you should be more careful.

doStuff(baz);

// Are the effects of doStuff() visible in this scope?
// Depends if baz is an int[] or an int[new].

However, this is really, really, *really* a corner case.  Any reasonable
template
would either slice the array to make sure it owns the slice information or pass
by
reference to make sure the change is propagated, so that there is no ambiguity.

 Here's what I wrote to Walter:
 ====================
 I'm going to suggest something terrible - let's get rid of T[new]. I
 know it's difficult to throw away work you've already done, but really
 things with T[new] start to look like a Pyrrhic victory. Here are some
 issues:
 * The abstraction doesn't seem to come off as crisp and clean as we both
 wanted;
 * There are efficiency issues, such as the two allocations that you
 valiantly tried to eliminate in a subset of cases;

Once you've opened the can of worms of having to perform allocations, one versus
two allocations isn't very important.

 * Explaining two very similar but subtly different types to newcomers is
 excruciatingly difficult (I'll send you a draft of the chapter - it
 looks like a burn victim who didn't make it);

This is admittedly a legitimate concern.  However, you can get off the ground in
D, or any language for that matter, without being a full-fledged language
lawyer.
 I frankly don't think it's important for beginners to understand every subtlety
and corner case.  Heck, I've been using D for a while and have done some
non-trivial projects in it and there are definitely dark corners that I don't
understand.  (Complex numbers come to mind.)  I just don't care because I figure
I'll learn them if/when I need to know them.

 * Furthermore, explaining people when to use one vs. the other is much
 more difficult than it seems. On the surface, it goes like this: "If you
 need to append stuff, use T[new]. If not, use T[]." Reality is much more
 subtle. For one thing, T[new] does not allow contraction from the left,
 whereas T[] does. That puts T[] at an advantage. So if you want to
 append stuff and also contract from the left, there's nothing our
 abstractions can help you with.

Why can't a T[new] contract from the left?  As far as I can tell, you could do
something like:

struct TNew(T) {
    typeof(this) opAssign(T[] slice) {
        if(slice.ptr >= this.ptr && slice.ptr < this.ptr + capacity) {
            // Then we own this block of memory and can assign
            // a slice by reference.

            // Adjust effective capacity.
            capacity -= cast(size_t) (slice.ptr - this.ptr);
            length = slice.length;
            this.ptr = slice.ptr;
       } else {
           // Assign the slice by copying.
       }
    }

    // Other stuff.
}

You would then contract from the left the same way as for a slice:

int[new] foo = new int[5];
foo = foo[1..$];

It would simply require a few integer/pointer comparisons and still be
reasonably
efficient.

 Instead of all T[new] stuff, I suggest the following:
 1. We stay with T[] and we define a struct ArrayBuilder that replaces
 T[new] with a much more clear name and charter. Phobos already has
 Appender which works very well. We can beef that up to allow array-like
 primitives.
 2. Assigning to a slice's .length allocates a new slice if growth is needed.
 3. Disallow ~= for slices. ArrayBuilder will define it.
 4. That's it.
 Java got away with a similar approach using StringBuilder:
 http://java.sun.com/j2se/1.5.0/docs/api/java/lang/StringBuilder.html
 Scala has something very similar called ArrayBuffer:
 http://www.nabble.com/ArrayList-and-ArrayBuffer-td15448842.html

 http://msdn.microsoft.com/en-us/library/2839d5h5%28VS.71%29.aspx
 So it looks like many programmers coming from other languages will
 already be familiar with the idea that you use a "builder" to grow an
 array, and then you use a non-growable array. One thing that Appender
 has and is really cool is that it can grow an already-existing slice. So
 you can grow a slice, play with it for a while, and then grow it again
 at low cost. I don't think the other languages allow that.
 I understand how you must feel about having implemented T[new] and all,
 but really please please try to detach for a minute and think back. Does
 what we've got now with T[new] make D a much better place? Between the
 increase of the language, the difficulty to explain the minute
 subtleties, and the annoying corner cases and oddities, I think it might
 be good to reconsider.

Yes, Java, etc. get away with this because Java has no pretense of being a make
simple things simple kind of language, but one big thing that D has over Java,
etc. is that it can be used almost like a scripting language for simple stuff. 
It
is about the only language that scales well all the way from simple scripts to
uber-complicated metaprogramming.  If we get rid of nice builtin arrays that
support everything with clean syntax, we're throwing out making simple things
simple in exchange for fixing a few corner cases.  IMHO this is a terrible
tradeoff.  If you and Walter *really* despise T[new], I would prefer to see
builtin arrays kept exactly the way they are, bugs and all, and for array
builders/appenders/whatever to be improved but still be considered purely a
performance hack.  The bugs in the current arrays are pretty nasty from a
theoretical safety/purity point of view (esp. the one that's a hole in
immutability), but are seldom run into in practice.

Oct 18 2009

Walter Bright <newshound1 digitalmars.com> writes:

dsimcha wrote:
 The bugs in the current arrays are pretty nasty from a
 theoretical safety/purity point of view (esp. the one that's a hole in
 immutability), but are seldom run into in practice.

We'd like to get D to the point where guarantees can be offered. This 
really sets it apart from C++, which offers no guarantees, and safety 
only happens if the programmer carefully follows convention. We need to 
do something about those nasty corner cases.

Another issue is that D currently has two array types. Adding T[new] 
makes for 3 array types, and that starts to look like we couldn't figure 
out what we were doing.

And lastly, the reason to make a type a built-in one is in order to 
offer substantial advantages over a library one. If a library one can do 
  a good job, it is better to push it into the library. Making for a 
smaller core language makes it easier for people to get into D and feel 
comfortable with it.

I was getting hard pressed to find significant advantages for T[new] 
over a library type. In particular, I find few uses for resizeable 
arrays as opposed to slices which are everywhere. When resizeable arrays 
are needed, they are usually isolated in one function, and when the 
function returns the resizeable array no longer needs to be resized - it 
can be converted to a slice type.

Another interesting advantage to the library type for T[new] is it 
solves the problem of creating a safe immutable string. When the library 
T[new] is converted to an immutable string, the T[new] contents are 
removed - hence it won't be possible to create a mutable alias of an 
immutable string without a user-applied cast, which would not be allowed 
in safe mode.

Oct 18 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Mon, 19 Oct 2009 05:16:45 +0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s  
 article
 3.  A T[new] should be implicitly convertible to a slice.  For  
 example:

 auto foo = someFunctionThatReturnsTnew();
 // foo is a T[new].
 T[] bar = someFunctionThatReturnsTnew();
 // Works.  bar is a T[].  The T[new] went into oblivion.

 This solves the problem of slices not being closed over .dup and ~.

 Check.

  So then why is slices not being closed over .dup, ~, etc. still a  
 problem?  With
 implicit conversion, they for all practical purposes are.

 The problems are with auto and template argument deduction.

Could you please bring any example? You talk a lot about corner cases and  
inconsistencies, but didn't bring a single example.

  This begs the question:  Why?  Walter's post on the subject was rather  
 brief and I
 can't understand for the life of me why you guys would throw away such  
 an elegant
 solution.

 Here's what I wrote to Walter:

 ====================
 I'm going to suggest something terrible - let's get rid of T[new]. I  
 know it's difficult to throw away work you've already done, but really  
 things with T[new] start to look like a Pyrrhic victory. Here are some  
 issues:

 * The abstraction doesn't seem to come off as crisp and clean as we both  
 wanted;

 * There are efficiency issues, such as the two allocations that you  
 valiantly tried to eliminate in a subset of cases;

 * Explaining two very similar but subtly different types to newcomers is  
 excruciatingly difficult (I'll send you a draft of the chapter - it  
 looks like a burn victim who didn't make it);

 * Furthermore, explaining people when to use one vs. the other is much  
 more difficult than it seems. On the surface, it goes like this: "If you  
 need to append stuff, use T[new]. If not, use T[]." Reality is much more  
 subtle. For one thing, T[new] does not allow contraction from the left,  
 whereas T[] does. That puts T[] at an advantage. So if you want to  
 append stuff and also contract from the left, there's nothing our  
 abstractions can help you with.

An Array!(T) is really just a different name to a T[new]. You'll have the  
same problem explaining difference between Array!(T) and T[].

But you are also creating a nightmare for CTFE. Since you can't use "a ~=  
b;" anymore, you'll have to use "a = a ~ b;" which *always* allocates. Not  
only it is syntactically less pleasant, this way you render this function  
useless at run-time - who in the sane mind will use such an inefficient  
stuff?

 Instead of all T[new] stuff, I suggest the following:

 1. We stay with T[] and we define a struct ArrayBuilder that replaces  
 T[new] with a much more clear name and charter. Phobos already has  
 Appender which works very well. We can beef that up to allow array-like  
 primitives.

 2. Assigning to a slice's .length allocates a new slice if growth is  
 needed.

 3. Disallow ~= for slices. ArrayBuilder will define it.

 4. That's it.

 Java got away with a similar approach using StringBuilder:

 http://java.sun.com/j2se/1.5.0/docs/api/java/lang/StringBuilder.html

 Scala has something very similar called ArrayBuffer:

 http://www.nabble.com/ArrayList-and-ArrayBuffer-td15448842.html



 http://msdn.microsoft.com/en-us/library/2839d5h5%28VS.71%29.aspx

This is not a fair comparison. StringBuilders only exists because strings  
are immutable and any operation on them create a new object. They are just  
a performance optimization, nothing else. StringBuilder is great, but its  
functionality doesn't really overlap with a general-purpose array.

 So it looks like many programmers coming from other languages will  
 already be familiar with the idea that you use a "builder" to grow an  
 array, and then you use a non-growable array. One thing that Appender  
 has and is really cool is that it can grow an already-existing slice. So  
 you can grow a slice, play with it for a while, and then grow it again  
 at low cost. I don't think the other languages allow that.

 I understand how you must feel about having implemented T[new] and all,  
 but really please please try to detach for a minute and think back. Does  
 what we've got now with T[new] make D a much better place? Between the  
 increase of the language, the difficulty to explain the minute  
 subtleties, and the annoying corner cases and oddities, I think it might  
 be good to reconsider.
 ================

 Given that we already agree that a T[new] can be implicitly cast to a
 T[], the lack of closure under ~, .dup, etc. seems like a non-issue.   
 When I was a
 beginner, before I got into templates, a major attraction to D was that  
 it was a
 fast, compiled language where basic things like arrays (mostly) just  
 worked.  To
 take away all the syntactic sugar for appending and lengthening from  
 arrays and
 push this into a separate library type that doesn't feel like a first  
 class object
 or (I assume) even support most array semantics would be a massive,  
 massive kludge
 no matter how well-implemented that library type was.

 I think the language will be in considerably better shape without T[new]  
 than with it. I understand this is a subjective point. You can argue  
 against on virtually every one of my objections.

It is a shame that D will have built-in associative arrays and no dynamic  
arrays. After all, which one of them do you use more frequently?

Oct 19 2009

dsimcha <dsimcha yahoo.com> writes:

This discussion originated in the T[new] thread, but I think it deserves its own
thread.

== Quote from Denis Koroskin (2korden gmail.com)'s article
 An Array!(T) is really just a different name to a T[new]. You'll have the
 same problem explaining difference between Array!(T) and T[].
 But you are also creating a nightmare for CTFE. Since you can't use "a ~=
 b;" anymore, you'll have to use "a = a ~ b;" which *always* allocates. Not
 only it is syntactically less pleasant, this way you render this function
 useless at run-time - who in the sane mind will use such an inefficient
 stuff?

Maybe what we need is a version(ctfe) statement.  Stuff inside such a block
would
be executed only if a function is being compile time evaluated.  When code is
generated for runtime evaluation the else block would be used.  This would allow
problems like this to be solved in a well-encapsulated way.  Example:

uint[] findPrimes(uint maxPrime) {
    version(ctfe) {
        uint[] ret;
    } else {
        ArrayBuilder!uint ret;
    }

    foreach(i; 0..maxPrime) {
        if(!isPrime(i)) {
             continue;
        }

        version(ctfe) {
            ret = ret ~ i;
        } else {
            ret ~= i;
        }
    }
}

Given that CTFE will likely never support everything that is supported at
runtime,
this will likely make it much more useful.

Oct 19 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Mon, 19 Oct 2009 17:13:46 +0400, dsimcha <dsimcha yahoo.com> wrote:

 This discussion originated in the T[new] thread, but I think it deserves  
 its own
 thread.

 == Quote from Denis Koroskin (2korden gmail.com)'s article
 An Array!(T) is really just a different name to a T[new]. You'll have  
 the
 same problem explaining difference between Array!(T) and T[].
 But you are also creating a nightmare for CTFE. Since you can't use "a  
 ~=
 b;" anymore, you'll have to use "a = a ~ b;" which *always* allocates.  
 Not
 only it is syntactically less pleasant, this way you render this  
 function
 useless at run-time - who in the sane mind will use such an inefficient
 stuff?

 Maybe what we need is a version(ctfe) statement.  Stuff inside such a  
 block would
 be executed only if a function is being compile time evaluated.  When  
 code is
 generated for runtime evaluation the else block would be used.  This  
 would allow
 problems like this to be solved in a well-encapsulated way.  Example:

 uint[] findPrimes(uint maxPrime) {
     version(ctfe) {
         uint[] ret;
     } else {
         ArrayBuilder!uint ret;
     }

     foreach(i; 0..maxPrime) {
         if(!isPrime(i)) {
              continue;
         }

         version(ctfe) {
             ret = ret ~ i;
         } else {
             ret ~= i;
         }
     }
 }

 Given that CTFE will likely never support everything that is supported  
 at runtime,
 this will likely make it much more useful.

It was suggested before and IIRC Walter said it is next to impossible to  
implement.

What you suggest is almost the same as writing 2 different functions. The  
killer feature of CTFE is that you write it just once and use both at  
compile and run-time without modifications.

I'd like to write it as

immutable(uint)[] findPrimes(uint maxPrime) {
     ArrayOfInt ret;

     foreach(i; 0..maxPrime) {
         if(!isPrime(i)) {
              continue;
         }

         ret ~= i;
     }

     return assumeUnique(ret[]);
}

ArrayOfInt may be an alias to Array!(int), int[] or int[new], or whatever  
else. All I care is it should be valid and efficient for both run- and  
compile-time cases.

Oct 19 2009

Don <nospam nospam.com> writes:

Denis Koroskin wrote:
 On Mon, 19 Oct 2009 17:13:46 +0400, dsimcha <dsimcha yahoo.com> wrote:
 
 This discussion originated in the T[new] thread, but I think it 
 deserves its own
 thread.

 == Quote from Denis Koroskin (2korden gmail.com)'s article
 An Array!(T) is really just a different name to a T[new]. You'll have 
 the
 same problem explaining difference between Array!(T) and T[].
 But you are also creating a nightmare for CTFE. Since you can't use 
 "a ~=
 b;" anymore, you'll have to use "a = a ~ b;" which *always* 
 allocates. Not
 only it is syntactically less pleasant, this way you render this 
 function
 useless at run-time - who in the sane mind will use such an inefficient
 stuff?

 Maybe what we need is a version(ctfe) statement.  Stuff inside such a 
 block would
 be executed only if a function is being compile time evaluated.  When 
 code is
 generated for runtime evaluation the else block would be used.  This 
 would allow
 problems like this to be solved in a well-encapsulated way.  Example:

 uint[] findPrimes(uint maxPrime) {
     version(ctfe) {
         uint[] ret;
     } else {
         ArrayBuilder!uint ret;
     }

     foreach(i; 0..maxPrime) {
         if(!isPrime(i)) {
              continue;
         }

         version(ctfe) {
             ret = ret ~ i;
         } else {
             ret ~= i;
         }
     }
 }

 Given that CTFE will likely never support everything that is supported 
 at runtime,
 this will likely make it much more useful.

 
 It was suggested before and IIRC Walter said it is next to impossible to 
 implement.

I had a bit of an attempt at it. version(ctfe) seems to be nearly 
impossible.
But I'm almost certain I could get a magic bool __ctfe to work:

if (__ctfe) {
   ...      // only contains D code
} else {
    asm { .... }
}

It would only be accessible inside unsafe modules (which are the only 
place you'd need it).

 What you suggest is almost the same as writing 2 different functions. 
 The killer feature of CTFE is that you write it just once and use both 
 at compile and run-time without modifications.

Yes, but there are problems. A few low-level, high-speed functions can't 
be used in CTFE because they use asm or nasty unions.

eg, std.math.poly()

Oct 19 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Don (nospam nospam.com)'s article
 Denis Koroskin wrote:
 On Mon, 19 Oct 2009 17:13:46 +0400, dsimcha <dsimcha yahoo.com> wrote:

 This discussion originated in the T[new] thread, but I think it
 deserves its own
 thread.

 == Quote from Denis Koroskin (2korden gmail.com)'s article
 An Array!(T) is really just a different name to a T[new]. You'll have
 the
 same problem explaining difference between Array!(T) and T[].
 But you are also creating a nightmare for CTFE. Since you can't use
 "a ~=
 b;" anymore, you'll have to use "a = a ~ b;" which *always*
 allocates. Not
 only it is syntactically less pleasant, this way you render this
 function
 useless at run-time - who in the sane mind will use such an inefficient
 stuff?

 Maybe what we need is a version(ctfe) statement.  Stuff inside such a
 block would
 be executed only if a function is being compile time evaluated.  When
 code is
 generated for runtime evaluation the else block would be used.  This
 would allow
 problems like this to be solved in a well-encapsulated way.  Example:

 uint[] findPrimes(uint maxPrime) {
     version(ctfe) {
         uint[] ret;
     } else {
         ArrayBuilder!uint ret;
     }

     foreach(i; 0..maxPrime) {
         if(!isPrime(i)) {
              continue;
         }

         version(ctfe) {
             ret = ret ~ i;
         } else {
             ret ~= i;
         }
     }
 }

 Given that CTFE will likely never support everything that is supported
 at runtime,
 this will likely make it much more useful.

 It was suggested before and IIRC Walter said it is next to impossible to
 implement.

 I had a bit of an attempt at it. version(ctfe) seems to be nearly
 impossible.
 But I'm almost certain I could get a magic bool __ctfe to work:
 if (__ctfe) {
    ...      // only contains D code
 } else {
     asm { .... }
 }
 It would only be accessible inside unsafe modules (which are the only
 place you'd need it).
 What you suggest is almost the same as writing 2 different functions.
 The killer feature of CTFE is that you write it just once and use both
 at compile and run-time without modifications.

 Yes, but there are problems. A few low-level, high-speed functions can't
 be used in CTFE because they use asm or nasty unions.
 eg, std.math.poly()

So this magic bool would be like a regular if statement in that it would create
a
new scope?  I guess if you could do it such that it worked with static if, you'd
be able to get it to work with version.

Oct 19 2009

Don <nospam nospam.com> writes:

dsimcha wrote:
 == Quote from Don (nospam nospam.com)'s article
 Denis Koroskin wrote:
 On Mon, 19 Oct 2009 17:13:46 +0400, dsimcha <dsimcha yahoo.com> wrote:

 This discussion originated in the T[new] thread, but I think it
 deserves its own
 thread.

 == Quote from Denis Koroskin (2korden gmail.com)'s article
 An Array!(T) is really just a different name to a T[new]. You'll have
 the
 same problem explaining difference between Array!(T) and T[].
 But you are also creating a nightmare for CTFE. Since you can't use
 "a ~=
 b;" anymore, you'll have to use "a = a ~ b;" which *always*
 allocates. Not
 only it is syntactically less pleasant, this way you render this
 function
 useless at run-time - who in the sane mind will use such an inefficient
 stuff?

 Maybe what we need is a version(ctfe) statement.  Stuff inside such a
 block would
 be executed only if a function is being compile time evaluated.  When
 code is
 generated for runtime evaluation the else block would be used.  This
 would allow
 problems like this to be solved in a well-encapsulated way.  Example:

 uint[] findPrimes(uint maxPrime) {
     version(ctfe) {
         uint[] ret;
     } else {
         ArrayBuilder!uint ret;
     }

     foreach(i; 0..maxPrime) {
         if(!isPrime(i)) {
              continue;
         }

         version(ctfe) {
             ret = ret ~ i;
         } else {
             ret ~= i;
         }
     }
 }

 Given that CTFE will likely never support everything that is supported
 at runtime,
 this will likely make it much more useful.

 It was suggested before and IIRC Walter said it is next to impossible to
 implement.

 I had a bit of an attempt at it. version(ctfe) seems to be nearly
 impossible.
 But I'm almost certain I could get a magic bool __ctfe to work:
 if (__ctfe) {
    ...      // only contains D code
 } else {
     asm { .... }
 }
 It would only be accessible inside unsafe modules (which are the only
 place you'd need it).
 What you suggest is almost the same as writing 2 different functions.
 The killer feature of CTFE is that you write it just once and use both
 at compile and run-time without modifications.

 Yes, but there are problems. A few low-level, high-speed functions can't
 be used in CTFE because they use asm or nasty unions.
 eg, std.math.poly()

 
 So this magic bool would be like a regular if statement in that it would
create a
 new scope?  I guess if you could do it such that it worked with static if,
you'd
 be able to get it to work with version.

It'd just be a variable, which the CTFE interpreter would evaluate as 
'true', but would be a normal bool variable everywhere else. Similar to 
the magic variable __dollar.
It'd only become false just before being sent to the code generator. 
Dead code elimination would remove the CTFE part.

The important thing is that the AST must be identical for both CTFE and 
run-time.

Oct 19 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

dsimcha wrote:
 This discussion originated in the T[new] thread, but I think it deserves its
own
 thread.
 
 == Quote from Denis Koroskin (2korden gmail.com)'s article
 An Array!(T) is really just a different name to a T[new]. You'll have the
 same problem explaining difference between Array!(T) and T[].
 But you are also creating a nightmare for CTFE. Since you can't use "a ~=
 b;" anymore, you'll have to use "a = a ~ b;" which *always* allocates. Not
 only it is syntactically less pleasant, this way you render this function
 useless at run-time - who in the sane mind will use such an inefficient
 stuff?

 
 Maybe what we need is a version(ctfe) statement.  Stuff inside such a block
would
 be executed only if a function is being compile time evaluated.  When code is
 generated for runtime evaluation the else block would be used.  This would
allow
 problems like this to be solved in a well-encapsulated way.  Example:
 
 uint[] findPrimes(uint maxPrime) {
     version(ctfe) {
         uint[] ret;
     } else {
         ArrayBuilder!uint ret;
     }
 
     foreach(i; 0..maxPrime) {
         if(!isPrime(i)) {
              continue;
         }
 
         version(ctfe) {
             ret = ret ~ i;
         } else {
             ret ~= i;
         }
     }
 }
 
 Given that CTFE will likely never support everything that is supported at
runtime,
 this will likely make it much more useful.

version(ctfe) would be helpful in many other situations (e.g. 
asm-optimized routines etc.)

I thought more about it and realized that ~= can be still implemented 
for arrays if the GC cooperates.

For relatively large chunks of memory, the GC keeps a control block. We 
could add a member size_t requestedSize that keeps the size that was 
requested with an allocation/reallocation call. The GC can take 
initiative in overallocating memory and expose primitives such as 
requestedSize (how much did the program ask for?) and capacity (how much 
is really allocated?).  With that API, it is possible to implement ~= 
efficiently.


Andrei

Oct 19 2009

Leandro Lucarella <llucax gmail.com> writes:

Andrei Alexandrescu, el 19 de octubre a las 08:38 me escribiste:
 For relatively large chunks of memory, the GC keeps a control block.
 We could add a member size_t requestedSize that keeps the size that
 was requested with an allocation/reallocation call. The GC can take
 initiative in overallocating memory and expose primitives such as
 requestedSize (how much did the program ask for?) and capacity (how
 much is really allocated?).  With that API, it is possible to
 implement ~= efficiently.

That would mean moving the overhead of arrays to the GC, but that overhead
will be present for *all* objects, arrays or not, so I don't think it will
be a good idea...

Having that information could be used if present though, for example, to
avoid scanning chunks of memory that are not used. That could help
avoiding false positives (if that chunks of memory are not zeroed
already).

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Yo soy Peperino, mártir latino, venid al asado pero traed el vino.
	-- Peperino Pómoro

Oct 19 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Leandro Lucarella wrote:
 Andrei Alexandrescu, el 19 de octubre a las 08:38 me escribiste:
 For relatively large chunks of memory, the GC keeps a control block.
 We could add a member size_t requestedSize that keeps the size that
 was requested with an allocation/reallocation call. The GC can take
 initiative in overallocating memory and expose primitives such as
 requestedSize (how much did the program ask for?) and capacity (how
 much is really allocated?).  With that API, it is possible to
 implement ~= efficiently.

 
 That would mean moving the overhead of arrays to the GC, but that overhead
 will be present for *all* objects, arrays or not, so I don't think it will
 be a good idea...

I forgot to mention that ~= would also have a means to communicate the 
GC to overallocate. My point is simple: the problem is that slices don't 
  store enough info. That info could get in the memory control block.

 Having that information could be used if present though, for example, to
 avoid scanning chunks of memory that are not used. That could help
 avoiding false positives (if that chunks of memory are not zeroed
 already).

Good point.


Andrei

Oct 19 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Leandro Lucarella wrote:
 Andrei Alexandrescu, el 19 de octubre a las 08:38 me escribiste:
 For relatively large chunks of memory, the GC keeps a control block.
 We could add a member size_t requestedSize that keeps the size that
 was requested with an allocation/reallocation call. The GC can take
 initiative in overallocating memory and expose primitives such as
 requestedSize (how much did the program ask for?) and capacity (how
 much is really allocated?).  With that API, it is possible to
 implement ~= efficiently.

 That would mean moving the overhead of arrays to the GC, but that overhead
 will be present for *all* objects, arrays or not, so I don't think it will
 be a good idea...

 I forgot to mention that ~= would also have a means to communicate the
 GC to overallocate. My point is simple: the problem is that slices don't
   store enough info. That info could get in the memory control block.

This is the part of this thread I'm not understanding:  Isn't this info stored
already?  The only problem is querying it is slow.  See GC.sizeOf(), and bug
2900.
 (2900 is fixed, but it's still slow, and there's still bug 2093.)  What are you
proposing that's different from the status quo?

Oct 19 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Leandro Lucarella wrote:
 Andrei Alexandrescu, el 19 de octubre a las 08:38 me escribiste:
 For relatively large chunks of memory, the GC keeps a control block.
 We could add a member size_t requestedSize that keeps the size that
 was requested with an allocation/reallocation call. The GC can take
 initiative in overallocating memory and expose primitives such as
 requestedSize (how much did the program ask for?) and capacity (how
 much is really allocated?).  With that API, it is possible to
 implement ~= efficiently.

 That would mean moving the overhead of arrays to the GC, but that overhead
 will be present for *all* objects, arrays or not, so I don't think it will
 be a good idea...

 I forgot to mention that ~= would also have a means to communicate the
 GC to overallocate. My point is simple: the problem is that slices don't
   store enough info. That info could get in the memory control block.

 
 This is the part of this thread I'm not understanding:  Isn't this info stored
 already?  The only problem is querying it is slow.  See GC.sizeOf(), and bug
2900.
  (2900 is fixed, but it's still slow, and there's still bug 2093.)  What are
you
 proposing that's different from the status quo?

Sorry, I got confused a bit: you have the size in the array and you can 
get the capacity from the GC, just slowly. Let's sleep on this some 
more, it may not be impossible to find a growth scheme that works fast.

Andrei

Oct 19 2009

Fawzi Mohamed <fmohamed mac.com> writes:

On 2009-10-19 18:19:34 +0200, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Leandro Lucarella wrote:
 Andrei Alexandrescu, el 19 de octubre a las 08:38 me escribiste:
 For relatively large chunks of memory, the GC keeps a control block.
 We could add a member size_t requestedSize that keeps the size that
 was requested with an allocation/reallocation call. The GC can take
 initiative in overallocating memory and expose primitives such as
 requestedSize (how much did the program ask for?) and capacity (how
 much is really allocated?).  With that API, it is possible to
 implement ~= efficiently.

 That would mean moving the overhead of arrays to the GC, but that overhead
 will be present for *all* objects, arrays or not, so I don't think it will
 be a good idea...

 I forgot to mention that ~= would also have a means to communicate the
 GC to overallocate. My point is simple: the problem is that slices don't
   store enough info. That info could get in the memory control block.

 
 This is the part of this thread I'm not understanding:  Isn't this info stored
 already?  The only problem is querying it is slow.  See GC.sizeOf(), 
 and bug 2900.
  (2900 is fixed, but it's still slow, and there's still bug 2093.)  
 What are you
 proposing that's different from the status quo?

 
 Sorry, I got confused a bit: you have the size in the array and you can 
 get the capacity from the GC, just slowly. Let's sleep on this some 
 more, it may not be impossible to find a growth scheme that works fast.
 
 Andrei

The problem with the current state, apart slow access to the capacity, 
is an ownership problem: which of the possibly multiple slices owns the 
data after the slice? what happens when multiple slices grow?

In some way the current state is the optimal solution with slice=array.
To improve on this the only way is (I think) to have slice!=array, and 
this was basically what T[new] was all about (at least in my 
understanding).

Ownership of data is a difficult problem with arrays, and clearly 
T[new] and slices is more complex than having slice=array, I would be 
tempted to say that on the whole having T[new] (or even a library 
vector type, if T[new] does not improve upon what a library type does), 
might be a good idea.

Fawzi

Oct 19 2009

Leandro Lucarella <llucax gmail.com> writes:

Andrei Alexandrescu, el 19 de octubre a las 10:33 me escribiste:
 Leandro Lucarella wrote:
Andrei Alexandrescu, el 19 de octubre a las 08:38 me escribiste:
For relatively large chunks of memory, the GC keeps a control block.
We could add a member size_t requestedSize that keeps the size that
was requested with an allocation/reallocation call. The GC can take
initiative in overallocating memory and expose primitives such as
requestedSize (how much did the program ask for?) and capacity (how
much is really allocated?).  With that API, it is possible to
implement ~= efficiently.

That would mean moving the overhead of arrays to the GC, but that overhead
will be present for *all* objects, arrays or not, so I don't think it will
be a good idea...

 
 I forgot to mention that ~= would also have a means to communicate
 the GC to overallocate. My point is simple: the problem is that
 slices don't  store enough info. That info could get in the memory
 control block.

The control block is not part of the object's memory, so I don't think
it's something you want to manipulate frequently (because of locality
issues). This works well for the GC because when you manipute control
information often is when you are collecting, so for collection most of
the control info should be cached and have good locality. In collections
you have the exact opposite scheme, you don't look at the data, only the
control info. When you manipulate an array you will be probably touching
both. That's why I think you should keep all the array data in one place.
Well, that and because I don't think is a good idea to transfer array
complexities to the GC. It's just not the place for that.

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Si por el chancho fuera, se autocomería con chimichurri Worshestershire!

Oct 19 2009

Leandro Lucarella <llucax gmail.com> writes:

Andrei Alexandrescu, el 18 de octubre a las 20:16 me escribiste:
 Here's what I wrote to Walter:
 
 ====================
 I'm going to suggest something terrible - let's get rid of T[new]. I
 know it's difficult to throw away work you've already done, but
 really things with T[new] start to look like a Pyrrhic victory. Here
 are some issues:
 
 * The abstraction doesn't seem to come off as crisp and clean as we
 both wanted;
 
 * There are efficiency issues, such as the two allocations that you
 valiantly tried to eliminate in a subset of cases;
 
 * Explaining two very similar but subtly different types to
 newcomers is excruciatingly difficult (I'll send you a draft of the
 chapter - it looks like a burn victim who didn't make it);
 
 * Furthermore, explaining people when to use one vs. the other is
 much more difficult than it seems. On the surface, it goes like
 this: "If you need to append stuff, use T[new]. If not, use T[]."
 Reality is much more subtle. For one thing, T[new] does not allow
 contraction from the left, whereas T[] does. That puts T[] at an
 advantage. So if you want to append stuff and also contract from the
 left, there's nothing our abstractions can help you with.

I think this is getting overcomplicated. I don't see it as complex, I see
it like this:

2 types should be provided: array and slice.

array is a *real* type, storing and owning memory, it should be something
like this (conceptually):

class array(T)
{
	size_t length;
	size_t capacity;
	T[capacity] elements;
}

1) a pure reference type.
2) 1 allocation only (interior pointers are not a problem, the GC have to
   support them anyways).
3) easily appendable (capacity field).

slice should be something like this:

struct slice(T)
{
	size_t length;
	T* ptr;
}

1) a pure value type.
2) no allocation at all, *ever*.
3) not appendable at all.
4) You can change both ptr and length, and you can mutate the elements (if
   not immutable).

So a slice is a window to peek a chunk of data.

Then you have the syntax. In this discussion, T[new] is array and T[] is
slice. I find that syntax very confusing. I think it could be even better
to just put those 2 types in object.d (well, do public imports of those
2 types in object.d) and let the people write:

array!T a;
slice!T s;

The array literals should be immutable chunks of memory in the static
memory, as ClassInfo.init. The type of [1,2,3] should be, for example,
slice!(immutable int), so:
auto s = [1,2,3];
should create a slice!(immutable int) with length 3 and ptr pointing to
the static memory where the [1,2,3] was stored (this is what happens with
strings, right? So no surprises here, they are consistent).

slice.dup should return an array. If you want to just copy the length and
ptr members, use assignment (it's a value type, remember)?

auto s = [1,2,3];
auto t = s;
assert(s.length == t.length);
assert(s.ptr == t.ptr);
assert(&s != &t);

Assignment of arrays is just a pointer assignment (is a reference type):

auto a = [1,2,3].dup;
auto b = a;
assert(a is b);

array.dup returns another array. array[] returns a slice though (you can
allow implicit casting from array to slice but I don't see the point as
array[] is short enough and more explicit). slices should not be
convertible to arrays (use slice.dup for that).


Back to the syntax, I think T[new] is *really* confusing. It tell me
nothing about the type, it provides the same information as calling it
it wazzzaaap!T for me. I *really* think it would be better to call it
array!T. But if we have both types, T[] is not clear anymore that is
a slice, again, you can't figure that out. So maybe T[] should be used for
arrays, not slices; if you want some syntax sugar I think T[] is more
guessable to be an array than a slice, and it would be a little more
backward compatible. Maybe slices can be T[..], but I'm not sure is clear
enough, again I think we could live with something more explicit like
array!T and slice!T. But at least slices should be part of the language,
because that should be the type of "array literals" (damn! that didn't
sound right =P).


I'm missing something? Why this shouldn't work?

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
- Que hacés, ratita?
- Espero un ratito...

Oct 19 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Mon, 19 Oct 2009 18:57:45 +0400, Leandro Lucarella <llucax gmail.com>  
wrote:

 Andrei Alexandrescu, el 18 de octubre a las 20:16 me escribiste:
 Here's what I wrote to Walter:

 ====================
 I'm going to suggest something terrible - let's get rid of T[new]. I
 know it's difficult to throw away work you've already done, but
 really things with T[new] start to look like a Pyrrhic victory. Here
 are some issues:

 * The abstraction doesn't seem to come off as crisp and clean as we
 both wanted;

 * There are efficiency issues, such as the two allocations that you
 valiantly tried to eliminate in a subset of cases;

 * Explaining two very similar but subtly different types to
 newcomers is excruciatingly difficult (I'll send you a draft of the
 chapter - it looks like a burn victim who didn't make it);

 * Furthermore, explaining people when to use one vs. the other is
 much more difficult than it seems. On the surface, it goes like
 this: "If you need to append stuff, use T[new]. If not, use T[]."
 Reality is much more subtle. For one thing, T[new] does not allow
 contraction from the left, whereas T[] does. That puts T[] at an
 advantage. So if you want to append stuff and also contract from the
 left, there's nothing our abstractions can help you with.

 I think this is getting overcomplicated. I don't see it as complex, I see
 it like this:

 2 types should be provided: array and slice.

 array is a *real* type, storing and owning memory, it should be something
 like this (conceptually):

 class array(T)
 {
 	size_t length;
 	size_t capacity;
 	T[capacity] elements;
 }

 1) a pure reference type.
 2) 1 allocation only (interior pointers are not a problem, the GC have to
    support them anyways).
 3) easily appendable (capacity field).

 slice should be something like this:

 struct slice(T)
 {
 	size_t length;
 	T* ptr;
 }

 1) a pure value type.
 2) no allocation at all, *ever*.
 3) not appendable at all.
 4) You can change both ptr and length, and you can mutate the elements  
 (if
    not immutable).

 So a slice is a window to peek a chunk of data.

 Then you have the syntax. In this discussion, T[new] is array and T[] is
 slice. I find that syntax very confusing. I think it could be even better
 to just put those 2 types in object.d (well, do public imports of those
 2 types in object.d) and let the people write:

 array!T a;
 slice!T s;

 The array literals should be immutable chunks of memory in the static
 memory, as ClassInfo.init. The type of [1,2,3] should be, for example,
 slice!(immutable int), so:
 auto s = [1,2,3];
 should create a slice!(immutable int) with length 3 and ptr pointing to
 the static memory where the [1,2,3] was stored (this is what happens with
 strings, right? So no surprises here, they are consistent).

 slice.dup should return an array. If you want to just copy the length and
 ptr members, use assignment (it's a value type, remember)?

 auto s = [1,2,3];
 auto t = s;
 assert(s.length == t.length);
 assert(s.ptr == t.ptr);
 assert(&s != &t);

 Assignment of arrays is just a pointer assignment (is a reference type):

 auto a = [1,2,3].dup;
 auto b = a;
 assert(a is b);

 array.dup returns another array. array[] returns a slice though (you can
 allow implicit casting from array to slice but I don't see the point as
 array[] is short enough and more explicit). slices should not be
 convertible to arrays (use slice.dup for that).


 Back to the syntax, I think T[new] is *really* confusing. It tell me
 nothing about the type, it provides the same information as calling it
 it wazzzaaap!T for me. I *really* think it would be better to call it
 array!T. But if we have both types, T[] is not clear anymore that is
 a slice, again, you can't figure that out. So maybe T[] should be used  
 for
 arrays, not slices; if you want some syntax sugar I think T[] is more
 guessable to be an array than a slice, and it would be a little more
 backward compatible. Maybe slices can be T[..], but I'm not sure is clear
 enough, again I think we could live with something more explicit like
 array!T and slice!T. But at least slices should be part of the language,
 because that should be the type of "array literals" (damn! that didn't
 sound right =P).


 I'm missing something? Why this shouldn't work?

That's what was initially planned (except some nuances).

I also think we need a precedent of mapping built-in language type to a  
library type, starting with Array!(T) and Range!(T). We could then map  
V[K] to AArray!(K,V), T? to Nullable!(T) etc.

There is one big issue, though: classes and allocations via 'new' don't  
work with CTFE. I believe this is something that is planned (Walter once  
said that an entire subset of D called SafeD should work with CTFE), but  
it is *very* hard to implement, not feasible in a nearest future for sure.

Oct 19 2009

Don <nospam nospam.com> writes:

Denis Koroskin wrote:
 On Mon, 19 Oct 2009 18:57:45 +0400, Leandro Lucarella <llucax gmail.com> 
 wrote:
 
 Andrei Alexandrescu, el 18 de octubre a las 20:16 me escribiste:
 Here's what I wrote to Walter:

 ====================
 I'm going to suggest something terrible - let's get rid of T[new]. I
 know it's difficult to throw away work you've already done, but
 really things with T[new] start to look like a Pyrrhic victory. Here
 are some issues:

 * The abstraction doesn't seem to come off as crisp and clean as we
 both wanted;

 * There are efficiency issues, such as the two allocations that you
 valiantly tried to eliminate in a subset of cases;

 * Explaining two very similar but subtly different types to
 newcomers is excruciatingly difficult (I'll send you a draft of the
 chapter - it looks like a burn victim who didn't make it);

 * Furthermore, explaining people when to use one vs. the other is
 much more difficult than it seems. On the surface, it goes like
 this: "If you need to append stuff, use T[new]. If not, use T[]."
 Reality is much more subtle. For one thing, T[new] does not allow
 contraction from the left, whereas T[] does. That puts T[] at an
 advantage. So if you want to append stuff and also contract from the
 left, there's nothing our abstractions can help you with.

 I think this is getting overcomplicated. I don't see it as complex, I see
 it like this:

 2 types should be provided: array and slice.

 array is a *real* type, storing and owning memory, it should be something
 like this (conceptually):

 class array(T)
 {
     size_t length;
     size_t capacity;
     T[capacity] elements;
 }

 1) a pure reference type.
 2) 1 allocation only (interior pointers are not a problem, the GC have to
    support them anyways).
 3) easily appendable (capacity field).

 slice should be something like this:

 struct slice(T)
 {
     size_t length;
     T* ptr;
 }

 1) a pure value type.
 2) no allocation at all, *ever*.
 3) not appendable at all.
 4) You can change both ptr and length, and you can mutate the elements 
 (if
    not immutable).

 So a slice is a window to peek a chunk of data.

 Then you have the syntax. In this discussion, T[new] is array and T[] is
 slice. I find that syntax very confusing. I think it could be even better
 to just put those 2 types in object.d (well, do public imports of those
 2 types in object.d) and let the people write:

 array!T a;
 slice!T s;

 The array literals should be immutable chunks of memory in the static
 memory, as ClassInfo.init. The type of [1,2,3] should be, for example,
 slice!(immutable int), so:
 auto s = [1,2,3];
 should create a slice!(immutable int) with length 3 and ptr pointing to
 the static memory where the [1,2,3] was stored (this is what happens with
 strings, right? So no surprises here, they are consistent).

 slice.dup should return an array. If you want to just copy the length and
 ptr members, use assignment (it's a value type, remember)?

 auto s = [1,2,3];
 auto t = s;
 assert(s.length == t.length);
 assert(s.ptr == t.ptr);
 assert(&s != &t);

 Assignment of arrays is just a pointer assignment (is a reference type):

 auto a = [1,2,3].dup;
 auto b = a;
 assert(a is b);

 array.dup returns another array. array[] returns a slice though (you can
 allow implicit casting from array to slice but I don't see the point as
 array[] is short enough and more explicit). slices should not be
 convertible to arrays (use slice.dup for that).


 Back to the syntax, I think T[new] is *really* confusing. It tell me
 nothing about the type, it provides the same information as calling it
 it wazzzaaap!T for me. I *really* think it would be better to call it
 array!T. But if we have both types, T[] is not clear anymore that is
 a slice, again, you can't figure that out. So maybe T[] should be used 
 for
 arrays, not slices; if you want some syntax sugar I think T[] is more
 guessable to be an array than a slice, and it would be a little more
 backward compatible. Maybe slices can be T[..], but I'm not sure is clear
 enough, again I think we could live with something more explicit like
 array!T and slice!T. But at least slices should be part of the language,
 because that should be the type of "array literals" (damn! that didn't
 sound right =P).


 I'm missing something? Why this shouldn't work?

 
 That's what was initially planned (except some nuances).
 
 I also think we need a precedent of mapping built-in language type to a 
 library type, starting with Array!(T) and Range!(T). We could then map 
 V[K] to AArray!(K,V), T? to Nullable!(T) etc.
 
 There is one big issue, though: classes and allocations via 'new' don't 
 work with CTFE. I believe this is something that is planned (Walter once 
 said that an entire subset of D called SafeD should work with CTFE), but 
 it is *very* hard to implement, not feasible in a nearest future for sure.

Don't worry. It's not hugely difficult to implement, though it'll be a 
while before all the bugs are ironed out. Most of the difficult stuff is 
already in place. (The CTFE memory bug is the difficult one).

Oct 19 2009

Leandro Lucarella <llucax gmail.com> writes:

Leandro Lucarella, el 19 de octubre a las 11:57 me escribiste:
 slice.dup should return an array.

To avoid making the language aware of the array type, slice.dup can be
removed and use an array constructor instead:

auto a = slice.dup;

should be:

auto a = array!T(slice);

This way, I think the language only have to know about slices.

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
He cometido pecados, he hecho el mal, he sido víctima de la envidia, el
egoísmo, la ambición, la mentira y la frivolidad, pero siempre he sido
un padre argentino que quiere que su hijo triunfe en la vida.
	-- Ricardo Vaporeso

Oct 19 2009

Leandro Lucarella <llucax gmail.com> writes:

Leandro Lucarella, el 19 de octubre a las 12:13 me escribiste:
 Leandro Lucarella, el 19 de octubre a las 11:57 me escribiste:
 slice.dup should return an array.

 
 To avoid making the language aware of the array type, slice.dup can be
 removed and use an array constructor instead:
 
 auto a = slice.dup;
 
 should be:
 
 auto a = array!T(slice);
 
 This way, I think the language only have to know about slices.

Unless you want to have an appendable array type in CTFE. I guess the
language should know about array anyways :S

(I should read the entire thread before posting)

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Si el enemigo se siente abatido, recuérdele que su familia está lejos y
que mientras siga la pelea él se estará perdiendo el crecimiento de sus
hijos. Si está feliz, recuerdele de la existencia de su familia
política, en especial de los cuñados que tienen sexo con sus hermanas.
	-- Ricardo Vaporeso. De la desmoralización del enemigo.

Oct 19 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Leandro Lucarella wrote:
 Andrei Alexandrescu, el 18 de octubre a las 20:16 me escribiste:
 Here's what I wrote to Walter:

 ====================
 I'm going to suggest something terrible - let's get rid of T[new]. I
 know it's difficult to throw away work you've already done, but
 really things with T[new] start to look like a Pyrrhic victory. Here
 are some issues:

 * The abstraction doesn't seem to come off as crisp and clean as we
 both wanted;

 * There are efficiency issues, such as the two allocations that you
 valiantly tried to eliminate in a subset of cases;

 * Explaining two very similar but subtly different types to
 newcomers is excruciatingly difficult (I'll send you a draft of the
 chapter - it looks like a burn victim who didn't make it);

 * Furthermore, explaining people when to use one vs. the other is
 much more difficult than it seems. On the surface, it goes like
 this: "If you need to append stuff, use T[new]. If not, use T[]."
 Reality is much more subtle. For one thing, T[new] does not allow
 contraction from the left, whereas T[] does. That puts T[] at an
 advantage. So if you want to append stuff and also contract from the
 left, there's nothing our abstractions can help you with.

 
 I think this is getting overcomplicated. I don't see it as complex, I see
 it like this:
 
 2 types should be provided: array and slice.
 
 array is a *real* type, storing and owning memory, it should be something
 like this (conceptually):
 
 class array(T)
 {
 	size_t length;
 	size_t capacity;
 	T[capacity] elements;
 }

At the point I introduce arrays I hadn't described classes.

One major problem with writing a "The X Programming Language" book is 
sequencing. It's very easy to explain a complicated feature if the 
listener knows all others. It is very difficult to introduce them serially.

 1) a pure reference type.
 2) 1 allocation only (interior pointers are not a problem, the GC have to
    support them anyways).
 3) easily appendable (capacity field).
 
 slice should be something like this:
 
 struct slice(T)
 {
 	size_t length;
 	T* ptr;
 }

structs and pointers have also not been introduced yet.

 I'm missing something? Why this shouldn't work?

It may work, but I was unable to pull it off reasonably well.


Andrei

Oct 19 2009

Leandro Lucarella <llucax gmail.com> writes:

Andrei Alexandrescu, el 19 de octubre a las 10:18 me escribiste:
 Leandro Lucarella wrote:
Andrei Alexandrescu, el 18 de octubre a las 20:16 me escribiste:
Here's what I wrote to Walter:

====================
I'm going to suggest something terrible - let's get rid of T[new]. I
know it's difficult to throw away work you've already done, but
really things with T[new] start to look like a Pyrrhic victory. Here
are some issues:

* The abstraction doesn't seem to come off as crisp and clean as we
both wanted;

* There are efficiency issues, such as the two allocations that you
valiantly tried to eliminate in a subset of cases;

* Explaining two very similar but subtly different types to
newcomers is excruciatingly difficult (I'll send you a draft of the
chapter - it looks like a burn victim who didn't make it);

* Furthermore, explaining people when to use one vs. the other is
much more difficult than it seems. On the surface, it goes like
this: "If you need to append stuff, use T[new]. If not, use T[]."
Reality is much more subtle. For one thing, T[new] does not allow
contraction from the left, whereas T[] does. That puts T[] at an
advantage. So if you want to append stuff and also contract from the
left, there's nothing our abstractions can help you with.

I think this is getting overcomplicated. I don't see it as complex, I see
it like this:

2 types should be provided: array and slice.

array is a *real* type, storing and owning memory, it should be something
like this (conceptually):

class array(T)
{
	size_t length;
	size_t capacity;
	T[capacity] elements;
}

 
 At the point I introduce arrays I hadn't described classes.

You are talking about the book, right? You don't have to! You can explain
arrays without talking about classes at all! See the Python Tutorial for
example:
http://docs.python.org/tutorial/index.html

You see all kind of complex data structures (lists, dictionaries, tuples,
strings, sets), input and output, and even exceptions! before classes. And
I read it, and was a very easy and pleasant reading.

Talking about a class here was just to prove my point more easily because
I *know* everybody here knows very well about classes semantics. I was not
describing it for a book.

 One major problem with writing a "The X Programming Language" book
 is sequencing. It's very easy to explain a complicated feature if
 the listener knows all others. It is very difficult to introduce
 them serially.

I don't think an array would be the case, come on! =)

1) a pure reference type.
2) 1 allocation only (interior pointers are not a problem, the GC have to
   support them anyways).
3) easily appendable (capacity field).

slice should be something like this:

struct slice(T)
{
	size_t length;
	T* ptr;
}

 
 structs and pointers have also not been introduced yet.

I don't know what this have to do with the book. Should be D designed
based on how good are you to explain the concepts on the book?  I think
this is getting ridiculous... seriously. I don't want to be rude, I know
you're putting a lot of work in the book and we all appreciate that, but
you didn't mention any technical point about what I wrote.

I'm missing something? Why this shouldn't work?

 
 It may work, but I was unable to pull it off reasonably well.

What problems did you find?

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
DESCARTAN BIDET VIRTUAL PORQUE NO LAVA
	-- Cronista TV

Oct 19 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Leandro Lucarella wrote:
 Andrei Alexandrescu, el 19 de octubre a las 10:18 me escribiste:
 2 types should be provided: array and slice.

 array is a *real* type, storing and owning memory, it should be something
 like this (conceptually):

 class array(T)
 {
 	size_t length;
 	size_t capacity;
 	T[capacity] elements;
 }

 At the point I introduce arrays I hadn't described classes.

 
 You are talking about the book, right? You don't have to! You can explain
 arrays without talking about classes at all!

I was just replying to your suggestion.

 One major problem with writing a "The X Programming Language" book
 is sequencing. It's very easy to explain a complicated feature if
 the listener knows all others. It is very difficult to introduce
 them serially.

 
 I don't think an array would be the case, come on! =)
 
 1) a pure reference type.
 2) 1 allocation only (interior pointers are not a problem, the GC have to
   support them anyways).
 3) easily appendable (capacity field).

 slice should be something like this:

 struct slice(T)
 {
 	size_t length;
 	T* ptr;
 }

 structs and pointers have also not been introduced yet.

 
 I don't know what this have to do with the book. Should be D designed
 based on how good are you to explain the concepts on the book?  I think
 this is getting ridiculous... seriously. I don't want to be rude, I know
 you're putting a lot of work in the book and we all appreciate that, but
 you didn't mention any technical point about what I wrote.
 
 I'm missing something? Why this shouldn't work?

 It may work, but I was unable to pull it off reasonably well.

 
 What problems did you find?

I thought I explained that above and in several other posts. I have the 
feeling this is going in circles, but let me add one more thing. People 
would want to have a reasonable way of choosing between T[new] and T[]. 
The differences between them are subtle (I have two tables showing the 
primitives of T[] and T[new], which are very similar). That static 
decision concerns future appends to the array, which doesn't strike me 
as something you know from the get-go through future iterations of a 
design. Use of "auto" messes up things further: a nice function may 
choose to return T[new] because it just created an array (an 
implementation detail), but clients may find that unexpected.


Andrei

Oct 19 2009

Johan Granberg <lijat.meREM OVEgmail.com> writes:

Andrei Alexandrescu wrote:

 Leandro Lucarella wrote:
 Andrei Alexandrescu, el 19 de octubre a las 10:18 me escribiste:
 2 types should be provided: array and slice.

 array is a *real* type, storing and owning memory, it should be
 something like this (conceptually):

 class array(T)
 {
 size_t length;
 size_t capacity;
 T[capacity] elements;
 }

 At the point I introduce arrays I hadn't described classes.

 
 You are talking about the book, right? You don't have to! You can explain
 arrays without talking about classes at all!

 
 I was just replying to your suggestion.
 
 One major problem with writing a "The X Programming Language" book
 is sequencing. It's very easy to explain a complicated feature if
 the listener knows all others. It is very difficult to introduce
 them serially.

 
 I don't think an array would be the case, come on! =)
 
 1) a pure reference type.
 2) 1 allocation only (interior pointers are not a problem, the GC have
 to
   support them anyways).
 3) easily appendable (capacity field).

 slice should be something like this:

 struct slice(T)
 {
 size_t length;
 T* ptr;
 }

 structs and pointers have also not been introduced yet.

 
 I don't know what this have to do with the book. Should be D designed
 based on how good are you to explain the concepts on the book?  I think
 this is getting ridiculous... seriously. I don't want to be rude, I know
 you're putting a lot of work in the book and we all appreciate that, but
 you didn't mention any technical point about what I wrote.
 
 I'm missing something? Why this shouldn't work?

 It may work, but I was unable to pull it off reasonably well.

 
 What problems did you find?

 
 I thought I explained that above and in several other posts. I have the
 feeling this is going in circles, but let me add one more thing. People
 would want to have a reasonable way of choosing between T[new] and T[].
 The differences between them are subtle (I have two tables showing the
 primitives of T[] and T[new], which are very similar). That static
 decision concerns future appends to the array, which doesn't strike me
 as something you know from the get-go through future iterations of a
 design. Use of "auto" messes up things further: a nice function may
 choose to return T[new] because it just created an array (an
 implementation detail), but clients may find that unexpected.
 
 
 Andrei

I think you are seeing a larger problem than their is. But consider this,
who is D a language for, is it for beginers only? advanced users only? or
everyone, if it is a language for everyone don't complicate the language
for the advanced users by rejecting nice syntax just because it is hard to
explain. The beginers already will just pick one and write ineficient code
and I don't think one more type will manage to confuse them more (as most
beginers I thaught is already maximaly confused).

Oct 19 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Johan Granberg wrote:
 I think you are seeing a larger problem than their is. But consider this,
 who is D a language for, is it for beginers only? advanced users only? or
 everyone, if it is a language for everyone don't complicate the language
 for the advanced users by rejecting nice syntax just because it is hard to
 explain.

It's not the nice syntax that's hard to explain.

We all have the same goal - make the language easy to acquire, easy to 
use, and reasonably efficient. What I can say is that T[new] is not a 
good step towards achieving those goals.


Andrei

Oct 19 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Mon, 19 Oct 2009 20:16:50 +0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 I thought I explained that above and in several other posts. I have the  
 feeling this is going in circles, but let me add one more thing. People  
 would want to have a reasonable way of choosing between T[new] and T[].  
 The differences between them are subtle (I have two tables showing the  
 primitives of T[] and T[new], which are very similar). That static  
 decision concerns future appends to the array, which doesn't strike me  
 as something you know from the get-go through future iterations of a  
 design. Use of "auto" messes up things further: a nice function may  
 choose to return T[new] because it just created an array (an  
 implementation detail), but clients may find that unexpected.

Put it simple: T[] is a range, and T[new] is a container. They belong to  
different leagues.
Yes, there is a lot common between them, because T[] supports some subset  
of operations that T[new] support.

We are going in circles because you couldn't convince anyone that we don't  
need dynamic array type in language core. Not unless library types are  
usable in CTFE (and have nice syntax).

Oct 19 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Denis Koroskin wrote:
 On Mon, 19 Oct 2009 20:16:50 +0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 I thought I explained that above and in several other posts. I have 
 the feeling this is going in circles, but let me add one more thing. 
 People would want to have a reasonable way of choosing between T[new] 
 and T[]. The differences between them are subtle (I have two tables 
 showing the primitives of T[] and T[new], which are very similar). 
 That static decision concerns future appends to the array, which 
 doesn't strike me as something you know from the get-go through future 
 iterations of a design. Use of "auto" messes up things further: a nice 
 function may choose to return T[new] because it just created an array 
 (an implementation detail), but clients may find that unexpected.

 
 Put it simple: T[] is a range, and T[new] is a container. They belong to 
 different leagues.

Define ranges and define containers.

 Yes, there is a lot common between them, because T[] supports some 
 subset of operations that T[new] support.

No. You see, even you who have a close perspective on the issue have 
gotten confused. T[new] does not support assignment from range. So T[] 
and T[new] are not in a subtyping relationship. They support subtly 
different primitives.

 We are going in circles because you couldn't convince anyone that we 
 don't need dynamic array type in language core. Not unless library types 
 are usable in CTFE (and have nice syntax).

"Anyone" is a bit assuming I think.


Andrei

Oct 19 2009

Fawzi Mohamed <fmohamed mac.com> writes:

On 2009-10-19 19:34:04 +0200, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Denis Koroskin wrote:
 Put it simple: T[] is a range, and T[new] is a container. They belong 
 to different leagues.

 
 Define ranges and define containers.
 
 Yes, there is a lot common between them, because T[] supports some 
 subset of operations that T[new] support.

 
 No. You see, even you who have a close perspective on the issue have 
 gotten confused. T[new] does not support assignment from range. So T[] 
 and T[new] are not in a subtyping relationship. They support subtly 
 different primitives.

unless I miss something
a_new[]=slice;
works just as
a[]=slice;

as in any case you have a slice on the left

a=slice; // should redefine the slice

and yes that is (correctly) not supported by T[new]

a=range;
should not be used in general imho because a slice should be a valid 
range, and I don't want similar operations to have very different 
results (changing the data or just changing the slice object itself).

Fawzi

Oct 19 2009

Leandro Lucarella <llucax gmail.com> writes:

Andrei Alexandrescu, el 19 de octubre a las 11:16 me escribiste:
I'm missing something? Why this shouldn't work?

It may work, but I was unable to pull it off reasonably well.

What problems did you find?

 
 I thought I explained that above and in several other posts. I have
 the feeling this is going in circles, but let me add one more thing.
 People would want to have a reasonable way of choosing between
 T[new] and T[]. The differences between them are subtle (I have two
 tables showing the primitives of T[] and T[new], which are very
 similar).

If T[new] is "array" and T[] is "slice" as I described them, I see no
subtlety, they are huge differences, one is a dynamic array, the other is
a view on somebody else's memory.

I know you probably hate when people says this (that's what I avoided
doing it before :):
I works very well in Python. You have arrays (well, lists, but the
interface is the same as an array) and slices. I never heard anybody
complaining about that being confusing.

 That static decision concerns future appends to the array,
 which doesn't strike me as something you know from the get-go
 through future iterations of a design. Use of "auto" messes up
 things further: a nice function may choose to return T[new] because
 it just created an array (an implementation detail), but clients may
 find that unexpected.

I don't see much problem. You should always return an array (T[new]) if
you have one, because you can get an slice from it (the inverse is not
true). Because of this, implicit conversion from array to slice can be
a good idea, so people expecting a slice when an array is returned will
never know an array was returned anyways. If you need to keep appending to
the array, you can have the original array and keep appending to it
without any performance hit (no copying, no new allocations).

What's wrong with that?

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Cuando intenté arrimarle mi brazo
Se puso a hablar de Miller, de Anais Nin y Picasso
Y si osaba intentar robarle un beso
Se ponía a leer de Neruda unos versos
Me hizo mucho mal la cumbiera intelectual

Oct 19 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Leandro Lucarella wrote:
 Andrei Alexandrescu, el 19 de octubre a las 11:16 me escribiste:
 I'm missing something? Why this shouldn't work?

 It may work, but I was unable to pull it off reasonably well.

 What problems did you find?

 I thought I explained that above and in several other posts. I have
 the feeling this is going in circles, but let me add one more thing.
 People would want to have a reasonable way of choosing between
 T[new] and T[]. The differences between them are subtle (I have two
 tables showing the primitives of T[] and T[new], which are very
 similar).

 
 If T[new] is "array" and T[] is "slice" as I described them, I see no
 subtlety, they are huge differences, one is a dynamic array, the other is
 a view on somebody else's memory.

Their primitives overlap 90% in syntax and semantics.

 I know you probably hate when people says this (that's what I avoided
 doing it before :):
 I works very well in Python. You have arrays (well, lists, but the
 interface is the same as an array) and slices. I never heard anybody
 complaining about that being confusing.

Python lists and slices are quite different from D arrays.

 That static decision concerns future appends to the array,
 which doesn't strike me as something you know from the get-go
 through future iterations of a design. Use of "auto" messes up
 things further: a nice function may choose to return T[new] because
 it just created an array (an implementation detail), but clients may
 find that unexpected.

 
 I don't see much problem. You should always return an array (T[new]) if
 you have one, because you can get an slice from it (the inverse is not
 true). Because of this, implicit conversion from array to slice can be
 a good idea, so people expecting a slice when an array is returned will
 never know an array was returned anyways. If you need to keep appending to
 the array, you can have the original array and keep appending to it
 without any performance hit (no copying, no new allocations).
 
 What's wrong with that?

People expecting a slice and using "auto" or templates are in for a rude 
awakening.


Andrei

Oct 19 2009

Leandro Lucarella <llucax gmail.com> writes:

Andrei Alexandrescu, el 19 de octubre a las 12:40 me escribiste:
 Leandro Lucarella wrote:
Andrei Alexandrescu, el 19 de octubre a las 11:16 me escribiste:
I'm missing something? Why this shouldn't work?

It may work, but I was unable to pull it off reasonably well.

What problems did you find?

I thought I explained that above and in several other posts. I have
the feeling this is going in circles, but let me add one more thing.
People would want to have a reasonable way of choosing between
T[new] and T[]. The differences between them are subtle (I have two
tables showing the primitives of T[] and T[new], which are very
similar).

If T[new] is "array" and T[] is "slice" as I described them, I see no
subtlety, they are huge differences, one is a dynamic array, the other is
a view on somebody else's memory.

 
 Their primitives overlap 90% in syntax and semantics.

So what? Ranges and slices too.

I know you probably hate when people says this (that's what I avoided
doing it before :):
I works very well in Python. You have arrays (well, lists, but the
interface is the same as an array) and slices. I never heard anybody
complaining about that being confusing.

 
 Python lists and slices are quite different from D arrays.

Not the API.

I don't see much problem. You should always return an array (T[new]) if
you have one, because you can get an slice from it (the inverse is not
true). Because of this, implicit conversion from array to slice can be
a good idea, so people expecting a slice when an array is returned will
never know an array was returned anyways. If you need to keep appending to
the array, you can have the original array and keep appending to it
without any performance hit (no copying, no new allocations).

What's wrong with that?

 
 People expecting a slice and using "auto" or templates are in for a
 rude awakening.

Ok, you are in one of those days when you don't listen to reasons. I give
up! =)

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Si ella es el sol, yo soy la luna
Si ella es el mar, soy el desierto
Y estamos en eclipse total
Y estamos en eclipse total

Oct 19 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Leandro Lucarella wrote:
 I don't see much problem. You should always return an array (T[new]) if
 you have one, because you can get an slice from it (the inverse is not
 true). Because of this, implicit conversion from array to slice can be
 a good idea, so people expecting a slice when an array is returned will
 never know an array was returned anyways. If you need to keep appending to
 the array, you can have the original array and keep appending to it
 without any performance hit (no copying, no new allocations).

 What's wrong with that?

 People expecting a slice and using "auto" or templates are in for a
 rude awakening.

 
 Ok, you are in one of those days when you don't listen to reasons. I give
 up! =)

At least in that case I had an objective point! I'd agree many others 
are subjective.

Andrei

Oct 19 2009

Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:

Andrei Alexandrescu wrote:
 Leandro Lucarella wrote:
 Andrei Alexandrescu, el 19 de octubre a las 11:16 me escribiste:
 I'm missing something? Why this shouldn't work?

 It may work, but I was unable to pull it off reasonably well.

 What problems did you find?

 I thought I explained that above and in several other posts. I have
 the feeling this is going in circles, but let me add one more thing.
 People would want to have a reasonable way of choosing between
 T[new] and T[]. The differences between them are subtle (I have two
 tables showing the primitives of T[] and T[new], which are very
 similar).

 If T[new] is "array" and T[] is "slice" as I described them, I see no
 subtlety, they are huge differences, one is a dynamic array, the other is
 a view on somebody else's memory.

 
 Their primitives overlap 90% in syntax and semantics.
 
 I know you probably hate when people says this (that's what I avoided
 doing it before :):
 I works very well in Python. You have arrays (well, lists, but the
 interface is the same as an array) and slices. I never heard anybody
 complaining about that being confusing.

 
 Python lists and slices are quite different from D arrays.
 
 That static decision concerns future appends to the array,
 which doesn't strike me as something you know from the get-go
 through future iterations of a design. Use of "auto" messes up
 things further: a nice function may choose to return T[new] because
 it just created an array (an implementation detail), but clients may
 find that unexpected.

 I don't see much problem. You should always return an array (T[new]) if
 you have one, because you can get an slice from it (the inverse is not
 true). Because of this, implicit conversion from array to slice can be
 a good idea, so people expecting a slice when an array is returned will
 never know an array was returned anyways. If you need to keep 
 appending to
 the array, you can have the original array and keep appending to it
 without any performance hit (no copying, no new allocations).

 What's wrong with that?

 
 People expecting a slice and using "auto" or templates are in for a rude 
 awakening.
 
 
 Andrei

I would argue that, if they truly *needed* a slice and are relying on "auto"
then they 
implicitly accept responsibility for putting the [] in place to guarantee that.
 In other 
words:
auto foo = someFunc()[];

Or T[]'s and T[new]'s could have an alternative to .dup that does the same
thing (return 
self for T[], and return self[] for T[new]).  Either way, it would be something
to point 
out in the book as a best practice: when expecting a slice, guarantee a slice
with []'s.

-- Chris Nicholson-Sauls

Oct 19 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Chris Nicholson-Sauls wrote:
 People expecting a slice and using "auto" or templates are in for a 
 rude awakening.


 Andrei

 
 I would argue that, if they truly *needed* a slice and are relying on 
 "auto" then they implicitly accept responsibility for putting the [] in 
 place to guarantee that.  In other words:
 auto foo = someFunc()[];
 
 Or T[]'s and T[new]'s could have an alternative to .dup that does the 
 same thing (return self for T[], and return self[] for T[new]).  Either 
 way, it would be something to point out in the book as a best practice: 
 when expecting a slice, guarantee a slice with []'s.

I understand that, but I also believe it's a symptom of a problem 
instead of a solution. One of our goals is to make the best practices 
book as high-level as possible, and integrate many best practices in the 
language.

The above code, therefore, I put in the bin "stuff that the language's 
user must keep in mind". At best that should go away.


Andrei

Oct 19 2009

Yigal Chripun <yigal100 gmail.com> writes:

On 19/10/2009 16:57, Leandro Lucarella wrote:
 Andrei Alexandrescu, el 18 de octubre a las 20:16 me escribiste:
 Here's what I wrote to Walter:

 ====================
 I'm going to suggest something terrible - let's get rid of T[new]. I
 know it's difficult to throw away work you've already done, but
 really things with T[new] start to look like a Pyrrhic victory. Here
 are some issues:

 * The abstraction doesn't seem to come off as crisp and clean as we
 both wanted;

 * There are efficiency issues, such as the two allocations that you
 valiantly tried to eliminate in a subset of cases;

 * Explaining two very similar but subtly different types to
 newcomers is excruciatingly difficult (I'll send you a draft of the
 chapter - it looks like a burn victim who didn't make it);

 * Furthermore, explaining people when to use one vs. the other is
 much more difficult than it seems. On the surface, it goes like
 this: "If you need to append stuff, use T[new]. If not, use T[]."
 Reality is much more subtle. For one thing, T[new] does not allow
 contraction from the left, whereas T[] does. That puts T[] at an
 advantage. So if you want to append stuff and also contract from the
 left, there's nothing our abstractions can help you with.

 I think this is getting overcomplicated. I don't see it as complex, I see
 it like this:

 2 types should be provided: array and slice.

 array is a *real* type, storing and owning memory, it should be something
 like this (conceptually):

 class array(T)
 {
 	size_t length;
 	size_t capacity;
 	T[capacity] elements;
 }

 1) a pure reference type.
 2) 1 allocation only (interior pointers are not a problem, the GC have to
     support them anyways).
 3) easily appendable (capacity field).

 slice should be something like this:

 struct slice(T)
 {
 	size_t length;
 	T* ptr;
 }

 1) a pure value type.
 2) no allocation at all, *ever*.
 3) not appendable at all.
 4) You can change both ptr and length, and you can mutate the elements (if
     not immutable).

 So a slice is a window to peek a chunk of data.

 Then you have the syntax. In this discussion, T[new] is array and T[] is
 slice. I find that syntax very confusing. I think it could be even better
 to just put those 2 types in object.d (well, do public imports of those
 2 types in object.d) and let the people write:

 array!T a;
 slice!T s;

 The array literals should be immutable chunks of memory in the static
 memory, as ClassInfo.init. The type of [1,2,3] should be, for example,
 slice!(immutable int), so:
 auto s = [1,2,3];
 should create a slice!(immutable int) with length 3 and ptr pointing to
 the static memory where the [1,2,3] was stored (this is what happens with
 strings, right? So no surprises here, they are consistent).

 slice.dup should return an array. If you want to just copy the length and
 ptr members, use assignment (it's a value type, remember)?

 auto s = [1,2,3];
 auto t = s;
 assert(s.length == t.length);
 assert(s.ptr == t.ptr);
 assert(&s !=&t);

 Assignment of arrays is just a pointer assignment (is a reference type):

 auto a = [1,2,3].dup;
 auto b = a;
 assert(a is b);

 array.dup returns another array. array[] returns a slice though (you can
 allow implicit casting from array to slice but I don't see the point as
 array[] is short enough and more explicit). slices should not be
 convertible to arrays (use slice.dup for that).


 Back to the syntax, I think T[new] is *really* confusing. It tell me
 nothing about the type, it provides the same information as calling it
 it wazzzaaap!T for me. I *really* think it would be better to call it
 array!T. But if we have both types, T[] is not clear anymore that is
 a slice, again, you can't figure that out. So maybe T[] should be used for
 arrays, not slices; if you want some syntax sugar I think T[] is more
 guessable to be an array than a slice, and it would be a little more
 backward compatible. Maybe slices can be T[..], but I'm not sure is clear
 enough, again I think we could live with something more explicit like
 array!T and slice!T. But at least slices should be part of the language,
 because that should be the type of "array literals" (damn! that didn't
 sound right =P).


 I'm missing something? Why this shouldn't work?

Vote += 1_000_000;

Also, slices are Ranges so this should integrate with the Ranges in phobos.

Oct 19 2009

downs <default_357-line yahoo.de> writes:

Walter Bright wrote:
 The purpose of T[new] was to solve the problems T[] had with passing T[]
 to a function and then the function resizes the T[]. What happens with
 the original?
 
 The solution we came up with was to create a third array type, T[new],
 which was a reference type.
 
 Andrei had the idea that T[new] could be dispensed with by making a
 "builder" library type to handle creating arrays by doing things like
 appending, and then delivering a finished T[] type. This is similar to
 what std.outbuffer and std.array.Appender do, they just need a bit of
 refining.
 
 The .length property of T[] would then become an rvalue only, not an
 lvalue, and ~= would no longer be allowed for T[].
 
 We both feel that this would simplify D, make it more flexible, and
 remove some awkward corner cases like the inability to say a.length++.
 
 What do you think?

I think ArrayBuilder can work almost entirely transparently provided the
following conditions are met:

1) when trying to cat to an array, the first array automatically promotes to
ArrayBuilder

2) Setting .length is, depending on new length, either a slice or a cat
operation with the length difference.

2) ArrayBuilders implicitly cast to string.

This should allow a syntax that is almost entirely identical to the one used in
D1, aside from typeof("a"~"b").stringof :)

Is this feasible in D2?

Oct 18 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from downs (default_357-line yahoo.de)'s article
 Walter Bright wrote:
 The purpose of T[new] was to solve the problems T[] had with passing T[]
 to a function and then the function resizes the T[]. What happens with
 the original?

 The solution we came up with was to create a third array type, T[new],
 which was a reference type.

 Andrei had the idea that T[new] could be dispensed with by making a
 "builder" library type to handle creating arrays by doing things like
 appending, and then delivering a finished T[] type. This is similar to
 what std.outbuffer and std.array.Appender do, they just need a bit of
 refining.

 The .length property of T[] would then become an rvalue only, not an
 lvalue, and ~= would no longer be allowed for T[].

 We both feel that this would simplify D, make it more flexible, and
 remove some awkward corner cases like the inability to say a.length++.

 What do you think?

 I think ArrayBuilder can work almost entirely transparently provided the

following conditions are met:
 1) when trying to cat to an array, the first array automatically promotes to

ArrayBuilder
 2) Setting .length is, depending on new length, either a slice or a cat

operation with the length difference.
 2) ArrayBuilders implicitly cast to string.
 This should allow a syntax that is almost entirely identical to the one used in

D1, aside from typeof("a"~"b").stringof :)
 Is this feasible in D2?

Once you throw in all the implicit conversions to ArrayBuilder and having the
core
language know about ArrayBuilder, aren't you, for all practical purposes,
implementing T[new] but calling it ArrayBuilder?

Oct 18 2009

downs <default_357-line yahoo.de> writes:

dsimcha wrote:
 == Quote from downs (default_357-line yahoo.de)'s article
 Walter Bright wrote:
 The purpose of T[new] was to solve the problems T[] had with passing T[]
 to a function and then the function resizes the T[]. What happens with
 the original?

 The solution we came up with was to create a third array type, T[new],
 which was a reference type.

 Andrei had the idea that T[new] could be dispensed with by making a
 "builder" library type to handle creating arrays by doing things like
 appending, and then delivering a finished T[] type. This is similar to
 what std.outbuffer and std.array.Appender do, they just need a bit of
 refining.

 The .length property of T[] would then become an rvalue only, not an
 lvalue, and ~= would no longer be allowed for T[].

 We both feel that this would simplify D, make it more flexible, and
 remove some awkward corner cases like the inability to say a.length++.

 What do you think?

 I think ArrayBuilder can work almost entirely transparently provided the

 following conditions are met:
 1) when trying to cat to an array, the first array automatically promotes to

 ArrayBuilder
 2) Setting .length is, depending on new length, either a slice or a cat

 operation with the length difference.
 2) ArrayBuilders implicitly cast to string.
 This should allow a syntax that is almost entirely identical to the one used in

 D1, aside from typeof("a"~"b").stringof :)
 Is this feasible in D2?

 
 Once you throw in all the implicit conversions to ArrayBuilder and having the
core
 language know about ArrayBuilder, aren't you, for all practical purposes,
 implementing T[new] but calling it ArrayBuilder?

Yes but changing the semantics of a library object is easier than changing the
semantics of a language feature. Besides, I think the builder works differently
internally than T[new], although I'm not sure - storing lists or trees of
sub-arrays and concatenating them lazily? At least that's what I would do :p

Oct 18 2009

Ary Borenszweig <ary esperanto.org.ar> writes:

Walter Bright wrote:
 The purpose of T[new] was to solve the problems T[] had with passing T[] 
 to a function and then the function resizes the T[]. What happens with 
 the original?
 
 The solution we came up with was to create a third array type, T[new], 
 which was a reference type.
 
 Andrei had the idea that T[new] could be dispensed with by making a 
 "builder" library type to handle creating arrays by doing things like 
 appending, and then delivering a finished T[] type. This is similar to 
 what std.outbuffer and std.array.Appender do, they just need a bit of 
 refining.
 
 The .length property of T[] would then become an rvalue only, not an 
 lvalue, and ~= would no longer be allowed for T[].
 
 We both feel that this would simplify D, make it more flexible, and 
 remove some awkward corner cases like the inability to say a.length++.
 
 What do you think?

I remember seeing a lot of CTFE code that created a dynamic array and 
then appended stuff to it, like for example to build a list of prime 
numbers. Would that still work with ArrayBuilder?

Oct 19 2009

Walter Bright <newshound1 digitalmars.com> writes:

Ary Borenszweig wrote:
 I remember seeing a lot of CTFE code that created a dynamic array and 
 then appended stuff to it, like for example to build a list of prime 
 numbers. Would that still work with ArrayBuilder?

Probably not. But you can rewrite:

   a ~= stuff;

as:

   a = a ~ stuff;

to make it work.

Oct 19 2009

Bill Baxter <wbaxter gmail.com> writes:

On Mon, Oct 19, 2009 at 2:48 AM, Walter Bright
<newshound1 digitalmars.com> wrote:
 Ary Borenszweig wrote:
 I remember seeing a lot of CTFE code that created a dynamic array and th=


en
 appended stuff to it, like for example to build a list of prime numbers.
 Would that still work with ArrayBuilder?

 Probably not. But you can rewrite:

 =A0a ~=3D stuff;

 as:

 =A0a =3D a ~ stuff;

 to make it work.

Ouch.  Gives me Matlab nightmares.

--bb

Oct 19 2009

downs <default_357-line yahoo.de> writes:

Walter Bright wrote:
 Ary Borenszweig wrote:
 I remember seeing a lot of CTFE code that created a dynamic array and
 then appended stuff to it, like for example to build a list of prime
 numbers. Would that still work with ArrayBuilder?

 
 Probably not. But you can rewrite:
 
   a ~= stuff;
 
 as:
 
   a = a ~ stuff;
 
 to make it work.

Is there any reason the first can't be a short-hand for the second?

Oct 19 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from downs (default_357-line yahoo.de)'s article
 Walter Bright wrote:
 Ary Borenszweig wrote:
 I remember seeing a lot of CTFE code that created a dynamic array and
 then appended stuff to it, like for example to build a list of prime
 numbers. Would that still work with ArrayBuilder?

 Probably not. But you can rewrite:

   a ~= stuff;

 as:

   a = a ~ stuff;

 to make it work.

 Is there any reason the first can't be a short-hand for the second?

Devil's advocate because I somewhat agree with you:  a = a ~ stuff; is so
inefficient that it should be ugly.

Oct 19 2009

Michel Fortin <michel.fortin michelf.com> writes:

On 2009-10-19 10:14:02 -0400, dsimcha <dsimcha yahoo.com> said:

 == Quote from downs (default_357-line yahoo.de)'s article
 Walter Bright wrote:
 
 Probably not. But you can rewrite:
 
 a ~= stuff;
 
 as:
 
 a = a ~ stuff;
 
 to make it work.

 Is there any reason the first can't be a short-hand for the second?

 
 Devil's advocate because I somewhat agree with you:  a = a ~ stuff; is so
 inefficient that it should be ugly.

I'd call that a premature optimization of the programmer's behavior. :-)

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Oct 19 2009

Kagamin <spam here.lot> writes:

Walter Bright Wrote:

 The purpose of T[new] was to solve the problems T[] had with passing T[] 
 to a function and then the function resizes the T[]. What happens with 
 the original?

Why do you need such feature built into language? So many proposals to
implement this feature and that feature in a library, and unable to implement
reftype array?

 We both feel that this would simplify D, make it more flexible, and 
 remove some awkward corner cases like the inability to say a.length++.

You write it as if you can do it with T[new].

Oct 19 2009

Yigal Chripun <yigal100 gmail.com> writes:

Walter Bright Wrote:

 The purpose of T[new] was to solve the problems T[] had with passing T[] 
 to a function and then the function resizes the T[]. What happens with 
 the original?
 
 The solution we came up with was to create a third array type, T[new], 
 which was a reference type.
 
 Andrei had the idea that T[new] could be dispensed with by making a 
 "builder" library type to handle creating arrays by doing things like 
 appending, and then delivering a finished T[] type. This is similar to 
 what std.outbuffer and std.array.Appender do, they just need a bit of 
 refining.
 
 The .length property of T[] would then become an rvalue only, not an 
 lvalue, and ~= would no longer be allowed for T[].
 
 We both feel that this would simplify D, make it more flexible, and 
 remove some awkward corner cases like the inability to say a.length++.
 
 What do you think?

I think that Arrays and AAs need to be removed from the language. 
I prefer to have a collections framework as part of Phobos without special
cases in the language for specific containers. 

The only thing that needs to remain on the language side is support for user
defined literals and operators so that it would be straight forward to use (as
opposed to the clumsiness of C++). This syntax sugar is the *only* benefit of
built in types and this should be possible to have without cluttering the
language.

such a framework would ideally support polymorphism. 

List!(Foo) list = new Array!(Foo); // instead of Foo[new]

auto arr = new Array!(Foo);
...
auto slice = arr[1..4]; // slice is a Range as in Andrei's Ranges in Phobos

Foo[] a; // compile-error Foo[] is not a type, use Range!Foo instead

Foo[3] a; // this is still perfectly legal

Oct 19 2009

BCS <none anon.com> writes:

Hello Yigal,

 I think that Arrays and AAs need to be removed from the language. I
 prefer to have a collections framework as part of Phobos without
 special cases in the language for specific containers.

If it can be assured that all the current array ops will get inlined is all 
cases and can result in identical asm I'd be willing to consider this.

 List!(Foo) list = new Array!(Foo); // instead of Foo[new]

For me to not dislike the idea you would also need to find a cleaner syntax.

Oct 19 2009

Yigal Chripun <yigal100 gmail.com> writes:

On 19/10/2009 19:10, BCS wrote:
 Hello Yigal,

 I think that Arrays and AAs need to be removed from the language. I
 prefer to have a collections framework as part of Phobos without
 special cases in the language for specific containers.

 If it can be assured that all the current array ops will get inlined is
 all cases and can result in identical asm I'd be willing to consider this.

 List!(Foo) list = new Array!(Foo); // instead of Foo[new]

 For me to not dislike the idea you would also need to find a cleaner
 syntax.

See Leandro Lucarella's posts which describe the same idea.

currently, AAs and arrays are implemented in the run-time. What I 
suggest is to move them to the stdlib. I don't see how such a move can 
impact performance which you're concerned with above. on the contrary, 
moving these types outside the runtime will make it easier to improve 
their implementations.

regarding syntax, what's cleaner than calling an array an "Array"?
T[new] is less clear and hackish.

Oct 19 2009

BCS <none anon.com> writes:

Hello Yigal,

 On 19/10/2009 19:10, BCS wrote:
 
 Hello Yigal,
 
 I think that Arrays and AAs need to be removed from the language. I
 prefer to have a collections framework as part of Phobos without
 special cases in the language for specific containers.
 

 If it can be assured that all the current array ops will get inlined
 is all cases and can result in identical asm I'd be willing to
 consider this.
 
 List!(Foo) list = new Array!(Foo); // instead of Foo[new]
 

 For me to not dislike the idea you would also need to find a cleaner
 syntax.
 

 See Leandro Lucarella's posts which describe the same idea.
 
 currently, AAs and arrays are implemented in the run-time. What I
 suggest is to move them to the stdlib. I don't see how such a move can
 impact performance which you're concerned with above.

// a is an array of ints

int b;
if(a.length > 5) b = a[5];
else b = a[$/2];

No runtime in that at all now.

 on the contrary,
 moving these types outside the runtime will make it easier to improve
 their implementations.
 
 regarding syntax, what's cleaner than calling an array an "Array"?
 T[new] is less clear and hackish.
 

if T[new] gets done as Array!T, what does T[] look like?

I'm not to worried about the fancy feature, it the bone simple stuff staying 
bone simple that I'm looking after.

Oct 19 2009

Yigal Chripun <yigal100 gmail.com> writes:

On 19/10/2009 22:53, BCS wrote:
 Hello Yigal,

 On 19/10/2009 19:10, BCS wrote:

 Hello Yigal,

 I think that Arrays and AAs need to be removed from the language. I
 prefer to have a collections framework as part of Phobos without
 special cases in the language for specific containers.

 If it can be assured that all the current array ops will get inlined
 is all cases and can result in identical asm I'd be willing to
 consider this.

 List!(Foo) list = new Array!(Foo); // instead of Foo[new]

 For me to not dislike the idea you would also need to find a cleaner
 syntax.

 See Leandro Lucarella's posts which describe the same idea.

 currently, AAs and arrays are implemented in the run-time. What I
 suggest is to move them to the stdlib. I don't see how such a move can
 impact performance which you're concerned with above.

 // a is an array of ints

 int b;
 if(a.length > 5) b = a[5];
 else b = a[$/2];

 No runtime in that at all now.

I don't see a reason why this can't be inlined by the compiler if Array 
is a lib type.
 on the contrary,
 moving these types outside the runtime will make it easier to improve
 their implementations.

 regarding syntax, what's cleaner than calling an array an "Array"?
 T[new] is less clear and hackish.

 if T[new] gets done as Array!T, what does T[] look like?

T[] should be replaced by Ranges which are already implemented in Phobos.
 I'm not to worried about the fancy feature, it the bone simple stuff
 staying bone simple that I'm looking after.

Oct 19 2009

BCS <none anon.com> writes:

Hello Yigal,

 On 19/10/2009 22:53, BCS wrote:
 
 int b;
 if(a.length > 5) b = a[5];
 else b = a[$/2];
 No runtime in that at all now.
 

 I don't see a reason why this can't be inlined by the compiler if
 Array is a lib type.
 

Yes, it should inline. What I'm saying is that unless it is /guaranteed/ 
that it will /always/ inline, I'm going to have an issue.

 if T[new] gets done as Array!T, what does T[] look like?
 

 T[] should be replaced by Ranges which are already implemented in
 Phobos.
 

I think that T[] should be (at a minimum) the current slice semantics and 
implementation (the lvalue-length and ~/~= stuff is another question). If 
this happens to have the needed semantics to act as a range, all the better.

Having a clean, light construct to deal with low level memory access is just 
to handy to give it up completely.

Oct 19 2009

gzp <galap freemail.hu> writes:

Walter Bright �rta:
 The purpose of T[new] was to solve the problems T[] had with passing T[] 
 to a function and then the function resizes the T[]. What happens with 
 the original?

I've tried to follow the T[], T[new], T+slice discussion, but i'm lost,
I don't think if it's a good idea, to extend the language with all kind 
of decorated version of the same object ( scope  (*T)[scope new][1..$] :) )

"It will tell you for instance how to describe something that was about 
to happen to you in the past before you avoided it by time-jumping 
forward two days in order to avoid it. The event will be described 
differently according to whether you are talking about it from the 
standpoint of your own natural time, from a time in the further future, 
or a time in the further past and is further complicated by the 
possibility of conducting conversations whilst you are actually 
travelling from one time to another with the intention of becoming your 
own father or mother." Douglas Adams - The Hitchhiker's Guide to the Galaxy

Just an idea to simplify this whole thing, maybe it's whole wrong, but 
who knows... :)

Instead of creating new types of arrays we have already two: static and 
dynamic. A dynamic version can be resized (restructured) any time, the 
static has a fixed size, thus the dynamic version works as a class and 
the static version works as a struct for either in,out,ref/etc.

To create slices, slices those references the whole array i'd introduce 
a new type (as there is something like this): range!(T). A range is 
actually an inner class of the array implementing an IRange(T) interface 
with the usual popFront, front, etc. For a general class if one wants to 
add range, he just have to provide an onRange (onSlice) function.

As the range interface itself provides the opIndex, opAssign, opXAssign 
etc., they can be implemented as desired and no more opSliceAddAssign 
and other decorated function's are required. Even more as the range is 
an inner class of the array (and knows about the array), a finer tune of 
what_shall_happen_when_my_referrer_array_is_resize policy can be created.

As the range itself can work as an array, range[1] (or maybe even 
rang[1..2], calling the IRange!(T).onRange() ) it can be used as an 
array  whose structure (length) cannot be altered as desired for T[]

ex:
T[] a; // is a dynamic array as before
range(T) s1 = T[1..2]; // the slice from 1..2
range(T) s1 = T.opRange(); // the slice representing the whole range

foo( T[new] a ) is same as foo( T[] )
foo( T[] ) would be replaced by foo( range(T) ) and thus cannot be resized

furthermore

  foreach( a; array )
translates to something like ->
  foreach( a; a.onRange() )...

foreach( a; array[1..2] )
translates to something like ->
  foreach( a; a.onRange(1,2) )...

foreach( a; array[1..2,3..4] )
translates to something like ->
  foreach( a; a.onRange(1,2,3,4) )...

array[1..2] += 1
translates to something like ->
array.onRange(1,2).opAddAssign(1);


It'd be also great to create more then one onRange implementation of a 
class ex: matrix.opRangeRowWise matrix.opRangeColumnWise, but i don't
know how to select the appropriate one in a foreach loop "automatically":
foreach( a; mx.rowWise ) ->
  mx.selectRange(name : string) { mixin( "onRange" ~ name )  }
  foreach( a; ma.selectRange("rowWise")(1,2) )...


As a conclusion: please don't create any more array type, we have just 
enough. We have already all the features in the language to express both 
the desired and explained (above) behaviour. Only the syntax should be 
cleared out. Using the same notation multiple times with small 
decorations is not a good idea, and makes things confusing.

Thanks.

Oct 19 2009

Don <nospam nospam.com> writes:

Walter Bright wrote:
 The purpose of T[new] was to solve the problems T[] had with passing T[] 
 to a function and then the function resizes the T[]. What happens with 
 the original?
 
 The solution we came up with was to create a third array type, T[new], 
 which was a reference type.
 
 Andrei had the idea that T[new] could be dispensed with by making a 
 "builder" library type to handle creating arrays by doing things like 
 appending, and then delivering a finished T[] type. This is similar to 
 what std.outbuffer and std.array.Appender do, they just need a bit of 
 refining.
 
 The .length property of T[] would then become an rvalue only, not an 
 lvalue, and ~= would no longer be allowed for T[].
 
 We both feel that this would simplify D, make it more flexible, and 
 remove some awkward corner cases like the inability to say a.length++.
 
 What do you think?

Since noone else seems to have said it: The fact that you're both 
willing to let it go, after having already invested a lot of time in it, 
is a good sign for the language. Well done.

Oct 19 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Don wrote:
 Walter Bright wrote:
 The purpose of T[new] was to solve the problems T[] had with passing 
 T[] to a function and then the function resizes the T[]. What happens 
 with the original?

 The solution we came up with was to create a third array type, T[new], 
 which was a reference type.

 Andrei had the idea that T[new] could be dispensed with by making a 
 "builder" library type to handle creating arrays by doing things like 
 appending, and then delivering a finished T[] type. This is similar to 
 what std.outbuffer and std.array.Appender do, they just need a bit of 
 refining.

 The .length property of T[] would then become an rvalue only, not an 
 lvalue, and ~= would no longer be allowed for T[].

 We both feel that this would simplify D, make it more flexible, and 
 remove some awkward corner cases like the inability to say a.length++.

 What do you think?

 
 Since noone else seems to have said it: The fact that you're both 
 willing to let it go, after having already invested a lot of time in it, 
 is a good sign for the language. Well done.

I'm relieved that somebody mentioned that :o). As soon as we gave up 
with T[new], people started to sell it to us. We should preemptively 
post about eliminating feature plans before actually implementing them.

By the way: implementation of  property has been canceled.


Andrei

Oct 19 2009

Leandro Lucarella <llucax gmail.com> writes:

Andrei Alexandrescu, el 19 de octubre a las 14:31 me escribiste:
 By the way: implementation of  property has been canceled.

Keep throwing the good news :(

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Es mas posible, que un elefante maneje un cero km a que un camello
habite un departamento de un ambiente.
	-- Peperino Pómoro

Oct 19 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Mon, 19 Oct 2009 23:31:37 +0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Don wrote:
 Walter Bright wrote:
 The purpose of T[new] was to solve the problems T[] had with passing  
 T[] to a function and then the function resizes the T[]. What happens  
 with the original?

 The solution we came up with was to create a third array type, T[new],  
 which was a reference type.

 Andrei had the idea that T[new] could be dispensed with by making a  
 "builder" library type to handle creating arrays by doing things like  
 appending, and then delivering a finished T[] type. This is similar to  
 what std.outbuffer and std.array.Appender do, they just need a bit of  
 refining.

 The .length property of T[] would then become an rvalue only, not an  
 lvalue, and ~= would no longer be allowed for T[].

 We both feel that this would simplify D, make it more flexible, and  
 remove some awkward corner cases like the inability to say a.length++.

 What do you think?

  Since noone else seems to have said it: The fact that you're both  
 willing to let it go, after having already invested a lot of time in  
 it, is a good sign for the language. Well done.

 I'm relieved that somebody mentioned that :o). As soon as we gave up  
 with T[new], people started to sell it to us. We should preemptively  
 post about eliminating feature plans before actually implementing them.

 By the way: implementation of  property has been canceled.


 Andrei

I *really* hope you told it so that we encourage actually implementing it!
I strongly believe attributes and properties are very important for D.

Oct 19 2009

Ary Borenszweig <ary esperanto.org.ar> writes:

Denis Koroskin wrote:
 On Mon, 19 Oct 2009 23:31:37 +0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Don wrote:
 Walter Bright wrote:
 The purpose of T[new] was to solve the problems T[] had with passing 
 T[] to a function and then the function resizes the T[]. What 
 happens with the original?

 The solution we came up with was to create a third array type, 
 T[new], which was a reference type.

 Andrei had the idea that T[new] could be dispensed with by making a 
 "builder" library type to handle creating arrays by doing things 
 like appending, and then delivering a finished T[] type. This is 
 similar to what std.outbuffer and std.array.Appender do, they just 
 need a bit of refining.

 The .length property of T[] would then become an rvalue only, not an 
 lvalue, and ~= would no longer be allowed for T[].

 We both feel that this would simplify D, make it more flexible, and 
 remove some awkward corner cases like the inability to say a.length++.

 What do you think?

  Since noone else seems to have said it: The fact that you're both 
 willing to let it go, after having already invested a lot of time in 
 it, is a good sign for the language. Well done.

 I'm relieved that somebody mentioned that :o). As soon as we gave up 
 with T[new], people started to sell it to us. We should preemptively 
 post about eliminating feature plans before actually implementing them.

 By the way: implementation of  property has been canceled.


 Andrei

 
 I *really* hope you told it so that we encourage actually implementing it!
 I strongly believe attributes and properties are very important for D.

Not if the ones that implement it didn't use attributes before or don't 
know what they are useful for or how they could be implemented. :-)

Oct 19 2009

grauzone <none example.net> writes:

Andrei Alexandrescu wrote:
 Don wrote:
 Walter Bright wrote:
 The purpose of T[new] was to solve the problems T[] had with passing 
 T[] to a function and then the function resizes the T[]. What happens 
 with the original?

 The solution we came up with was to create a third array type, 
 T[new], which was a reference type.

 Andrei had the idea that T[new] could be dispensed with by making a 
 "builder" library type to handle creating arrays by doing things like 
 appending, and then delivering a finished T[] type. This is similar 
 to what std.outbuffer and std.array.Appender do, they just need a bit 
 of refining.

 The .length property of T[] would then become an rvalue only, not an 
 lvalue, and ~= would no longer be allowed for T[].

 We both feel that this would simplify D, make it more flexible, and 
 remove some awkward corner cases like the inability to say a.length++.

 What do you think?

 Since noone else seems to have said it: The fact that you're both 
 willing to let it go, after having already invested a lot of time in 
 it, is a good sign for the language. Well done.

 
 I'm relieved that somebody mentioned that :o). As soon as we gave up 
 with T[new], people started to sell it to us. We should preemptively 
 post about eliminating feature plans before actually implementing them.
 
 By the way: implementation of  property has been canceled.

Yeah, let's just keep the language in the broken state it is, because we 
can't think of a better solution.

 Andrei

Oct 19 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

grauzone wrote:
 Andrei Alexandrescu wrote:
 Don wrote:
 Walter Bright wrote:
 The purpose of T[new] was to solve the problems T[] had with passing 
 T[] to a function and then the function resizes the T[]. What 
 happens with the original?

 The solution we came up with was to create a third array type, 
 T[new], which was a reference type.

 Andrei had the idea that T[new] could be dispensed with by making a 
 "builder" library type to handle creating arrays by doing things 
 like appending, and then delivering a finished T[] type. This is 
 similar to what std.outbuffer and std.array.Appender do, they just 
 need a bit of refining.

 The .length property of T[] would then become an rvalue only, not an 
 lvalue, and ~= would no longer be allowed for T[].

 We both feel that this would simplify D, make it more flexible, and 
 remove some awkward corner cases like the inability to say a.length++.

 What do you think?

 Since noone else seems to have said it: The fact that you're both 
 willing to let it go, after having already invested a lot of time in 
 it, is a good sign for the language. Well done.

 I'm relieved that somebody mentioned that :o). As soon as we gave up 
 with T[new], people started to sell it to us. We should preemptively 
 post about eliminating feature plans before actually implementing them.

 By the way: implementation of  property has been canceled.

 
 Yeah, let's just keep the language in the broken state it is, because we 
 can't think of a better solution.

Silly me, I was thinking the humor was all too obvious.

Andrei

Oct 19 2009

grauzone <none example.net> writes:

Andrei Alexandrescu wrote:
 grauzone wrote:
 Andrei Alexandrescu wrote:
 Don wrote:
 Walter Bright wrote:
 The purpose of T[new] was to solve the problems T[] had with 
 passing T[] to a function and then the function resizes the T[]. 
 What happens with the original?

 The solution we came up with was to create a third array type, 
 T[new], which was a reference type.

 Andrei had the idea that T[new] could be dispensed with by making a 
 "builder" library type to handle creating arrays by doing things 
 like appending, and then delivering a finished T[] type. This is 
 similar to what std.outbuffer and std.array.Appender do, they just 
 need a bit of refining.

 The .length property of T[] would then become an rvalue only, not 
 an lvalue, and ~= would no longer be allowed for T[].

 We both feel that this would simplify D, make it more flexible, and 
 remove some awkward corner cases like the inability to say a.length++.

 What do you think?

 Since noone else seems to have said it: The fact that you're both 
 willing to let it go, after having already invested a lot of time in 
 it, is a good sign for the language. Well done.

 I'm relieved that somebody mentioned that :o). As soon as we gave up 
 with T[new], people started to sell it to us. We should preemptively 
 post about eliminating feature plans before actually implementing them.

 By the way: implementation of  property has been canceled.

 Yeah, let's just keep the language in the broken state it is, because 
 we can't think of a better solution.

 
 Silly me, I was thinking the humor was all too obvious.

It was only a joke? That's a relief.

 Andrei

Oct 19 2009

Piotrek <starpit tlen.pl> writes:

Don pisze:
 
 Since noone else seems to have said it: The fact that you're both 
 willing to let it go, after having already invested a lot of time in it, 
 is a good sign for the language. Well done.

Hey Don, you speak my words. Specially that I can't see good reason for 
T[new] with present array language support (but that's probably my poor 
knowledge). Walter's main post and one of the later are reflecting my 
feelings.

We need some polishing and good library now.

Cheers
Piotrek

Oct 19 2009

Michel Fortin <michel.fortin michelf.com> writes:

On 2009-10-18 17:05:39 -0400, Walter Bright <newshound1 digitalmars.com> said:

 The purpose of T[new] was to solve the problems T[] had with passing 
 T[] to a function and then the function resizes the T[]. What happens 
 with the original?
 
 The solution we came up with was to create a third array type, T[new], 
 which was a reference type.
 
 Andrei had the idea that T[new] could be dispensed with by making a 
 "builder" library type to handle creating arrays by doing things like 
 appending, and then delivering a finished T[] type. This is similar to 
 what std.outbuffer and std.array.Appender do, they just need a bit of 
 refining.
 
 The .length property of T[] would then become an rvalue only, not an 
 lvalue, and ~= would no longer be allowed for T[].
 
 We both feel that this would simplify D, make it more flexible, and 
 remove some awkward corner cases like the inability to say a.length++.
 
 What do you think?

I never liked T[new] much from the beginning, so good riddance. :-)

But seriously, disallowing '~=' but not '~'? I can already see the newbies:

Q: Why can't I write '~=' as a shortcut for '~' on a slice? It works 
fine for every other operator, everywhere else.
A: Because appending is not very efficient.
Q: So '~' is more efficient than '~='?
A: Hum, no. They're both as inefficient, but...

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Oct 19 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sun, 18 Oct 2009 17:05:39 -0400, Walter Bright  
<newshound1 digitalmars.com> wrote:

 The purpose of T[new] was to solve the problems T[] had with passing T[]  
 to a function and then the function resizes the T[]. What happens with  
 the original?

 The solution we came up with was to create a third array type, T[new],  
 which was a reference type.

 Andrei had the idea that T[new] could be dispensed with by making a  
 "builder" library type to handle creating arrays by doing things like  
 appending, and then delivering a finished T[] type. This is similar to  
 what std.outbuffer and std.array.Appender do, they just need a bit of  
 refining.

 The .length property of T[] would then become an rvalue only, not an  
 lvalue, and ~= would no longer be allowed for T[].

 We both feel that this would simplify D, make it more flexible, and  
 remove some awkward corner cases like the inability to say a.length++.

 What do you think?

At the risk of sounding like bearophile -- I've proposed 2 solutions in  
the past for this that *don't* involve creating a T[new] type.

1. Store the allocated length in the GC structure, then only allow  
appending when the length of the array being appended matches the  
allocated length.

2. Store the allocated length at the beginning of the array, and use a bit  
in the array length to determine if it starts at the beginning of the  
block.

The first solution has space concerns, and the second has lots more  
concerns, but can help in the case of having to do a GC lookup to  
determine if a slice can be appended (you'd still have to lock the GC to  
do an actual append or realloc).  I prefer the first solution over the  
second.

I like the current behavior *except* for appending.  Most of the time it  
does what you want, and the syntax is beautiful.

In regards to disallowing x ~= y, I'd propose you at least make it  
equivalent to x = x ~ y instead of removing it.

-Steve

Oct 20 2009

Bill Baxter <wbaxter gmail.com> writes:

On Tue, Oct 20, 2009 at 6:25 AM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 On Sun, 18 Oct 2009 17:05:39 -0400, Walter Bright
 <newshound1 digitalmars.com> wrote:

 The purpose of T[new] was to solve the problems T[] had with passing T[]
 to a function and then the function resizes the T[]. What happens with t=


he
 original?

 The solution we came up with was to create a third array type, T[new],
 which was a reference type.

 Andrei had the idea that T[new] could be dispensed with by making a
 "builder" library type to handle creating arrays by doing things like
 appending, and then delivering a finished T[] type. This is similar to w=


hat
 std.outbuffer and std.array.Appender do, they just need a bit of refinin=


g.
 The .length property of T[] would then become an rvalue only, not an
 lvalue, and ~=3D would no longer be allowed for T[].

 We both feel that this would simplify D, make it more flexible, and remo=


ve
 some awkward corner cases like the inability to say a.length++.

 What do you think?

 At the risk of sounding like bearophile -- I've proposed 2 solutions in t=

he
 past for this that *don't* involve creating a T[new] type.

 1. Store the allocated length in the GC structure, then only allow append=

ing
 when the length of the array being appended matches the allocated length.

 2. Store the allocated length at the beginning of the array, and use a bi=

t
 in the array length to determine if it starts at the beginning of the blo=

ck.
 The first solution has space concerns, and the second has lots more
 concerns, but can help in the case of having to do a GC lookup to determi=

ne
 if a slice can be appended (you'd still have to lock the GC to do an actu=

al
 append or realloc). =A0I prefer the first solution over the second.

 I like the current behavior *except* for appending. =A0Most of the time i=

t
 does what you want, and the syntax is beautiful.

 In regards to disallowing x ~=3D y, I'd propose you at least make it
 equivalent to x =3D x ~ y instead of removing it.

If you're going to do ~=3D a lot then you should convert to the dynamic
array type.
If you're not going to do ~=3D a lot, then you can afford to write out x =
=3D x ~ y.

The bottom line is that it just doesn't make sense to append onto a
"view" type.  It's really a kind of constness.  Having a view says the
underlying memory locations you are looking at are fixed.  It doesn't
make sense to imply there's an operation that can change those memory
locations (other than shrinking the window to view fewer of them).

--bb

Oct 20 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 20 Oct 2009 11:10:20 -0400, Bill Baxter <wbaxter gmail.com> wrote:

 On Tue, Oct 20, 2009 at 6:25 AM, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 On Sun, 18 Oct 2009 17:05:39 -0400, Walter Bright
 <newshound1 digitalmars.com> wrote:

 The purpose of T[new] was to solve the problems T[] had with passing  
 T[]
 to a function and then the function resizes the T[]. What happens with  
 the
 original?

 The solution we came up with was to create a third array type, T[new],
 which was a reference type.

 Andrei had the idea that T[new] could be dispensed with by making a
 "builder" library type to handle creating arrays by doing things like
 appending, and then delivering a finished T[] type. This is similar to  
 what
 std.outbuffer and std.array.Appender do, they just need a bit of  
 refining.

 The .length property of T[] would then become an rvalue only, not an
 lvalue, and ~= would no longer be allowed for T[].

 We both feel that this would simplify D, make it more flexible, and  
 remove
 some awkward corner cases like the inability to say a.length++.

 What do you think?

 At the risk of sounding like bearophile -- I've proposed 2 solutions in  
 the
 past for this that *don't* involve creating a T[new] type.

 1. Store the allocated length in the GC structure, then only allow  
 appending
 when the length of the array being appended matches the allocated  
 length.

 2. Store the allocated length at the beginning of the array, and use a  
 bit
 in the array length to determine if it starts at the beginning of the  
 block.

 The first solution has space concerns, and the second has lots more
 concerns, but can help in the case of having to do a GC lookup to  
 determine
 if a slice can be appended (you'd still have to lock the GC to do an  
 actual
 append or realloc).  I prefer the first solution over the second.

 I like the current behavior *except* for appending.  Most of the time it
 does what you want, and the syntax is beautiful.

 In regards to disallowing x ~= y, I'd propose you at least make it
 equivalent to x = x ~ y instead of removing it.

 If you're going to do ~= a lot then you should convert to the dynamic
 array type.
 If you're not going to do ~= a lot, then you can afford to write out x =  
 x ~ y.

 The bottom line is that it just doesn't make sense to append onto a
 "view" type.  It's really a kind of constness.  Having a view says the
 underlying memory locations you are looking at are fixed.  It doesn't
 make sense to imply there's an operation that can change those memory
 locations (other than shrinking the window to view fewer of them).

Having the append operation extend into already allocated memory is an  
optimization.  In this case, it's an optimization that can corrupt memory.

If we can make append extend into already allocated memory *and* not cause  
corruption, I don't see the downside.  And then there is one less array  
type to deal with (, create functions that handle, etc.).

Besides, I think Andrei's LRU solution is better than mine (and pretty  
much in line with it).

I still think having an Appender object or struct is a worthwhile thing,  
the "pre-allocate array then set length to zero" model is a hack at best.

-Steve

Oct 20 2009

grauzone <none example.net> writes:

Steven Schveighoffer wrote:
 I still think having an Appender object or struct is a worthwhile thing, 
 the "pre-allocate array then set length to zero" model is a hack at best.

Would that work with Andrei's append cache at all? Setting the length to 
zero and then appending is like taking a slice of length 0 and then 
appending.

Maybe introduce a write/readable .capacity property, that magically 
accesses the cache/GC?

 -Steve

Oct 20 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

grauzone wrote:
 Steven Schveighoffer wrote:
 I still think having an Appender object or struct is a worthwhile 
 thing, the "pre-allocate array then set length to zero" model is a 
 hack at best.

 
 Would that work with Andrei's append cache at all? Setting the length to 
 zero and then appending is like taking a slice of length 0 and then 
 appending.
 
 Maybe introduce a write/readable .capacity property, that magically 
 accesses the cache/GC?

For my money, I'd get rid of that trick:

a.length = 1000;
a.length = 0;
for (...) a ~= x;


Andrei

Oct 20 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Tue, 20 Oct 2009 21:01:17 +0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 grauzone wrote:
 Steven Schveighoffer wrote:
 I still think having an Appender object or struct is a worthwhile  
 thing, the "pre-allocate array then set length to zero" model is a  
 hack at best.

  Would that work with Andrei's append cache at all? Setting the length  
 to zero and then appending is like taking a slice of length 0 and then  
 appending.
  Maybe introduce a write/readable .capacity property, that magically  
 accesses the cache/GC?

 For my money, I'd get rid of that trick:

 a.length = 1000;
 a.length = 0;
 for (...) a ~= x;


 Andrei

I agree it's ugly but that's the best we have in D, and it looks like  
things are getting even worse...

Oct 20 2009

grauzone <none example.net> writes:

Andrei Alexandrescu wrote:
 grauzone wrote:
 Steven Schveighoffer wrote:
 I still think having an Appender object or struct is a worthwhile 
 thing, the "pre-allocate array then set length to zero" model is a 
 hack at best.

 Would that work with Andrei's append cache at all? Setting the length 
 to zero and then appending is like taking a slice of length 0 and then 
 appending.

 Maybe introduce a write/readable .capacity property, that magically 
 accesses the cache/GC?

 
 For my money, I'd get rid of that trick:
 
 a.length = 1000;
 a.length = 0;
 for (...) a ~= x;

Yes, that's what we currently use for setting the capacity of an array. 
And it looks stupid and non-intuitive; someone new to D might think it's 
a no-op.

Better way please?

 
 Andrei

Oct 20 2009

Bill Baxter <wbaxter gmail.com> writes:

On Tue, Oct 20, 2009 at 8:50 AM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 On Tue, 20 Oct 2009 11:10:20 -0400, Bill Baxter <wbaxter gmail.com> wrote=

:
 On Tue, Oct 20, 2009 at 6:25 AM, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 On Sun, 18 Oct 2009 17:05:39 -0400, Walter Bright
 <newshound1 digitalmars.com> wrote:

 The purpose of T[new] was to solve the problems T[] had with passing T=




[]
 to a function and then the function resizes the T[]. What happens with
 the
 original?

 The solution we came up with was to create a third array type, T[new],
 which was a reference type.

 Andrei had the idea that T[new] could be dispensed with by making a
 "builder" library type to handle creating arrays by doing things like
 appending, and then delivering a finished T[] type. This is similar to
 what
 std.outbuffer and std.array.Appender do, they just need a bit of
 refining.

 The .length property of T[] would then become an rvalue only, not an
 lvalue, and ~=3D would no longer be allowed for T[].

 We both feel that this would simplify D, make it more flexible, and
 remove
 some awkward corner cases like the inability to say a.length++.

 What do you think?

 At the risk of sounding like bearophile -- I've proposed 2 solutions in
 the
 past for this that *don't* involve creating a T[new] type.

 1. Store the allocated length in the GC structure, then only allow
 appending
 when the length of the array being appended matches the allocated lengt=



h.
 2. Store the allocated length at the beginning of the array, and use a
 bit
 in the array length to determine if it starts at the beginning of the
 block.

 The first solution has space concerns, and the second has lots more
 concerns, but can help in the case of having to do a GC lookup to
 determine
 if a slice can be appended (you'd still have to lock the GC to do an
 actual
 append or realloc). =A0I prefer the first solution over the second.

 I like the current behavior *except* for appending. =A0Most of the time=



 it
 does what you want, and the syntax is beautiful.

 In regards to disallowing x ~=3D y, I'd propose you at least make it
 equivalent to x =3D x ~ y instead of removing it.

 If you're going to do ~=3D a lot then you should convert to the dynamic
 array type.
 If you're not going to do ~=3D a lot, then you can afford to write out x=


 =3D x
 ~ y.

 The bottom line is that it just doesn't make sense to append onto a
 "view" type. =A0It's really a kind of constness. =A0Having a view says t=


he
 underlying memory locations you are looking at are fixed. =A0It doesn't
 make sense to imply there's an operation that can change those memory
 locations (other than shrinking the window to view fewer of them).

 Having the append operation extend into already allocated memory is an
 optimization. =A0In this case, it's an optimization that can corrupt memo=

ry.
 If we can make append extend into already allocated memory *and* not caus=

e
 corruption, I don't see the downside. =A0And then there is one less array=

 type
 to deal with (, create functions that handle, etc.).

 Besides, I think Andrei's LRU solution is better than mine (and pretty mu=

ch
 in line with it).

 I still think having an Appender object or struct is a worthwhile thing, =

the
 "pre-allocate array then set length to zero" model is a hack at best.

But you still have the problem Andrei posted.  Code like this:

void func(int[] x)
{
     x ~=3D 3;
     x[0] =3D 42;
}

it'll compile and maybe run just fine, but there's no way to know if
the caller will see the 42 or not.   Unpredictable behavior like that
is breeding grounds for subtle bugs.

Perhaps that potential for bugs can be reduced by turning off the LRU
stuff in debug builds, and just making ~=3D reallocate always there.
Since, as you said, it's an optimization, makes sense to only turn it
on in release or maybe optimized builds.

To Andrei, do you really feel comfortable trying to explain this in
your book?  It seems like it will be difficult to explain that ~=3D is
sometimes efficient for appending but not necessarily if you're
working with a lot of arrays because it actually keeps this cache
under the hood that may or may not remember the actual underlying
capacity of the array you're appending to, so you should probably use
ArrayBuilder if you can, despite the optimization.

--bb

Oct 20 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Bill Baxter wrote:
 To Andrei, do you really feel comfortable trying to explain this in
 your book?  It seems like it will be difficult to explain that ~= is
 sometimes efficient for appending but not necessarily if you're
 working with a lot of arrays because it actually keeps this cache
 under the hood that may or may not remember the actual underlying
 capacity of the array you're appending to, so you should probably use
 ArrayBuilder if you can, despite the optimization.

I guess I'll try and let you all know.

Andrei

Oct 20 2009

Bill Baxter <wbaxter gmail.com> writes:

On Tue, Oct 20, 2009 at 10:05 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Bill Baxter wrote:
 To Andrei, do you really feel comfortable trying to explain this in
 your book? =A0It seems like it will be difficult to explain that ~=3D is
 sometimes efficient for appending but not necessarily if you're
 working with a lot of arrays because it actually keeps this cache
 under the hood that may or may not remember the actual underlying
 capacity of the array you're appending to, so you should probably use
 ArrayBuilder if you can, despite the optimization.

 I guess I'll try and let you all know.

I can also see this becoming an Effective D tip --
"""


For common cases appending to slices is fast.  However the performance
depends on a hidden LRU cache to remember the capacities of the most
recent N arrays.  This works fine until you hit that N limit.
Unfortunately as you compose code together it is easy to overflow that
cache without realizing it, leading to sudden performance drops for no
apparent reason.   Thus we suggest you always use ArrayBuilder when
appending to arrays rather than slices.
"""

Or not.  This is one of those places where some data is really needed.
 It may be that 99.9% of code is only actively appending to 4 arrays
or fewer.  It just seems too tricky that this innocent-looking code:

     int[] i;
     foreach(k; 1..20_000) {
             i ~=3D some_function(k);
     }

could hit a performance cliff based on how many arrays get used deep
in the call chain of some_function().   Granted, cache issues can
cause these kinds of cliffs for any kind of code, but I suspect this
cliff would be particularly noticeable, given the slowness of
allocations.

--bb

Oct 20 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 20 Oct 2009 12:30:57 -0400, Bill Baxter <wbaxter gmail.com> wrote:

 On Tue, Oct 20, 2009 at 8:50 AM, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 On Tue, 20 Oct 2009 11:10:20 -0400, Bill Baxter <wbaxter gmail.com>  
 wrote:

 On Tue, Oct 20, 2009 at 6:25 AM, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 On Sun, 18 Oct 2009 17:05:39 -0400, Walter Bright
 <newshound1 digitalmars.com> wrote:

 The purpose of T[new] was to solve the problems T[] had with passing  
 T[]
 to a function and then the function resizes the T[]. What happens  
 with
 the
 original?

 The solution we came up with was to create a third array type,  
 T[new],
 which was a reference type.

 Andrei had the idea that T[new] could be dispensed with by making a
 "builder" library type to handle creating arrays by doing things like
 appending, and then delivering a finished T[] type. This is similar  
 to
 what
 std.outbuffer and std.array.Appender do, they just need a bit of
 refining.

 The .length property of T[] would then become an rvalue only, not an
 lvalue, and ~= would no longer be allowed for T[].

 We both feel that this would simplify D, make it more flexible, and
 remove
 some awkward corner cases like the inability to say a.length++.

 What do you think?

 At the risk of sounding like bearophile -- I've proposed 2 solutions  
 in
 the
 past for this that *don't* involve creating a T[new] type.

 1. Store the allocated length in the GC structure, then only allow
 appending
 when the length of the array being appended matches the allocated  
 length.

 2. Store the allocated length at the beginning of the array, and use a
 bit
 in the array length to determine if it starts at the beginning of the
 block.

 The first solution has space concerns, and the second has lots more
 concerns, but can help in the case of having to do a GC lookup to
 determine
 if a slice can be appended (you'd still have to lock the GC to do an
 actual
 append or realloc).  I prefer the first solution over the second.

 I like the current behavior *except* for appending.  Most of the time  
 it
 does what you want, and the syntax is beautiful.

 In regards to disallowing x ~= y, I'd propose you at least make it
 equivalent to x = x ~ y instead of removing it.

 If you're going to do ~= a lot then you should convert to the dynamic
 array type.
 If you're not going to do ~= a lot, then you can afford to write out x  
 = x
 ~ y.

 The bottom line is that it just doesn't make sense to append onto a
 "view" type.  It's really a kind of constness.  Having a view says the
 underlying memory locations you are looking at are fixed.  It doesn't
 make sense to imply there's an operation that can change those memory
 locations (other than shrinking the window to view fewer of them).

 Having the append operation extend into already allocated memory is an
 optimization.  In this case, it's an optimization that can corrupt  
 memory.

 If we can make append extend into already allocated memory *and* not  
 cause
 corruption, I don't see the downside.  And then there is one less array  
 type
 to deal with (, create functions that handle, etc.).

 Besides, I think Andrei's LRU solution is better than mine (and pretty  
 much
 in line with it).

 I still think having an Appender object or struct is a worthwhile  
 thing, the
 "pre-allocate array then set length to zero" model is a hack at best.

 But you still have the problem Andrei posted.  Code like this:

 void func(int[] x)
 {
      x ~= 3;
      x[0] = 42;
 }

depending on what you want, you then rewrite:

 void func(int[] x)
 {
      x[0] = 42;
      x ~= 3;
 }

or


 void func(int[] x)
 {
      x = x ~ 3;
      x[0] = 42;
 }

Generally when you are appending, you are not also changing the original  
data, so you don't care whether it's an optimization or not.

Your code is obviously broken anyways, since *nobody* ever sees the 3.

 it'll compile and maybe run just fine, but there's no way to know if
 the caller will see the 42 or not.   Unpredictable behavior like that
 is breeding grounds for subtle bugs.

I'm sure we could spend days coming up with code that introduces subtle  
bugs.  You can't fix all bugs that people may write.  I don't think your  
scenario is very likely.

More importantly, the problem with the current appending behavior is this:

void foo(int[] x)
{
   x ~= 3;
   ...
}

That may have just corrupted some data that you don't own, so defensively,  
you should write:

void foo(int[] x)
{
   x = x ~ 3;
   ...
}

But with Andrei's solution, you cannot possibly corrupt data with this  
line.  Now, if you then go and set one of the values in the original array  
(like you did), then you may or may not change the original array.  But as  
the function takes a mutable array, *you own the array* so it is a mistake  
to think when you pass in an array that's not const, you should expect it  
to remain unchanged.  If your goal is to affect the original array, then  
you should accept a ref argument or not append to it.

-Steve

Oct 20 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Steven Schveighoffer wrote:
 If your goal is to affect the 
 original array, then you should accept a ref argument or not append to it.

I think that's an entirely reasonable (and easy to explain) stance.

Andrei

Oct 20 2009

Bill Baxter <wbaxter gmail.com> writes:

On Tue, Oct 20, 2009 at 11:30 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Steven Schveighoffer wrote:
 If your goal is to affect the original array, then you should accept a ref
 argument or not append to it.

 I think that's an entirely reasonable (and easy to explain) stance.

I've definitely spent time tracking down exactly such bugs, where I
meant to make the argument a ref but didn't.  If the above is to be
the official stance, then I think it should be enforced by the
compiler.  Appending to non-ref slice args should be an error.

--bb

Oct 20 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 20 Oct 2009 14:43:16 -0400, Bill Baxter <wbaxter gmail.com> wrote:

 On Tue, Oct 20, 2009 at 11:30 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Steven Schveighoffer wrote:
 If your goal is to affect the original array, then you should accept a  
 ref
 argument or not append to it.

 I think that's an entirely reasonable (and easy to explain) stance.

 I've definitely spent time tracking down exactly such bugs, where I
 meant to make the argument a ref but didn't.  If the above is to be
 the official stance, then I think it should be enforced by the
 compiler.  Appending to non-ref slice args should be an error.

Except when you are passing ownership of an array.  Basically, there are  
four modes:

T[] x : a) you are passing ownership to the function, and the array might  
not be deterministically usable after the function returns, e.g. T[]  
padAndCapitalize(T[]).  Usually such functions' return values are  
deterministic, and the focus of what you care about.

   -or-
         b) you are lending ownership to the function, only the array  
elements will be altered, e.g. replace().

ref T[] x : You are lending ownership of the array to the function, but  
you get ownership back, e.g. push().  Fully deterministic altering.

const(T)[] x : You retain ownership of the array, the function cannot  
alter it, e.g. toUpper().

So it's not possible to flag on the signature between the first a) and b)  
modes, we must currently rely on documentation.  But the "append and  
modify" routines are pretty rare.

Essentially, what you want is a type where the length is const, but the  
data is mutable.  I think creating such a type should be possible in the  
library.

But fixing the append problem alone is a huge step forward, since all of  
these cases could result in corruption of totally unrelated data if the  
arrays were appended to.  Especially the const version is bad.

-Steve

Oct 20 2009

Stewart Gordon <smjg_1998 yahoo.com> writes:

I've just caught up on this thread.  Built-in dynamic arrays have been 
one of D's major features since the early days.  Now you're planning to 
remove this feature?

http://www.digitalmars.com/d/1.0/builtin.html
states that the C++ STL has many types that have been created to 
compensate for the limitations of the built-in array type, and the power 
of D's built-in arrays largely eliminates the need for these.  So now 
you're suggesting that we do away with this power, and create these 
library types that the point was to avoid?

Walter Bright wrote:
<snip>
 The .length property of T[] would then become an rvalue only, not an 
 lvalue, and ~= would no longer be allowed for T[].

I thought you were already moving that functionality into T[new].

I think T[new] versus T[] is actually a good design - it makes for a 
form of array length constancy as well as possibly getting rid of such 
nasties as bug 2093.

 We both feel that this would simplify D, make it more flexible, and 
 remove some awkward corner cases like the inability to say a.length++.

I must've missed the discussion - what's wrong with fixing such 
expressions to work in terms of property setters/getters in the natural way?

Stewart.

Oct 30 2009

D Programming

C/C++ Programming

Other

digitalmars.D - The demise of T[new]