digitalmars.D - Re: D array expansion and non-deterministic re-allocation

dsimcha Wrote:

 
 The one thing that I think has been missing from this discussion is, what
would be
 the alternative if we didn't have this "non-deterministic" reallocation?  How
else
 could you **efficiently** implement dynamic arrays?


In the long run (D3), I proposed using the "unique" type modifier. If an array
is unique, the compiler knows that there are no slices to worry about, and it
can use in-place reallocation to its heart content. That pretty much solves the
performance problem. 

In the short run (D2), I would suggest sticking to "reallocate on every
extension" semantics (especially in SafeD) and provide a library solution (a la
C++ std::vector) where the performance of appending is an issue.

Nov 17 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Bartosz Milewski (bartosz-nospam relisoft.com)'s article
 dsimcha Wrote:
 The one thing that I think has been missing from this discussion is, what
would be
 the alternative if we didn't have this "non-deterministic" reallocation?  How
else
 could you **efficiently** implement dynamic arrays?




use in-place reallocation to its heart content. That pretty much solves the
performance problem.
 In the short run (D2), I would suggest sticking to "reallocate on every


C++ std::vector) where the performance of appending is an issue.

Probably not a bad idea, since:

1.  I've concluded that appending to slices simply can't be made both efficient
and safe.

2.  We'll probably get a std.collections anyhow.

3.  If you **really** care about performance, you should only append when you
don't know the length in advance.  If you know the length, you should always
pre-allocate.

Nov 17 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

dsimcha wrote:
 == Quote from Bartosz Milewski (bartosz-nospam relisoft.com)'s article
 dsimcha Wrote:
 The one thing that I think has been missing from this discussion is, what
would be
 the alternative if we didn't have this "non-deterministic" reallocation?  How
else
 could you **efficiently** implement dynamic arrays?




 use in-place reallocation to its heart content. That pretty much solves the
 performance problem.
 In the short run (D2), I would suggest sticking to "reallocate on every


 C++ std::vector) where the performance of appending is an issue.
 
 Probably not a bad idea, since:
 
 1.  I've concluded that appending to slices simply can't be made both efficient
 and safe.
 
 2.  We'll probably get a std.collections anyhow.
 
 3.  If you **really** care about performance, you should only append when you
 don't know the length in advance.  If you know the length, you should always
 pre-allocate.


We will have collections and all those good things, but I don't see how 
the proposal follows from the feedback. My perception is that this is a 
group of people who use D. Bartosz' concern didn't cause quite a riot, 
so as far as I can tell there is no big issue at stake. (Add to that the 
fact that today's arrays are even _more_ potentially puzzling because, 
without the MRU cache, expanding slices may trample over other slices.)

In contrast, slowness of ~= comes time and again to haunt this 
newsgroup. Want it or not, people will want to use T[] and ~= to manage 
arrays. We could remove the ~= operator and require use of a library 
type for appends, but the aforementioned facts suggest that that might 
not be the best decision.

One thing that Bartosz didn't mention is that "this unique" is different 
than "the other unique", which I think reveals the half-bakedness of the 
proposal. "The other unique" that was intensively discussed before is 
transitive, meaning that the root reference points to a graph of objects 
having no other connection to the outer world. "This unique" is very 
different because it's shallow - only the array chunk is unaliased, but 
the members of the array don't need to be. This is a very important 
distinction.


Andrei

Nov 17 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Leandro Lucarella wrote:
 Andrei Alexandrescu, el 17 de noviembre a las 18:45 me escribiste:
 3.  If you **really** care about performance, you should only append when you
 don't know the length in advance.  If you know the length, you should always
 pre-allocate.


 how the proposal follows from the feedback. My perception is that
 this is a group of people who use D. Bartosz' concern didn't cause
 quite a riot, so as far as I can tell there is no big issue at
 stake.


 I didn't say anything (until now) because this was discussed already.
 Dynamic arrays/slices appending is horribly broken (I know it's well
 defined, and deterministic, but you are just condemned to make mistakes,
 it just doesn't work as one would expect, even when you know how it works
 you have to keep fighting your intuition all the time).


In which ways do you think arrays horribly broken? Same as Bartosz mentions?

One question is whether you actually have had bugs and problems coding, 
or if arrays stopped you from getting work done.

 But it's a little pointless to keep making riots, when we were so close to
 finally fix this with T[new]/whatever and it all went back.


Hmmmm... resignation is not a trait of this group :o).


Andrei

Nov 18 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

Andrei Alexandrescu Wrote:

 Leandro Lucarella wrote:
 Andrei Alexandrescu, el 17 de noviembre a las 18:45 me escribiste:
 3.  If you **really** care about performance, you should only append when you
 don't know the length in advance.  If you know the length, you should always
 pre-allocate.


 how the proposal follows from the feedback. My perception is that
 this is a group of people who use D. Bartosz' concern didn't cause
 quite a riot, so as far as I can tell there is no big issue at
 stake.


 I didn't say anything (until now) because this was discussed already.
 Dynamic arrays/slices appending is horribly broken (I know it's well
 defined, and deterministic, but you are just condemned to make mistakes,
 it just doesn't work as one would expect, even when you know how it works
 you have to keep fighting your intuition all the time).


 In which ways do you think arrays horribly broken? Same as Bartosz mentions?
 
 One question is whether you actually have had bugs and problems coding, 
 or if arrays stopped you from getting work done.
 


What Leandro points out is that using D arrays is not straightforward and may
present a steep learning curve. In particular, even the beginner would be
required to understand section 4.1.9 of TDPL (Expanding). For many people
arrays' behavior might be counter-intuitive and a source of mistakes.

In fact, even after reading 4.1.9 I don't know what to expect in some cases.
Here's an example:

int[] a = [0];
auto b = a;
a ~= 1;
b ~= 2;
What is a[1]?

Is this considered "stomping" and requiring a re-allocation?

Nov 18 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

Steven Schveighoffer Wrote:

 In fact, even after reading 4.1.9 I don't know what to expect in some  
 cases. Here's an example:

 int[] a = [0];
 auto b = a;
 a ~= 1;
 b ~= 2;
 What is a[1]?

 Is this considered "stomping" and requiring a re-allocation?


 a[1] would be 1 if an MRU cache was introduced.  This is exactly the sort  
 of problem that the MRU cache is supposed to solve.
 
 What should happen is:
 
 a ~= 1 should allocate in place because its parameters were in the MRU  
 cache.
 b ~= 1 should reallocate because the a~=1 should have removed its  
 parameters from the MRU cache.
 


Notice that you are using particular implementation detail (MRU cache) to
explain the semantics of D arrays. There is a very important distinction
between language specification and compiler implementation. Andrei already had
to go pretty deep into implementation to describe arrays and slices: You can't
define D arrays without talking about shared buffers and memory allocation. I
don't think including the description of the MRU cache in the language
specification is the right solution. But I'm afraid an abstract definition of
"stomping" might turn out to be quite non-trivial.

Nov 19 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Bartosz Milewski (bartosz-nospam relisoft.com)'s article
 Steven Schveighoffer Wrote:
 In fact, even after reading 4.1.9 I don't know what to expect in some
 cases. Here's an example:

 int[] a = [0];
 auto b = a;
 a ~= 1;
 b ~= 2;
 What is a[1]?

 Is this considered "stomping" and requiring a re-allocation?


 a[1] would be 1 if an MRU cache was introduced.  This is exactly the sort
 of problem that the MRU cache is supposed to solve.

 What should happen is:

 a ~= 1 should allocate in place because its parameters were in the MRU
 cache.
 b ~= 1 should reallocate because the a~=1 should have removed its
 parameters from the MRU cache.




language specification and compiler implementation. Andrei already had to go
pretty deep into implementation to describe arrays and slices: You can't define
D
arrays without talking about shared buffers and memory allocation. I don't think
including the description of the MRU cache in the language specification is the
right solution. But I'm afraid an abstract definition of "stomping" might turn
out
to be quite non-trivial.

Face it, D is a close to the metal language.  Any attempt to define the spec at
a
purely abstract level without reference to any implementation details, and
without
leaving certain details implementation defined is absurd.  I think people need
to
face the fact that you can't have close-to-the-metal flexibility and performance
and at the same time far-from-the-metal strict separation of specification from
implementation details.

Nov 19 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Leandro Lucarella wrote:
 dsimcha, el 19 de noviembre a las 23:21 me escribiste:
 Face it, D is a close to the metal language.  Any attempt to define the spec
at a
 purely abstract level without reference to any implementation details, and
without
 leaving certain details implementation defined is absurd.  I think people need
to
 face the fact that you can't have close-to-the-metal flexibility and
performance
 and at the same time far-from-the-metal strict separation of specification from
 implementation details.


 Yes, but I think this is a place where you are doing early optimization at
 the expense of usability. You can always do the slice-thinguie yourself if
 you are that needed for speed that you can't afford an extra indirection
 when using an array (which will be the only overhead if T[new] were
 implemented as a pure reference type).
 
 I think the cost here in terms of usability if far higher than the little
 performance you loose, specially when you can overcome the issue for those
 weird cases where you need it.


I think one aspect that tends to be forgotten is that built-in arrays 
are the lowest abstraction - really just one small step above pointers. 
Efficiency lost there is lost forever. That's why I consider the current 
design adequate and the proposed designs less so.

In other words: you can build a more convenient facility on top of T[], 
but you couldn't build a more efficient facility on top of a convenient 
one. I think D made the right choice with built-in arrays.


Andrei

Nov 19 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Denis Koroskin wrote:
 On Fri, 20 Nov 2009 04:13:42 +0300, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Leandro Lucarella wrote:
 dsimcha, el 19 de noviembre a las 23:21 me escribiste:
 Face it, D is a close to the metal language.  Any attempt to define 
 the spec at a
 purely abstract level without reference to any implementation 
 details, and without
 leaving certain details implementation defined is absurd.  I think 
 people need to
 face the fact that you can't have close-to-the-metal flexibility and 
 performance
 and at the same time far-from-the-metal strict separation of 
 specification from
 implementation details.


 optimization at
 the expense of usability. You can always do the slice-thinguie 
 yourself if
 you are that needed for speed that you can't afford an extra indirection
 when using an array (which will be the only overhead if T[new] were
 implemented as a pure reference type).
  I think the cost here in terms of usability if far higher than the 
 little
 performance you loose, specially when you can overcome the issue for 
 those
 weird cases where you need it.


 I think one aspect that tends to be forgotten is that built-in arrays 
 are the lowest abstraction - really just one small step above 
 pointers. Efficiency lost there is lost forever. That's why I consider 
 the current design adequate and the proposed designs less so.

 In other words: you can build a more convenient facility on top of 
 T[], but you couldn't build a more efficient facility on top of a 
 convenient one. I think D made the right choice with built-in arrays.


 Andrei


 Not so true. T[] appends are slow, *very* slow. Whenever I need anything 
 efficient, I always use an additional "size" field (and arr.length is 
 used as capacity). This is tricky, error-prone and not too intuitive to 
 newcomers. As such, it contradicts with all your statements above.


I was speaking somewhat anticipatively; the MRU is slated to accelerate 
append speed while also improving its safety. I have tried in vain to 
convince someone to look into implementing it.

 And there is also a hack to make it work a bit faster:
 
 T[] arr;
 arr.length = capacity
 arr.length = 0;
 
 Much to my disgust, having no higher-level abstraction in language core 
 only encourages writing code like this.


I hope that idiom will be put to rest soon.

 TBH, I used to write code like above before, too, but not now. I slowly 
 eliminate everything like this (as well as all the ~= operations) from 
 the code I wrote. It's unpredictable, slow, and leads to hard-to-fix bugs.
 
 In other words, I disagree with your opinion that D made a right choice 
 with built-in pseudo-arrays.


I really meant to say "will have made".


Andrei

Nov 19 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Leandro Lucarella wrote:
 Andrei Alexandrescu, el 19 de noviembre a las 17:13 me escribiste:
 Leandro Lucarella wrote:
 dsimcha, el 19 de noviembre a las 23:21 me escribiste:
 Face it, D is a close to the metal language.  Any attempt to define the spec
at a
 purely abstract level without reference to any implementation details, and
without
 leaving certain details implementation defined is absurd.  I think people need
to
 face the fact that you can't have close-to-the-metal flexibility and
performance
 and at the same time far-from-the-metal strict separation of specification from
 implementation details.


 the expense of usability. You can always do the slice-thinguie yourself if
 you are that needed for speed that you can't afford an extra indirection
 when using an array (which will be the only overhead if T[new] were
 implemented as a pure reference type).

 I think the cost here in terms of usability if far higher than the little
 performance you loose, specially when you can overcome the issue for those
 weird cases where you need it.


 arrays are the lowest abstraction - really just one small step above
 pointers. Efficiency lost there is lost forever. That's why I
 consider the current design adequate and the proposed designs less
 so.


 I don't think so, slices (a view over somebody elses memory, no appending)
 should be the lowest level abstraction.
 
 In other words: you can build a more convenient facility on top of
 T[], but you couldn't build a more efficient facility on top of a
 convenient one. I think D made the right choice with built-in
 arrays.


 I don't. Current built-in arrays are both dynamic arrays and slices, which
 is what messes all up.


Well it also makes arrays/slices quite capable.

Right now they only have one problem (which different people ascribe a 
different gravity to) and are otherwise very expressive. Now think of 
this: the MRU solves a grave issue with slices. We haven't thought of 
that for years, and all of a sudden stomping is not an issue anymore. I 
wouldn't be in the least surprised if someone posts in this group 
tomorrow that they found a way to make ~= more predictable.

 But I'll drop the subject, once again, for the sake of... well, everybody
 ;)


I encourage you to further your point (while in understanding that your 
existing arguments and proposals have been understood). This is the best 
time to make a contribution to D's design.

We may as well undefine "~" or "~=" or both for arrays if there's enough 
arguments and evidence that it's a problem to keep them. When making 
proposals just keep in mind one important thing: with the context that 
you (all) have accumulated regarding this discussion, a lot of things 
seem very easy to understand, which are really weird when just starting 
with the language.

One other thing is that it's all tradeoffs; things that are "good" in 
all dimensions are in very short supply. With arrays, we may be in the 
position of choosing a solution that won't please everyone. The question 
right now is "which 5% of users you will disappoint". As for reviving 
T[new], I believe that, unless some new insight comes forward, 
practically that ship has sailed. That doesn't preclude defining a 
library type that is a more encapsulated array. Fortunately, that can be 
implemented using T[] because T[] is expressive enough :o).


Andrei

Nov 19 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Leandro Lucarella wrote:
 Andrei Alexandrescu, el 19 de noviembre a las 17:13 me escribiste:
 Leandro Lucarella wrote:
 dsimcha, el 19 de noviembre a las 23:21 me escribiste:
 Face it, D is a close to the metal language.  Any attempt to define the










 purely abstract level without reference to any implementation details, and










 leaving certain details implementation defined is absurd.  I think people










 face the fact that you can't have close-to-the-metal flexibility and










 and at the same time far-from-the-metal strict separation of specification from
 implementation details.


 the expense of usability. You can always do the slice-thinguie yourself if
 you are that needed for speed that you can't afford an extra indirection
 when using an array (which will be the only overhead if T[new] were
 implemented as a pure reference type).

 I think the cost here in terms of usability if far higher than the little
 performance you loose, specially when you can overcome the issue for those
 weird cases where you need it.


 arrays are the lowest abstraction - really just one small step above
 pointers. Efficiency lost there is lost forever. That's why I
 consider the current design adequate and the proposed designs less
 so.


 I don't think so, slices (a view over somebody elses memory, no appending)
 should be the lowest level abstraction.

 In other words: you can build a more convenient facility on top of
 T[], but you couldn't build a more efficient facility on top of a
 convenient one. I think D made the right choice with built-in
 arrays.


 I don't. Current built-in arrays are both dynamic arrays and slices, which
 is what messes all up.


 Right now they only have one problem (which different people ascribe a
 different gravity to) and are otherwise very expressive. Now think of
 this: the MRU solves a grave issue with slices. We haven't thought of
 that for years, and all of a sudden stomping is not an issue anymore. I
 wouldn't be in the least surprised if someone posts in this group
 tomorrow that they found a way to make ~= more predictable.
 But I'll drop the subject, once again, for the sake of... well, everybody
 ;)


 existing arguments and proposals have been understood). This is the best
 time to make a contribution to D's design.
 We may as well undefine "~" or "~=" or both for arrays if there's enough
 arguments and evidence that it's a problem to keep them. When making
 proposals just keep in mind one important thing: with the context that
 you (all) have accumulated regarding this discussion, a lot of things
 seem very easy to understand, which are really weird when just starting
 with the language.
 One other thing is that it's all tradeoffs; things that are "good" in
 all dimensions are in very short supply. With arrays, we may be in the
 position of choosing a solution that won't please everyone. The question
 right now is "which 5% of users you will disappoint". As for reviving
 T[new], I believe that, unless some new insight comes forward,
 practically that ship has sailed. That doesn't preclude defining a
 library type that is a more encapsulated array. Fortunately, that can be
 implemented using T[] because T[] is expressive enough :o).
 Andrei


I'd be inclined to support the MRU (and even implement it) iff:

1.  We still get a good library array type that pulls out all the stops to make
appending efficient.  (I'm putting my effort where my mouth is and working on
writing one pursuant to my ideas about combining it with uniqueness right now,
actually).

2.  We state in the documentation that ~= on slices is a convenience that isn't
terribly efficient and recommend that people that care more about performance
than
simplicity use the library type for their serious array building, especially for
large arrays.

Nov 19 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

Andrei Alexandrescu Wrote:
 I think one aspect that tends to be forgotten is that built-in arrays 
 are the lowest abstraction - really just one small step above pointers. 
 Efficiency lost there is lost forever. That's why I consider the current 
 design adequate and the proposed designs less so.


Let's look at your proposal from a slightly different perspective. First of
all, a conforming implementation is free to do the re-allocation on every
append. There are no complexity requirements specified. 

The controversy stems from the fact that the spec deliberately leaves a window
for certain optimizations that might result in the array being write-shared
after the append. In that case the spec dictates certain restrictions that are
expressed informally as no-stomping. BTW, we can't currently evaluate the
proposal until no-stomping is formally defined.

Because of that, a program written for and tested on one conforming
implementation may not work correctly on another, depending on what
optimization strategy is picked. That will probably lead to implementations
that use a compile-time switch to disable such optimizations. 

The bugs resulting from not understanding when write sharing is possible (and
I'm afraid a lot of programmers will have problems following the spec in this
area) will be subtle. They will also be "soft" (as opposed to "hard" bugs), and
those are much harder to discover and track down. 

All these things have to be carefully weighed. We are in uncharted territory.

Nov 20 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Steven Schveighoffer wrote:
[snip]
 Andrei has mentioned that he thinks we can store the allocated length in 
 the GC block, which I think would also work.  You also wouldn't need an 
 MRU cache in that case, but he says it's in *addition* to the MRU cache, 
 so I'm not sure what he means.



Reaching the GC block is relatively expensive, so the MRU still helps. 
In essence it's like this. When appending:

a) look up the cache, if there, you have O(1) amortized append that's 
really fast

b) if not in the cache, talk to the GC, still O(1) amortized append 
that's not as fast

Both help providing an important performance guarantee. I was a bit 
worried about guaranteeing "O(1) amortized append for up to 8 arrays at 
a time."


Andrei

Nov 24 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Steven Schveighoffer wrote:
 On Tue, 24 Nov 2009 11:01:10 -0500, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Steven Schveighoffer wrote:
 [snip]
 Andrei has mentioned that he thinks we can store the allocated length 
 in the GC block, which I think would also work.  You also wouldn't 
 need an MRU cache in that case, but he says it's in *addition* to the 
 MRU cache, so I'm not sure what he means.



 Reaching the GC block is relatively expensive, so the MRU still helps. 
 In essence it's like this. When appending:

 a) look up the cache, if there, you have O(1) amortized append that's 
 really fast

 b) if not in the cache, talk to the GC, still O(1) amortized append 
 that's not as fast

 Both help providing an important performance guarantee. I was a bit 
 worried about guaranteeing "O(1) amortized append for up to 8 arrays 
 at a time."


 Have you considered the performance impact on allocating non-array 
 types?  That is, are you intending on all allocations storing the 
 allocated length, even class or struct allocations who will likely never 
 append?  Or will there be a "non-appendable" flag?
 
 Also, the part without the MRU cache was my original proposal from last 
 year, I had some ideas on how length could be stored.  For example, in a 
 page of up to 128 byte blocks, you only need 8 bits to store the length 
 (alas, you cannot store with 4 bits for 16-byte blocks because you need 
 to cover both 0 and 16).  This reduces the overhead for those blocks.  
 For 256 byte to 1-page blocks, 16 bits is acceptable multi-page blocks, 
 the cost of storing a 32-bit integer is negligible.


Having access to the requested length is important at larger lengths, so 
probably we could be fine by not actually storing it for allocations up 
to 128 bytes.

 It is true the lookup of the MRU cache will not involve dissecting the 
 address of the block to find it's container block, but you still will 
 need a lock, or are you planning on doing something clever?   I think 
 the locking will be the bottleneck, and if you don't make it the same as 
 the global lock, add the cost of both locks when you actually need to 
 append.


The cache is a thread-local map from pointers to size_t. Using it does 
not require any locking I think.


Andrei

Nov 24 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Bill Baxter wrote:
 On Tue, Nov 24, 2009 at 8:26 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Steven Schveighoffer wrote:
 On Tue, 24 Nov 2009 11:01:10 -0500, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 [snip]
 Andrei has mentioned that he thinks we can store the allocated length in
 the GC block, which I think would also work.  You also wouldn't need an MRU
 cache in that case, but he says it's in *addition* to the MRU cache, so I'm
 not sure what he means.



 Reaching the GC block is relatively expensive, so the MRU still helps. In
 essence it's like this. When appending:

 a) look up the cache, if there, you have O(1) amortized append that's
 really fast

 b) if not in the cache, talk to the GC, still O(1) amortized append
 that's not as fast

 Both help providing an important performance guarantee. I was a bit
 worried about guaranteeing "O(1) amortized append for up to 8 arrays at a
 time."


  That is, are you intending on all allocations storing the allocated length,
 even class or struct allocations who will likely never append?  Or will
 there be a "non-appendable" flag?

 Also, the part without the MRU cache was my original proposal from last
 year, I had some ideas on how length could be stored.  For example, in a
 page of up to 128 byte blocks, you only need 8 bits to store the length
 (alas, you cannot store with 4 bits for 16-byte blocks because you need to
 cover both 0 and 16).  This reduces the overhead for those blocks.  For 256
 byte to 1-page blocks, 16 bits is acceptable multi-page blocks, the cost of
 storing a 32-bit integer is negligible.


 probably we could be fine by not actually storing it for allocations up to
 128 bytes.

 It is true the lookup of the MRU cache will not involve dissecting the
 address of the block to find it's container block, but you still will need a
 lock, or are you planning on doing something clever?   I think the locking
 will be the bottleneck, and if you don't make it the same as the global
 lock, add the cost of both locks when you actually need to append.


 require any locking I think.


 Wait, what happens if you're sharing an array between two different threads?


The cache does not work with shared arrays, and threads cannot share 
non-shared arrays.


Andrei

Nov 24 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Steven Schveighoffer wrote:
 On Tue, 24 Nov 2009 11:41:17 -0500, Steven Schveighoffer 
 <schveiguy yahoo.com> wrote:
 
 
 The cache is a thread-local map from pointers to size_t. Using it 
 does not require any locking I think.


 When reallocating, do you not also need to update the allocated length 
 in the heap, even if the allocation fits into the same block to 
 prevent other threads from stomping?  Doesn't that require a lock?  
 Somewhere you have to share the allocated length between threads 
 sharing the same array.  I assumed it was going to be in the MRU cache...


 If all this is true, I had another thought for an MRU cache.  It could 
 simply be a map of pointers to the right blocks in the GC that store the 
 lengths (to avoid the block search with the global mutex locked).


That's a great idea. I keep on dreaming someone will implement all this 
stuff we talk about :o).

 Lookup should be atomic without locking (I think, simply an integer load 
 right?), but you'd have to lock to actually do an append.
 
 I don't think we solve the lock problem without having multiple heaps...


I don't think we need to worry about optimizing growth of shared arrays.


Andrei

Nov 24 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Steven Schveighoffer wrote:
 On Tue, 24 Nov 2009 11:41:17 -0500, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:


 The cache is a thread-local map from pointers to size_t. Using it
 does not require any locking I think.


 When reallocating, do you not also need to update the allocated length
 in the heap, even if the allocation fits into the same block to
 prevent other threads from stomping?  Doesn't that require a lock?
 Somewhere you have to share the allocated length between threads
 sharing the same array.  I assumed it was going to be in the MRU cache...


 If all this is true, I had another thought for an MRU cache.  It could
 simply be a map of pointers to the right blocks in the GC that store the
 lengths (to avoid the block search with the global mutex locked).


 stuff we talk about :o).
 Lookup should be atomic without locking (I think, simply an integer load
 right?), but you'd have to lock to actually do an append.

 I don't think we solve the lock problem without having multiple heaps...


 Andrei


But if the full semantics of shared are not implemented for D2 (will they be),
how
do we even detect shared arrays?

Nov 24 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

It's hard to argue whether something is or isn't bug prone and how severe the
bugs could be. An experienced programmer knows all the gotchas so he or she
will never introduce such bugs. A more bug-prone person will be afraid to speak
up in public so as not to appear stupid or be bullied away. It's only when the
language goes into production that the problems surface. Then an experienced
programmer becomes part of a group or not so experienced programmers and he or
she starts seeing how much the rests of the team is prone to certain bugs and
how hard it is to prevent or even detect them. That's why the best languages
attempt to turn as many errors as possible into hard bugs. 

Non-deterministic behavior is a major source of soft bugs. A buggy program
might work correctly 99% of the time, introduce minor errors 0.9% of the time,
and in 0.1% of the time lead to a disaster.

I will try to come up with some scenarios that would qualify for the "100
Gotchas with D arrays" to provide more food for though. Consider these little
language puzzles. 

Imagine you're reviewing this code written by a relative newbee:

char[] quote(char[] word) {
  word.length += 2;
  word[length-1] = '"';
  for(int i = word.length-2; i > 0; --i)
    word[i] = word[i - 1];
  word[0] = '"';
  return word;
}

void storeFoos(char[] buffer, string pattern, Container c)
{
	char[] res = find(buffer, pattern);
	c.store(quote(res[0..pattern.len]);
}

Is this buggy? If so, how and when will the bug manifest itself? 

Or another story. An enthusiastic programmer comes to you with a very clever
rewrite of a function. This is the original:

T[] duplicate(T)(T[] src) {
   static if (is(T == invariant))
      return src.idup;
   else
      return src.dup;
}

This is his improved version:

T[] duplicate(T)(T[] src) {
   T tmp = src[$-1];
   src.length -= 1;
   src ~= tmp;
   return src;
}

No need for static if! The extension is bound to re-allocate because of
stomping rules. Will this always work, or is there a gotcha?

Nov 24 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

Steven Schveighoffer Wrote:
 Imagine you're reviewing this code written by a relative newbee:

 char[] quote(char[] word) {
   word.length += 2;
   word[length-1] = '"';
   for(int i = word.length-2; i > 0; --i)
     word[i] = word[i - 1];
   word[0] = '"';
   return word;
 }

 void storeFoos(char[] buffer, string pattern, Container c)
 {
 	char[] res = find(buffer, pattern);
 	c.store(quote(res[0..pattern.len]);
 }

 Is this buggy? If so, how and when will the bug manifest itself?


 In my mind it is not buggy.  If quote or storeFoos is not meant to be able  
 to molest the data, it should require a const(char)[].  If this was D1  
 code, I'd say it depends on the purpose of storeFoos, if storeFoos is  
 expecting to own the input buffer, it's not buggy.  If it's not, then  
 storeFoos is buggy (in all cases).  But we aren't talking about D1.


Yes, a lot depends on what's expected of storeFoos. Suppose that the
surrounding code expects the buffer not to be modified. The newbee didn't want
to idup the buffer because in most buffers the pattern is not found.

 Or another story. An enthusiastic programmer comes to you with a very  
 clever rewrite of a function. This is the original:

 T[] duplicate(T)(T[] src) {
    static if (is(T == invariant))
       return src.idup;
    else
       return src.dup;
 }

 This is his improved version:

 T[] duplicate(T)(T[] src) {
    T tmp = src[$-1];
    src.length -= 1;
    src ~= tmp;
    return src;
 }

 No need for static if! The extension is bound to re-allocate because of  
 stomping rules. Will this always work, or is there a gotcha?


 He forgot to check for length 0.
 


That he did. So now he goes back to the drawing board and adds an early return
for length 0. Is his code correct now?

Nov 24 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

Steven Schveighoffer Wrote:

 On Tue, 24 Nov 2009 16:52:52 -0500, Bartosz Milewski  
 <bartosz-nospam relisoft.com> wrote:
 
 Steven Schveighoffer Wrote:
 Imagine you're reviewing this code written by a relative newbee:

 char[] quote(char[] word) {
   word.length += 2;
   word[length-1] = '"';
   for(int i = word.length-2; i > 0; --i)
     word[i] = word[i - 1];
   word[0] = '"';
   return word;
 }

 void storeFoos(char[] buffer, string pattern, Container c)
 {
 	char[] res = find(buffer, pattern);
 	c.store(quote(res[0..pattern.len]);
 }

 Is this buggy? If so, how and when will the bug manifest itself?


 In my mind it is not buggy.  If quote or storeFoos is not meant to be  
 able
 to molest the data, it should require a const(char)[].  If this was D1
 code, I'd say it depends on the purpose of storeFoos, if storeFoos is
 expecting to own the input buffer, it's not buggy.  If it's not, then
 storeFoos is buggy (in all cases).  But we aren't talking about D1.


 surrounding code expects the buffer not to be modified. The newbee  
 didn't want to idup the buffer because in most buffers the pattern is  
 not found.


 Then he forgot the const decoration for the parameter.  No duping is  
 necessary:
 
 void storeFoos(const(char)[] buffer, string pattern, Container c)
 {
     ... // will result in compiler errors for given quote function, fix  
 that.
 }


Well, he tried that, but then his quote function wouldn't work and he needs it
in other places as well. You know how they are ;-). Besides he thinks "const"
is for wimps. He won't stop arguing until you prove to him that there is an
actual bug in his code.

 Or another story. An enthusiastic programmer comes to you with a very
 clever rewrite of a function. This is the original:

 T[] duplicate(T)(T[] src) {
    static if (is(T == invariant))
       return src.idup;
    else
       return src.dup;
 }

 This is his improved version:

 T[] duplicate(T)(T[] src) {
    T tmp = src[$-1];
    src.length -= 1;
    src ~= tmp;
    return src;
 }

 No need for static if! The extension is bound to re-allocate because  


 stomping rules. Will this always work, or is there a gotcha?


 He forgot to check for length 0.


 That he did. So now he goes back to the drawing board and adds an early  
 return for length 0. Is his code correct now?


 Yes.
 
 I have a feeling you have some trick up your sleeve that you think makes  
 it incorrect :)


Well, I was wondering if it could be rewritten as:
T[] duplicate(T)(T[] src) {
    assert(src.length != 0);
    T tmp = src[$-1];
    src.length -= 1;
    src.length += 1;
    src[$-1] = tmp;
    return src;
}
and whether it's okay for the optimizer to do some simple arithmetic
optimization.

Nov 24 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

Steven Schveighoffer Wrote:


 Bottom line: if a function isn't supposed to change the buffer, the  
 signature should be const for that parameter.  It's one of the principles  
 of const, and why it's in D2 in the first place.  I'd explain to the coder  
 that he is wrong to expect that modifying a buffer in a function that  
 isn't supposed to modify a buffer is acceptable (and hopefully he gets it,  
 or else I don't have time to deal with people who insist on being right  
 when they are not).
 
 BTW, in my experience, the newbie expectaction of ~= is usually that it  
 modifies the original even when it doesn't, not the other way around.
 


The guy insists that the reallocation happens in the quote function (otherwise
there would be stomping), so no sharing happens and the original is not
modified. He tested it on a variety of inputs! I'm not totally making it up--I
had this kind of arguments with C++ programmers. You tell them that it's safer
to use const and they laugh at you. "My code works, why should I change it!"

 Second, the optimizer cannot do simple math optimization on src.length -=  




 the length calls a function, which cannot be optimized out.


You are again resorting to implementation. I guess in the current
implementation it's true that the compiler will indeed insert a call to its
internal function. But the language spec does not proscribe that. 

Would you also argue that this optimization is incorrect? Changing:

a.length += 2;
a.length += 3;

to:

a.lenght += 5;

I'm not saying that this is an unsolvable problem. I'm just showing how hard it
is to reason about D arrays.

Nov 24 2009

Steven Schveighoffer <schveiguy yahoo.com> writes:

Bartosz Milewski Wrote:

 Steven Schveighoffer Wrote:
 
 
 Bottom line: if a function isn't supposed to change the buffer, the  
 signature should be const for that parameter.  It's one of the principles  
 of const, and why it's in D2 in the first place.  I'd explain to the coder  
 that he is wrong to expect that modifying a buffer in a function that  
 isn't supposed to modify a buffer is acceptable (and hopefully he gets it,  
 or else I don't have time to deal with people who insist on being right  
 when they are not).
 
 BTW, in my experience, the newbie expectaction of ~= is usually that it  
 modifies the original even when it doesn't, not the other way around.
 


 The guy insists that the reallocation happens in the quote function (otherwise
there would be stomping), so no sharing happens and the original is not
modified. He tested it on a variety of inputs! I'm not totally making it up--I
had this kind of arguments with C++ programmers. You tell them that it's safer
to use const and they laugh at you. "My code works, why should I change it!"


C++ const is kind of a joke :)

But in any case, it's trivial to come up with a case that causes buffer issues.
 If he insists on a test case, then I'd give him one.

Programmers like this don't last long at companies.  I had a coworker once who
insisted that a bug he was working on was going to be impossible to fix, so
there was no point in working on it.  My boss said that if he (the boss) could
fix it, the coworker would be let go.  My boss fixed it in one day.

And there are *tons* of logic errors that don't cause bugs for most inputs, I
don't see why we are focusing on this one case.

 
 Second, the optimizer cannot do simple math optimization on src.length -=  




 the length calls a function, which cannot be optimized out.


 You are again resorting to implementation. I guess in the current
implementation it's true that the compiler will indeed insert a call to its
internal function. But the language spec does not proscribe that. 
 


Sure it does.  It says that setting the length may reallocate the array to hold
enough space.  How do you do that without a function call?

 Would you also argue that this optimization is incorrect? Changing:
 
 a.length += 2;
 a.length += 3;
 
 to:
 
 a.lenght += 5;


To the developer?  no.  To the compiler?  yes.  The compiler cannot know what
affect it will have by making this optimization, it does not know what the
underlying function will do or the semantic meaning of those statements.  What
if adding to a.length outputs it's argument?  Then the optimizer would have
changed the output from 23 to 5.

But the developer who knows the semantic meaning can make the optimization
change.

 
 I'm not saying that this is an unsolvable problem. I'm just showing how hard
it is to reason about D arrays. 
 


I will agree that D arrays are hard to explain, I've had my share of training
sessions on IRC.  It's not an easy concept to get right off, but the benefits
of doing arrays this way are huge.

-Steve

Nov 25 2009

Steven Schveighoffer <schveiguy yahoo.com> writes:

Simen kjaeraas Wrote:

 Steven Schveighoffer <schveiguy yahoo.com> wrote:
 
 You are again resorting to implementation. I guess in the current  
 implementation it's true that the compiler will indeed insert a call to  
 its internal function. But the language spec does not proscribe that.


 Sure it does.  It says that setting the length may reallocate the array  
 to hold enough space.  How do you do that without a function call?


 As long as the compiler is aware of arrays' existence, and knows how
 those functions work, there might indeed be an optimization that turns
 length+=1;length-=1; into a nop. There isn't at the moment, and there
 might never be, but it's a conceivable optimization.


As long as the spec says changing length may expand the array to hold enough
space, the optimizer can't, because the optimization would change the side
effects of the function.  An optimizer should not change the outcome or side
effects of a function.  It's not unheard of for an optimizer to eliminate
important operations that it thinks are invalid, but in that case, it's a
broken optimizer, I don't see why we would add this case.

Now, I could see a future optimizer changing

a.length += 2;
a.length += 3;

into

a.length += 5;

But I don't see a huge gain.  Why not just write a.length += 5 in the first
place?

This point brought up is a non-issue, an optimizer that changes the
outcome/side effects of a function is a broken optimizer, end of story.

-Steve

Nov 25 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

Steven Schveighoffer Wrote:

 As long as the spec says changing length may expand the array to hold enough
space, the optimizer can't, because the optimization would change the side
effects of the function.  An optimizer should not change the outcome or side
effects of a function.  It's not unheard of for an optimizer to eliminate
important operations that it thinks are invalid, but in that case, it's a
broken optimizer, I don't see why we would add this case.


This is all true for user-defined data types, but array is totally under
compiler's control. What optimizations are acceptable depends only on the
interpretation of the language spec. In particular the sequence

a.length -= 1;
a.length += 1;

as a whole does not result in stomping, so why would it be wrong to optimize it
away? I claim it's perfectly legitimate.

Nov 25 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Bartosz Milewski wrote:
 Steven Schveighoffer Wrote:
 
 As long as the spec says changing length may expand the array to hold enough
space, the optimizer can't, because the optimization would change the side
effects of the function.  An optimizer should not change the outcome or side
effects of a function.  It's not unheard of for an optimizer to eliminate
important operations that it thinks are invalid, but in that case, it's a
broken optimizer, I don't see why we would add this case.


 This is all true for user-defined data types, but array is totally under
compiler's control. What optimizations are acceptable depends only on the
interpretation of the language spec. In particular the sequence
 
 a.length -= 1;
 a.length += 1;
 
 as a whole does not result in stomping, so why would it be wrong to optimize
it away? I claim it's perfectly legitimate. 


Yes, but this gets into optimizing sequences of complex operations, so 
the question is one of expectancy. What if I insert something there:

a.length -= 1;
writeln("Why oh why???");
a.length += 1;

After all people don't ordinarily expect C++ compilers to optimize away

a.resize(a.size() - 1);
a.resize(a.size() + 1);

Each function has some semantics and effects, and I don't know why we'd 
want to get down the route that effects depend on the flow in which the 
function gets called.

What is the example trying to accomplish?


Andrei

Nov 25 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

Andrei Alexandrescu Wrote:

 Bartosz Milewski wrote:
 Steven Schveighoffer Wrote:
 
 As long as the spec says changing length may expand the array to hold enough
space, the optimizer can't, because the optimization would change the side
effects of the function.  An optimizer should not change the outcome or side
effects of a function.  It's not unheard of for an optimizer to eliminate
important operations that it thinks are invalid, but in that case, it's a
broken optimizer, I don't see why we would add this case.


 This is all true for user-defined data types, but array is totally under
compiler's control. What optimizations are acceptable depends only on the
interpretation of the language spec. In particular the sequence
 
 a.length -= 1;
 a.length += 1;
 
 as a whole does not result in stomping, so why would it be wrong to optimize
it away? I claim it's perfectly legitimate. 


 Yes, but this gets into optimizing sequences of complex operations, so 
 the question is one of expectancy. What if I insert something there:
 
 a.length -= 1;
 writeln("Why oh why???");
 a.length += 1;
 
 After all people don't ordinarily expect C++ compilers to optimize away
 
 a.resize(a.size() - 1);
 a.resize(a.size() + 1);
 
 Each function has some semantics and effects, and I don't know why we'd 
 want to get down the route that effects depend on the flow in which the 
 function gets called.
 
 What is the example trying to accomplish?
 


I'm trying to nail down the semantics. Here is the case where there provably is
no stomping. If this can be optimized away then the language guarantee has to
be formulated accordingly. Right now I don't know what to expect from the code
I wrote. Is it a good substitute for duplicate or not? 

I get worried when the semantics of a very important and essential
primitive--the array--is hard to explain and comprehend.

Nov 25 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Bartosz Milewski wrote:
 Andrei Alexandrescu Wrote:
 
 Bartosz Milewski wrote:
 Steven Schveighoffer Wrote:
 
 As long as the spec says changing length may expand the array
 to hold enough space, the optimizer can't, because the
 optimization would change the side effects of the function.  An
 optimizer should not change the outcome or side effects of a
 function.  It's not unheard of for an optimizer to eliminate
 important operations that it thinks are invalid, but in that
 case, it's a broken optimizer, I don't see why we would add
 this case.


 totally under compiler's control. What optimizations are
 acceptable depends only on the interpretation of the language
 spec. In particular the sequence
 
 a.length -= 1; a.length += 1;
 
 as a whole does not result in stomping, so why would it be wrong
 to optimize it away? I claim it's perfectly legitimate.


 so the question is one of expectancy. What if I insert something
 there:
 
 a.length -= 1; writeln("Why oh why???"); a.length += 1;
 
 After all people don't ordinarily expect C++ compilers to optimize
 away
 
 a.resize(a.size() - 1); a.resize(a.size() + 1);
 
 Each function has some semantics and effects, and I don't know why
 we'd want to get down the route that effects depend on the flow in
 which the function gets called.
 
 What is the example trying to accomplish?
 


 I'm trying to nail down the semantics. Here is the case where there
 provably is no stomping. If this can be optimized away then the
 language guarantee has to be formulated accordingly. Right now I
 don't know what to expect from the code I wrote. Is it a good
 substitute for duplicate or not?


The answer is very simple, and it is present in the book. I have also 
repeated it here. Appending to an array may or may not terminate an 
existing sharing relationship, and it never initiates a sharing 
relationship. Assigning a larger length is equivalent to appending a 
number of default-initialized values. There is nothing difficult here, 
although convoluted examples can be created that compound "if"s in 
complex interactions.

 I get worried when the semantics of a very important and essential
 primitive--the array--is hard to explain and comprehend.


We have a couple of dozens of other things to worry about. There are 
deadlines and many other more important and more urgent things to work 
on. The primitive is not important and not essential. It's a primitive, 
and like all primitives, it can be used to create better abstractions. 
UDP is a primitive protocol with few guarantees, but TCP is implemented 
on top of it to provide the guarantees. Simple, efficient primitives are 
a defensible choice.


Andrei

Nov 25 2009

Steven Schveighoffer <schveiguy yahoo.com> writes:

Bartosz Milewski Wrote:

 Steven Schveighoffer Wrote:
 
 As long as the spec says changing length may expand the array to hold enough
space, the optimizer can't, because the optimization would change the side
effects of the function.  An optimizer should not change the outcome or side
effects of a function.  It's not unheard of for an optimizer to eliminate
important operations that it thinks are invalid, but in that case, it's a
broken optimizer, I don't see why we would add this case.


 This is all true for user-defined data types, but array is totally under
compiler's control. What optimizations are acceptable depends only on the
interpretation of the language spec. In particular the sequence
 
 a.length -= 1;
 a.length += 1;
 
 as a whole does not result in stomping, so why would it be wrong to optimize
it away? I claim it's perfectly legitimate. 


In fact, adding length modifies the contents.

For example:

auto str = "hello";
str.length -= 1;
str.length += 1;
assert(str == "hell\xFF");

If the length modifications are optimized out, the assert fails.  Looks like an
invalid optimization to me.

-Steve

Nov 25 2009

Walter Bright <newshound1 digitalmars.com> writes:

Steven Schveighoffer wrote:
 As long as the spec says changing length may expand the array to hold
 enough space, the optimizer can't, because the optimization would
 change the side effects of the function.  An optimizer should not
 change the outcome or side effects of a function.  It's not unheard
 of for an optimizer to eliminate important operations that it thinks
 are invalid, but in that case, it's a broken optimizer, I don't see
 why we would add this case.


The optimizer is free to change around implementation-defined-behavior 
and undefined-behavior. For defined-behavior, it can change things 
around as long as the observer cannot observe a change.

 This point brought up is a non-issue, an optimizer that changes the
 outcome/side effects of a function is a broken optimizer, end of
 story.


It can within the constraints of defined-behavior.

Nov 25 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

Walter Bright Wrote:

 Steven Schveighoffer wrote:
 As long as the spec says changing length may expand the array to hold
 enough space, the optimizer can't, because the optimization would
 change the side effects of the function.  An optimizer should not
 change the outcome or side effects of a function.  It's not unheard
 of for an optimizer to eliminate important operations that it thinks
 are invalid, but in that case, it's a broken optimizer, I don't see
 why we would add this case.


 The optimizer is free to change around implementation-defined-behavior 
 and undefined-behavior. For defined-behavior, it can change things 
 around as long as the observer cannot observe a change.
 
 This point brought up is a non-issue, an optimizer that changes the
 outcome/side effects of a function is a broken optimizer, end of
 story.


 It can within the constraints of defined-behavior.


So the real question is, is extension defined in terms of re-allocation or in
terms of stomping. In the former case, the compiler should not optimize,
because re-allocation is the expected side effect. In the latter case, it can
eliminate the code, because what counts is the absence of stomping.

Nov 26 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

Steven Schveighoffer Wrote:

 Bartosz Milewski Wrote:
 
 Steven Schveighoffer Wrote:
 
 
 Bottom line: if a function isn't supposed to change the buffer, the  
 signature should be const for that parameter.  It's one of the principles  
 of const, and why it's in D2 in the first place.  I'd explain to the coder  
 that he is wrong to expect that modifying a buffer in a function that  
 isn't supposed to modify a buffer is acceptable (and hopefully he gets it,  
 or else I don't have time to deal with people who insist on being right  
 when they are not).
 
 BTW, in my experience, the newbie expectaction of ~= is usually that it  
 modifies the original even when it doesn't, not the other way around.
 


 The guy insists that the reallocation happens in the quote function (otherwise
there would be stomping), so no sharing happens and the original is not
modified. He tested it on a variety of inputs! I'm not totally making it up--I
had this kind of arguments with C++ programmers. You tell them that it's safer
to use const and they laugh at you. "My code works, why should I change it!"


 C++ const is kind of a joke :)
 
 But in any case, it's trivial to come up with a case that causes buffer
issues.  If he insists on a test case, then I'd give him one.
 
 Programmers like this don't last long at companies.  I had a coworker once who
insisted that a bug he was working on was going to be impossible to fix, so
there was no point in working on it.  My boss said that if he (the boss) could
fix it, the coworker would be let go.  My boss fixed it in one day.
 
 And there are *tons* of logic errors that don't cause bugs for most inputs, I
don't see why we are focusing on this one case.
 


The problem here is that the guy is sort of right. Indeed quote() expands the
slice before overwriting it and that requires re-allocation to avoid stomping
over the original buffer. There is however a boundary case, when the whole
buffer matches the pattern. Then the expansion is done in place and the buffer
is modified. This would never happen if expansion were guaranteed to
re-allocate. 

What this example was supposed to illustrate is that it's easy to forget to
explicitly duplicate an array, especially if it's not clear who's
responsible--the caller or the callee. 

This is somewhat reminiscent of the situation in C++ and the responsibility to
delete an object. Interestingly, in C++ programs you can use unique_ptr's to
have deterministic deletion. In D you could also use uniqueness to have
deterministic reallocation (or rather avoid it when it's not observable).

Nov 25 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Bartosz Milewski wrote:
 Steven Schveighoffer Wrote:
 
 Bartosz Milewski Wrote:

 Steven Schveighoffer Wrote:


 Bottom line: if a function isn't supposed to change the buffer, the  
 signature should be const for that parameter.  It's one of the principles  
 of const, and why it's in D2 in the first place.  I'd explain to the coder  
 that he is wrong to expect that modifying a buffer in a function that  
 isn't supposed to modify a buffer is acceptable (and hopefully he gets it,  
 or else I don't have time to deal with people who insist on being right  
 when they are not).

 BTW, in my experience, the newbie expectaction of ~= is usually that it  
 modifies the original even when it doesn't, not the other way around.





 But in any case, it's trivial to come up with a case that causes buffer
issues.  If he insists on a test case, then I'd give him one.

 Programmers like this don't last long at companies.  I had a coworker once who
insisted that a bug he was working on was going to be impossible to fix, so
there was no point in working on it.  My boss said that if he (the boss) could
fix it, the coworker would be let go.  My boss fixed it in one day.

 And there are *tons* of logic errors that don't cause bugs for most inputs, I
don't see why we are focusing on this one case.


 The problem here is that the guy is sort of right. Indeed quote() expands the
slice before overwriting it and that requires re-allocation to avoid stomping
over the original buffer. There is however a boundary case, when the whole
buffer matches the pattern. Then the expansion is done in place and the buffer
is modified. This would never happen if expansion were guaranteed to
re-allocate. 
 
 What this example was supposed to illustrate is that it's easy to forget to
explicitly duplicate an array, especially if it's not clear who's
responsible--the caller or the callee. 
 
 This is somewhat reminiscent of the situation in C++ and the responsibility to
delete an object. Interestingly, in C++ programs you can use unique_ptr's to
have deterministic deletion. In D you could also use uniqueness to have
deterministic reallocation (or rather avoid it when it's not observable). 


I agree that the underlying array sharing can be often a hassle. This is 
a problem that goes beyond just expansion - it's all the implied 
aliasing that goes on whenever you pass arrays around.

How about creating a struct Value!T that transforms T (be it an array or 
a class) into a value type? Then if you use Value!(int[]), you're 
effectively dealing with values throughout (even though internally they 
might be COWed). Sometimes I also see a need for DeepValue!T which would 
e.g. duplicate transitively arrays of arrays and full object trees. For 
the latter we need some more introspection though. But we have 
everything in the laguage to make Value and DeepValue work with arrays.

What do you think?


Andrei

Nov 25 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

Andrei Alexandrescu Wrote:

 How about creating a struct Value!T that transforms T (be it an array or 
 a class) into a value type? Then if you use Value!(int[]), you're 
 effectively dealing with values throughout (even though internally they 
 might be COWed). Sometimes I also see a need for DeepValue!T which would 
 e.g. duplicate transitively arrays of arrays and full object trees. For 
 the latter we need some more introspection though. But we have 
 everything in the laguage to make Value and DeepValue work with arrays.
 
 What do you think?


I'm afraid this would further muddle the message: "If you want safe arrays, use
the Value device; if you want to live dangerously, use the built in type." I'd
rather see the reverse: D arrays are safe to use. They have the usual reference
semantics of static arrays. But if you expand them, the sharing goes away and
you get a unique reference to a copy. This is the "no gotcha" semantics,
totally predictable, easy to reason about. 

How the compiler supports that semantics while performing clever optimizations
is another story. It's fine if this part is hard. The language can even impose
complexity requirements, if you are sure that they are possible to implement
(it's enough to have one compliant implementation to prove this point). 

By the way, what are the library algorithms that rely on O(1) behavior of array
append?

Nov 25 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Bartosz Milewski wrote:
 Andrei Alexandrescu Wrote:
 
 How about creating a struct Value!T that transforms T (be it an
 array or a class) into a value type? Then if you use Value!(int[]),
 you're effectively dealing with values throughout (even though
 internally they might be COWed). Sometimes I also see a need for
 DeepValue!T which would e.g. duplicate transitively arrays of
 arrays and full object trees. For the latter we need some more
 introspection though. But we have everything in the laguage to make
 Value and DeepValue work with arrays.
 
 What do you think?


 I'm afraid this would further muddle the message: "If you want safe
 arrays, use the Value device; if you want to live dangerously, use
 the built in type."


I think the message is "If you want values, use Value. If you want 
slices, use slices." To me that's rather a clear message.

 I'd rather see the reverse: D arrays are safe to
 use.


But that's backwards. You can do Value with slices. You can't do slices 
with values. The logical primitive to pick is the slice.

 They have the usual reference semantics of static arrays. But if
 you expand them, the sharing goes away and you get a unique reference
 to a copy. This is the "no gotcha" semantics, totally predictable,
 easy to reason about.
 
 How the compiler supports that semantics while performing clever
 optimizations is another story. It's fine if this part is hard. The
 language can even impose complexity requirements, if you are sure
 that they are possible to implement (it's enough to have one
 compliant implementation to prove this point).


Well the problem is we don't have that. It's more difficult to "be fine" 
if the onus of implementation is on you.

 By the way, what are the library algorithms that rely on O(1)
 behavior of array append?


I don't know, but there are plenty of algorithms out there that rely on 
that.


Andrei

Nov 25 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

Andrei Alexandrescu Wrote:

 Bartosz Milewski wrote:
 Andrei Alexandrescu Wrote:
 
 How about creating a struct Value!T that transforms T (be it an
 array or a class) into a value type? Then if you use Value!(int[]),
 you're effectively dealing with values throughout (even though
 internally they might be COWed). Sometimes I also see a need for
 DeepValue!T which would e.g. duplicate transitively arrays of
 arrays and full object trees. For the latter we need some more
 introspection though. But we have everything in the laguage to make
 Value and DeepValue work with arrays.
 
 What do you think?


 I'm afraid this would further muddle the message: "If you want safe
 arrays, use the Value device; if you want to live dangerously, use
 the built in type."


 I think the message is "If you want values, use Value. If you want 
 slices, use slices." To me that's rather a clear message.
 
 I'd rather see the reverse: D arrays are safe to
 use.


 But that's backwards. You can do Value with slices. You can't do slices 
 with values. The logical primitive to pick is the slice.
 


I know you don't like analogies, but for me it sounds like an advice to a  C++
programmer: if you want to control destruction, use unique_ptr (or shared_ptr).
This, BTW, is how I program in C++ because I can't live with memory leaks and
double-deletions. 

What would your advice be to somebody who is learning D as her first language.
Don't use arrays directly, use Value!(T[]) ? 

Should libraries be written in terms of Value!(T[])? 

 They have the usual reference semantics of static arrays. But if
 you expand them, the sharing goes away and you get a unique reference
 to a copy. This is the "no gotcha" semantics, totally predictable,
 easy to reason about.
 
 How the compiler supports that semantics while performing clever
 optimizations is another story. It's fine if this part is hard. The
 language can even impose complexity requirements, if you are sure
 that they are possible to implement (it's enough to have one
 compliant implementation to prove this point).


 Well the problem is we don't have that. It's more difficult to "be fine" 
 if the onus of implementation is on you.
 
 By the way, what are the library algorithms that rely on O(1)
 behavior of array append?


 I don't know, but there are plenty of algorithms out there that rely on 
 that.


I thought you had concrete examples.

Nov 26 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Bartosz Milewski wrote:
 Andrei Alexandrescu Wrote:
 
 Bartosz Milewski wrote:
 Andrei Alexandrescu Wrote:

 How about creating a struct Value!T that transforms T (be it an
 array or a class) into a value type? Then if you use Value!(int[]),
 you're effectively dealing with values throughout (even though
 internally they might be COWed). Sometimes I also see a need for
 DeepValue!T which would e.g. duplicate transitively arrays of
 arrays and full object trees. For the latter we need some more
 introspection though. But we have everything in the laguage to make
 Value and DeepValue work with arrays.

 What do you think?


 arrays, use the Value device; if you want to live dangerously, use
 the built in type."


 slices, use slices." To me that's rather a clear message.

 I'd rather see the reverse: D arrays are safe to
 use.


 with values. The logical primitive to pick is the slice.


 I know you don't like analogies, but for me it sounds like an advice to a  C++
programmer: if you want to control destruction, use unique_ptr (or shared_ptr).
This, BTW, is how I program in C++ because I can't live with memory leaks and
double-deletions. 
 
 What would your advice be to somebody who is learning D as her first language.
Don't use arrays directly, use Value!(T[]) ? 
 
 Should libraries be written in terms of Value!(T[])? 


Some would use that, some would use T[]. For example std.algorithm could 
derive only little, yet nonzero, use from Value!(T[]).

 They have the usual reference semantics of static arrays. But if
 you expand them, the sharing goes away and you get a unique reference
 to a copy. This is the "no gotcha" semantics, totally predictable,
 easy to reason about.

 How the compiler supports that semantics while performing clever
 optimizations is another story. It's fine if this part is hard. The
 language can even impose complexity requirements, if you are sure
 that they are possible to implement (it's enough to have one
 compliant implementation to prove this point).


 if the onus of implementation is on you.

 By the way, what are the library algorithms that rely on O(1)
 behavior of array append?


 that.


 I thought you had concrete examples.


I do (one famous example is in Windows 3), but I wouldn't think there's 
a need to argue that point. If we're facing the prospect of an append 
that always reallocates, we better don't provide it at all. People here 
have been complaining about ~= being slow because it acquires a lock, 
even though its complexity is amortized O(1).


Andrei

Nov 26 2009

Steven Schveighoffer <schveiguy yahoo.com> writes:

Bartosz Milewski Wrote:

 The problem here is that the guy is sort of right. Indeed quote() expands the
slice before overwriting it and that requires re-allocation to avoid stomping
over the original buffer. There is however a boundary case, when the whole
buffer matches the pattern. Then the expansion is done in place and the buffer
is modified. This would never happen if expansion were guaranteed to
re-allocate. 


Actually, the requirement for not reallocating is that the slice ends at the
end of an allocated block, so for example if the pattern matched at the end of
the buffer, you may have problems.

The larger problem in the code is that the developer expects reallocation when
it is not guaranteed.  That is, he is *depending* on reallocation.  In that
case, you should explicitly reallocate.  This doesn't necessarily mean using
.dup, the following works just fine:

return '"' ~ word ~ '"';

And it actually performs much better than the example.  In all honesty, I was
sort of trying to play your game for this example.  In real life, I would have
mentioned that the above line replaces the quote function, and most likely the
coder would have accepted it not because it deterministically allocates but
because it's cleaner and shorter :)

 
 What this example was supposed to illustrate is that it's easy to forget to
explicitly duplicate an array, especially if it's not clear who's
responsible--the caller or the callee. 


I don't think the example illustrates that.  It illustrates that there are
convoluted ways to try and explicitly use appending to duplicate arrays, and
why one shouldn't do that.

-Steve

Nov 25 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

Steve, I don't know about you, but this exchange clarified some things for me.
The major one is that it's dangerous to define the semantics of a language
construct in terms of implementation. You were defending some points using
implementation arguments rather than sticking to defined semantics. We have
found out that one should never rely on the array being re-allocated on
expansion, even if it seems like there's no other way. The only correct
statement is that the freshly expanded part of the array is guaranteed not to
be write-shared with any other array. 

However, this discussion veered away from a more important point. I don't
believe that programmers will consciously make assumptions about re-allocation
breaking sharing. 

The danger is that it's easy to miss accidental sharing and it's very hard to
test for it. 

That was my original point and I haven't been convinced otherwise.


 Bartosz Milewski Wrote:
 
 The problem here is that the guy is sort of right. Indeed quote() expands the
slice before overwriting it and that requires re-allocation to avoid stomping
over the original buffer. There is however a boundary case, when the whole
buffer matches the pattern. Then the expansion is done in place and the buffer
is modified. This would never happen if expansion were guaranteed to
re-allocate. 


 Actually, the requirement for not reallocating is that the slice ends at the
end of an allocated block, so for example if the pattern matched at the end of
the buffer, you may have problems.
 
 The larger problem in the code is that the developer expects reallocation when
it is not guaranteed.  That is, he is *depending* on reallocation.  In that
case, you should explicitly reallocate.  This doesn't necessarily mean using
.dup, the following works just fine:
 
 return '"' ~ word ~ '"';
 
 And it actually performs much better than the example.  In all honesty, I was
sort of trying to play your game for this example.  In real life, I would have
mentioned that the above line replaces the quote function, and most likely the
coder would have accepted it not because it deterministically allocates but
because it's cleaner and shorter :)
 
 
 What this example was supposed to illustrate is that it's easy to forget to
explicitly duplicate an array, especially if it's not clear who's
responsible--the caller or the callee. 


 I don't think the example illustrates that.  It illustrates that there are
convoluted ways to try and explicitly use appending to duplicate arrays, and
why one shouldn't do that.
 
 -Steve

Nov 26 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

Steven Schveighoffer Wrote:

 The danger is that it's easy to miss accidental sharing and it's very  
 hard to test for it.


 I think this danger is rare, and it's easy to search for (just search for  
 ~= in your code, I did it with Tango).  I think it can be very well  
 defined in a tutorial or book chapter.


This might just be it. Experienced D programmers will avoid using ~= . It's the
newbies who will be stumbling into these problems.

Dec 01 2009

Bill Baxter <wbaxter gmail.com> writes:

On Tue, Nov 24, 2009 at 8:26 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Steven Schveighoffer wrote:
 On Tue, 24 Nov 2009 11:01:10 -0500, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 [snip]
 Andrei has mentioned that he thinks we can store the allocated length =








 the GC block, which I think would also work. =A0You also wouldn't need=








 cache in that case, but he says it's in *addition* to the MRU cache, s=








 not sure what he means.


 [snip]

 Reaching the GC block is relatively expensive, so the MRU still helps. =






 essence it's like this. When appending:

 a) look up the cache, if there, you have O(1) amortized append that's
 really fast

 b) if not in the cache, talk to the GC, still O(1) amortized append
 that's not as fast

 Both help providing an important performance guarantee. I was a bit
 worried about guaranteeing "O(1) amortized append for up to 8 arrays at=






 time."


 Have you considered the performance impact on allocating non-array types=




 =A0That is, are you intending on all allocations storing the allocated l=




 even class or struct allocations who will likely never append? =A0Or wil=




 there be a "non-appendable" flag?

 Also, the part without the MRU cache was my original proposal from last
 year, I had some ideas on how length could be stored. =A0For example, in=




 page of up to 128 byte blocks, you only need 8 bits to store the length
 (alas, you cannot store with 4 bits for 16-byte blocks because you need =




 cover both 0 and 16). =A0This reduces the overhead for those blocks. =A0=




 byte to 1-page blocks, 16 bits is acceptable multi-page blocks, the cost=




 storing a 32-bit integer is negligible.


 Having access to the requested length is important at larger lengths, so
 probably we could be fine by not actually storing it for allocations up t=


 128 bytes.

 It is true the lookup of the MRU cache will not involve dissecting the
 address of the block to find it's container block, but you still will ne=




 lock, or are you planning on doing something clever? =A0 I think the loc=




 will be the bottleneck, and if you don't make it the same as the global
 lock, add the cost of both locks when you actually need to append.


 The cache is a thread-local map from pointers to size_t. Using it does no=


 require any locking I think.


Wait, what happens if you're sharing an array between two different threads=
?
The MRU in my thread might be out of date.  So some kind of cache
flushing or snooping seems required.  And that means a lock.  It's the
same cache consistency problem faced by multiple CPU cores sharing one
system memory.

--bb

Nov 24 2009

Leandro Lucarella <llucax gmail.com> writes:

dsimcha, el 20 de noviembre a las 03:35 me escribiste:
 2.  We state in the documentation that ~= on slices is a convenience that isn't
 terribly efficient and recommend that people that care more about performance
than
 simplicity use the library type for their serious array building, especially
for
 large arrays.


I think this is a really bad idea, it's so obvious dynamic arrays are
broken when you have to say in the documentation "you have this operation,
but don't use it because it's broken".

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Si ella es la flor, yo soy la espina
Si ella es quien florece, yo quien se marchita
Y estamos en eclipse total, y estamos en eclipse total
Completamente cruzados, completamente cruzados

Nov 20 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 24 Nov 2009 16:52:52 -0500, Bartosz Milewski  
<bartosz-nospam relisoft.com> wrote:

 Steven Schveighoffer Wrote:
 Imagine you're reviewing this code written by a relative newbee:

 char[] quote(char[] word) {
   word.length += 2;
   word[length-1] = '"';
   for(int i = word.length-2; i > 0; --i)
     word[i] = word[i - 1];
   word[0] = '"';
   return word;
 }

 void storeFoos(char[] buffer, string pattern, Container c)
 {
 	char[] res = find(buffer, pattern);
 	c.store(quote(res[0..pattern.len]);
 }

 Is this buggy? If so, how and when will the bug manifest itself?


 In my mind it is not buggy.  If quote or storeFoos is not meant to be  
 able
 to molest the data, it should require a const(char)[].  If this was D1
 code, I'd say it depends on the purpose of storeFoos, if storeFoos is
 expecting to own the input buffer, it's not buggy.  If it's not, then
 storeFoos is buggy (in all cases).  But we aren't talking about D1.


 surrounding code expects the buffer not to be modified. The newbee  
 didn't want to idup the buffer because in most buffers the pattern is  
 not found.


Then he forgot the const decoration for the parameter.  No duping is  
necessary:

void storeFoos(const(char)[] buffer, string pattern, Container c)
{
    ... // will result in compiler errors for given quote function, fix  
that.
}

 Or another story. An enthusiastic programmer comes to you with a very
 clever rewrite of a function. This is the original:

 T[] duplicate(T)(T[] src) {
    static if (is(T == invariant))
       return src.idup;
    else
       return src.dup;
 }

 This is his improved version:

 T[] duplicate(T)(T[] src) {
    T tmp = src[$-1];
    src.length -= 1;
    src ~= tmp;
    return src;
 }

 No need for static if! The extension is bound to re-allocate because  


 stomping rules. Will this always work, or is there a gotcha?


 He forgot to check for length 0.


 That he did. So now he goes back to the drawing board and adds an early  
 return for length 0. Is his code correct now?


Yes.

I have a feeling you have some trick up your sleeve that you think makes  
it incorrect :)

-Steve

Nov 24 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

Andrei Alexandrescu Wrote:


 One thing that Bartosz didn't mention is that "this unique" is different 
 than "the other unique", which I think reveals the half-bakedness of the 
 proposal. "The other unique" that was intensively discussed before is 
 transitive, meaning that the root reference points to a graph of objects 
 having no other connection to the outer world. "This unique" is very 
 different because it's shallow - only the array chunk is unaliased, but 
 the members of the array don't need to be. This is a very important 
 distinction.


Actually I was thinking of deep unique, but understandably I neither described
full details of my proposal, nor have I completely "baked" it, which would
require some serious work and a lot more discussion. One of these days I will
blog about uniqueness in more detail.

The problem of transitivity with respect to arrays has been discussed before in
the context of const and immutable. We realized that shallow
constness/immutability would be useful in some scenarios but, because of the
added complexity, we punted it. My feeling is that, similarly, in most cases
the restriction that a unique array may only hold unique objects is not
unreasonable.

Nov 18 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

Steven Schveighoffer Wrote:

 I wonder what your expert opinion is:  Is something like unique possible  
 without introducing all the other type modifiers (like 'lent', etc.)?  If  
 not, what are the minimum type modifiers you'd need to get unique to  
 work?  I really like the unique concept, and think it is really one of the  
 biggest holes missing from the const system (aside from bug 1961).


The most complete description of "unique" that I've seen is in Haller/Odersky
Scala paper "Capabilities for External Uniqueness." It's not an easy paper to
read, but I'm planning on writing a blog post in which I'll try to make it more
palatable. They use annotations to extend the type system, something that was
pioneered in Java. The recent introduction of annotations (like  property) in D
makes this approach quite attractive. They used the following annotations:
 unique,  transient, and  exposed ( transient loosely corresponds to "lent").

Don't get me wrong, the full specification of uniqueness is non-trivial and I
wouldn't propose it for D just to solve the array expansion problem. The real
power of "unique" shows in multithreaded programming where you can pass unique
objects between threads and safely operate on them without any synchronization.

Nov 19 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

 int[] a = [0];
 auto b = a;
 a ~= 1;
 b ~= 2;
 What is a[1]?

 Is this considered "stomping" and requiring a re-allocation?


Can this question be answered without the recourse to implementation, and the
MRU cache in particular? 

I thought about an abstract definition of "stomping".Something like that: The
expanded part of the array is never implicitly shared with any other array. But
I'm not sure about this case:

int[] a = [1, 2, 3];
int[] b = a[0..1];
b ~= 4;
what is a[1]?

By my definition, this would be considered stomping, but I couldn't find this
example in TDPL. 

I'm also not sure what to make of this statement: "client code is guaranteed to
benefit of good average performance over a large number of appends to the same
array". Since "good" is open to interpretation, such a guarantee is
meaningless. If it is meant as "amortized O(1)", it's impossible to fulfill,
unless the MRU cache is infinite and a lookup is O(1).

Then there is the problem of multithreading. A global MRU cache would require
potentially expensive synchronization. A thread-local cache, on the other hand,
makes thread creation even more heavy-weight than it is right now. My
prediction is that D will be evolving towards user-level threads and fine
granularity concurrency, which requires very small thread footprint.

Nov 21 2009

Ali Cehreli <acehreli yahoo.com> writes:

Bartosz Milewski Wrote:

 int[] a = [0];
 auto b = a;
 a ~= 1;
 b ~= 2;
 What is a[1]?

 Is this considered "stomping" and requiring a re-allocation?


 Can this question be answered without the recourse to implementation, and the
MRU cache in particular? 
 
 I thought about an abstract definition of "stomping".Something like that: The
expanded part of the array is never implicitly shared with any other array. But
I'm not sure about this case:
 


It is natural that a language like D has dynamic arrays (as core features or as
library implentations). I think that's why we keep using the word "array" to
describe slices. (No, I don't have proof for this theory.)

In *current* D, the user code does not handle dynamic arrays. Both 'a' and 'b'
below are slices.

 int[] a = [1, 2, 3];
 int[] b = a[0..1];
 b ~= 4;
 what is a[1]?
 
 By my definition, this would be considered stomping, but I couldn't find this
example in TDPL. 


Neither 'a' nor 'b' own the elements; in that sense, if one is a slice then the
other is a slice... In D, slices have "discretionary sharing semantics." Any
slice may leave the sharing contract as it sees unfit.

Ali

Nov 21 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

Ali Cehreli Wrote:

 Neither 'a' nor 'b' own the elements; in that sense, if one is a slice then
the other is a slice... In D, slices have "discretionary sharing semantics."
Any slice may leave the sharing contract as it sees unfit.


So is your answer different from that of dsimcha? I'm sitting on a fence about
it. Andrei, could you settle this issue (and a few others I raised before)?

Nov 21 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Bartosz Milewski (bartosz-nospam relisoft.com)'s article
 Ali Cehreli Wrote:
 Neither 'a' nor 'b' own the elements; in that sense, if one is a slice then




slice may leave the sharing contract as it sees unfit.
 So is your answer different from that of dsimcha? I'm sitting on a fence about



I think the confusion stems from Ali Cehreli describing things as currently
implemented and me describing things as proposed under the MRU cache scheme.

Nov 21 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Bartosz Milewski wrote:
 Ali Cehreli Wrote:
 
 Neither 'a' nor 'b' own the elements; in that sense, if one is a
 slice then the other is a slice... In D, slices have "discretionary
 sharing semantics." Any slice may leave the sharing contract as it
 sees unfit.


 So is your answer different from that of dsimcha? I'm sitting on a
 fence about it. Andrei, could you settle this issue (and a few others
 I raised before)?


I'm not sure I figure what the issue is, and I'd appreciate a rehash of 
the few other issues brought up. (I thought they had been responded to.) 
David's  reply is valid but may be a bit misleading because it explains 
things in terms of the cache, which could reinforce the impression that 
the cache is part of the definition.

Ali's point is also correct in the sense that arrays are not owned. 
Garbage collection makes that possible.

It's very simple, really. Appending to an array never results in 
overwriting elements of an existing array. Appending may terminate a 
sharing relationship. This is simple, easy to understand, and requires 
no understanding of the optimization paraphernalia, which I hope to be 
able to improve.


Andrei

Nov 22 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

Andrei Alexandrescu Wrote:

 I'm not sure I figure what the issue is, and I'd appreciate a rehash of 
 the few other issues brought up. (I thought they had been responded to.) 


Some of the points are in my message from Sat, 05:19 pm. Especially this point:
Can a conforming implementation simply re-allocate on every expansion? You make
it sound like that would violate some performance guarantees but there are no
specifics.

Nov 22 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Bartosz Milewski wrote:
 Andrei Alexandrescu Wrote:
 
 I'm not sure I figure what the issue is, and I'd appreciate a
 rehash of the few other issues brought up. (I thought they had been
 responded to.)


 Some of the points are in my message from Sat, 05:19 pm. Especially
 this point: Can a conforming implementation simply re-allocate on
 every expansion? You make it sound like that would violate some
 performance guarantees but there are no specifics.


I don't think a conforming implementation is allowed to do that. The 
functioning of many algorithms assumes constant-time append. If D can't 
guarantee that, those algorithms won't work.

I've been deliberately vague in the book because I wasn't sure we can 
offer a guarantee if the cache is filled. Now I know we can: if a memory 
block stores the requested size in addition to today's effective 
capacity, then the allocator can overallocate growing arrays and then 
detect and correctly handle in-place reallocation requests, even when 
the MRU cache capacity is exceeded. There would be a larger cost for 
that (acquiring a lock), but it's still a constant amortized per-element 
cost.


Andrei

Nov 23 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

Andrei Alexandrescu Wrote:


 Some of the points are in my message from Sat, 05:19 pm. Especially
 this point: Can a conforming implementation simply re-allocate on
 every expansion? You make it sound like that would violate some
 performance guarantees but there are no specifics.


 I don't think a conforming implementation is allowed to do that. The 
 functioning of many algorithms assumes constant-time append. If D can't 
 guarantee that, those algorithms won't work.
 
 I've been deliberately vague in the book because I wasn't sure we can 
 offer a guarantee if the cache is filled. Now I know we can: if a memory 
 block stores the requested size in addition to today's effective 
 capacity, then the allocator can overallocate growing arrays and then 
 detect and correctly handle in-place reallocation requests, even when 
 the MRU cache capacity is exceeded. There would be a larger cost for 
 that (acquiring a lock), but it's still a constant amortized per-element 
 cost.


I wish we had some performance numbers to support this optimization. There's
always a break-even point between O(1) and O(N) and I wonder at what N it
happens. Also, how often does it happen that the cache is full--whether it's
normal state or an exception. Hitting the cache and the heap lock on every
extension will also break locality. There are so many factors...

Nov 23 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Bartosz Milewski wrote:
 Andrei Alexandrescu Wrote:
 
 
 Some of the points are in my message from Sat, 05:19 pm. Especially
 this point: Can a conforming implementation simply re-allocate on
 every expansion? You make it sound like that would violate some
 performance guarantees but there are no specifics.


 functioning of many algorithms assumes constant-time append. If D can't 
 guarantee that, those algorithms won't work.

 I've been deliberately vague in the book because I wasn't sure we can 
 offer a guarantee if the cache is filled. Now I know we can: if a memory 
 block stores the requested size in addition to today's effective 
 capacity, then the allocator can overallocate growing arrays and then 
 detect and correctly handle in-place reallocation requests, even when 
 the MRU cache capacity is exceeded. There would be a larger cost for 
 that (acquiring a lock), but it's still a constant amortized per-element 
 cost.


 I wish we had some performance numbers to support this optimization. There's
always a break-even point between O(1) and O(N) and I wonder at what N it
happens. Also, how often does it happen that the cache is full--whether it's
normal state or an exception. Hitting the cache and the heap lock on every
extension will also break locality. There are so many factors...


Very true. It would be great if you or someone else interested could 
find the time to run a few tests. I am committed way over the top and 
simply cannot look into this.

Andrei

Nov 23 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Bartosz Milewski wrote:
 Andrei Alexandrescu Wrote:


 Some of the points are in my message from Sat, 05:19 pm. Especially
 this point: Can a conforming implementation simply re-allocate on
 every expansion? You make it sound like that would violate some
 performance guarantees but there are no specifics.


 functioning of many algorithms assumes constant-time append. If D can't
 guarantee that, those algorithms won't work.

 I've been deliberately vague in the book because I wasn't sure we can
 offer a guarantee if the cache is filled. Now I know we can: if a memory
 block stores the requested size in addition to today's effective
 capacity, then the allocator can overallocate growing arrays and then
 detect and correctly handle in-place reallocation requests, even when
 the MRU cache capacity is exceeded. There would be a larger cost for
 that (acquiring a lock), but it's still a constant amortized per-element
 cost.


 I wish we had some performance numbers to support this optimization. There's




Also, how often does it happen that the cache is full--whether it's normal state
or an exception. Hitting the cache and the heap lock on every extension will
also
break locality. There are so many factors...
 Very true. It would be great if you or someone else interested could
 find the time to run a few tests. I am committed way over the top and
 simply cannot look into this.
 Andrei


But one of my biggest gripes about the current appending implementation, far
more
important from a practical perspective than any theoretical safety guarantees,
is
that it doesn't scale to multiple threads.  I've experienced this in a
real-world
program that I translated from someone else's C++ code that used
vector::push_back() quite liberally inside core algorithms.  (The reason is that
it was written more for simplicity/readability than performance, so the person
who
wrote it never thought to optimize this kind of stuff.)  When I did a literal
translation of this to D, everything was completely serialized.  It didn't even
scale to two cores.  I had to rewrite significant portions of the code in a way
that did not use appending to get it to scale at all.

D is supposed to be a language that is geared toward multi-threading.  If
something as simple as array appending is a threading bottleneck (and it was for
me in a real-world program), I don't care if it is O(1) because effectively it
has
a constant the size of Jupiter.

Nov 23 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Bill Baxter wrote:
 On Mon, Nov 23, 2009 at 2:09 PM, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 On Mon, 23 Nov 2009 16:33:42 -0500, dsimcha <dsimcha yahoo.com> wrote:

 But one of my biggest gripes about the current appending implementation,
 far more
 important from a practical perspective than any theoretical safety
 guarantees, is
 that it doesn't scale to multiple threads.  I've experienced this in a
 real-world
 program that I translated from someone else's C++ code that used
 vector::push_back() quite liberally inside core algorithms.  (The reason
 is that
 it was written more for simplicity/readability than performance, so the
 person who
 wrote it never thought to optimize this kind of stuff.)  When I did a
 literal
 translation of this to D, everything was completely serialized.  It didn't
 even
 scale to two cores.  I had to rewrite significant portions of the code in
 a way
 that did not use appending to get it to scale at all.

 D is supposed to be a language that is geared toward multi-threading.  If
 something as simple as array appending is a threading bottleneck (and it
 was for
 me in a real-world program), I don't care if it is O(1) because
 effectively it has
 a constant the size of Jupiter.


 you need is a library type that does not require a global lock for
 appending.  Such is not the case for every example of appending in D, and
 the existence of the current D arrays does not prevent you from adding more
 specialized types.  There are other costs to using such types besides the
 global lock, such as storing the capacity, and if you are slicing, locally
 storing the slice extents.


 That's likely to be far too big a limitation for the upcoming world of
 parallelism everywhere.
 But the single-threaded GC is the real culprit.  Fix that and arrays
 shouldn't be a problem either.


I agree. Overall, our eyes should be at this point focused on principle 
matters rather than matters of optimizations that can be done but 
haven't been done yet.

Without being 100% sure, I conjecture that array primitives can be 
implemented efficiently if we give the implementation the freedom to end 
sharing for ~=. Also I am not ruling out the possibility that we can 
guarantee a ~= that never keeps sharing with existing arrays. (One idea 
I discussed with Walter was encoding uniqueness of arrays in a bit 
stored with the array limits.)


Andrei

Nov 23 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

Andrei Alexandrescu Wrote:
 sharing for ~=. Also I am not ruling out the possibility that we can 
 guarantee a ~= that never keeps sharing with existing arrays. (One idea 
 I discussed with Walter was encoding uniqueness of arrays in a bit 
 stored with the array limits.)


I've been thinking the same. Full uniqueness might be hard to implement, but if
you are only interested in "opportunistic uniqueness" as an optimization tool,
that might be easier. In full uniqueness you worry about lending and deep
aliases. Here you might just conservatively decide that passing an array to a
function, or storing it in an object, destroys its uniqueness (on top of
explicit slicing). And you don't have to care about deep aliases as long as
your only interest is in the uniqueness of the array buffer and not the objects
it might be storing. Uniqueness one way or another is the key.

Nov 24 2009

Michel Fortin <michel.fortin michelf.com> writes:

On 2009-11-22 19:58:37 -0500, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 It's very simple, really. Appending to an array never results in 
 overwriting elements of an existing array. Appending may terminate a 
 sharing relationship. This is simple, easy to understand, and requires 
 no understanding of the optimization paraphernalia, which I hope to be 
 able to improve.


I think it's the "may terminate" part Bartosz (and some others) don't 
like, since it's hard to predict and may cause bugs that appear in an 
hard to understand pattern when misused.

I'm not exactly sure where to stand on this issue, but I might have a 
solution that looks like a compromise: use the MRU only for arrays of 
immutable elements, like strings. Since it's not possible to modify 
immutable data, it doesn't change the semantics if two slices are 
aliasing each other or not[^1]. Other arrays could get the more 
predictable behavior of always reallocating, but an optimizer could 
still avoid that when static analysis can prove that only one slice 
points to the data.

[^1]: It changes things somewhat if you take the address of elements 
and start comparing them, but I find it reasonable that taking address 
of things could reveal implementation details. It could change things 
if the element type has a postblit function, or a destructor, since you 
will avoid some copying, but I think that's fine.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Nov 22 2009

Bill Baxter <wbaxter gmail.com> writes:

On Mon, Nov 23, 2009 at 2:09 PM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 On Mon, 23 Nov 2009 16:33:42 -0500, dsimcha <dsimcha yahoo.com> wrote:

 But one of my biggest gripes about the current appending implementation,
 far more
 important from a practical perspective than any theoretical safety
 guarantees, is
 that it doesn't scale to multiple threads. =A0I've experienced this in a
 real-world
 program that I translated from someone else's C++ code that used
 vector::push_back() quite liberally inside core algorithms. =A0(The reas=




 is that
 it was written more for simplicity/readability than performance, so the
 person who
 wrote it never thought to optimize this kind of stuff.) =A0When I did a
 literal
 translation of this to D, everything was completely serialized. =A0It di=




 even
 scale to two cores. =A0I had to rewrite significant portions of the code=




 a way
 that did not use appending to get it to scale at all.

 D is supposed to be a language that is geared toward multi-threading. =




 something as simple as array appending is a threading bottleneck (and it
 was for
 me in a real-world program), I don't care if it is O(1) because
 effectively it has
 a constant the size of Jupiter.


 It's hard to say that your example is a good fit for the arrays of D. =A0=


 you need is a library type that does not require a global lock for
 appending. =A0Such is not the case for every example of appending in D, a=


 the existence of the current D arrays does not prevent you from adding mo=


 specialized types. =A0There are other costs to using such types besides t=


 global lock, such as storing the capacity, and if you are slicing, locall=


 storing the slice extents.


That's likely to be far too big a limitation for the upcoming world of
parallelism everywhere.
But the single-threaded GC is the real culprit.  Fix that and arrays
shouldn't be a problem either.

--bb

Nov 23 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Bartosz Milewski (bartosz-nospam relisoft.com)'s article
 int[] a = [0];
 auto b = a;
 a ~= 1;
 b ~= 2;
 What is a[1]?

 Is this considered "stomping" and requiring a re-allocation?




 I thought about an abstract definition of "stomping".Something like that: The


I'm not sure about this case:
 int[] a = [1, 2, 3];
 int[] b = a[0..1];
 b ~= 4;
 what is a[1]?
 By my definition, this would be considered stomping, but I couldn't find this



a[1] == 2.  The MRU cache stores both the pointer and the length.  When you
attempt to append to b, the MRU is searched.  No array that starts at b.ptr and
has length b.length is found.  b is reallocated.  This is easy to specify:  "The
efficiency of the ~= operator for dynamic arrays is implementation defined but
guaranteed to be no worse than O(N).  An implementation may optimize this
operation to avoid a reallocation on every append.  The current reference
implementation does so.  However, the observable behavior of performing the
operation a ~= b is guaranteed to be equivalent to the observable behavior of
performing the operation a = a ~ b."

Furthermore, the purpose of language design is to create a usable language first
and foremost, and then figure out how to specify it formally and abstractly, not
to create a perfectly bulletproof specification first and foremost and then hope
that results in a usable language.  Again, not saying I particularly like the
MRU
cache idea (it has its issues at an implementation level) but I think that
"perfectly specified" has to take a back seat to "works well in practice".

Nov 21 2009

Bartosz Milewski <bartosz-nospam relisoft.com> writes:

dsimcha Wrote:

 Furthermore, the purpose of language design is to create a usable language
first
 and foremost, and then figure out how to specify it formally and abstractly,
not
 to create a perfectly bulletproof specification first and foremost and then
hope
 that results in a usable language.  Again, not saying I particularly like the
MRU
 cache idea (it has its issues at an implementation level) but I think that
 "perfectly specified" has to take a back seat to "works well in practice".


This might be true at the experimental phase of language design. But I expect
more from TDPL, if it's going to be the official definition of the language for
the foreseeable future. I'm not expecting it to be as scrupulous as the C++
Standard, but a bit more solid than, "If in doubt, look how it's implemented in
DMD--MRU cache and all."

Nov 21 2009

Ali Cehreli <acehreli yahoo.com> writes:

dsimcha Wrote:

 == Quote from Bartosz Milewski (bartosz-nospam relisoft.com)'s article
 int[] a = [0];
 auto b = a;
 a ~= 1;
 b ~= 2;
 What is a[1]?

 Is this considered "stomping" and requiring a re-allocation?




 I thought about an abstract definition of "stomping".Something like that: The


 I'm not sure about this case:
 int[] a = [1, 2, 3];
 int[] b = a[0..1];
 b ~= 4;
 what is a[1]?
 By my definition, this would be considered stomping, but I couldn't find this


 
 a[1] == 2.  The MRU cache stores both the pointer and the length.  When you
 attempt to append to b, the MRU is searched.  No array that starts at b.ptr and
 has length b.length is found.  b is reallocated.


Consider there is also c:

  int[] c = b[0..1];

According to the above definition, after b~=4 'b' would be relocated and b[0]
would be disjoint from c[0].

Now consider that a's definition is changed to just [ 1 ].

If I understand this correctly, then b.ptr and b.length would match the
original array, and in this case b ~= 4 operation might not relocate b, and
b[0] and c[0] would refer to the same element.

A change in 'a' would be defining the relation between b and c.

Ali

Nov 21 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Ali Cehreli (acehreli yahoo.com)'s article
 dsimcha Wrote:
 == Quote from Bartosz Milewski (bartosz-nospam relisoft.com)'s article
 int[] a = [0];
 auto b = a;
 a ~= 1;
 b ~= 2;
 What is a[1]?

 Is this considered "stomping" and requiring a re-allocation?




 I thought about an abstract definition of "stomping".Something like that: The


 I'm not sure about this case:
 int[] a = [1, 2, 3];
 int[] b = a[0..1];
 b ~= 4;
 what is a[1]?
 By my definition, this would be considered stomping, but I couldn't find this



 a[1] == 2.  The MRU cache stores both the pointer and the length.  When you
 attempt to append to b, the MRU is searched.  No array that starts at b.ptr and
 has length b.length is found.  b is reallocated.


   int[] c = b[0..1];
 According to the above definition, after b~=4 'b' would be relocated and b[0]


 Now consider that a's definition is changed to just [ 1 ].
 If I understand this correctly, then b.ptr and b.length would match the
original


would refer to the same element.
 A change in 'a' would be defining the relation between b and c.
 Ali


You're right.  Indexing, unlike appending provides no guarantees about stomping,
except that, if none of the arrays involved are resized, then they will point to
the same memory block.  I don't see this as a problem in practice, because it
does
not break immutability guarantees, etc.  I've been using D for a year and a half
and never had a bug caused by that.  All you need to do is follow these rules:

1.  If you want a slice to point to the same memory as the original array, make
sure none of the arrays are resized.

2.  If you don't want a slice to point to the same memory as the original array,
use .dup.

3.  If you don't care either way, then do neither.

Nov 22 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 23 Nov 2009 18:34:48 -0500, Leandro Lucarella <llucax gmail.com>  
wrote:

 Steven Schveighoffer, el 23 de noviembre a las 15:18 me escribiste:
 On Mon, 23 Nov 2009 11:10:48 -0500, Leandro Lucarella <llucax gmail.com>
 wrote:
The thing is, with realloc() is less likely that you forget that the  


can be copied because it returns the new pointer (that can be the same  


the original pointer). And even in this case, I saw a lot of bugs  


to realloc() misuse (and I made a couple myself).

With slices is much worse.


 realloc is worse.  If I have multiple aliases to the same data, then
 I realloc one of those, the others all can become dangling pointers
 if the runtime decides to move the data.


 Well, you are comparing GC vs no-GC, not realloc() vs. slices. I have no
 intention on following that discussion, I'm just saying that realloc() is
 less error-prone than slices (and realloc() is error-prone already).


I was originally saying that realloc is similar to appending because it  
may or may not choose to move the data.
The only difference from realloc in slices is that your other aliases to  
the old data don't become invalidated, so there are no dangling pointers.   
The GC is partly the cause for that, but also the fact that slices don't  
deallocate the original data (that would be possible, even in a GC'd  
environment).

Your assertion that realloc is less error prone doesn't make any sense to  
me.  realloc is as error prone as appending slices *and* it can also leave  
dangling pointers.

 You also cannot realloc data that's not malloc'd but you can append to
 a slice of non-heap data without issue.


 How is that? AFAIK slices uses gc_realloc() to do the actual realloc, if
 that's done in a piece of malloc'ed data it will fail.


I believe step 1 of the append process is to find the GC heap block that  
contains the slice.  If it doesn't, it simply copies the data to a new  
block.  If the original data is stack-allocated or allocated outside the  
GC, no errors occur AFAIK.

 And even if it were
 true, I don't really see this as a big source of bugs, I really never had
 a bug because I tried to realloc() a non-heap piece of memory or  
 appending
 to a slice of non-GC-heap memory either.


You probably didn't have such bugs because you probably didn't think of  
doing this in C:

void foo(char *arg)
{
    realloc(arg, 10000);
}

Oops, what if arg isn't malloc'd?  But in D this is safe code, and works  
as you expect, no matter what size arg was to begin with.  In fact, tango  
uses this all the time for buffer optimization.  Basically, you pass in a  
stack buffer, if it's big enough, you save the penalty of heap  
allocation.  If it's not, no big deal, just resize the buffer and the GC  
takes care of the rest.

void foo(char[] arg)
{
    arg.length = 10000;
}

 No matter what you do with slice appending in D, you cannot access
 dangling pointers unless you access the slice's ptr field.


 Again, that's only because D is GCed, not because slices are great.


We are comparing realloc to slices.  You assert slices are worse and I  
gave you a reason why they are not.  You can't just ignore that because  
comparing slices to realloc requires taking the GC into account.

 Yes, you can run into trouble if you append to a slice and then
 change the original data in the slice, but that is a rare event, and
 it won't result in corrupting memory you didn't already have access
 to (at least, with the MRU cache).


 I'm a little lost now. I don't know of what hypothetical D are you  
 talking
 about.


Here's the deal.  You have two very common operations on an array:  
appending and modification.  In practice you either append data or you  
modify data.  Rarely do you append data *and* modify the original data,  
and the only case I've seen for that is for the buffer optimization trick  
above.  This is the only case where slices might get you  
"non-deterministic" behavior (and that's only if you still want to use the  
original array before appending).  I have no proof of this except for my  
experience reading tango code.

The two major use cases for appending are, building an array of items  
using append, and using an array as a buffer.  In the first case, you  
start out with an empty or fully owned array, so no harm.  In the second  
case, there are actually two types of functions that accept buffers, ones  
that return a slice of the buffer, and ones which don't.  The ones that  
return a slice, you use the slice, not the original buffer.  The ones  
which don't return a slice, you use the original buffer if you expect  
important data to be there.

I don't think there's ever a case where you see a function that takes an  
array append to the array, then modify the original part of the array  
*without* returning the result.

 I can't see how the MRU cache can provide any safety. The cache is
 finite, and not all the slices will fit in it, so for those slices that
 are not cached, I don't see how the cache can provide any safety.


In those cases, the slice is reallocated because the runtime isn't sure  
whether it will result in stomping.  The MRU cache is an optimization  
which defaults to reallocation for safety.

Andrei has mentioned that he thinks we can store the allocated length in  
the GC block, which I think would also work.  You also wouldn't need an  
MRU cache in that case, but he says it's in *addition* to the MRU cache,  
so I'm not sure what he means.

 Safety can be provided if a ~= b is defined to be semantically the same  
 as
 a = a ~ b, and leaving the MRU cache as an optimization.


Yes, exactly.  That is what the MRU cache does.  It's no good if the  
current behavior becomes the fallback, the fallback must be a reallocation.

 In that case we
 agree slices are predictable and a little safer (because stomping is not
 possible). But they are still error prone if you expect them to be a full
 reference type. Being the only entity in D with such semantics, is
 something one can forget very easily and introduce subtle bugs. In this
 case, I really think providing ~= is a bad idea, it's just too error  
 prone
 and doesn't give you anything.


Providing the simple syntax for simple cases is what this does.  I think  
it's worth having in the language.

Let's all face it: slices and arrays being the same thing is a new concept  
that most of us have only seen in D (I don't know many languages, but  
probably it's been done before).  And the hybrid value/reference nature of  
a slice is definitely new to me.  I think it provides great power and  
flexibility that makes code just work the way you want it to *in most  
cases*.  If you read a lot of the tango code, you see how elegant things  
can be with slices.  But it's always going to be confusing to newcomers.   
It's like explaining pointers for the first time to someone.

 I still think the best is to just make slices immutable value types (you
 can mutate the data they point to, you just can't modify slices; ptr and
 length), and provide a proper dynamic array type in the library or
 something.


I think you mean slices should only be *shrinkable*, immutable makes no  
sense, you need to be able to rebind a slice.

I don't see the harm in allowing appending as long as it's safe.

Having a separate type in the library that optimizes appending and has  
complete reference semantics is fine with me, just leave slices and arrays  
alone (well, except the fix for stomping).  They *can* live together in  
the same language.

-Steve

Nov 24 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 24 Nov 2009 11:01:10 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 [snip]
 Andrei has mentioned that he thinks we can store the allocated length  
 in the GC block, which I think would also work.  You also wouldn't need  
 an MRU cache in that case, but he says it's in *addition* to the MRU  
 cache, so I'm not sure what he means.



 Reaching the GC block is relatively expensive, so the MRU still helps.  
 In essence it's like this. When appending:

 a) look up the cache, if there, you have O(1) amortized append that's  
 really fast

 b) if not in the cache, talk to the GC, still O(1) amortized append  
 that's not as fast

 Both help providing an important performance guarantee. I was a bit  
 worried about guaranteeing "O(1) amortized append for up to 8 arrays at  
 a time."


Have you considered the performance impact on allocating non-array types?   
That is, are you intending on all allocations storing the allocated  
length, even class or struct allocations who will likely never append?  Or  
will there be a "non-appendable" flag?

Also, the part without the MRU cache was my original proposal from last  
year, I had some ideas on how length could be stored.  For example, in a  
page of up to 128 byte blocks, you only need 8 bits to store the length  
(alas, you cannot store with 4 bits for 16-byte blocks because you need to  
cover both 0 and 16).  This reduces the overhead for those blocks.  For  
256 byte to 1-page blocks, 16 bits is acceptable multi-page blocks, the  
cost of storing a 32-bit integer is negligible.

It is true the lookup of the MRU cache will not involve dissecting the  
address of the block to find it's container block, but you still will need  
a lock, or are you planning on doing something clever?   I think the  
locking will be the bottleneck, and if you don't make it the same as the  
global lock, add the cost of both locks when you actually need to append.

-Steve

Nov 24 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 24 Nov 2009 11:26:11 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 On Tue, 24 Nov 2009 11:01:10 -0500, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 [snip]
 Andrei has mentioned that he thinks we can store the allocated length  
 in the GC block, which I think would also work.  You also wouldn't  
 need an MRU cache in that case, but he says it's in *addition* to the  
 MRU cache, so I'm not sure what he means.



 Reaching the GC block is relatively expensive, so the MRU still helps.  
 In essence it's like this. When appending:

 a) look up the cache, if there, you have O(1) amortized append that's  
 really fast

 b) if not in the cache, talk to the GC, still O(1) amortized append  
 that's not as fast

 Both help providing an important performance guarantee. I was a bit  
 worried about guaranteeing "O(1) amortized append for up to 8 arrays  
 at a time."


 types?  That is, are you intending on all allocations storing the  
 allocated length, even class or struct allocations who will likely  
 never append?  Or will there be a "non-appendable" flag?
  Also, the part without the MRU cache was my original proposal from  
 last year, I had some ideas on how length could be stored.  For  
 example, in a page of up to 128 byte blocks, you only need 8 bits to  
 store the length (alas, you cannot store with 4 bits for 16-byte blocks  
 because you need to cover both 0 and 16).  This reduces the overhead  
 for those blocks.  For 256 byte to 1-page blocks, 16 bits is acceptable  
 multi-page blocks, the cost of storing a 32-bit integer is negligible.


 Having access to the requested length is important at larger lengths, so  
 probably we could be fine by not actually storing it for allocations up  
 to 128 bytes.


So in other words, appending any array under 128 bytes which does not  
still sit in the MRU cache causes a reallocation?  I'm not sure if it's  
the right move, but it is an optimization, so maybe it works well.  It  
also handily fixes the problem of requiring allocated lengths for  
non-array allocations since most would be under 128 bytes.

 It is true the lookup of the MRU cache will not involve dissecting the  
 address of the block to find it's container block, but you still will  
 need a lock, or are you planning on doing something clever?   I think  
 the locking will be the bottleneck, and if you don't make it the same  
 as the global lock, add the cost of both locks when you actually need  
 to append.


 The cache is a thread-local map from pointers to size_t. Using it does  
 not require any locking I think.


When reallocating, do you not also need to update the allocated length in  
the heap, even if the allocation fits into the same block to prevent other  
threads from stomping?  Doesn't that require a lock?  Somewhere you have  
to share the allocated length between threads sharing the same array.  I  
assumed it was going to be in the MRU cache...

-Steve

Nov 24 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 24 Nov 2009 11:41:17 -0500, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:


 The cache is a thread-local map from pointers to size_t. Using it does  
 not require any locking I think.


 When reallocating, do you not also need to update the allocated length  
 in the heap, even if the allocation fits into the same block to prevent  
 other threads from stomping?  Doesn't that require a lock?  Somewhere  
 you have to share the allocated length between threads sharing the same  
 array.  I assumed it was going to be in the MRU cache...


If all this is true, I had another thought for an MRU cache.  It could  
simply be a map of pointers to the right blocks in the GC that store the  
lengths (to avoid the block search with the global mutex locked).  Lookup  
should be atomic without locking (I think, simply an integer load right?),  
but you'd have to lock to actually do an append.

I don't think we solve the lock problem without having multiple heaps...

-Steve

Nov 24 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 24 Nov 2009 12:46:37 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 On Tue, 24 Nov 2009 11:41:17 -0500, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 The cache is a thread-local map from pointers to size_t. Using it  
 does not require any locking I think.


 When reallocating, do you not also need to update the allocated length  
 in the heap, even if the allocation fits into the same block to  
 prevent other threads from stomping?  Doesn't that require a lock?   
 Somewhere you have to share the allocated length between threads  
 sharing the same array.  I assumed it was going to be in the MRU  
 cache...


 simply be a map of pointers to the right blocks in the GC that store  
 the lengths (to avoid the block search with the global mutex locked).


 That's a great idea. I keep on dreaming someone will implement all this  
 stuff we talk about :o).

 Lookup should be atomic without locking (I think, simply an integer  
 load right?), but you'd have to lock to actually do an append.
  I don't think we solve the lock problem without having multiple  
 heaps...


 I don't think we need to worry about optimizing growth of shared arrays.


What about immutable arrays?  Can't those be shared without being marked  
as shared?

-Steve

Nov 24 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 24 Nov 2009 12:46:37 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 On Tue, 24 Nov 2009 11:41:17 -0500, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 The cache is a thread-local map from pointers to size_t. Using it  
 does not require any locking I think.


 When reallocating, do you not also need to update the allocated length  
 in the heap, even if the allocation fits into the same block to  
 prevent other threads from stomping?  Doesn't that require a lock?   
 Somewhere you have to share the allocated length between threads  
 sharing the same array.  I assumed it was going to be in the MRU  
 cache...


 simply be a map of pointers to the right blocks in the GC that store  
 the lengths (to avoid the block search with the global mutex locked).


 That's a great idea. I keep on dreaming someone will implement all this  
 stuff we talk about :o).


Note this also solves the problem of having to flush the MRU cache on a  
collection -- the same allocated length property should remain attached to  
that block.

:)

-Steve

Nov 24 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 24 Nov 2009 12:46:37 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 Lookup should be atomic without locking (I think, simply an integer  
 load right?), but you'd have to lock to actually do an append.
  I don't think we solve the lock problem without having multiple  
 heaps...


 I don't think we need to worry about optimizing growth of shared arrays.


This doesn't compute.  The heap is shared, you need to store the allocated  
length in the heap, you need to lock on every allocate, even for  
non-shared arrays.  What am I missing?  Can you update *parts* of the heap  
without locking?  If so, why do we need an MRU cache?

-Steve

Nov 24 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 24 Nov 2009 15:00:11 -0500, Bartosz Milewski  
<bartosz-nospam relisoft.com> wrote:

 It's hard to argue whether something is or isn't bug prone and how  
 severe the bugs could be. An experienced programmer knows all the  
 gotchas so he or she will never introduce such bugs. A more bug-prone  
 person will be afraid to speak up in public so as not to appear stupid  
 or be bullied away. It's only when the language goes into production  
 that the problems surface. Then an experienced programmer becomes part  
 of a group or not so experienced programmers and he or she starts seeing  
 how much the rests of the team is prone to certain bugs and how hard it  
 is to prevent or even detect them. That's why the best languages attempt  
 to turn as many errors as possible into hard bugs.


Yes, but there is a very fuzzy line between preventing all bugs and  
preventing useful code.  The safest most secure system is one that has no  
inputs or outputs.

 Non-deterministic behavior is a major source of soft bugs. A buggy  
 program might work correctly 99% of the time, introduce minor errors  
 0.9% of the time, and in 0.1% of the time lead to a disaster.

 I will try to come up with some scenarios that would qualify for the  
 "100 Gotchas with D arrays" to provide more food for though. Consider  
 these little language puzzles.

 Imagine you're reviewing this code written by a relative newbee:

 char[] quote(char[] word) {
   word.length += 2;
   word[length-1] = '"';
   for(int i = word.length-2; i > 0; --i)
     word[i] = word[i - 1];
   word[0] = '"';
   return word;
 }

 void storeFoos(char[] buffer, string pattern, Container c)
 {
 	char[] res = find(buffer, pattern);
 	c.store(quote(res[0..pattern.len]);
 }

 Is this buggy? If so, how and when will the bug manifest itself?


In my mind it is not buggy.  If quote or storeFoos is not meant to be able  
to molest the data, it should require a const(char)[].  If this was D1  
code, I'd say it depends on the purpose of storeFoos, if storeFoos is  
expecting to own the input buffer, it's not buggy.  If it's not, then  
storeFoos is buggy (in all cases).  But we aren't talking about D1.

 Or another story. An enthusiastic programmer comes to you with a very  
 clever rewrite of a function. This is the original:

 T[] duplicate(T)(T[] src) {
    static if (is(T == invariant))
       return src.idup;
    else
       return src.dup;
 }

 This is his improved version:

 T[] duplicate(T)(T[] src) {
    T tmp = src[$-1];
    src.length -= 1;
    src ~= tmp;
    return src;
 }

 No need for static if! The extension is bound to re-allocate because of  
 stomping rules. Will this always work, or is there a gotcha?


He forgot to check for length 0.

Also, you are a bit generous on your use of 'clever' :)

-Steve

Nov 24 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 18 Nov 2009 13:21:49 -0500, Bartosz Milewski  
<bartosz-nospam relisoft.com> wrote:

 Andrei Alexandrescu Wrote:

 Leandro Lucarella wrote:
 Andrei Alexandrescu, el 17 de noviembre a las 18:45 me escribiste:
 3.  If you **really** care about performance, you should only  






 don't know the length in advance.  If you know the length, you  






 pre-allocate.


 how the proposal follows from the feedback. My perception is that
 this is a group of people who use D. Bartosz' concern didn't cause
 quite a riot, so as far as I can tell there is no big issue at
 stake.


 I didn't say anything (until now) because this was discussed already.
 Dynamic arrays/slices appending is horribly broken (I know it's well
 defined, and deterministic, but you are just condemned to make  


 it just doesn't work as one would expect, even when you know how it  


 you have to keep fighting your intuition all the time).


 In which ways do you think arrays horribly broken? Same as Bartosz  
 mentions?

 One question is whether you actually have had bugs and problems coding,
 or if arrays stopped you from getting work done.


 What Leandro points out is that using D arrays is not straightforward  
 and may present a steep learning curve. In particular, even the beginner  
 would be required to understand section 4.1.9 of TDPL (Expanding). For  
 many people arrays' behavior might be counter-intuitive and a source of  
 mistakes.

 In fact, even after reading 4.1.9 I don't know what to expect in some  
 cases. Here's an example:

 int[] a = [0];
 auto b = a;
 a ~= 1;
 b ~= 2;
 What is a[1]?

 Is this considered "stomping" and requiring a re-allocation?


a[1] would be 1 if an MRU cache was introduced.  This is exactly the sort  
of problem that the MRU cache is supposed to solve.

What should happen is:

a ~= 1 should allocate in place because its parameters were in the MRU  
cache.
b ~= 1 should reallocate because the a~=1 should have removed its  
parameters from the MRU cache.

I agree that arrays behave differently than what people naively expect.   
However, their usage is so much better than anything from other languages  
I've used.  One of the biggest "problems" in understanding arrays I've  
found from having to explain them to other people is that they are half  
value type and half reference type.  I'd say most of the confusion comes  
 from that, but it's that feature that makes them so great IMO.  I'm unsure  
whether introducing a different usage would leave them as great, but I  
hope we can find a way to make them intuitive *and* keep them as useful as  
they are now.

The implementation-defined determinisim is probably another issue that  
bites people, but I think the cases are so much rarer.  Generally when you  
are appending *and* manipulating the original data, you have a unique  
array.  Otherwise your usage falls into either read-only + append usage,  
or buffer usage where you don't usually append.  Data-stomping is a memory  
issue (and a const/invariant issue), so I think that needs the most  
immediate attention.  If it can be solved without changing the syntax or  
system, then I think it's a win-win.  We can always decide to use a new  
concept later if we want to fix the deterministic behavior.

In fact, how hard would it be to introduce a deterministic construct  
*without* changing current behavior?  I mean, if the end result of your  
proposal is you add something like std::vector, why not still allow  
appending to slices *and* add something like std::vector where people can  
use that if they want better behavior?

-Steve

Nov 19 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 18 Nov 2009 14:03:11 -0500, Bartosz Milewski  
<bartosz-nospam relisoft.com> wrote:

 Andrei Alexandrescu Wrote:


 One thing that Bartosz didn't mention is that "this unique" is different
 than "the other unique", which I think reveals the half-bakedness of the
 proposal. "The other unique" that was intensively discussed before is
 transitive, meaning that the root reference points to a graph of objects
 having no other connection to the outer world. "This unique" is very
 different because it's shallow - only the array chunk is unaliased, but
 the members of the array don't need to be. This is a very important
 distinction.


 Actually I was thinking of deep unique, but understandably I neither  
 described full details of my proposal, nor have I completely "baked" it,  
 which would require some serious work and a lot more discussion. One of  
 these days I will blog about uniqueness in more detail.

 The problem of transitivity with respect to arrays has been discussed  
 before in the context of const and immutable. We realized that shallow  
 constness/immutability would be useful in some scenarios but, because of  
 the added complexity, we punted it. My feeling is that, similarly, in  
 most cases the restriction that a unique array may only hold unique  
 objects is not unreasonable.


I wonder what your expert opinion is:  Is something like unique possible  
without introducing all the other type modifiers (like 'lent', etc.)?  If  
not, what are the minimum type modifiers you'd need to get unique to  
work?  I really like the unique concept, and think it is really one of the  
biggest holes missing from the const system (aside from bug 1961).

-Steve

Nov 19 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Thu, 19 Nov 2009 14:56:40 +0300, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Wed, 18 Nov 2009 13:21:49 -0500, Bartosz Milewski  
 <bartosz-nospam relisoft.com> wrote:

 Andrei Alexandrescu Wrote:

 Leandro Lucarella wrote:
 Andrei Alexandrescu, el 17 de noviembre a las 18:45 me escribiste:
 3.  If you **really** care about performance, you should only  






 don't know the length in advance.  If you know the length, you  






 pre-allocate.


 how the proposal follows from the feedback. My perception is that
 this is a group of people who use D. Bartosz' concern didn't cause
 quite a riot, so as far as I can tell there is no big issue at
 stake.


 I didn't say anything (until now) because this was discussed already.
 Dynamic arrays/slices appending is horribly broken (I know it's well
 defined, and deterministic, but you are just condemned to make  


 it just doesn't work as one would expect, even when you know how it  


 you have to keep fighting your intuition all the time).


 In which ways do you think arrays horribly broken? Same as Bartosz  
 mentions?

 One question is whether you actually have had bugs and problems coding,
 or if arrays stopped you from getting work done.


 What Leandro points out is that using D arrays is not straightforward  
 and may present a steep learning curve. In particular, even the  
 beginner would be required to understand section 4.1.9 of TDPL  
 (Expanding). For many people arrays' behavior might be  
 counter-intuitive and a source of mistakes.

 In fact, even after reading 4.1.9 I don't know what to expect in some  
 cases. Here's an example:

 int[] a = [0];
 auto b = a;
 a ~= 1;
 b ~= 2;
 What is a[1]?

 Is this considered "stomping" and requiring a re-allocation?


 a[1] would be 1 if an MRU cache was introduced.  This is exactly the  
 sort of problem that the MRU cache is supposed to solve.

 What should happen is:

 a ~= 1 should allocate in place because its parameters were in the MRU  
 cache.
 b ~= 1 should reallocate because the a~=1 should have removed its  
 parameters from the MRU cache.

 I agree that arrays behave differently than what people naively expect.   
 However, their usage is so much better than anything from other  
 languages I've used.  One of the biggest "problems" in understanding  
 arrays I've found from having to explain them to other people is that  
 they are half value type and half reference type.  I'd say most of the  
 confusion comes  from that, but it's that feature that makes them so  
 great IMO.  I'm unsure whether introducing a different usage would leave  
 them as great, but I hope we can find a way to make them intuitive *and*  
 keep them as useful as they are now.

 The implementation-defined determinisim is probably another issue that  
 bites people, but I think the cases are so much rarer.  Generally when  
 you are appending *and* manipulating the original data, you have a  
 unique array.  Otherwise your usage falls into either read-only + append  
 usage, or buffer usage where you don't usually append.  Data-stomping is  
 a memory issue (and a const/invariant issue), so I think that needs the  
 most immediate attention.  If it can be solved without changing the  
 syntax or system, then I think it's a win-win.  We can always decide to  
 use a new concept later if we want to fix the deterministic behavior.

 In fact, how hard would it be to introduce a deterministic construct  
 *without* changing current behavior?  I mean, if the end result of your  
 proposal is you add something like std::vector, why not still allow  
 appending to slices *and* add something like std::vector where people  
 can use that if they want better behavior?

 -Steve


Slices are ranges. And ranges are *shrink-only*. Are you going to append  
to your data structure? Use an Array instead. Slices are not suitable for  
that purpose!

Nov 19 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 19 Nov 2009 07:32:03 -0500, Denis Koroskin <2korden gmail.com>  
wrote:

 Slices are ranges. And ranges are *shrink-only*. Are you going to append  
 to your data structure? Use an Array instead. Slices are not suitable  
 for that purpose!


Arrays are ranges too.  Just because the range interface does not define  
operations that allow growth doesn't mean that something that implements  
the interface *can't* add growth methods.

I see no problem with slices and arrays being the same type, as long as  
the stomping problem is solved.

-Steve

Nov 19 2009

Leandro Lucarella <llucax gmail.com> writes:

Andrei Alexandrescu, el 19 de noviembre a las 17:13 me escribiste:
 Leandro Lucarella wrote:
dsimcha, el 19 de noviembre a las 23:21 me escribiste:
Face it, D is a close to the metal language.  Any attempt to define the spec at
a
purely abstract level without reference to any implementation details, and
without
leaving certain details implementation defined is absurd.  I think people need
to
face the fact that you can't have close-to-the-metal flexibility and performance
and at the same time far-from-the-metal strict separation of specification from
implementation details.


Yes, but I think this is a place where you are doing early optimization at
the expense of usability. You can always do the slice-thinguie yourself if
you are that needed for speed that you can't afford an extra indirection
when using an array (which will be the only overhead if T[new] were
implemented as a pure reference type).

I think the cost here in terms of usability if far higher than the little
performance you loose, specially when you can overcome the issue for those
weird cases where you need it.


 I think one aspect that tends to be forgotten is that built-in
 arrays are the lowest abstraction - really just one small step above
 pointers. Efficiency lost there is lost forever. That's why I
 consider the current design adequate and the proposed designs less
 so.


I don't think so, slices (a view over somebody elses memory, no appending)
should be the lowest level abstraction.

 In other words: you can build a more convenient facility on top of
 T[], but you couldn't build a more efficient facility on top of a
 convenient one. I think D made the right choice with built-in
 arrays.


I don't. Current built-in arrays are both dynamic arrays and slices, which
is what messes all up.

But I'll drop the subject, once again, for the sake of... well, everybody
;)

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
SEÑOR BIELSA: CON TODO RESPETO ¿USTED LO VE JUGAR A RIQUELME?
	-- Crónica TV

Nov 19 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Fri, 20 Nov 2009 04:13:42 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Leandro Lucarella wrote:
 dsimcha, el 19 de noviembre a las 23:21 me escribiste:
 Face it, D is a close to the metal language.  Any attempt to define  
 the spec at a
 purely abstract level without reference to any implementation details,  
 and without
 leaving certain details implementation defined is absurd.  I think  
 people need to
 face the fact that you can't have close-to-the-metal flexibility and  
 performance
 and at the same time far-from-the-metal strict separation of  
 specification from
 implementation details.


 optimization at
 the expense of usability. You can always do the slice-thinguie yourself  
 if
 you are that needed for speed that you can't afford an extra indirection
 when using an array (which will be the only overhead if T[new] were
 implemented as a pure reference type).
  I think the cost here in terms of usability if far higher than the  
 little
 performance you loose, specially when you can overcome the issue for  
 those
 weird cases where you need it.


 I think one aspect that tends to be forgotten is that built-in arrays  
 are the lowest abstraction - really just one small step above pointers.  
 Efficiency lost there is lost forever. That's why I consider the current  
 design adequate and the proposed designs less so.

 In other words: you can build a more convenient facility on top of T[],  
 but you couldn't build a more efficient facility on top of a convenient  
 one. I think D made the right choice with built-in arrays.


 Andrei


Not so true. T[] appends are slow, *very* slow. Whenever I need anything  
efficient, I always use an additional "size" field (and arr.length is used  
as capacity). This is tricky, error-prone and not too intuitive to  
newcomers. As such, it contradicts with all your statements above.

And there is also a hack to make it work a bit faster:

T[] arr;
arr.length = capacity
arr.length = 0;

Much to my disgust, having no higher-level abstraction in language core  
only encourages writing code like this.

TBH, I used to write code like above before, too, but not now. I slowly  
eliminate everything like this (as well as all the ~= operations) from the  
code I wrote. It's unpredictable, slow, and leads to hard-to-fix bugs.

In other words, I disagree with your opinion that D made a right choice  
with built-in pseudo-arrays.

Nov 19 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 19 Nov 2009 17:58:15 -0500, Bartosz Milewski  
<bartosz-nospam relisoft.com> wrote:

 Steven Schveighoffer Wrote:

 In fact, even after reading 4.1.9 I don't know what to expect in some
 cases. Here's an example:

 int[] a = [0];
 auto b = a;
 a ~= 1;
 b ~= 2;
 What is a[1]?

 Is this considered "stomping" and requiring a re-allocation?


 a[1] would be 1 if an MRU cache was introduced.  This is exactly the  
 sort
 of problem that the MRU cache is supposed to solve.

 What should happen is:

 a ~= 1 should allocate in place because its parameters were in the MRU
 cache.
 b ~= 1 should reallocate because the a~=1 should have removed its
 parameters from the MRU cache.


 Notice that you are using particular implementation detail (MRU cache)  
 to explain the semantics of D arrays. There is a very important  
 distinction between language specification and compiler implementation.  
 Andrei already had to go pretty deep into implementation to describe  
 arrays and slices: You can't define D arrays without talking about  
 shared buffers and memory allocation. I don't think including the  
 description of the MRU cache in the language specification is the right  
 solution. But I'm afraid an abstract definition of "stomping" might turn  
 out to be quite non-trivial.


I haven't yet read all the other posts, so someone else may already have  
pointed this out but...

Having an MRU cache makes it so you *don't* have to explain its semantics  
(or stomping).  Currently there is a paragraph in the spec (complete with  
example) talking about how stomping can occur, so you can just remove that  
part.  There is no need to talk about the MRU cache when talking about  
arrays, that's an implementation detail.  I was pointing it out because  
you are already used to the current bad implementation :)  I wouldn't even  
bring up the MRU cache in the book or the spec.  You just say that you can  
append to an array and it may make a copy of the data if necessary.  It's  
just like realloc, except safer.

-Steve

Nov 23 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sun, 22 Nov 2009 21:19:15 -0500, Bartosz Milewski  
<bartosz-nospam relisoft.com> wrote:

 Andrei Alexandrescu Wrote:

 I'm not sure I figure what the issue is, and I'd appreciate a rehash of
 the few other issues brought up. (I thought they had been responded to.)


 Some of the points are in my message from Sat, 05:19 pm. Especially this  
 point: Can a conforming implementation simply re-allocate on every  
 expansion? You make it sound like that would violate some performance  
 guarantees but there are no specifics.


My view is that a conforming implementation could reallocate on every  
expansion.  That translates to an MRU cache size of zero.

But the MRU cache is just one way to solve the problem, and does not need  
to be in the language spec.  It's an implementation detail.

BTW, the more disturbing case than those being discussed, one which  
violates more than just your expectations is:

string x = "hello".idup;
auto y = x[0..1];
y ~= "appy";

We must have a solution to this problem, to keep the promise of  
immutability intact.  The other problem of sometimes sharing data is not  
as important IMO.

Basically, we have 2 options for the spec:

A. Once you append to an array, you cannot change the data through that  
array for any other alias to the original array, so x ~= y is equivalent  
to x = x ~ y.
B. Appending to an array may reallocate, so appending to x may allow x to  
share it's prefix bytes with another array.

Whatever implementation satisfies the choice is correct, however optimized  
you want to make it.

For instance, you could use an MRU cache for immutable data only, and  
still satisfy rule A.

I also agree that there should be an array building mechanism, because  
these rules may not satisfy all usages.

-Steve

Nov 23 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 21 Nov 2009 17:19:08 -0500, Bartosz Milewski  
<bartosz-nospam relisoft.com> wrote:

 int[] a = [0];
 auto b = a;
 a ~= 1;
 b ~= 2;
 What is a[1]?

 Is this considered "stomping" and requiring a re-allocation?


 Can this question be answered without the recourse to implementation,  
 and the MRU cache in particular?

 I thought about an abstract definition of "stomping".Something like  
 that: The expanded part of the array is never implicitly shared with any  
 other array. But I'm not sure about this case:

 int[] a = [1, 2, 3];
 int[] b = a[0..1];
 b ~= 4;
 what is a[1]?

 By my definition, this would be considered stomping, but I couldn't find  
 this example in TDPL.


I believe that's because Andrei is hoping this problem will be fixed  
before TDPL comes out...

 I'm also not sure what to make of this statement: "client code is  
 guaranteed to benefit of good average performance over a large number of  
 appends to the same array". Since "good" is open to interpretation, such  
 a guarantee is meaningless. If it is meant as "amortized O(1)", it's  
 impossible to fulfill, unless the MRU cache is infinite and a lookup is  
 O(1).


If a reasonably small size limit is put on the MRU cache (defined by  
Andrei as 8), then MRU cache lookup is O(1).  You still need a lock, but  
that's not any different than today's requirement, and it certainly  
wouldn't be any worse than re-allocating for every append.  I agree there  
needs to be a better solution if you want the highest performance, but the  
simplest and most convenient solution should also perform reasonably well,  
and shouldn't corrupt data.

 Then there is the problem of multithreading. A global MRU cache would  
 require potentially expensive synchronization. A thread-local cache, on  
 the other hand, makes thread creation even more heavy-weight than it is  
 right now. My prediction is that D will be evolving towards user-level  
 threads and fine granularity concurrency, which requires very small  
 thread footprint.


You mean more heavyweight than allocating a new heap for each thread?  The  
only construct which makes sense to tie individual MRU caches to is a  
heap, and I think adding the 64 bytes required for an MRU cache to a heap  
is not that big a deal.

-Steve

Nov 23 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 23 Nov 2009 11:10:48 -0500, Leandro Lucarella <llucax gmail.com>
wrote:

 Steven Schveighoffer, el 23 de noviembre a las 07:34 me escribiste:
Notice that you are using particular implementation detail (MRU
cache) to explain the semantics of D arrays. There is a very
important distinction between language specification and compiler
implementation. Andrei already had to go pretty deep into
implementation to describe arrays and slices: You can't define D
arrays without talking about shared buffers and memory allocation.
I don't think including the description of the MRU cache in the
language specification is the right solution. But I'm afraid an
abstract definition of "stomping" might turn out to be quite
non-trivial.


 I haven't yet read all the other posts, so someone else may already
 have pointed this out but...

 Having an MRU cache makes it so you *don't* have to explain its
 semantics (or stomping).  Currently there is a paragraph in the spec
 (complete with example) talking about how stomping can occur, so you
 can just remove that part.  There is no need to talk about the MRU
 cache when talking about arrays, that's an implementation detail.  I
 was pointing it out because you are already used to the current bad
 implementation :)  I wouldn't even bring up the MRU cache in the
 book or the spec.  You just say that you can append to an array and
 it may make a copy of the data if necessary.  It's just like
 realloc, except safer.


 The thing is, with realloc() is less likely that you forget that the data
 can be copied because it returns the new pointer (that can be the same as
 the original pointer). And even in this case, I saw a lot of bugs related
 to realloc() misuse (and I made a couple myself).

 With slices is much worse.


realloc is worse.  If I have multiple aliases to the same data, then I  
realloc one of those, the others all can become dangling pointers if the  
runtime decides to move the data.  You also cannot realloc data that's not  
malloc'd but you can append to a slice of non-heap data without issue.

No matter what you do with slice appending in D, you cannot access  
dangling pointers unless you access the slice's ptr field.

Yes, you can run into trouble if you append to a slice and then change the  
original data in the slice, but that is a rare event, and it won't result  
in corrupting memory you didn't already have access to (at least, with the  
MRU cache).

-Steve

Nov 23 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 23 Nov 2009 16:33:42 -0500, dsimcha <dsimcha yahoo.com> wrote:

 But one of my biggest gripes about the current appending implementation,  
 far more
 important from a practical perspective than any theoretical safety  
 guarantees, is
 that it doesn't scale to multiple threads.  I've experienced this in a  
 real-world
 program that I translated from someone else's C++ code that used
 vector::push_back() quite liberally inside core algorithms.  (The reason  
 is that
 it was written more for simplicity/readability than performance, so the  
 person who
 wrote it never thought to optimize this kind of stuff.)  When I did a  
 literal
 translation of this to D, everything was completely serialized.  It  
 didn't even
 scale to two cores.  I had to rewrite significant portions of the code  
 in a way
 that did not use appending to get it to scale at all.

 D is supposed to be a language that is geared toward multi-threading.  If
 something as simple as array appending is a threading bottleneck (and it  
 was for
 me in a real-world program), I don't care if it is O(1) because  
 effectively it has
 a constant the size of Jupiter.


It's hard to say that your example is a good fit for the arrays of D.   
What you need is a library type that does not require a global lock for  
appending.  Such is not the case for every example of appending in D, and  
the existence of the current D arrays does not prevent you from adding  
more specialized types.  There are other costs to using such types besides  
the global lock, such as storing the capacity, and if you are slicing,  
locally storing the slice extents.

-Steve

Nov 23 2009

Leandro Lucarella <llucax gmail.com> writes:

Steven Schveighoffer, el 23 de noviembre a las 15:18 me escribiste:
 On Mon, 23 Nov 2009 11:10:48 -0500, Leandro Lucarella <llucax gmail.com>
 wrote:
 
Steven Schveighoffer, el 23 de noviembre a las 07:34 me escribiste:
Notice that you are using particular implementation detail (MRU
cache) to explain the semantics of D arrays. There is a very
important distinction between language specification and compiler
implementation. Andrei already had to go pretty deep into
implementation to describe arrays and slices: You can't define D
arrays without talking about shared buffers and memory allocation.
I don't think including the description of the MRU cache in the
language specification is the right solution. But I'm afraid an
abstract definition of "stomping" might turn out to be quite
non-trivial.


I haven't yet read all the other posts, so someone else may already
have pointed this out but...

Having an MRU cache makes it so you *don't* have to explain its
semantics (or stomping).  Currently there is a paragraph in the spec
(complete with example) talking about how stomping can occur, so you
can just remove that part.  There is no need to talk about the MRU
cache when talking about arrays, that's an implementation detail.  I
was pointing it out because you are already used to the current bad
implementation :)  I wouldn't even bring up the MRU cache in the
book or the spec.  You just say that you can append to an array and
it may make a copy of the data if necessary.  It's just like
realloc, except safer.


The thing is, with realloc() is less likely that you forget that the data
can be copied because it returns the new pointer (that can be the same as
the original pointer). And even in this case, I saw a lot of bugs related
to realloc() misuse (and I made a couple myself).

With slices is much worse.


 realloc is worse.  If I have multiple aliases to the same data, then
 I realloc one of those, the others all can become dangling pointers
 if the runtime decides to move the data.


Well, you are comparing GC vs no-GC, not realloc() vs. slices. I have no
intention on following that discussion, I'm just saying that realloc() is
less error-prone than slices (and realloc() is error-prone already).

 You also cannot realloc data that's not malloc'd but you can append to
 a slice of non-heap data without issue.


How is that? AFAIK slices uses gc_realloc() to do the actual realloc, if
that's done in a piece of malloc'ed data it will fail. And even if it were
true, I don't really see this as a big source of bugs, I really never had
a bug because I tried to realloc() a non-heap piece of memory or appending
to a slice of non-GC-heap memory either.

 No matter what you do with slice appending in D, you cannot access
 dangling pointers unless you access the slice's ptr field.


Again, that's only because D is GCed, not because slices are great.

 Yes, you can run into trouble if you append to a slice and then
 change the original data in the slice, but that is a rare event, and
 it won't result in corrupting memory you didn't already have access
 to (at least, with the MRU cache).


I'm a little lost now. I don't know of what hypothetical D are you talking
about. I can't see how the MRU cache can provide any safety. The cache is
finite, and not all the slices will fit in it, so for those slices that
are not cached, I don't see how the cache can provide any safety.

Safety can be provided if a ~= b is defined to be semantically the same as
a = a ~ b, and leaving the MRU cache as an optimization. In that case we
agree slices are predictable and a little safer (because stomping is not
possible). But they are still error prone if you expect them to be a full
reference type. Being the only entity in D with such semantics, is
something one can forget very easily and introduce subtle bugs. In this
case, I really think providing ~= is a bad idea, it's just too error prone
and doesn't give you anything.

I still think the best is to just make slices immutable value types (you
can mutate the data they point to, you just can't modify slices; ptr and
length), and provide a proper dynamic array type in the library or
something.

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
The biggest lie you can tell yourself is
When I get what I want I will be happy

Nov 23 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 24 Nov 2009 17:35:43 -0500, Bartosz Milewski  
<bartosz-nospam relisoft.com> wrote:

 Steven Schveighoffer Wrote:

 On Tue, 24 Nov 2009 16:52:52 -0500, Bartosz Milewski
 <bartosz-nospam relisoft.com> wrote:

 Steven Schveighoffer Wrote:
 Imagine you're reviewing this code written by a relative newbee:

 char[] quote(char[] word) {
   word.length += 2;
   word[length-1] = '"';
   for(int i = word.length-2; i > 0; --i)
     word[i] = word[i - 1];
   word[0] = '"';
   return word;
 }

 void storeFoos(char[] buffer, string pattern, Container c)
 {
 	char[] res = find(buffer, pattern);
 	c.store(quote(res[0..pattern.len]);
 }

 Is this buggy? If so, how and when will the bug manifest itself?


 In my mind it is not buggy.  If quote or storeFoos is not meant to be
 able
 to molest the data, it should require a const(char)[].  If this was  




 code, I'd say it depends on the purpose of storeFoos, if storeFoos is
 expecting to own the input buffer, it's not buggy.  If it's not, then
 storeFoos is buggy (in all cases).  But we aren't talking about D1.


 surrounding code expects the buffer not to be modified. The newbee
 didn't want to idup the buffer because in most buffers the pattern is
 not found.


 Then he forgot the const decoration for the parameter.  No duping is
 necessary:

 void storeFoos(const(char)[] buffer, string pattern, Container c)
 {
     ... // will result in compiler errors for given quote function, fix
 that.
 }


 Well, he tried that, but then his quote function wouldn't work and he  
 needs it in other places as well. You know how they are ;-). Besides he  
 thinks "const" is for wimps. He won't stop arguing until you prove to  
 him that there is an actual bug in his code.


I think this warrants more code reviews of this particular developer's  
code :)

Bottom line: if a function isn't supposed to change the buffer, the  
signature should be const for that parameter.  It's one of the principles  
of const, and why it's in D2 in the first place.  I'd explain to the coder  
that he is wrong to expect that modifying a buffer in a function that  
isn't supposed to modify a buffer is acceptable (and hopefully he gets it,  
or else I don't have time to deal with people who insist on being right  
when they are not).

BTW, in my experience, the newbie expectaction of ~= is usually that it  
modifies the original even when it doesn't, not the other way around.

 Or another story. An enthusiastic programmer comes to you with a  






 clever rewrite of a function. This is the original:

 T[] duplicate(T)(T[] src) {
    static if (is(T == invariant))
       return src.idup;
    else
       return src.dup;
 }










 This is his improved version:

 T[] duplicate(T)(T[] src) {
    T tmp = src[$-1];
    src.length -= 1;
    src ~= tmp;
    return src;
 }

 No need for static if! The extension is bound to re-allocate  






 of
 stomping rules. Will this always work, or is there a gotcha?


 He forgot to check for length 0.


 That he did. So now he goes back to the drawing board and adds an  


 return for length 0. Is his code correct now?


 Yes.

 I have a feeling you have some trick up your sleeve that you think makes
 it incorrect :)


 Well, I was wondering if it could be rewritten as:
 T[] duplicate(T)(T[] src) {
     assert(src.length != 0);
     T tmp = src[$-1];
     src.length -= 1;
     src.length += 1;
     src[$-1] = tmp;
     return src;
 }
 and whether it's okay for the optimizer to do some simple arithmetic  
 optimization.


First, this code isn't equivalent to the original, because it doesn't work  
on const and immutable arrays.
Second, the optimizer cannot do simple math optimization on src.length -=  
1 and src.length += 1 because length is a property, not a field.  Setting  
the length calls a function, which cannot be optimized out.

I also thought of some problems with the original code with static if's.   
Assuming that version, why must this code fail?:

class C {}
const(C)[] arr;
const(C)[] arr2 = arr.duplicate();

Oops, just tried it and no problems!  I'll file a bug if it still happens  
on the latest compiler.

-Steve

Nov 24 2009

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Steven Schveighoffer <schveiguy yahoo.com> wrote:

 You are again resorting to implementation. I guess in the current  
 implementation it's true that the compiler will indeed insert a call to  
 its internal function. But the language spec does not proscribe that.


 Sure it does.  It says that setting the length may reallocate the array  
 to hold enough space.  How do you do that without a function call?


As long as the compiler is aware of arrays' existence, and knows how
those functions work, there might indeed be an optimization that turns
length+=1;length-=1; into a nop. There isn't at the moment, and there
might never be, but it's a conceivable optimization.

-- 
Simen

Nov 25 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 26 Nov 2009 17:45:30 -0500, Bartosz Milewski  
<bartosz-nospam relisoft.com> wrote:

 Steve, I don't know about you, but this exchange clarified some things  
 for me. The major one is that it's dangerous to define the semantics of  
 a language construct in terms of implementation. You were defending some  
 points using implementation arguments rather than sticking to defined  
 semantics.


I was defending the semantics by using an example implementation.  I was  
not defining the semantics in terms of implementation.  The semantics are  
defined by the spec, and do not indicate when an array is reallocated and  
when it is not.  That detail is implementation defined.  My examples use  
dmd's implementation to show how the assumption can break.  You said the  
guy needs me to show him that it is broken, and all his tests pass, why  
can't I use my knowledge of the implementation to come up with an example?

I could rewrite my statements as:  "You should not rely on the array being  
reallocated via append, because D does not guarantee such reallocation.   
Using the reference implementation of dmd, it is possible to come up with  
an example of where this fails: ..."

 We have found out that one should never rely on the array being  
 re-allocated on expansion, even if it seems like there's no other way.  
 The only correct statement is that the freshly expanded part of the  
 array is guaranteed not to be write-shared with any other array.


I agree with this (except for "even if it seems like there's no other  
way,"  The spec says an allocation always occurs when you do a ~ b, so you  
can always rewrite a ~= b as a = a ~ b).  In fact, at one point to avoid  
stomping I went through Tango and found all places where append could  
result in stomping, and changed the code this way.  There were probably  
less than 5 instances.  Append is not a very common operation when you  
didn't create the array to begin with.

 However, this discussion veered away from a more important point. I  
 don't believe that programmers will consciously make assumptions about  
 re-allocation breaking sharing.


For the most part, this is ok -- rarely do you see someone append to an  
array they didn't create *and* modify the original data.

My belief is that people will expect more that appending an array  
*doesn't* reallocate.  If you have experience in programming, the language  
you are used to either treats arrays as value types or as reference  
types.  I don't think I've ever seen a language besides D that uses the  
hybrid type for arrays.  So you are going to come to D expecting value or  
reference.  If you expect value, you should quickly learn that's not the  
case because 99% of the time, arrays look like reference types.  It is  
natural then to expect appending to an array to affect all other aliases  
of that array, after all it is a reference type.  I just think your  
examples don't ring true in practice because there are simpler ways to  
guarantee allocation.  You have to go out of your way to write bad code  
that doesn't work correctly.

Finally, it's easy to turn an array into a reference type when passing as  
a parameter, just use the ref decorator.  All we need is a way to turn it  
into a value type, and I think Andrei's idea of Value!(arr) would be great  
for that.

 The danger is that it's easy to miss accidental sharing and it's very  
 hard to test for it.


I think this danger is rare, and it's easy to search for (just search for  
~= in your code, I did it with Tango).  I think it can be very well  
defined in a tutorial or book chapter.

-Steve

Dec 01 2009

Leandro Lucarella <llucax gmail.com> writes:

Andrei Alexandrescu, el 17 de noviembre a las 18:45 me escribiste:
3.  If you **really** care about performance, you should only append when you
don't know the length in advance.  If you know the length, you should always
pre-allocate.


 We will have collections and all those good things, but I don't see
 how the proposal follows from the feedback. My perception is that
 this is a group of people who use D. Bartosz' concern didn't cause
 quite a riot, so as far as I can tell there is no big issue at
 stake.


I didn't say anything (until now) because this was discussed already.
Dynamic arrays/slices appending is horribly broken (I know it's well
defined, and deterministic, but you are just condemned to make mistakes,
it just doesn't work as one would expect, even when you know how it works
you have to keep fighting your intuition all the time).

But it's a little pointless to keep making riots, when we were so close to
finally fix this with T[new]/whatever and it all went back.

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Hey you, out there on your own
Sitting naked by the phone
Would you touch me?

Nov 18 2009

Leandro Lucarella <llucax gmail.com> writes:

Andrei Alexandrescu, el 19 de noviembre a las 18:25 me escribiste:
 Leandro Lucarella wrote:
Andrei Alexandrescu, el 19 de noviembre a las 17:13 me escribiste:
Leandro Lucarella wrote:
dsimcha, el 19 de noviembre a las 23:21 me escribiste:
Face it, D is a close to the metal language.  Any attempt to define the spec at
a
purely abstract level without reference to any implementation details, and
without
leaving certain details implementation defined is absurd.  I think people need
to
face the fact that you can't have close-to-the-metal flexibility and performance
and at the same time far-from-the-metal strict separation of specification from
implementation details.


the expense of usability. You can always do the slice-thinguie yourself if
you are that needed for speed that you can't afford an extra indirection
when using an array (which will be the only overhead if T[new] were
implemented as a pure reference type).

I think the cost here in terms of usability if far higher than the little
performance you loose, specially when you can overcome the issue for those
weird cases where you need it.


arrays are the lowest abstraction - really just one small step above
pointers. Efficiency lost there is lost forever. That's why I
consider the current design adequate and the proposed designs less
so.


I don't think so, slices (a view over somebody elses memory, no appending)
should be the lowest level abstraction.

In other words: you can build a more convenient facility on top of
T[], but you couldn't build a more efficient facility on top of a
convenient one. I think D made the right choice with built-in
arrays.


I don't. Current built-in arrays are both dynamic arrays and slices, which
is what messes all up.


 Well it also makes arrays/slices quite capable.


Yes, capable of introducing bugs specially =)

 Right now they only have one problem (which different people ascribe
 a different gravity to) and are otherwise very expressive.


Nobody is judging the expressiveness of arrays, the bug-prone semantics
is the problem.

 Now think of this: the MRU solves a grave issue with slices.


You are using arrays and slices interchangeably on purpose, right? I hate
you! =)

 We haven't thought of that for years, and all of a sudden stomping is
 not an issue anymore.


And suddenly you are tying array implementation with the GC. You are
making things worse, not better. This whole issue reminds me to this blog:
http://thereifixedit.com/
=P

 I wouldn't be in the least surprised if someone posts in this group
 tomorrow that they found a way to make ~= more predictable.


Split slices and arrays, add a capacity field to arrays, remove appending
from slices. There.

But I'll drop the subject, once again, for the sake of... well, everybody
;)


 I encourage you to further your point (while in understanding that
 your existing arguments and proposals have been understood). This is
 the best time to make a contribution to D's design.


I keep replying just because I can't control myself, but I really think
it's pointless, I have said (as many others) several times what are the
problems and what would be a nice solution. You just find the solution
unsuitable and think the cache is good enough.

 We may as well undefine "~" or "~=" or both for arrays if there's
 enough arguments and evidence that it's a problem to keep them.


I think that would be very bad. For once, I thought "~" where out of
discussion, because it always create a new temporal (like any other binary
operator), so I can't see a problem with it. And I hope you are only
seeing as a possibility to remove "~=" from slices, making possible for
a library array type to define it. I can totally lived with that. If the
array type were imported in object.d it would be even better.

 When making proposals just keep in mind one important thing: with the
 context that you (all) have accumulated regarding this discussion, a lot
 of things seem very easy to understand, which are really weird when just
 starting with the language.


Please, name them! And BTW, understanding the MRU cache (and trying to
predict it) will be any easier?

 One other thing is that it's all tradeoffs; things that are "good"
 in all dimensions are in very short supply.


I know that! This discussion is just because some think one tradeoff is
better than the other.

 With arrays, we may be in the position of choosing a solution that won't
 please everyone.  The question right now is "which 5% of users you will
 disappoint".


I really think more than 5% of the users will hit the weird half value
half reference semantics of slices, but, of course, I'm making this
figures up, just like you did ;)

 As for reviving T[new], I believe that, unless some new insight comes
 forward, practically that ship has sailed. That doesn't preclude
 defining a library type that is a more encapsulated array.


I don't care about T[new], I'm fine with a library defined type. Just
remove appending from slices (and make .length read-only).

 Fortunately, that can be implemented using T[] because T[] is expressive
 enough :o).


...

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Relax.
I'll need some information first.
Just the basic facts.
Can you show me where it hurts?

Nov 20 2009

Leandro Lucarella <llucax gmail.com> writes:

Andrei Alexandrescu, el 18 de noviembre a las 07:22 me escribiste:
 Leandro Lucarella wrote:
Andrei Alexandrescu, el 17 de noviembre a las 18:45 me escribiste:
3.  If you **really** care about performance, you should only append when you
don't know the length in advance.  If you know the length, you should always
pre-allocate.


how the proposal follows from the feedback. My perception is that
this is a group of people who use D. Bartosz' concern didn't cause
quite a riot, so as far as I can tell there is no big issue at
stake.


I didn't say anything (until now) because this was discussed already.
Dynamic arrays/slices appending is horribly broken (I know it's well
defined, and deterministic, but you are just condemned to make mistakes,
it just doesn't work as one would expect, even when you know how it works
you have to keep fighting your intuition all the time).


 In which ways do you think arrays horribly broken? Same as Bartosz mentions?


Yes, I think dynamic arrays are grouping 2 separate things: a proper
dynamic array able to efficiently appending stuff (without hacks like the
size cache) and a slice (a read-only view of a piece of memory);  a "random
range" if you want to put it in terms of D.

A dynamic array should be a reference type, a slice a value type.
I explained my view before. I wont explain it with much detail again (I
hope you remember the thread).

 One question is whether you actually have had bugs and problems
 coding, or if arrays stopped you from getting work done.


I had many bugs because of the reallocation when appending.

But it's a little pointless to keep making riots, when we were so close
to finally fix this with T[new]/whatever and it all went back.


 Hmmmm... resignation is not a trait of this group :o).


It is after a good fight :)

From time to time this kind of issues (that are present in the group since
I can remember, which is about 4 years maybe) keep popping in the NG. I
get tired of repeating myself over and over again (like may others,
I guess).

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Hey, it's George. I got nothing to say.
	-- George Constanza, dejando un mensaje en la contestadora de Jerry

Nov 18 2009

gzp <galap freemail.hu> writes:

Bartosz Milewski �rta:
 dsimcha Wrote:
 
 The one thing that I think has been missing from this discussion is, what
would be
 the alternative if we didn't have this "non-deterministic" reallocation?  How
else
 could you **efficiently** implement dynamic arrays?


 In the long run (D3), I proposed using the "unique" type modifier. If an array
is unique, the compiler knows that there are no slices to worry about, and it
can use in-place reallocation to its heart content. That pretty much solves the
performance problem. 
 
 In the short run (D2), I would suggest sticking to "reallocate on every
extension" semantics (especially in SafeD) and provide a library solution (a la
C++ std::vector) where the performance of appending is an issue. 



It seems as we cannot persuade Andrei and Walter, that it is a 
really-really bad and dangerous implementation. Everybody knows what is 
going on when an int[] "array" is resized. It doesn't really matters, 
how we call it, indeterminism, undefined behaviour, implementation 
dependent behaviour. But we all know it's a source of a hardly 
detectable, dangerous bug.
As I've said it before it's worse than buffer overrun, as they can be 
detected using some memory patterns and memory guarding techniques. But 
this bug cannot be detected, it just occurs once in a while, (like 
implementing parallel programs without the proper critical sections.)
It's hard enough to create good (parallel) programs by itself, so don't
harden the work with placing such a trap.

So please take a look at the language with a different eye. I know 
you've worked a lot on it, but I'm afraid such a mistake could ruin the 
whole language. Sorry. Or at least just think why are there so many 
posts concerning this feature.

int[] is not a range, it is a range + random_copy_on_resize (replace the 
word random with any of your choice, it is not a matter)

I think we can agree that passing "array" int[] as a (non-cont) value 
type is highly avoidable. Than why do we allow it ???

With std::vector + iterator you always now when it is invalid (when the 
array is resized). But here you don't have an object whose modification 
(surly) invalidates the ranges. I'm missing to have reference for the 
actual array.

The semantics is not so important for me (now), but
distinct the two object:
1. have array those contains the element + structure, that can be 
resized: RandomAccessArray!(int)
2. have slices (ranges) of array, where the structure cannot be altered 
only the elements: Rang!(int)
3. decide if int[] stands for a short version for either 
RandomAccessArray!(int) or Range!(int), but DO NOT have a mixed meaning.

Gzp

Nov 19 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Thu, 19 Nov 2009 18:33:22 +0300, gzp <galap freemail.hu> wrote:

 Bartosz Milewski =C3=ADrta:
 dsimcha Wrote:

 The one thing that I think has been missing from this discussion is,=







 what would be
 the alternative if we didn't have this "non-deterministic"  =






 reallocation?  How else
 could you **efficiently** implement dynamic arrays?







 an array is unique, the compiler knows that there are no slices to  =




 worry about, and it can use in-place reallocation to its heart conten=





 That pretty much solves the performance problem.  In the short run  =




 (D2), I would suggest sticking to "reallocate on every extension"  =




 semantics (especially in SafeD) and provide a library solution (a la =





 C++ std::vector) where the performance of appending is an issue.



 It seems as we cannot persuade Andrei and Walter, that it is a  =


 really-really bad and dangerous implementation. Everybody knows what i=



 going on when an int[] "array" is resized. It doesn't really matters, =



 how we call it, indeterminism, undefined behaviour, implementation  =


 dependent behaviour. But we all know it's a source of a hardly  =


 detectable, dangerous bug.
 As I've said it before it's worse than buffer overrun, as they can be =



 detected using some memory patterns and memory guarding techniques. Bu=



 this bug cannot be detected, it just occurs once in a while, (like  =


 implementing parallel programs without the proper critical sections.)
 It's hard enough to create good (parallel) programs by itself, so don'=


 harden the work with placing such a trap.

 So please take a look at the language with a different eye. I know  =


 you've worked a lot on it, but I'm afraid such a mistake could ruin th=



 whole language. Sorry. Or at least just think why are there so many  =


 posts concerning this feature.

 int[] is not a range, it is a range + random_copy_on_resize (replace t=



 word random with any of your choice, it is not a matter)

 I think we can agree that passing "array" int[] as a (non-cont) value =



 type is highly avoidable. Than why do we allow it ???

 With std::vector + iterator you always now when it is invalid (when th=



 array is resized). But here you don't have an object whose modificatio=



 (surly) invalidates the ranges. I'm missing to have reference for the =



 actual array.

 The semantics is not so important for me (now), but
 distinct the two object:
 1. have array those contains the element + structure, that can be  =


 resized: RandomAccessArray!(int)
 2. have slices (ranges) of array, where the structure cannot be altere=



 only the elements: Rang!(int)
 3. decide if int[] stands for a short version for either  =


 RandomAccessArray!(int) or Range!(int), but DO NOT have a mixed meanin=


 Gzp


Same here. FWIW, I strongly believe demise of T[new] was a step in a wro=
ng  =

direction, and I feel highly frustrated about it. It was one of the most=
  =

anticipated features that was asked for *years*! And it's gone when it w=
as  =

so close to be implemented...

The only more or less reasonable answer why T[new] "sucked" was:

On Mon, 19 Oct 2009 01:55:28 +0400, Andrei Alexandrescu  =

<SeeWebsiteForEmail erdani.org> wrote:
 Returning a distinct type from .dup and ~ makes slices not closed over=



 these operations, a source of complication, confusion, and bloating.


I see no problem returning T[] when a slice is dup'ed or concatenated.  =

That's what always happened anyway.

Nov 19 2009

grauzone <none example.net> writes:

Denis Koroskin wrote:
 Same here. FWIW, I strongly believe demise of T[new] was a step in a 
 wrong direction, and I feel highly frustrated about it. It was one of 
 the most anticipated features that was asked for *years*! And it's gone 
 when it was so close to be implemented...
 
 The only more or less reasonable answer why T[new] "sucked" was:
 
 On Mon, 19 Oct 2009 01:55:28 +0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 Returning a distinct type from .dup and ~ makes slices not closed over 
 these operations, a source of complication, confusion, and bloating.


 I see no problem returning T[] when a slice is dup'ed or concatenated. 
 That's what always happened anyway.


I think Walter and Andrei just felt like many people do when they have 
to adapt to a new environment or some other big change: the new thing 
"sucks", you feel bad, you thought "the past was better", you want go go 
back. Also, you don't see the advantages, you only get riled up by the 
disadvantage (because _every_ change has disadvantages; that's only 
natural).

So they just threw it away.

Also, I don't quite understand why they find the new semantics caused by 
the append-caching simpler. Even Bartosz got worried about the 
undeterministic behavior of this.

Nov 19 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

grauzone wrote:
 Denis Koroskin wrote:
 Same here. FWIW, I strongly believe demise of T[new] was a step in a 
 wrong direction, and I feel highly frustrated about it. It was one of 
 the most anticipated features that was asked for *years*! And it's 
 gone when it was so close to be implemented...

 The only more or less reasonable answer why T[new] "sucked" was:

 On Mon, 19 Oct 2009 01:55:28 +0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 Returning a distinct type from .dup and ~ makes slices not closed 
 over these operations, a source of complication, confusion, and 
 bloating.


 I see no problem returning T[] when a slice is dup'ed or concatenated. 
 That's what always happened anyway.


 I think Walter and Andrei just felt like many people do when they have 
 to adapt to a new environment or some other big change: the new thing 
 "sucks", you feel bad, you thought "the past was better", you want go go 
 back. Also, you don't see the advantages, you only get riled up by the 
 disadvantage (because _every_ change has disadvantages; that's only 
 natural).
 
 So they just threw it away.
 
 Also, I don't quite understand why they find the new semantics caused by 
 the append-caching simpler. Even Bartosz got worried about the 
 undeterministic behavior of this.


However flattering the attention to Walter's and my psychology could be, 
it may be difficult to find supporting evidence for the theory above.

Andrei

Nov 19 2009

Leandro Lucarella <llucax gmail.com> writes:

dsimcha, el 19 de noviembre a las 23:21 me escribiste:
 Face it, D is a close to the metal language.  Any attempt to define the spec
at a
 purely abstract level without reference to any implementation details, and
without
 leaving certain details implementation defined is absurd.  I think people need
to
 face the fact that you can't have close-to-the-metal flexibility and
performance
 and at the same time far-from-the-metal strict separation of specification from
 implementation details.


Yes, but I think this is a place where you are doing early optimization at
the expense of usability. You can always do the slice-thinguie yourself if
you are that needed for speed that you can't afford an extra indirection
when using an array (which will be the only overhead if T[new] were
implemented as a pure reference type).

I think the cost here in terms of usability if far higher than the little
performance you loose, specially when you can overcome the issue for those
weird cases where you need it.

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Pa' ella cociné, pa' ella lavé, pa' ella soñe
Paella completa, $2,50
Pero, la luz mala me tira, y yo? yo soy ligero pa'l trote
La luz buena, está en el monte, allá voy, al horizonte

Nov 19 2009

Leandro Lucarella <llucax gmail.com> writes:

Steven Schveighoffer, el 23 de noviembre a las 07:34 me escribiste:
Notice that you are using particular implementation detail (MRU
cache) to explain the semantics of D arrays. There is a very
important distinction between language specification and compiler
implementation. Andrei already had to go pretty deep into
implementation to describe arrays and slices: You can't define D
arrays without talking about shared buffers and memory allocation.
I don't think including the description of the MRU cache in the
language specification is the right solution. But I'm afraid an
abstract definition of "stomping" might turn out to be quite
non-trivial.


 I haven't yet read all the other posts, so someone else may already
 have pointed this out but...
 
 Having an MRU cache makes it so you *don't* have to explain its
 semantics (or stomping).  Currently there is a paragraph in the spec
 (complete with example) talking about how stomping can occur, so you
 can just remove that part.  There is no need to talk about the MRU
 cache when talking about arrays, that's an implementation detail.  I
 was pointing it out because you are already used to the current bad
 implementation :)  I wouldn't even bring up the MRU cache in the
 book or the spec.  You just say that you can append to an array and
 it may make a copy of the data if necessary.  It's just like
 realloc, except safer.


The thing is, with realloc() is less likely that you forget that the data
can be copied because it returns the new pointer (that can be the same as
the original pointer). And even in this case, I saw a lot of bugs related
to realloc() misuse (and I made a couple myself).

With slices is much worse.

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Si le decía "Vamos al cine, rica"
Me decía "Veamos una de Kusturica"
Si le decía "Vamos a oler las flores"
Me hablaba de Virginia Wolf y sus amores
Me hizo mucho mal la cumbiera intelectual

Nov 23 2009

D Programming

C/C++ Programming

Other

digitalmars.D - Re: D array expansion and non-deterministic re-allocation