www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Re: D array expansion and non-deterministic re-allocation

reply Bartosz Milewski <bartosz-nospam relisoft.com> writes:
dsimcha Wrote:

 
 The one thing that I think has been missing from this discussion is, what
would be
 the alternative if we didn't have this "non-deterministic" reallocation?  How
else
 could you **efficiently** implement dynamic arrays?

In the long run (D3), I proposed using the "unique" type modifier. If an array is unique, the compiler knows that there are no slices to worry about, and it can use in-place reallocation to its heart content. That pretty much solves the performance problem. In the short run (D2), I would suggest sticking to "reallocate on every extension" semantics (especially in SafeD) and provide a library solution (a la C++ std::vector) where the performance of appending is an issue.
Nov 17 2009
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Bartosz Milewski (bartosz-nospam relisoft.com)'s article
 dsimcha Wrote:
 The one thing that I think has been missing from this discussion is, what
would be
 the alternative if we didn't have this "non-deterministic" reallocation?  How
else
 could you **efficiently** implement dynamic arrays?


use in-place reallocation to its heart content. That pretty much solves the performance problem.
 In the short run (D2), I would suggest sticking to "reallocate on every

C++ std::vector) where the performance of appending is an issue. Probably not a bad idea, since: 1. I've concluded that appending to slices simply can't be made both efficient and safe. 2. We'll probably get a std.collections anyhow. 3. If you **really** care about performance, you should only append when you don't know the length in advance. If you know the length, you should always pre-allocate.
Nov 17 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
dsimcha wrote:
 == Quote from Bartosz Milewski (bartosz-nospam relisoft.com)'s article
 dsimcha Wrote:
 The one thing that I think has been missing from this discussion is, what
would be
 the alternative if we didn't have this "non-deterministic" reallocation?  How
else
 could you **efficiently** implement dynamic arrays?


use in-place reallocation to its heart content. That pretty much solves the performance problem.
 In the short run (D2), I would suggest sticking to "reallocate on every

C++ std::vector) where the performance of appending is an issue. Probably not a bad idea, since: 1. I've concluded that appending to slices simply can't be made both efficient and safe. 2. We'll probably get a std.collections anyhow. 3. If you **really** care about performance, you should only append when you don't know the length in advance. If you know the length, you should always pre-allocate.

We will have collections and all those good things, but I don't see how the proposal follows from the feedback. My perception is that this is a group of people who use D. Bartosz' concern didn't cause quite a riot, so as far as I can tell there is no big issue at stake. (Add to that the fact that today's arrays are even _more_ potentially puzzling because, without the MRU cache, expanding slices may trample over other slices.) In contrast, slowness of ~= comes time and again to haunt this newsgroup. Want it or not, people will want to use T[] and ~= to manage arrays. We could remove the ~= operator and require use of a library type for appends, but the aforementioned facts suggest that that might not be the best decision. One thing that Bartosz didn't mention is that "this unique" is different than "the other unique", which I think reveals the half-bakedness of the proposal. "The other unique" that was intensively discussed before is transitive, meaning that the root reference points to a graph of objects having no other connection to the outer world. "This unique" is very different because it's shallow - only the array chunk is unaliased, but the members of the array don't need to be. This is a very important distinction. Andrei
Nov 17 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Leandro Lucarella wrote:
 Andrei Alexandrescu, el 17 de noviembre a las 18:45 me escribiste:
 3.  If you **really** care about performance, you should only append when you
 don't know the length in advance.  If you know the length, you should always
 pre-allocate.

how the proposal follows from the feedback. My perception is that this is a group of people who use D. Bartosz' concern didn't cause quite a riot, so as far as I can tell there is no big issue at stake.

I didn't say anything (until now) because this was discussed already. Dynamic arrays/slices appending is horribly broken (I know it's well defined, and deterministic, but you are just condemned to make mistakes, it just doesn't work as one would expect, even when you know how it works you have to keep fighting your intuition all the time).

In which ways do you think arrays horribly broken? Same as Bartosz mentions? One question is whether you actually have had bugs and problems coding, or if arrays stopped you from getting work done.
 But it's a little pointless to keep making riots, when we were so close to
 finally fix this with T[new]/whatever and it all went back.

Hmmmm... resignation is not a trait of this group :o). Andrei
Nov 18 2009
next sibling parent reply Bartosz Milewski <bartosz-nospam relisoft.com> writes:
Andrei Alexandrescu Wrote:

 Leandro Lucarella wrote:
 Andrei Alexandrescu, el 17 de noviembre a las 18:45 me escribiste:
 3.  If you **really** care about performance, you should only append when you
 don't know the length in advance.  If you know the length, you should always
 pre-allocate.

how the proposal follows from the feedback. My perception is that this is a group of people who use D. Bartosz' concern didn't cause quite a riot, so as far as I can tell there is no big issue at stake.

I didn't say anything (until now) because this was discussed already. Dynamic arrays/slices appending is horribly broken (I know it's well defined, and deterministic, but you are just condemned to make mistakes, it just doesn't work as one would expect, even when you know how it works you have to keep fighting your intuition all the time).

In which ways do you think arrays horribly broken? Same as Bartosz mentions? One question is whether you actually have had bugs and problems coding, or if arrays stopped you from getting work done.

What Leandro points out is that using D arrays is not straightforward and may present a steep learning curve. In particular, even the beginner would be required to understand section 4.1.9 of TDPL (Expanding). For many people arrays' behavior might be counter-intuitive and a source of mistakes. In fact, even after reading 4.1.9 I don't know what to expect in some cases. Here's an example: int[] a = [0]; auto b = a; a ~= 1; b ~= 2; What is a[1]? Is this considered "stomping" and requiring a re-allocation?
Nov 18 2009
parent reply Bartosz Milewski <bartosz-nospam relisoft.com> writes:
Steven Schveighoffer Wrote:

 In fact, even after reading 4.1.9 I don't know what to expect in some  
 cases. Here's an example:

 int[] a = [0];
 auto b = a;
 a ~= 1;
 b ~= 2;
 What is a[1]?

 Is this considered "stomping" and requiring a re-allocation?

a[1] would be 1 if an MRU cache was introduced. This is exactly the sort of problem that the MRU cache is supposed to solve. What should happen is: a ~= 1 should allocate in place because its parameters were in the MRU cache. b ~= 1 should reallocate because the a~=1 should have removed its parameters from the MRU cache.

Notice that you are using particular implementation detail (MRU cache) to explain the semantics of D arrays. There is a very important distinction between language specification and compiler implementation. Andrei already had to go pretty deep into implementation to describe arrays and slices: You can't define D arrays without talking about shared buffers and memory allocation. I don't think including the description of the MRU cache in the language specification is the right solution. But I'm afraid an abstract definition of "stomping" might turn out to be quite non-trivial.
Nov 19 2009
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Bartosz Milewski (bartosz-nospam relisoft.com)'s article
 Steven Schveighoffer Wrote:
 In fact, even after reading 4.1.9 I don't know what to expect in some
 cases. Here's an example:

 int[] a = [0];
 auto b = a;
 a ~= 1;
 b ~= 2;
 What is a[1]?

 Is this considered "stomping" and requiring a re-allocation?

a[1] would be 1 if an MRU cache was introduced. This is exactly the sort of problem that the MRU cache is supposed to solve. What should happen is: a ~= 1 should allocate in place because its parameters were in the MRU cache. b ~= 1 should reallocate because the a~=1 should have removed its parameters from the MRU cache.


language specification and compiler implementation. Andrei already had to go pretty deep into implementation to describe arrays and slices: You can't define D arrays without talking about shared buffers and memory allocation. I don't think including the description of the MRU cache in the language specification is the right solution. But I'm afraid an abstract definition of "stomping" might turn out to be quite non-trivial. Face it, D is a close to the metal language. Any attempt to define the spec at a purely abstract level without reference to any implementation details, and without leaving certain details implementation defined is absurd. I think people need to face the fact that you can't have close-to-the-metal flexibility and performance and at the same time far-from-the-metal strict separation of specification from implementation details.
Nov 19 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Leandro Lucarella wrote:
 dsimcha, el 19 de noviembre a las 23:21 me escribiste:
 Face it, D is a close to the metal language.  Any attempt to define the spec
at a
 purely abstract level without reference to any implementation details, and
without
 leaving certain details implementation defined is absurd.  I think people need
to
 face the fact that you can't have close-to-the-metal flexibility and
performance
 and at the same time far-from-the-metal strict separation of specification from
 implementation details.

Yes, but I think this is a place where you are doing early optimization at the expense of usability. You can always do the slice-thinguie yourself if you are that needed for speed that you can't afford an extra indirection when using an array (which will be the only overhead if T[new] were implemented as a pure reference type). I think the cost here in terms of usability if far higher than the little performance you loose, specially when you can overcome the issue for those weird cases where you need it.

I think one aspect that tends to be forgotten is that built-in arrays are the lowest abstraction - really just one small step above pointers. Efficiency lost there is lost forever. That's why I consider the current design adequate and the proposed designs less so. In other words: you can build a more convenient facility on top of T[], but you couldn't build a more efficient facility on top of a convenient one. I think D made the right choice with built-in arrays. Andrei
Nov 19 2009
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Denis Koroskin wrote:
 On Fri, 20 Nov 2009 04:13:42 +0300, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Leandro Lucarella wrote:
 dsimcha, el 19 de noviembre a las 23:21 me escribiste:
 Face it, D is a close to the metal language.  Any attempt to define 
 the spec at a
 purely abstract level without reference to any implementation 
 details, and without
 leaving certain details implementation defined is absurd.  I think 
 people need to
 face the fact that you can't have close-to-the-metal flexibility and 
 performance
 and at the same time far-from-the-metal strict separation of 
 specification from
 implementation details.

optimization at the expense of usability. You can always do the slice-thinguie yourself if you are that needed for speed that you can't afford an extra indirection when using an array (which will be the only overhead if T[new] were implemented as a pure reference type). I think the cost here in terms of usability if far higher than the little performance you loose, specially when you can overcome the issue for those weird cases where you need it.

I think one aspect that tends to be forgotten is that built-in arrays are the lowest abstraction - really just one small step above pointers. Efficiency lost there is lost forever. That's why I consider the current design adequate and the proposed designs less so. In other words: you can build a more convenient facility on top of T[], but you couldn't build a more efficient facility on top of a convenient one. I think D made the right choice with built-in arrays. Andrei

Not so true. T[] appends are slow, *very* slow. Whenever I need anything efficient, I always use an additional "size" field (and arr.length is used as capacity). This is tricky, error-prone and not too intuitive to newcomers. As such, it contradicts with all your statements above.

I was speaking somewhat anticipatively; the MRU is slated to accelerate append speed while also improving its safety. I have tried in vain to convince someone to look into implementing it.
 And there is also a hack to make it work a bit faster:
 
 T[] arr;
 arr.length = capacity
 arr.length = 0;
 
 Much to my disgust, having no higher-level abstraction in language core 
 only encourages writing code like this.

I hope that idiom will be put to rest soon.
 TBH, I used to write code like above before, too, but not now. I slowly 
 eliminate everything like this (as well as all the ~= operations) from 
 the code I wrote. It's unpredictable, slow, and leads to hard-to-fix bugs.
 
 In other words, I disagree with your opinion that D made a right choice 
 with built-in pseudo-arrays.

I really meant to say "will have made". Andrei
Nov 19 2009
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Leandro Lucarella wrote:
 Andrei Alexandrescu, el 19 de noviembre a las 17:13 me escribiste:
 Leandro Lucarella wrote:
 dsimcha, el 19 de noviembre a las 23:21 me escribiste:
 Face it, D is a close to the metal language.  Any attempt to define the spec
at a
 purely abstract level without reference to any implementation details, and
without
 leaving certain details implementation defined is absurd.  I think people need
to
 face the fact that you can't have close-to-the-metal flexibility and
performance
 and at the same time far-from-the-metal strict separation of specification from
 implementation details.

the expense of usability. You can always do the slice-thinguie yourself if you are that needed for speed that you can't afford an extra indirection when using an array (which will be the only overhead if T[new] were implemented as a pure reference type). I think the cost here in terms of usability if far higher than the little performance you loose, specially when you can overcome the issue for those weird cases where you need it.

arrays are the lowest abstraction - really just one small step above pointers. Efficiency lost there is lost forever. That's why I consider the current design adequate and the proposed designs less so.

I don't think so, slices (a view over somebody elses memory, no appending) should be the lowest level abstraction.
 In other words: you can build a more convenient facility on top of
 T[], but you couldn't build a more efficient facility on top of a
 convenient one. I think D made the right choice with built-in
 arrays.

I don't. Current built-in arrays are both dynamic arrays and slices, which is what messes all up.

Well it also makes arrays/slices quite capable. Right now they only have one problem (which different people ascribe a different gravity to) and are otherwise very expressive. Now think of this: the MRU solves a grave issue with slices. We haven't thought of that for years, and all of a sudden stomping is not an issue anymore. I wouldn't be in the least surprised if someone posts in this group tomorrow that they found a way to make ~= more predictable.
 But I'll drop the subject, once again, for the sake of... well, everybody
 ;)

I encourage you to further your point (while in understanding that your existing arguments and proposals have been understood). This is the best time to make a contribution to D's design. We may as well undefine "~" or "~=" or both for arrays if there's enough arguments and evidence that it's a problem to keep them. When making proposals just keep in mind one important thing: with the context that you (all) have accumulated regarding this discussion, a lot of things seem very easy to understand, which are really weird when just starting with the language. One other thing is that it's all tradeoffs; things that are "good" in all dimensions are in very short supply. With arrays, we may be in the position of choosing a solution that won't please everyone. The question right now is "which 5% of users you will disappoint". As for reviving T[new], I believe that, unless some new insight comes forward, practically that ship has sailed. That doesn't preclude defining a library type that is a more encapsulated array. Fortunately, that can be implemented using T[] because T[] is expressive enough :o). Andrei
Nov 19 2009
parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Leandro Lucarella wrote:
 Andrei Alexandrescu, el 19 de noviembre a las 17:13 me escribiste:
 Leandro Lucarella wrote:
 dsimcha, el 19 de noviembre a las 23:21 me escribiste:
 Face it, D is a close to the metal language.  Any attempt to define the





 purely abstract level without reference to any implementation details, and





 leaving certain details implementation defined is absurd.  I think people





 face the fact that you can't have close-to-the-metal flexibility and





 and at the same time far-from-the-metal strict separation of specification from
 implementation details.

the expense of usability. You can always do the slice-thinguie yourself if you are that needed for speed that you can't afford an extra indirection when using an array (which will be the only overhead if T[new] were implemented as a pure reference type). I think the cost here in terms of usability if far higher than the little performance you loose, specially when you can overcome the issue for those weird cases where you need it.

arrays are the lowest abstraction - really just one small step above pointers. Efficiency lost there is lost forever. That's why I consider the current design adequate and the proposed designs less so.

I don't think so, slices (a view over somebody elses memory, no appending) should be the lowest level abstraction.
 In other words: you can build a more convenient facility on top of
 T[], but you couldn't build a more efficient facility on top of a
 convenient one. I think D made the right choice with built-in
 arrays.

I don't. Current built-in arrays are both dynamic arrays and slices, which is what messes all up.

Right now they only have one problem (which different people ascribe a different gravity to) and are otherwise very expressive. Now think of this: the MRU solves a grave issue with slices. We haven't thought of that for years, and all of a sudden stomping is not an issue anymore. I wouldn't be in the least surprised if someone posts in this group tomorrow that they found a way to make ~= more predictable.
 But I'll drop the subject, once again, for the sake of... well, everybody
 ;)

existing arguments and proposals have been understood). This is the best time to make a contribution to D's design. We may as well undefine "~" or "~=" or both for arrays if there's enough arguments and evidence that it's a problem to keep them. When making proposals just keep in mind one important thing: with the context that you (all) have accumulated regarding this discussion, a lot of things seem very easy to understand, which are really weird when just starting with the language. One other thing is that it's all tradeoffs; things that are "good" in all dimensions are in very short supply. With arrays, we may be in the position of choosing a solution that won't please everyone. The question right now is "which 5% of users you will disappoint". As for reviving T[new], I believe that, unless some new insight comes forward, practically that ship has sailed. That doesn't preclude defining a library type that is a more encapsulated array. Fortunately, that can be implemented using T[] because T[] is expressive enough :o). Andrei

I'd be inclined to support the MRU (and even implement it) iff: 1. We still get a good library array type that pulls out all the stops to make appending efficient. (I'm putting my effort where my mouth is and working on writing one pursuant to my ideas about combining it with uniqueness right now, actually). 2. We state in the documentation that ~= on slices is a convenience that isn't terribly efficient and recommend that people that care more about performance than simplicity use the library type for their serious array building, especially for large arrays.
Nov 19 2009
prev sibling parent Bartosz Milewski <bartosz-nospam relisoft.com> writes:
Andrei Alexandrescu Wrote:
 I think one aspect that tends to be forgotten is that built-in arrays 
 are the lowest abstraction - really just one small step above pointers. 
 Efficiency lost there is lost forever. That's why I consider the current 
 design adequate and the proposed designs less so.

Let's look at your proposal from a slightly different perspective. First of all, a conforming implementation is free to do the re-allocation on every append. There are no complexity requirements specified. The controversy stems from the fact that the spec deliberately leaves a window for certain optimizations that might result in the array being write-shared after the append. In that case the spec dictates certain restrictions that are expressed informally as no-stomping. BTW, we can't currently evaluate the proposal until no-stomping is formally defined. Because of that, a program written for and tested on one conforming implementation may not work correctly on another, depending on what optimization strategy is picked. That will probably lead to implementations that use a compile-time switch to disable such optimizations. The bugs resulting from not understanding when write sharing is possible (and I'm afraid a lot of programmers will have problems following the spec in this area) will be subtle. They will also be "soft" (as opposed to "hard" bugs), and those are much harder to discover and track down. All these things have to be carefully weighed. We are in uncharted territory.
Nov 20 2009
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
[snip]
 Andrei has mentioned that he thinks we can store the allocated length in 
 the GC block, which I think would also work.  You also wouldn't need an 
 MRU cache in that case, but he says it's in *addition* to the MRU cache, 
 so I'm not sure what he means.

Reaching the GC block is relatively expensive, so the MRU still helps. In essence it's like this. When appending: a) look up the cache, if there, you have O(1) amortized append that's really fast b) if not in the cache, talk to the GC, still O(1) amortized append that's not as fast Both help providing an important performance guarantee. I was a bit worried about guaranteeing "O(1) amortized append for up to 8 arrays at a time." Andrei
Nov 24 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
 On Tue, 24 Nov 2009 11:01:10 -0500, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Steven Schveighoffer wrote:
 [snip]
 Andrei has mentioned that he thinks we can store the allocated length 
 in the GC block, which I think would also work.  You also wouldn't 
 need an MRU cache in that case, but he says it's in *addition* to the 
 MRU cache, so I'm not sure what he means.

Reaching the GC block is relatively expensive, so the MRU still helps. In essence it's like this. When appending: a) look up the cache, if there, you have O(1) amortized append that's really fast b) if not in the cache, talk to the GC, still O(1) amortized append that's not as fast Both help providing an important performance guarantee. I was a bit worried about guaranteeing "O(1) amortized append for up to 8 arrays at a time."

Have you considered the performance impact on allocating non-array types? That is, are you intending on all allocations storing the allocated length, even class or struct allocations who will likely never append? Or will there be a "non-appendable" flag? Also, the part without the MRU cache was my original proposal from last year, I had some ideas on how length could be stored. For example, in a page of up to 128 byte blocks, you only need 8 bits to store the length (alas, you cannot store with 4 bits for 16-byte blocks because you need to cover both 0 and 16). This reduces the overhead for those blocks. For 256 byte to 1-page blocks, 16 bits is acceptable multi-page blocks, the cost of storing a 32-bit integer is negligible.

Having access to the requested length is important at larger lengths, so probably we could be fine by not actually storing it for allocations up to 128 bytes.
 It is true the lookup of the MRU cache will not involve dissecting the 
 address of the block to find it's container block, but you still will 
 need a lock, or are you planning on doing something clever?   I think 
 the locking will be the bottleneck, and if you don't make it the same as 
 the global lock, add the cost of both locks when you actually need to 
 append.

The cache is a thread-local map from pointers to size_t. Using it does not require any locking I think. Andrei
Nov 24 2009
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bill Baxter wrote:
 On Tue, Nov 24, 2009 at 8:26 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Steven Schveighoffer wrote:
 On Tue, 24 Nov 2009 11:01:10 -0500, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 [snip]
 Andrei has mentioned that he thinks we can store the allocated length in
 the GC block, which I think would also work.  You also wouldn't need an MRU
 cache in that case, but he says it's in *addition* to the MRU cache, so I'm
 not sure what he means.

Reaching the GC block is relatively expensive, so the MRU still helps. In essence it's like this. When appending: a) look up the cache, if there, you have O(1) amortized append that's really fast b) if not in the cache, talk to the GC, still O(1) amortized append that's not as fast Both help providing an important performance guarantee. I was a bit worried about guaranteeing "O(1) amortized append for up to 8 arrays at a time."

That is, are you intending on all allocations storing the allocated length, even class or struct allocations who will likely never append? Or will there be a "non-appendable" flag? Also, the part without the MRU cache was my original proposal from last year, I had some ideas on how length could be stored. For example, in a page of up to 128 byte blocks, you only need 8 bits to store the length (alas, you cannot store with 4 bits for 16-byte blocks because you need to cover both 0 and 16). This reduces the overhead for those blocks. For 256 byte to 1-page blocks, 16 bits is acceptable multi-page blocks, the cost of storing a 32-bit integer is negligible.

probably we could be fine by not actually storing it for allocations up to 128 bytes.
 It is true the lookup of the MRU cache will not involve dissecting the
 address of the block to find it's container block, but you still will need a
 lock, or are you planning on doing something clever?   I think the locking
 will be the bottleneck, and if you don't make it the same as the global
 lock, add the cost of both locks when you actually need to append.

require any locking I think.

Wait, what happens if you're sharing an array between two different threads?

The cache does not work with shared arrays, and threads cannot share non-shared arrays. Andrei
Nov 24 2009
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
 On Tue, 24 Nov 2009 11:41:17 -0500, Steven Schveighoffer 
 <schveiguy yahoo.com> wrote:
 
 
 The cache is a thread-local map from pointers to size_t. Using it 
 does not require any locking I think.

When reallocating, do you not also need to update the allocated length in the heap, even if the allocation fits into the same block to prevent other threads from stomping? Doesn't that require a lock? Somewhere you have to share the allocated length between threads sharing the same array. I assumed it was going to be in the MRU cache...

If all this is true, I had another thought for an MRU cache. It could simply be a map of pointers to the right blocks in the GC that store the lengths (to avoid the block search with the global mutex locked).

That's a great idea. I keep on dreaming someone will implement all this stuff we talk about :o).
 Lookup should be atomic without locking (I think, simply an integer load 
 right?), but you'd have to lock to actually do an append.
 
 I don't think we solve the lock problem without having multiple heaps...

I don't think we need to worry about optimizing growth of shared arrays. Andrei
Nov 24 2009
next sibling parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Steven Schveighoffer wrote:
 On Tue, 24 Nov 2009 11:41:17 -0500, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:


 The cache is a thread-local map from pointers to size_t. Using it
 does not require any locking I think.

When reallocating, do you not also need to update the allocated length in the heap, even if the allocation fits into the same block to prevent other threads from stomping? Doesn't that require a lock? Somewhere you have to share the allocated length between threads sharing the same array. I assumed it was going to be in the MRU cache...

If all this is true, I had another thought for an MRU cache. It could simply be a map of pointers to the right blocks in the GC that store the lengths (to avoid the block search with the global mutex locked).

stuff we talk about :o).
 Lookup should be atomic without locking (I think, simply an integer load
 right?), but you'd have to lock to actually do an append.

 I don't think we solve the lock problem without having multiple heaps...

Andrei

But if the full semantics of shared are not implemented for D2 (will they be), how do we even detect shared arrays?
Nov 24 2009
prev sibling parent reply Bartosz Milewski <bartosz-nospam relisoft.com> writes:
It's hard to argue whether something is or isn't bug prone and how severe the
bugs could be. An experienced programmer knows all the gotchas so he or she
will never introduce such bugs. A more bug-prone person will be afraid to speak
up in public so as not to appear stupid or be bullied away. It's only when the
language goes into production that the problems surface. Then an experienced
programmer becomes part of a group or not so experienced programmers and he or
she starts seeing how much the rests of the team is prone to certain bugs and
how hard it is to prevent or even detect them. That's why the best languages
attempt to turn as many errors as possible into hard bugs. 

Non-deterministic behavior is a major source of soft bugs. A buggy program
might work correctly 99% of the time, introduce minor errors 0.9% of the time,
and in 0.1% of the time lead to a disaster.

I will try to come up with some scenarios that would qualify for the "100
Gotchas with D arrays" to provide more food for though. Consider these little
language puzzles. 

Imagine you're reviewing this code written by a relative newbee:

char[] quote(char[] word) {
  word.length += 2;
  word[length-1] = '"';
  for(int i = word.length-2; i > 0; --i)
    word[i] = word[i - 1];
  word[0] = '"';
  return word;
}

void storeFoos(char[] buffer, string pattern, Container c)
{
	char[] res = find(buffer, pattern);
	c.store(quote(res[0..pattern.len]);
}

Is this buggy? If so, how and when will the bug manifest itself? 

Or another story. An enthusiastic programmer comes to you with a very clever
rewrite of a function. This is the original:

T[] duplicate(T)(T[] src) {
   static if (is(T == invariant))
      return src.idup;
   else
      return src.dup;
}

This is his improved version:

T[] duplicate(T)(T[] src) {
   T tmp = src[$-1];
   src.length -= 1;
   src ~= tmp;
   return src;
}

No need for static if! The extension is bound to re-allocate because of
stomping rules. Will this always work, or is there a gotcha?
Nov 24 2009
parent reply Bartosz Milewski <bartosz-nospam relisoft.com> writes:
Steven Schveighoffer Wrote:
 Imagine you're reviewing this code written by a relative newbee:

 char[] quote(char[] word) {
   word.length += 2;
   word[length-1] = '"';
   for(int i = word.length-2; i > 0; --i)
     word[i] = word[i - 1];
   word[0] = '"';
   return word;
 }

 void storeFoos(char[] buffer, string pattern, Container c)
 {
 	char[] res = find(buffer, pattern);
 	c.store(quote(res[0..pattern.len]);
 }

 Is this buggy? If so, how and when will the bug manifest itself?

In my mind it is not buggy. If quote or storeFoos is not meant to be able to molest the data, it should require a const(char)[]. If this was D1 code, I'd say it depends on the purpose of storeFoos, if storeFoos is expecting to own the input buffer, it's not buggy. If it's not, then storeFoos is buggy (in all cases). But we aren't talking about D1.

Yes, a lot depends on what's expected of storeFoos. Suppose that the surrounding code expects the buffer not to be modified. The newbee didn't want to idup the buffer because in most buffers the pattern is not found.
 Or another story. An enthusiastic programmer comes to you with a very  
 clever rewrite of a function. This is the original:

 T[] duplicate(T)(T[] src) {
    static if (is(T == invariant))
       return src.idup;
    else
       return src.dup;
 }

 This is his improved version:

 T[] duplicate(T)(T[] src) {
    T tmp = src[$-1];
    src.length -= 1;
    src ~= tmp;
    return src;
 }

 No need for static if! The extension is bound to re-allocate because of  
 stomping rules. Will this always work, or is there a gotcha?

He forgot to check for length 0.

That he did. So now he goes back to the drawing board and adds an early return for length 0. Is his code correct now?
Nov 24 2009
parent reply Bartosz Milewski <bartosz-nospam relisoft.com> writes:
Steven Schveighoffer Wrote:

 On Tue, 24 Nov 2009 16:52:52 -0500, Bartosz Milewski  
 <bartosz-nospam relisoft.com> wrote:
 
 Steven Schveighoffer Wrote:
 Imagine you're reviewing this code written by a relative newbee:

 char[] quote(char[] word) {
   word.length += 2;
   word[length-1] = '"';
   for(int i = word.length-2; i > 0; --i)
     word[i] = word[i - 1];
   word[0] = '"';
   return word;
 }

 void storeFoos(char[] buffer, string pattern, Container c)
 {
 	char[] res = find(buffer, pattern);
 	c.store(quote(res[0..pattern.len]);
 }

 Is this buggy? If so, how and when will the bug manifest itself?

In my mind it is not buggy. If quote or storeFoos is not meant to be able to molest the data, it should require a const(char)[]. If this was D1 code, I'd say it depends on the purpose of storeFoos, if storeFoos is expecting to own the input buffer, it's not buggy. If it's not, then storeFoos is buggy (in all cases). But we aren't talking about D1.

surrounding code expects the buffer not to be modified. The newbee didn't want to idup the buffer because in most buffers the pattern is not found.

Then he forgot the const decoration for the parameter. No duping is necessary: void storeFoos(const(char)[] buffer, string pattern, Container c) { ... // will result in compiler errors for given quote function, fix that. }

Well, he tried that, but then his quote function wouldn't work and he needs it in other places as well. You know how they are ;-). Besides he thinks "const" is for wimps. He won't stop arguing until you prove to him that there is an actual bug in his code.
 Or another story. An enthusiastic programmer comes to you with a very
 clever rewrite of a function. This is the original:

 T[] duplicate(T)(T[] src) {
    static if (is(T == invariant))
       return src.idup;
    else
       return src.dup;
 }

 This is his improved version:

 T[] duplicate(T)(T[] src) {
    T tmp = src[$-1];
    src.length -= 1;
    src ~= tmp;
    return src;
 }

 No need for static if! The extension is bound to re-allocate because  

 stomping rules. Will this always work, or is there a gotcha?

He forgot to check for length 0.

That he did. So now he goes back to the drawing board and adds an early return for length 0. Is his code correct now?

Yes. I have a feeling you have some trick up your sleeve that you think makes it incorrect :)

Well, I was wondering if it could be rewritten as: T[] duplicate(T)(T[] src) { assert(src.length != 0); T tmp = src[$-1]; src.length -= 1; src.length += 1; src[$-1] = tmp; return src; } and whether it's okay for the optimizer to do some simple arithmetic optimization.
Nov 24 2009
parent reply Bartosz Milewski <bartosz-nospam relisoft.com> writes:
Steven Schveighoffer Wrote:


 Bottom line: if a function isn't supposed to change the buffer, the  
 signature should be const for that parameter.  It's one of the principles  
 of const, and why it's in D2 in the first place.  I'd explain to the coder  
 that he is wrong to expect that modifying a buffer in a function that  
 isn't supposed to modify a buffer is acceptable (and hopefully he gets it,  
 or else I don't have time to deal with people who insist on being right  
 when they are not).
 
 BTW, in my experience, the newbie expectaction of ~= is usually that it  
 modifies the original even when it doesn't, not the other way around.
 

The guy insists that the reallocation happens in the quote function (otherwise there would be stomping), so no sharing happens and the original is not modified. He tested it on a variety of inputs! I'm not totally making it up--I had this kind of arguments with C++ programmers. You tell them that it's safer to use const and they laugh at you. "My code works, why should I change it!"
 Second, the optimizer cannot do simple math optimization on src.length -=  


the length calls a function, which cannot be optimized out.

You are again resorting to implementation. I guess in the current implementation it's true that the compiler will indeed insert a call to its internal function. But the language spec does not proscribe that. Would you also argue that this optimization is incorrect? Changing: a.length += 2; a.length += 3; to: a.lenght += 5; I'm not saying that this is an unsolvable problem. I'm just showing how hard it is to reason about D arrays.
Nov 24 2009
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
Bartosz Milewski Wrote:

 Steven Schveighoffer Wrote:
 
 
 Bottom line: if a function isn't supposed to change the buffer, the  
 signature should be const for that parameter.  It's one of the principles  
 of const, and why it's in D2 in the first place.  I'd explain to the coder  
 that he is wrong to expect that modifying a buffer in a function that  
 isn't supposed to modify a buffer is acceptable (and hopefully he gets it,  
 or else I don't have time to deal with people who insist on being right  
 when they are not).
 
 BTW, in my experience, the newbie expectaction of ~= is usually that it  
 modifies the original even when it doesn't, not the other way around.
 

The guy insists that the reallocation happens in the quote function (otherwise there would be stomping), so no sharing happens and the original is not modified. He tested it on a variety of inputs! I'm not totally making it up--I had this kind of arguments with C++ programmers. You tell them that it's safer to use const and they laugh at you. "My code works, why should I change it!"

C++ const is kind of a joke :) But in any case, it's trivial to come up with a case that causes buffer issues. If he insists on a test case, then I'd give him one. Programmers like this don't last long at companies. I had a coworker once who insisted that a bug he was working on was going to be impossible to fix, so there was no point in working on it. My boss said that if he (the boss) could fix it, the coworker would be let go. My boss fixed it in one day. And there are *tons* of logic errors that don't cause bugs for most inputs, I don't see why we are focusing on this one case.
 
 Second, the optimizer cannot do simple math optimization on src.length -=  


the length calls a function, which cannot be optimized out.

You are again resorting to implementation. I guess in the current implementation it's true that the compiler will indeed insert a call to its internal function. But the language spec does not proscribe that.

Sure it does. It says that setting the length may reallocate the array to hold enough space. How do you do that without a function call?
 Would you also argue that this optimization is incorrect? Changing:
 
 a.length += 2;
 a.length += 3;
 
 to:
 
 a.lenght += 5;

To the developer? no. To the compiler? yes. The compiler cannot know what affect it will have by making this optimization, it does not know what the underlying function will do or the semantic meaning of those statements. What if adding to a.length outputs it's argument? Then the optimizer would have changed the output from 23 to 5. But the developer who knows the semantic meaning can make the optimization change.
 
 I'm not saying that this is an unsolvable problem. I'm just showing how hard
it is to reason about D arrays. 
 

I will agree that D arrays are hard to explain, I've had my share of training sessions on IRC. It's not an easy concept to get right off, but the benefits of doing arrays this way are huge. -Steve
Nov 25 2009
next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
Simen kjaeraas Wrote:

 Steven Schveighoffer <schveiguy yahoo.com> wrote:
 
 You are again resorting to implementation. I guess in the current  
 implementation it's true that the compiler will indeed insert a call to  
 its internal function. But the language spec does not proscribe that.

Sure it does. It says that setting the length may reallocate the array to hold enough space. How do you do that without a function call?

As long as the compiler is aware of arrays' existence, and knows how those functions work, there might indeed be an optimization that turns length+=1;length-=1; into a nop. There isn't at the moment, and there might never be, but it's a conceivable optimization.

As long as the spec says changing length may expand the array to hold enough space, the optimizer can't, because the optimization would change the side effects of the function. An optimizer should not change the outcome or side effects of a function. It's not unheard of for an optimizer to eliminate important operations that it thinks are invalid, but in that case, it's a broken optimizer, I don't see why we would add this case. Now, I could see a future optimizer changing a.length += 2; a.length += 3; into a.length += 5; But I don't see a huge gain. Why not just write a.length += 5 in the first place? This point brought up is a non-issue, an optimizer that changes the outcome/side effects of a function is a broken optimizer, end of story. -Steve
Nov 25 2009
next sibling parent reply Bartosz Milewski <bartosz-nospam relisoft.com> writes:
Steven Schveighoffer Wrote:

 As long as the spec says changing length may expand the array to hold enough
space, the optimizer can't, because the optimization would change the side
effects of the function.  An optimizer should not change the outcome or side
effects of a function.  It's not unheard of for an optimizer to eliminate
important operations that it thinks are invalid, but in that case, it's a
broken optimizer, I don't see why we would add this case.

This is all true for user-defined data types, but array is totally under compiler's control. What optimizations are acceptable depends only on the interpretation of the language spec. In particular the sequence a.length -= 1; a.length += 1; as a whole does not result in stomping, so why would it be wrong to optimize it away? I claim it's perfectly legitimate.
Nov 25 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bartosz Milewski wrote:
 Steven Schveighoffer Wrote:
 
 As long as the spec says changing length may expand the array to hold enough
space, the optimizer can't, because the optimization would change the side
effects of the function.  An optimizer should not change the outcome or side
effects of a function.  It's not unheard of for an optimizer to eliminate
important operations that it thinks are invalid, but in that case, it's a
broken optimizer, I don't see why we would add this case.

This is all true for user-defined data types, but array is totally under compiler's control. What optimizations are acceptable depends only on the interpretation of the language spec. In particular the sequence a.length -= 1; a.length += 1; as a whole does not result in stomping, so why would it be wrong to optimize it away? I claim it's perfectly legitimate.

Yes, but this gets into optimizing sequences of complex operations, so the question is one of expectancy. What if I insert something there: a.length -= 1; writeln("Why oh why???"); a.length += 1; After all people don't ordinarily expect C++ compilers to optimize away a.resize(a.size() - 1); a.resize(a.size() + 1); Each function has some semantics and effects, and I don't know why we'd want to get down the route that effects depend on the flow in which the function gets called. What is the example trying to accomplish? Andrei
Nov 25 2009
parent reply Bartosz Milewski <bartosz-nospam relisoft.com> writes:
Andrei Alexandrescu Wrote:

 Bartosz Milewski wrote:
 Steven Schveighoffer Wrote:
 
 As long as the spec says changing length may expand the array to hold enough
space, the optimizer can't, because the optimization would change the side
effects of the function.  An optimizer should not change the outcome or side
effects of a function.  It's not unheard of for an optimizer to eliminate
important operations that it thinks are invalid, but in that case, it's a
broken optimizer, I don't see why we would add this case.

This is all true for user-defined data types, but array is totally under compiler's control. What optimizations are acceptable depends only on the interpretation of the language spec. In particular the sequence a.length -= 1; a.length += 1; as a whole does not result in stomping, so why would it be wrong to optimize it away? I claim it's perfectly legitimate.

Yes, but this gets into optimizing sequences of complex operations, so the question is one of expectancy. What if I insert something there: a.length -= 1; writeln("Why oh why???"); a.length += 1; After all people don't ordinarily expect C++ compilers to optimize away a.resize(a.size() - 1); a.resize(a.size() + 1); Each function has some semantics and effects, and I don't know why we'd want to get down the route that effects depend on the flow in which the function gets called. What is the example trying to accomplish?

I'm trying to nail down the semantics. Here is the case where there provably is no stomping. If this can be optimized away then the language guarantee has to be formulated accordingly. Right now I don't know what to expect from the code I wrote. Is it a good substitute for duplicate or not? I get worried when the semantics of a very important and essential primitive--the array--is hard to explain and comprehend.
Nov 25 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bartosz Milewski wrote:
 Andrei Alexandrescu Wrote:
 
 Bartosz Milewski wrote:
 Steven Schveighoffer Wrote:
 
 As long as the spec says changing length may expand the array
 to hold enough space, the optimizer can't, because the
 optimization would change the side effects of the function.  An
 optimizer should not change the outcome or side effects of a
 function.  It's not unheard of for an optimizer to eliminate
 important operations that it thinks are invalid, but in that
 case, it's a broken optimizer, I don't see why we would add
 this case.

totally under compiler's control. What optimizations are acceptable depends only on the interpretation of the language spec. In particular the sequence a.length -= 1; a.length += 1; as a whole does not result in stomping, so why would it be wrong to optimize it away? I claim it's perfectly legitimate.

so the question is one of expectancy. What if I insert something there: a.length -= 1; writeln("Why oh why???"); a.length += 1; After all people don't ordinarily expect C++ compilers to optimize away a.resize(a.size() - 1); a.resize(a.size() + 1); Each function has some semantics and effects, and I don't know why we'd want to get down the route that effects depend on the flow in which the function gets called. What is the example trying to accomplish?

I'm trying to nail down the semantics. Here is the case where there provably is no stomping. If this can be optimized away then the language guarantee has to be formulated accordingly. Right now I don't know what to expect from the code I wrote. Is it a good substitute for duplicate or not?

The answer is very simple, and it is present in the book. I have also repeated it here. Appending to an array may or may not terminate an existing sharing relationship, and it never initiates a sharing relationship. Assigning a larger length is equivalent to appending a number of default-initialized values. There is nothing difficult here, although convoluted examples can be created that compound "if"s in complex interactions.
 I get worried when the semantics of a very important and essential
 primitive--the array--is hard to explain and comprehend.

We have a couple of dozens of other things to worry about. There are deadlines and many other more important and more urgent things to work on. The primitive is not important and not essential. It's a primitive, and like all primitives, it can be used to create better abstractions. UDP is a primitive protocol with few guarantees, but TCP is implemented on top of it to provide the guarantees. Simple, efficient primitives are a defensible choice. Andrei
Nov 25 2009
prev sibling parent Steven Schveighoffer <schveiguy yahoo.com> writes:
Bartosz Milewski Wrote:

 Steven Schveighoffer Wrote:
 
 As long as the spec says changing length may expand the array to hold enough
space, the optimizer can't, because the optimization would change the side
effects of the function.  An optimizer should not change the outcome or side
effects of a function.  It's not unheard of for an optimizer to eliminate
important operations that it thinks are invalid, but in that case, it's a
broken optimizer, I don't see why we would add this case.

This is all true for user-defined data types, but array is totally under compiler's control. What optimizations are acceptable depends only on the interpretation of the language spec. In particular the sequence a.length -= 1; a.length += 1; as a whole does not result in stomping, so why would it be wrong to optimize it away? I claim it's perfectly legitimate.

In fact, adding length modifies the contents. For example: auto str = "hello"; str.length -= 1; str.length += 1; assert(str == "hell\xFF"); If the length modifications are optimized out, the assert fails. Looks like an invalid optimization to me. -Steve
Nov 25 2009
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Steven Schveighoffer wrote:
 As long as the spec says changing length may expand the array to hold
 enough space, the optimizer can't, because the optimization would
 change the side effects of the function.  An optimizer should not
 change the outcome or side effects of a function.  It's not unheard
 of for an optimizer to eliminate important operations that it thinks
 are invalid, but in that case, it's a broken optimizer, I don't see
 why we would add this case.

The optimizer is free to change around implementation-defined-behavior and undefined-behavior. For defined-behavior, it can change things around as long as the observer cannot observe a change.
 This point brought up is a non-issue, an optimizer that changes the
 outcome/side effects of a function is a broken optimizer, end of
 story.

It can within the constraints of defined-behavior.
Nov 25 2009
parent Bartosz Milewski <bartosz-nospam relisoft.com> writes:
Walter Bright Wrote:

 Steven Schveighoffer wrote:
 As long as the spec says changing length may expand the array to hold
 enough space, the optimizer can't, because the optimization would
 change the side effects of the function.  An optimizer should not
 change the outcome or side effects of a function.  It's not unheard
 of for an optimizer to eliminate important operations that it thinks
 are invalid, but in that case, it's a broken optimizer, I don't see
 why we would add this case.

The optimizer is free to change around implementation-defined-behavior and undefined-behavior. For defined-behavior, it can change things around as long as the observer cannot observe a change.
 This point brought up is a non-issue, an optimizer that changes the
 outcome/side effects of a function is a broken optimizer, end of
 story.

It can within the constraints of defined-behavior.

So the real question is, is extension defined in terms of re-allocation or in terms of stomping. In the former case, the compiler should not optimize, because re-allocation is the expected side effect. In the latter case, it can eliminate the code, because what counts is the absence of stomping.
Nov 26 2009
prev sibling parent reply Bartosz Milewski <bartosz-nospam relisoft.com> writes:
Steven Schveighoffer Wrote:

 Bartosz Milewski Wrote:
 
 Steven Schveighoffer Wrote:
 
 
 Bottom line: if a function isn't supposed to change the buffer, the  
 signature should be const for that parameter.  It's one of the principles  
 of const, and why it's in D2 in the first place.  I'd explain to the coder  
 that he is wrong to expect that modifying a buffer in a function that  
 isn't supposed to modify a buffer is acceptable (and hopefully he gets it,  
 or else I don't have time to deal with people who insist on being right  
 when they are not).
 
 BTW, in my experience, the newbie expectaction of ~= is usually that it  
 modifies the original even when it doesn't, not the other way around.
 

The guy insists that the reallocation happens in the quote function (otherwise there would be stomping), so no sharing happens and the original is not modified. He tested it on a variety of inputs! I'm not totally making it up--I had this kind of arguments with C++ programmers. You tell them that it's safer to use const and they laugh at you. "My code works, why should I change it!"

C++ const is kind of a joke :) But in any case, it's trivial to come up with a case that causes buffer issues. If he insists on a test case, then I'd give him one. Programmers like this don't last long at companies. I had a coworker once who insisted that a bug he was working on was going to be impossible to fix, so there was no point in working on it. My boss said that if he (the boss) could fix it, the coworker would be let go. My boss fixed it in one day. And there are *tons* of logic errors that don't cause bugs for most inputs, I don't see why we are focusing on this one case.

The problem here is that the guy is sort of right. Indeed quote() expands the slice before overwriting it and that requires re-allocation to avoid stomping over the original buffer. There is however a boundary case, when the whole buffer matches the pattern. Then the expansion is done in place and the buffer is modified. This would never happen if expansion were guaranteed to re-allocate. What this example was supposed to illustrate is that it's easy to forget to explicitly duplicate an array, especially if it's not clear who's responsible--the caller or the callee. This is somewhat reminiscent of the situation in C++ and the responsibility to delete an object. Interestingly, in C++ programs you can use unique_ptr's to have deterministic deletion. In D you could also use uniqueness to have deterministic reallocation (or rather avoid it when it's not observable).
Nov 25 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bartosz Milewski wrote:
 Steven Schveighoffer Wrote:
 
 Bartosz Milewski Wrote:

 Steven Schveighoffer Wrote:


 Bottom line: if a function isn't supposed to change the buffer, the  
 signature should be const for that parameter.  It's one of the principles  
 of const, and why it's in D2 in the first place.  I'd explain to the coder  
 that he is wrong to expect that modifying a buffer in a function that  
 isn't supposed to modify a buffer is acceptable (and hopefully he gets it,  
 or else I don't have time to deal with people who insist on being right  
 when they are not).

 BTW, in my experience, the newbie expectaction of ~= is usually that it  
 modifies the original even when it doesn't, not the other way around.


But in any case, it's trivial to come up with a case that causes buffer issues. If he insists on a test case, then I'd give him one. Programmers like this don't last long at companies. I had a coworker once who insisted that a bug he was working on was going to be impossible to fix, so there was no point in working on it. My boss said that if he (the boss) could fix it, the coworker would be let go. My boss fixed it in one day. And there are *tons* of logic errors that don't cause bugs for most inputs, I don't see why we are focusing on this one case.

The problem here is that the guy is sort of right. Indeed quote() expands the slice before overwriting it and that requires re-allocation to avoid stomping over the original buffer. There is however a boundary case, when the whole buffer matches the pattern. Then the expansion is done in place and the buffer is modified. This would never happen if expansion were guaranteed to re-allocate. What this example was supposed to illustrate is that it's easy to forget to explicitly duplicate an array, especially if it's not clear who's responsible--the caller or the callee. This is somewhat reminiscent of the situation in C++ and the responsibility to delete an object. Interestingly, in C++ programs you can use unique_ptr's to have deterministic deletion. In D you could also use uniqueness to have deterministic reallocation (or rather avoid it when it's not observable).

I agree that the underlying array sharing can be often a hassle. This is a problem that goes beyond just expansion - it's all the implied aliasing that goes on whenever you pass arrays around. How about creating a struct Value!T that transforms T (be it an array or a class) into a value type? Then if you use Value!(int[]), you're effectively dealing with values throughout (even though internally they might be COWed). Sometimes I also see a need for DeepValue!T which would e.g. duplicate transitively arrays of arrays and full object trees. For the latter we need some more introspection though. But we have everything in the laguage to make Value and DeepValue work with arrays. What do you think? Andrei
Nov 25 2009
parent reply Bartosz Milewski <bartosz-nospam relisoft.com> writes:
Andrei Alexandrescu Wrote:

 How about creating a struct Value!T that transforms T (be it an array or 
 a class) into a value type? Then if you use Value!(int[]), you're 
 effectively dealing with values throughout (even though internally they 
 might be COWed). Sometimes I also see a need for DeepValue!T which would 
 e.g. duplicate transitively arrays of arrays and full object trees. For 
 the latter we need some more introspection though. But we have 
 everything in the laguage to make Value and DeepValue work with arrays.
 
 What do you think?

I'm afraid this would further muddle the message: "If you want safe arrays, use the Value device; if you want to live dangerously, use the built in type." I'd rather see the reverse: D arrays are safe to use. They have the usual reference semantics of static arrays. But if you expand them, the sharing goes away and you get a unique reference to a copy. This is the "no gotcha" semantics, totally predictable, easy to reason about. How the compiler supports that semantics while performing clever optimizations is another story. It's fine if this part is hard. The language can even impose complexity requirements, if you are sure that they are possible to implement (it's enough to have one compliant implementation to prove this point). By the way, what are the library algorithms that rely on O(1) behavior of array append?
Nov 25 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bartosz Milewski wrote:
 Andrei Alexandrescu Wrote:
 
 How about creating a struct Value!T that transforms T (be it an
 array or a class) into a value type? Then if you use Value!(int[]),
 you're effectively dealing with values throughout (even though
 internally they might be COWed). Sometimes I also see a need for
 DeepValue!T which would e.g. duplicate transitively arrays of
 arrays and full object trees. For the latter we need some more
 introspection though. But we have everything in the laguage to make
 Value and DeepValue work with arrays.
 
 What do you think?

I'm afraid this would further muddle the message: "If you want safe arrays, use the Value device; if you want to live dangerously, use the built in type."

I think the message is "If you want values, use Value. If you want slices, use slices." To me that's rather a clear message.
 I'd rather see the reverse: D arrays are safe to
 use.

But that's backwards. You can do Value with slices. You can't do slices with values. The logical primitive to pick is the slice.
 They have the usual reference semantics of static arrays. But if
 you expand them, the sharing goes away and you get a unique reference
 to a copy. This is the "no gotcha" semantics, totally predictable,
 easy to reason about.
 
 How the compiler supports that semantics while performing clever
 optimizations is another story. It's fine if this part is hard. The
 language can even impose complexity requirements, if you are sure
 that they are possible to implement (it's enough to have one
 compliant implementation to prove this point).

Well the problem is we don't have that. It's more difficult to "be fine" if the onus of implementation is on you.
 By the way, what are the library algorithms that rely on O(1)
 behavior of array append?

I don't know, but there are plenty of algorithms out there that rely on that. Andrei
Nov 25 2009
parent reply Bartosz Milewski <bartosz-nospam relisoft.com> writes:
Andrei Alexandrescu Wrote:

 Bartosz Milewski wrote:
 Andrei Alexandrescu Wrote:
 
 How about creating a struct Value!T that transforms T (be it an
 array or a class) into a value type? Then if you use Value!(int[]),
 you're effectively dealing with values throughout (even though
 internally they might be COWed). Sometimes I also see a need for
 DeepValue!T which would e.g. duplicate transitively arrays of
 arrays and full object trees. For the latter we need some more
 introspection though. But we have everything in the laguage to make
 Value and DeepValue work with arrays.
 
 What do you think?

I'm afraid this would further muddle the message: "If you want safe arrays, use the Value device; if you want to live dangerously, use the built in type."

I think the message is "If you want values, use Value. If you want slices, use slices." To me that's rather a clear message.
 I'd rather see the reverse: D arrays are safe to
 use.

But that's backwards. You can do Value with slices. You can't do slices with values. The logical primitive to pick is the slice.

I know you don't like analogies, but for me it sounds like an advice to a C++ programmer: if you want to control destruction, use unique_ptr (or shared_ptr). This, BTW, is how I program in C++ because I can't live with memory leaks and double-deletions. What would your advice be to somebody who is learning D as her first language. Don't use arrays directly, use Value!(T[]) ? Should libraries be written in terms of Value!(T[])?
 They have the usual reference semantics of static arrays. But if
 you expand them, the sharing goes away and you get a unique reference
 to a copy. This is the "no gotcha" semantics, totally predictable,
 easy to reason about.
 
 How the compiler supports that semantics while performing clever
 optimizations is another story. It's fine if this part is hard. The
 language can even impose complexity requirements, if you are sure
 that they are possible to implement (it's enough to have one
 compliant implementation to prove this point).

Well the problem is we don't have that. It's more difficult to "be fine" if the onus of implementation is on you.
 By the way, what are the library algorithms that rely on O(1)
 behavior of array append?

I don't know, but there are plenty of algorithms out there that rely on that.

I thought you had concrete examples.
Nov 26 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bartosz Milewski wrote:
 Andrei Alexandrescu Wrote:
 
 Bartosz Milewski wrote:
 Andrei Alexandrescu Wrote:

 How about creating a struct Value!T that transforms T (be it an
 array or a class) into a value type? Then if you use Value!(int[]),
 you're effectively dealing with values throughout (even though
 internally they might be COWed). Sometimes I also see a need for
 DeepValue!T which would e.g. duplicate transitively arrays of
 arrays and full object trees. For the latter we need some more
 introspection though. But we have everything in the laguage to make
 Value and DeepValue work with arrays.

 What do you think?

arrays, use the Value device; if you want to live dangerously, use the built in type."

slices, use slices." To me that's rather a clear message.
 I'd rather see the reverse: D arrays are safe to
 use.

with values. The logical primitive to pick is the slice.

I know you don't like analogies, but for me it sounds like an advice to a C++ programmer: if you want to control destruction, use unique_ptr (or shared_ptr). This, BTW, is how I program in C++ because I can't live with memory leaks and double-deletions. What would your advice be to somebody who is learning D as her first language. Don't use arrays directly, use Value!(T[]) ? Should libraries be written in terms of Value!(T[])?

Some would use that, some would use T[]. For example std.algorithm could derive only little, yet nonzero, use from Value!(T[]).
 They have the usual reference semantics of static arrays. But if
 you expand them, the sharing goes away and you get a unique reference
 to a copy. This is the "no gotcha" semantics, totally predictable,
 easy to reason about.

 How the compiler supports that semantics while performing clever
 optimizations is another story. It's fine if this part is hard. The
 language can even impose complexity requirements, if you are sure
 that they are possible to implement (it's enough to have one
 compliant implementation to prove this point).

if the onus of implementation is on you.
 By the way, what are the library algorithms that rely on O(1)
 behavior of array append?

that.

I thought you had concrete examples.

I do (one famous example is in Windows 3), but I wouldn't think there's a need to argue that point. If we're facing the prospect of an append that always reallocates, we better don't provide it at all. People here have been complaining about ~= being slow because it acquires a lock, even though its complexity is amortized O(1). Andrei
Nov 26 2009
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
Bartosz Milewski Wrote:

 The problem here is that the guy is sort of right. Indeed quote() expands the
slice before overwriting it and that requires re-allocation to avoid stomping
over the original buffer. There is however a boundary case, when the whole
buffer matches the pattern. Then the expansion is done in place and the buffer
is modified. This would never happen if expansion were guaranteed to
re-allocate. 

Actually, the requirement for not reallocating is that the slice ends at the end of an allocated block, so for example if the pattern matched at the end of the buffer, you may have problems. The larger problem in the code is that the developer expects reallocation when it is not guaranteed. That is, he is *depending* on reallocation. In that case, you should explicitly reallocate. This doesn't necessarily mean using .dup, the following works just fine: return '"' ~ word ~ '"'; And it actually performs much better than the example. In all honesty, I was sort of trying to play your game for this example. In real life, I would have mentioned that the above line replaces the quote function, and most likely the coder would have accepted it not because it deterministically allocates but because it's cleaner and shorter :)
 
 What this example was supposed to illustrate is that it's easy to forget to
explicitly duplicate an array, especially if it's not clear who's
responsible--the caller or the callee. 

I don't think the example illustrates that. It illustrates that there are convoluted ways to try and explicitly use appending to duplicate arrays, and why one shouldn't do that. -Steve
Nov 25 2009
parent reply Bartosz Milewski <bartosz-nospam relisoft.com> writes:
Steve, I don't know about you, but this exchange clarified some things for me.
The major one is that it's dangerous to define the semantics of a language
construct in terms of implementation. You were defending some points using
implementation arguments rather than sticking to defined semantics. We have
found out that one should never rely on the array being re-allocated on
expansion, even if it seems like there's no other way. The only correct
statement is that the freshly expanded part of the array is guaranteed not to
be write-shared with any other array. 

However, this discussion veered away from a more important point. I don't
believe that programmers will consciously make assumptions about re-allocation
breaking sharing. 

The danger is that it's easy to miss accidental sharing and it's very hard to
test for it. 

That was my original point and I haven't been convinced otherwise.


 Bartosz Milewski Wrote:
 
 The problem here is that the guy is sort of right. Indeed quote() expands the
slice before overwriting it and that requires re-allocation to avoid stomping
over the original buffer. There is however a boundary case, when the whole
buffer matches the pattern. Then the expansion is done in place and the buffer
is modified. This would never happen if expansion were guaranteed to
re-allocate. 

Actually, the requirement for not reallocating is that the slice ends at the end of an allocated block, so for example if the pattern matched at the end of the buffer, you may have problems. The larger problem in the code is that the developer expects reallocation when it is not guaranteed. That is, he is *depending* on reallocation. In that case, you should explicitly reallocate. This doesn't necessarily mean using .dup, the following works just fine: return '"' ~ word ~ '"'; And it actually performs much better than the example. In all honesty, I was sort of trying to play your game for this example. In real life, I would have mentioned that the above line replaces the quote function, and most likely the coder would have accepted it not because it deterministically allocates but because it's cleaner and shorter :)
 
 What this example was supposed to illustrate is that it's easy to forget to
explicitly duplicate an array, especially if it's not clear who's
responsible--the caller or the callee. 

I don't think the example illustrates that. It illustrates that there are convoluted ways to try and explicitly use appending to duplicate arrays, and why one shouldn't do that. -Steve

Nov 26 2009
parent Bartosz Milewski <bartosz-nospam relisoft.com> writes:
Steven Schveighoffer Wrote:

 The danger is that it's easy to miss accidental sharing and it's very  
 hard to test for it.

I think this danger is rare, and it's easy to search for (just search for ~= in your code, I did it with Tango). I think it can be very well defined in a tutorial or book chapter.

This might just be it. Experienced D programmers will avoid using ~= . It's the newbies who will be stumbling into these problems.
Dec 01 2009
prev sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Tue, Nov 24, 2009 at 8:26 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Steven Schveighoffer wrote:
 On Tue, 24 Nov 2009 11:01:10 -0500, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 [snip]
 Andrei has mentioned that he thinks we can store the allocated length =




 the GC block, which I think would also work. =A0You also wouldn't need=




 cache in that case, but he says it's in *addition* to the MRU cache, s=




 not sure what he means.

[snip] Reaching the GC block is relatively expensive, so the MRU still helps. =



 essence it's like this. When appending:

 a) look up the cache, if there, you have O(1) amortized append that's
 really fast

 b) if not in the cache, talk to the GC, still O(1) amortized append
 that's not as fast

 Both help providing an important performance guarantee. I was a bit
 worried about guaranteeing "O(1) amortized append for up to 8 arrays at=



 time."

Have you considered the performance impact on allocating non-array types=


 =A0That is, are you intending on all allocations storing the allocated l=


 even class or struct allocations who will likely never append? =A0Or wil=


 there be a "non-appendable" flag?

 Also, the part without the MRU cache was my original proposal from last
 year, I had some ideas on how length could be stored. =A0For example, in=


 page of up to 128 byte blocks, you only need 8 bits to store the length
 (alas, you cannot store with 4 bits for 16-byte blocks because you need =


 cover both 0 and 16). =A0This reduces the overhead for those blocks. =A0=


 byte to 1-page blocks, 16 bits is acceptable multi-page blocks, the cost=


 storing a 32-bit integer is negligible.

Having access to the requested length is important at larger lengths, so probably we could be fine by not actually storing it for allocations up t=

 128 bytes.

 It is true the lookup of the MRU cache will not involve dissecting the
 address of the block to find it's container block, but you still will ne=


 lock, or are you planning on doing something clever? =A0 I think the loc=


 will be the bottleneck, and if you don't make it the same as the global
 lock, add the cost of both locks when you actually need to append.

The cache is a thread-local map from pointers to size_t. Using it does no=

 require any locking I think.

Wait, what happens if you're sharing an array between two different threads= ? The MRU in my thread might be out of date. So some kind of cache flushing or snooping seems required. And that means a lock. It's the same cache consistency problem faced by multiple CPU cores sharing one system memory. --bb
Nov 24 2009
prev sibling next sibling parent Leandro Lucarella <llucax gmail.com> writes:
dsimcha, el 20 de noviembre a las 03:35 me escribiste:
 2.  We state in the documentation that ~= on slices is a convenience that isn't
 terribly efficient and recommend that people that care more about performance
than
 simplicity use the library type for their serious array building, especially
for
 large arrays.

I think this is a really bad idea, it's so obvious dynamic arrays are broken when you have to say in the documentation "you have this operation, but don't use it because it's broken". -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Si ella es la flor, yo soy la espina Si ella es quien florece, yo quien se marchita Y estamos en eclipse total, y estamos en eclipse total Completamente cruzados, completamente cruzados
Nov 20 2009
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 24 Nov 2009 16:52:52 -0500, Bartosz Milewski  
<bartosz-nospam relisoft.com> wrote:

 Steven Schveighoffer Wrote:
 Imagine you're reviewing this code written by a relative newbee:

 char[] quote(char[] word) {
   word.length += 2;
   word[length-1] = '"';
   for(int i = word.length-2; i > 0; --i)
     word[i] = word[i - 1];
   word[0] = '"';
   return word;
 }

 void storeFoos(char[] buffer, string pattern, Container c)
 {
 	char[] res = find(buffer, pattern);
 	c.store(quote(res[0..pattern.len]);
 }

 Is this buggy? If so, how and when will the bug manifest itself?

In my mind it is not buggy. If quote or storeFoos is not meant to be able to molest the data, it should require a const(char)[]. If this was D1 code, I'd say it depends on the purpose of storeFoos, if storeFoos is expecting to own the input buffer, it's not buggy. If it's not, then storeFoos is buggy (in all cases). But we aren't talking about D1.

surrounding code expects the buffer not to be modified. The newbee didn't want to idup the buffer because in most buffers the pattern is not found.

Then he forgot the const decoration for the parameter. No duping is necessary: void storeFoos(const(char)[] buffer, string pattern, Container c) { ... // will result in compiler errors for given quote function, fix that. }
 Or another story. An enthusiastic programmer comes to you with a very
 clever rewrite of a function. This is the original:

 T[] duplicate(T)(T[] src) {
    static if (is(T == invariant))
       return src.idup;
    else
       return src.dup;
 }

 This is his improved version:

 T[] duplicate(T)(T[] src) {
    T tmp = src[$-1];
    src.length -= 1;
    src ~= tmp;
    return src;
 }

 No need for static if! The extension is bound to re-allocate because  

 stomping rules. Will this always work, or is there a gotcha?

He forgot to check for length 0.

That he did. So now he goes back to the drawing board and adds an early return for length 0. Is his code correct now?

Yes. I have a feeling you have some trick up your sleeve that you think makes it incorrect :) -Steve
Nov 24 2009
prev sibling next sibling parent reply Bartosz Milewski <bartosz-nospam relisoft.com> writes:
Andrei Alexandrescu Wrote:


 One thing that Bartosz didn't mention is that "this unique" is different 
 than "the other unique", which I think reveals the half-bakedness of the 
 proposal. "The other unique" that was intensively discussed before is 
 transitive, meaning that the root reference points to a graph of objects 
 having no other connection to the outer world. "This unique" is very 
 different because it's shallow - only the array chunk is unaliased, but 
 the members of the array don't need to be. This is a very important 
 distinction.

Actually I was thinking of deep unique, but understandably I neither described full details of my proposal, nor have I completely "baked" it, which would require some serious work and a lot more discussion. One of these days I will blog about uniqueness in more detail. The problem of transitivity with respect to arrays has been discussed before in the context of const and immutable. We realized that shallow constness/immutability would be useful in some scenarios but, because of the added complexity, we punted it. My feeling is that, similarly, in most cases the restriction that a unique array may only hold unique objects is not unreasonable.
Nov 18 2009
parent reply Bartosz Milewski <bartosz-nospam relisoft.com> writes:
Steven Schveighoffer Wrote:

 I wonder what your expert opinion is:  Is something like unique possible  
 without introducing all the other type modifiers (like 'lent', etc.)?  If  
 not, what are the minimum type modifiers you'd need to get unique to  
 work?  I really like the unique concept, and think it is really one of the  
 biggest holes missing from the const system (aside from bug 1961).

The most complete description of "unique" that I've seen is in Haller/Odersky Scala paper "Capabilities for External Uniqueness." It's not an easy paper to read, but I'm planning on writing a blog post in which I'll try to make it more palatable. They use annotations to extend the type system, something that was pioneered in Java. The recent introduction of annotations (like property) in D makes this approach quite attractive. They used the following annotations: unique, transient, and exposed ( transient loosely corresponds to "lent"). Don't get me wrong, the full specification of uniqueness is non-trivial and I wouldn't propose it for D just to solve the array expansion problem. The real power of "unique" shows in multithreaded programming where you can pass unique objects between threads and safely operate on them without any synchronization.
Nov 19 2009
parent reply Bartosz Milewski <bartosz-nospam relisoft.com> writes:
 int[] a = [0];
 auto b = a;
 a ~= 1;
 b ~= 2;
 What is a[1]?

 Is this considered "stomping" and requiring a re-allocation?

Can this question be answered without the recourse to implementation, and the MRU cache in particular? I thought about an abstract definition of "stomping".Something like that: The expanded part of the array is never implicitly shared with any other array. But I'm not sure about this case: int[] a = [1, 2, 3]; int[] b = a[0..1]; b ~= 4; what is a[1]? By my definition, this would be considered stomping, but I couldn't find this example in TDPL. I'm also not sure what to make of this statement: "client code is guaranteed to benefit of good average performance over a large number of appends to the same array". Since "good" is open to interpretation, such a guarantee is meaningless. If it is meant as "amortized O(1)", it's impossible to fulfill, unless the MRU cache is infinite and a lookup is O(1). Then there is the problem of multithreading. A global MRU cache would require potentially expensive synchronization. A thread-local cache, on the other hand, makes thread creation even more heavy-weight than it is right now. My prediction is that D will be evolving towards user-level threads and fine granularity concurrency, which requires very small thread footprint.
Nov 21 2009
next sibling parent reply Ali Cehreli <acehreli yahoo.com> writes:
Bartosz Milewski Wrote:

 int[] a = [0];
 auto b = a;
 a ~= 1;
 b ~= 2;
 What is a[1]?

 Is this considered "stomping" and requiring a re-allocation?

Can this question be answered without the recourse to implementation, and the MRU cache in particular? I thought about an abstract definition of "stomping".Something like that: The expanded part of the array is never implicitly shared with any other array. But I'm not sure about this case:

It is natural that a language like D has dynamic arrays (as core features or as library implentations). I think that's why we keep using the word "array" to describe slices. (No, I don't have proof for this theory.) In *current* D, the user code does not handle dynamic arrays. Both 'a' and 'b' below are slices.
 int[] a = [1, 2, 3];
 int[] b = a[0..1];
 b ~= 4;
 what is a[1]?
 
 By my definition, this would be considered stomping, but I couldn't find this
example in TDPL. 

Neither 'a' nor 'b' own the elements; in that sense, if one is a slice then the other is a slice... In D, slices have "discretionary sharing semantics." Any slice may leave the sharing contract as it sees unfit. Ali
Nov 21 2009
next sibling parent reply Bartosz Milewski <bartosz-nospam relisoft.com> writes:
Ali Cehreli Wrote:

 Neither 'a' nor 'b' own the elements; in that sense, if one is a slice then
the other is a slice... In D, slices have "discretionary sharing semantics."
Any slice may leave the sharing contract as it sees unfit.

So is your answer different from that of dsimcha? I'm sitting on a fence about it. Andrei, could you settle this issue (and a few others I raised before)?
Nov 21 2009
next sibling parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Bartosz Milewski (bartosz-nospam relisoft.com)'s article
 Ali Cehreli Wrote:
 Neither 'a' nor 'b' own the elements; in that sense, if one is a slice then


slice may leave the sharing contract as it sees unfit.
 So is your answer different from that of dsimcha? I'm sitting on a fence about

I think the confusion stems from Ali Cehreli describing things as currently implemented and me describing things as proposed under the MRU cache scheme.
Nov 21 2009
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bartosz Milewski wrote:
 Ali Cehreli Wrote:
 
 Neither 'a' nor 'b' own the elements; in that sense, if one is a
 slice then the other is a slice... In D, slices have "discretionary
 sharing semantics." Any slice may leave the sharing contract as it
 sees unfit.

So is your answer different from that of dsimcha? I'm sitting on a fence about it. Andrei, could you settle this issue (and a few others I raised before)?

I'm not sure I figure what the issue is, and I'd appreciate a rehash of the few other issues brought up. (I thought they had been responded to.) David's reply is valid but may be a bit misleading because it explains things in terms of the cache, which could reinforce the impression that the cache is part of the definition. Ali's point is also correct in the sense that arrays are not owned. Garbage collection makes that possible. It's very simple, really. Appending to an array never results in overwriting elements of an existing array. Appending may terminate a sharing relationship. This is simple, easy to understand, and requires no understanding of the optimization paraphernalia, which I hope to be able to improve. Andrei
Nov 22 2009
next sibling parent reply Bartosz Milewski <bartosz-nospam relisoft.com> writes:
Andrei Alexandrescu Wrote:

 I'm not sure I figure what the issue is, and I'd appreciate a rehash of 
 the few other issues brought up. (I thought they had been responded to.) 

Some of the points are in my message from Sat, 05:19 pm. Especially this point: Can a conforming implementation simply re-allocate on every expansion? You make it sound like that would violate some performance guarantees but there are no specifics.
Nov 22 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bartosz Milewski wrote:
 Andrei Alexandrescu Wrote:
 
 I'm not sure I figure what the issue is, and I'd appreciate a
 rehash of the few other issues brought up. (I thought they had been
 responded to.)

Some of the points are in my message from Sat, 05:19 pm. Especially this point: Can a conforming implementation simply re-allocate on every expansion? You make it sound like that would violate some performance guarantees but there are no specifics.

I don't think a conforming implementation is allowed to do that. The functioning of many algorithms assumes constant-time append. If D can't guarantee that, those algorithms won't work. I've been deliberately vague in the book because I wasn't sure we can offer a guarantee if the cache is filled. Now I know we can: if a memory block stores the requested size in addition to today's effective capacity, then the allocator can overallocate growing arrays and then detect and correctly handle in-place reallocation requests, even when the MRU cache capacity is exceeded. There would be a larger cost for that (acquiring a lock), but it's still a constant amortized per-element cost. Andrei
Nov 23 2009
parent reply Bartosz Milewski <bartosz-nospam relisoft.com> writes:
Andrei Alexandrescu Wrote:


 Some of the points are in my message from Sat, 05:19 pm. Especially
 this point: Can a conforming implementation simply re-allocate on
 every expansion? You make it sound like that would violate some
 performance guarantees but there are no specifics.

I don't think a conforming implementation is allowed to do that. The functioning of many algorithms assumes constant-time append. If D can't guarantee that, those algorithms won't work. I've been deliberately vague in the book because I wasn't sure we can offer a guarantee if the cache is filled. Now I know we can: if a memory block stores the requested size in addition to today's effective capacity, then the allocator can overallocate growing arrays and then detect and correctly handle in-place reallocation requests, even when the MRU cache capacity is exceeded. There would be a larger cost for that (acquiring a lock), but it's still a constant amortized per-element cost.

I wish we had some performance numbers to support this optimization. There's always a break-even point between O(1) and O(N) and I wonder at what N it happens. Also, how often does it happen that the cache is full--whether it's normal state or an exception. Hitting the cache and the heap lock on every extension will also break locality. There are so many factors...
Nov 23 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bartosz Milewski wrote:
 Andrei Alexandrescu Wrote:
 
 
 Some of the points are in my message from Sat, 05:19 pm. Especially
 this point: Can a conforming implementation simply re-allocate on
 every expansion? You make it sound like that would violate some
 performance guarantees but there are no specifics.

functioning of many algorithms assumes constant-time append. If D can't guarantee that, those algorithms won't work. I've been deliberately vague in the book because I wasn't sure we can offer a guarantee if the cache is filled. Now I know we can: if a memory block stores the requested size in addition to today's effective capacity, then the allocator can overallocate growing arrays and then detect and correctly handle in-place reallocation requests, even when the MRU cache capacity is exceeded. There would be a larger cost for that (acquiring a lock), but it's still a constant amortized per-element cost.

I wish we had some performance numbers to support this optimization. There's always a break-even point between O(1) and O(N) and I wonder at what N it happens. Also, how often does it happen that the cache is full--whether it's normal state or an exception. Hitting the cache and the heap lock on every extension will also break locality. There are so many factors...

Very true. It would be great if you or someone else interested could find the time to run a few tests. I am committed way over the top and simply cannot look into this. Andrei
Nov 23 2009
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Bartosz Milewski wrote:
 Andrei Alexandrescu Wrote:


 Some of the points are in my message from Sat, 05:19 pm. Especially
 this point: Can a conforming implementation simply re-allocate on
 every expansion? You make it sound like that would violate some
 performance guarantees but there are no specifics.

functioning of many algorithms assumes constant-time append. If D can't guarantee that, those algorithms won't work. I've been deliberately vague in the book because I wasn't sure we can offer a guarantee if the cache is filled. Now I know we can: if a memory block stores the requested size in addition to today's effective capacity, then the allocator can overallocate growing arrays and then detect and correctly handle in-place reallocation requests, even when the MRU cache capacity is exceeded. There would be a larger cost for that (acquiring a lock), but it's still a constant amortized per-element cost.

I wish we had some performance numbers to support this optimization. There's


Also, how often does it happen that the cache is full--whether it's normal state or an exception. Hitting the cache and the heap lock on every extension will also break locality. There are so many factors...
 Very true. It would be great if you or someone else interested could
 find the time to run a few tests. I am committed way over the top and
 simply cannot look into this.
 Andrei

But one of my biggest gripes about the current appending implementation, far more important from a practical perspective than any theoretical safety guarantees, is that it doesn't scale to multiple threads. I've experienced this in a real-world program that I translated from someone else's C++ code that used vector::push_back() quite liberally inside core algorithms. (The reason is that it was written more for simplicity/readability than performance, so the person who wrote it never thought to optimize this kind of stuff.) When I did a literal translation of this to D, everything was completely serialized. It didn't even scale to two cores. I had to rewrite significant portions of the code in a way that did not use appending to get it to scale at all. D is supposed to be a language that is geared toward multi-threading. If something as simple as array appending is a threading bottleneck (and it was for me in a real-world program), I don't care if it is O(1) because effectively it has a constant the size of Jupiter.
Nov 23 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bill Baxter wrote:
 On Mon, Nov 23, 2009 at 2:09 PM, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 On Mon, 23 Nov 2009 16:33:42 -0500, dsimcha <dsimcha yahoo.com> wrote:

 But one of my biggest gripes about the current appending implementation,
 far more
 important from a practical perspective than any theoretical safety
 guarantees, is
 that it doesn't scale to multiple threads.  I've experienced this in a
 real-world
 program that I translated from someone else's C++ code that used
 vector::push_back() quite liberally inside core algorithms.  (The reason
 is that
 it was written more for simplicity/readability than performance, so the
 person who
 wrote it never thought to optimize this kind of stuff.)  When I did a
 literal
 translation of this to D, everything was completely serialized.  It didn't
 even
 scale to two cores.  I had to rewrite significant portions of the code in
 a way
 that did not use appending to get it to scale at all.

 D is supposed to be a language that is geared toward multi-threading.  If
 something as simple as array appending is a threading bottleneck (and it
 was for
 me in a real-world program), I don't care if it is O(1) because
 effectively it has
 a constant the size of Jupiter.

you need is a library type that does not require a global lock for appending. Such is not the case for every example of appending in D, and the existence of the current D arrays does not prevent you from adding more specialized types. There are other costs to using such types besides the global lock, such as storing the capacity, and if you are slicing, locally storing the slice extents.

That's likely to be far too big a limitation for the upcoming world of parallelism everywhere. But the single-threaded GC is the real culprit. Fix that and arrays shouldn't be a problem either.

I agree. Overall, our eyes should be at this point focused on principle matters rather than matters of optimizations that can be done but haven't been done yet. Without being 100% sure, I conjecture that array primitives can be implemented efficiently if we give the implementation the freedom to end sharing for ~=. Also I am not ruling out the possibility that we can guarantee a ~= that never keeps sharing with existing arrays. (One idea I discussed with Walter was encoding uniqueness of arrays in a bit stored with the array limits.) Andrei
Nov 23 2009
parent Bartosz Milewski <bartosz-nospam relisoft.com> writes:
Andrei Alexandrescu Wrote:
 sharing for ~=. Also I am not ruling out the possibility that we can 
 guarantee a ~= that never keeps sharing with existing arrays. (One idea 
 I discussed with Walter was encoding uniqueness of arrays in a bit 
 stored with the array limits.)

I've been thinking the same. Full uniqueness might be hard to implement, but if you are only interested in "opportunistic uniqueness" as an optimization tool, that might be easier. In full uniqueness you worry about lending and deep aliases. Here you might just conservatively decide that passing an array to a function, or storing it in an object, destroys its uniqueness (on top of explicit slicing). And you don't have to care about deep aliases as long as your only interest is in the uniqueness of the array buffer and not the objects it might be storing. Uniqueness one way or another is the key.
Nov 24 2009
prev sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2009-11-22 19:58:37 -0500, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 It's very simple, really. Appending to an array never results in 
 overwriting elements of an existing array. Appending may terminate a 
 sharing relationship. This is simple, easy to understand, and requires 
 no understanding of the optimization paraphernalia, which I hope to be 
 able to improve.

I think it's the "may terminate" part Bartosz (and some others) don't like, since it's hard to predict and may cause bugs that appear in an hard to understand pattern when misused. I'm not exactly sure where to stand on this issue, but I might have a solution that looks like a compromise: use the MRU only for arrays of immutable elements, like strings. Since it's not possible to modify immutable data, it doesn't change the semantics if two slices are aliasing each other or not[^1]. Other arrays could get the more predictable behavior of always reallocating, but an optimizer could still avoid that when static analysis can prove that only one slice points to the data. [^1]: It changes things somewhat if you take the address of elements and start comparing them, but I find it reasonable that taking address of things could reveal implementation details. It could change things if the element type has a postblit function, or a destructor, since you will avoid some copying, but I think that's fine. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 22 2009
prev sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Mon, Nov 23, 2009 at 2:09 PM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 On Mon, 23 Nov 2009 16:33:42 -0500, dsimcha <dsimcha yahoo.com> wrote:

 But one of my biggest gripes about the current appending implementation,
 far more
 important from a practical perspective than any theoretical safety
 guarantees, is
 that it doesn't scale to multiple threads. =A0I've experienced this in a
 real-world
 program that I translated from someone else's C++ code that used
 vector::push_back() quite liberally inside core algorithms. =A0(The reas=


 is that
 it was written more for simplicity/readability than performance, so the
 person who
 wrote it never thought to optimize this kind of stuff.) =A0When I did a
 literal
 translation of this to D, everything was completely serialized. =A0It di=


 even
 scale to two cores. =A0I had to rewrite significant portions of the code=


 a way
 that did not use appending to get it to scale at all.

 D is supposed to be a language that is geared toward multi-threading. =


 something as simple as array appending is a threading bottleneck (and it
 was for
 me in a real-world program), I don't care if it is O(1) because
 effectively it has
 a constant the size of Jupiter.

It's hard to say that your example is a good fit for the arrays of D. =A0=

 you need is a library type that does not require a global lock for
 appending. =A0Such is not the case for every example of appending in D, a=

 the existence of the current D arrays does not prevent you from adding mo=

 specialized types. =A0There are other costs to using such types besides t=

 global lock, such as storing the capacity, and if you are slicing, locall=

 storing the slice extents.

That's likely to be far too big a limitation for the upcoming world of parallelism everywhere. But the single-threaded GC is the real culprit. Fix that and arrays shouldn't be a problem either. --bb
Nov 23 2009
prev sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Bartosz Milewski (bartosz-nospam relisoft.com)'s article
 int[] a = [0];
 auto b = a;
 a ~= 1;
 b ~= 2;
 What is a[1]?

 Is this considered "stomping" and requiring a re-allocation?


 I thought about an abstract definition of "stomping".Something like that: The

I'm not sure about this case:
 int[] a = [1, 2, 3];
 int[] b = a[0..1];
 b ~= 4;
 what is a[1]?
 By my definition, this would be considered stomping, but I couldn't find this

a[1] == 2. The MRU cache stores both the pointer and the length. When you attempt to append to b, the MRU is searched. No array that starts at b.ptr and has length b.length is found. b is reallocated. This is easy to specify: "The efficiency of the ~= operator for dynamic arrays is implementation defined but guaranteed to be no worse than O(N). An implementation may optimize this operation to avoid a reallocation on every append. The current reference implementation does so. However, the observable behavior of performing the operation a ~= b is guaranteed to be equivalent to the observable behavior of performing the operation a = a ~ b." Furthermore, the purpose of language design is to create a usable language first and foremost, and then figure out how to specify it formally and abstractly, not to create a perfectly bulletproof specification first and foremost and then hope that results in a usable language. Again, not saying I particularly like the MRU cache idea (it has its issues at an implementation level) but I think that "perfectly specified" has to take a back seat to "works well in practice".
Nov 21 2009
next sibling parent Bartosz Milewski <bartosz-nospam relisoft.com> writes:
dsimcha Wrote:

 Furthermore, the purpose of language design is to create a usable language
first
 and foremost, and then figure out how to specify it formally and abstractly,
not
 to create a perfectly bulletproof specification first and foremost and then
hope
 that results in a usable language.  Again, not saying I particularly like the
MRU
 cache idea (it has its issues at an implementation level) but I think that
 "perfectly specified" has to take a back seat to "works well in practice".

This might be true at the experimental phase of language design. But I expect more from TDPL, if it's going to be the official definition of the language for the foreseeable future. I'm not expecting it to be as scrupulous as the C++ Standard, but a bit more solid than, "If in doubt, look how it's implemented in DMD--MRU cache and all."
Nov 21 2009
prev sibling parent reply Ali Cehreli <acehreli yahoo.com> writes:
dsimcha Wrote:

 == Quote from Bartosz Milewski (bartosz-nospam relisoft.com)'s article
 int[] a = [0];
 auto b = a;
 a ~= 1;
 b ~= 2;
 What is a[1]?

 Is this considered "stomping" and requiring a re-allocation?


 I thought about an abstract definition of "stomping".Something like that: The

I'm not sure about this case:
 int[] a = [1, 2, 3];
 int[] b = a[0..1];
 b ~= 4;
 what is a[1]?
 By my definition, this would be considered stomping, but I couldn't find this

a[1] == 2. The MRU cache stores both the pointer and the length. When you attempt to append to b, the MRU is searched. No array that starts at b.ptr and has length b.length is found. b is reallocated.

Consider there is also c: int[] c = b[0..1]; According to the above definition, after b~=4 'b' would be relocated and b[0] would be disjoint from c[0]. Now consider that a's definition is changed to just [ 1 ]. If I understand this correctly, then b.ptr and b.length would match the original array, and in this case b ~= 4 operation might not relocate b, and b[0] and c[0] would refer to the same element. A change in 'a' would be defining the relation between b and c. Ali
Nov 21 2009
parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Ali Cehreli (acehreli yahoo.com)'s article
 dsimcha Wrote:
 == Quote from Bartosz Milewski (bartosz-nospam relisoft.com)'s article
 int[] a = [0];
 auto b = a;
 a ~= 1;
 b ~= 2;
 What is a[1]?

 Is this considered "stomping" and requiring a re-allocation?


 I thought about an abstract definition of "stomping".Something like that: The

I'm not sure about this case:
 int[] a = [1, 2, 3];
 int[] b = a[0..1];
 b ~= 4;
 what is a[1]?
 By my definition, this would be considered stomping, but I couldn't find this

a[1] == 2. The MRU cache stores both the pointer and the length. When you attempt to append to b, the MRU is searched. No array that starts at b.ptr and has length b.length is found. b is reallocated.

int[] c = b[0..1]; According to the above definition, after b~=4 'b' would be relocated and b[0]

 Now consider that a's definition is changed to just [ 1 ].
 If I understand this correctly, then b.ptr and b.length would match the
original

would refer to the same element.
 A change in 'a' would be defining the relation between b and c.
 Ali

You're right. Indexing, unlike appending provides no guarantees about stomping, except that, if none of the arrays involved are resized, then they will point to the same memory block. I don't see this as a problem in practice, because it does not break immutability guarantees, etc. I've been using D for a year and a half and never had a bug caused by that. All you need to do is follow these rules: 1. If you want a slice to point to the same memory as the original array, make sure none of the arrays are resized. 2. If you don't want a slice to point to the same memory as the original array, use .dup. 3. If you don't care either way, then do neither.
Nov 22 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 23 Nov 2009 18:34:48 -0500, Leandro Lucarella <llucax gmail.com>  
wrote:

 Steven Schveighoffer, el 23 de noviembre a las 15:18 me escribiste:
 On Mon, 23 Nov 2009 11:10:48 -0500, Leandro Lucarella <llucax gmail.com>
 wrote:
The thing is, with realloc() is less likely that you forget that the  

can be copied because it returns the new pointer (that can be the same  

the original pointer). And even in this case, I saw a lot of bugs  

to realloc() misuse (and I made a couple myself).

With slices is much worse.

realloc is worse. If I have multiple aliases to the same data, then I realloc one of those, the others all can become dangling pointers if the runtime decides to move the data.

Well, you are comparing GC vs no-GC, not realloc() vs. slices. I have no intention on following that discussion, I'm just saying that realloc() is less error-prone than slices (and realloc() is error-prone already).

I was originally saying that realloc is similar to appending because it may or may not choose to move the data. The only difference from realloc in slices is that your other aliases to the old data don't become invalidated, so there are no dangling pointers. The GC is partly the cause for that, but also the fact that slices don't deallocate the original data (that would be possible, even in a GC'd environment). Your assertion that realloc is less error prone doesn't make any sense to me. realloc is as error prone as appending slices *and* it can also leave dangling pointers.
 You also cannot realloc data that's not malloc'd but you can append to
 a slice of non-heap data without issue.

How is that? AFAIK slices uses gc_realloc() to do the actual realloc, if that's done in a piece of malloc'ed data it will fail.

I believe step 1 of the append process is to find the GC heap block that contains the slice. If it doesn't, it simply copies the data to a new block. If the original data is stack-allocated or allocated outside the GC, no errors occur AFAIK.
 And even if it were
 true, I don't really see this as a big source of bugs, I really never had
 a bug because I tried to realloc() a non-heap piece of memory or  
 appending
 to a slice of non-GC-heap memory either.

You probably didn't have such bugs because you probably didn't think of doing this in C: void foo(char *arg) { realloc(arg, 10000); } Oops, what if arg isn't malloc'd? But in D this is safe code, and works as you expect, no matter what size arg was to begin with. In fact, tango uses this all the time for buffer optimization. Basically, you pass in a stack buffer, if it's big enough, you save the penalty of heap allocation. If it's not, no big deal, just resize the buffer and the GC takes care of the rest. void foo(char[] arg) { arg.length = 10000; }
 No matter what you do with slice appending in D, you cannot access
 dangling pointers unless you access the slice's ptr field.

Again, that's only because D is GCed, not because slices are great.

We are comparing realloc to slices. You assert slices are worse and I gave you a reason why they are not. You can't just ignore that because comparing slices to realloc requires taking the GC into account.
 Yes, you can run into trouble if you append to a slice and then
 change the original data in the slice, but that is a rare event, and
 it won't result in corrupting memory you didn't already have access
 to (at least, with the MRU cache).

I'm a little lost now. I don't know of what hypothetical D are you talking about.

Here's the deal. You have two very common operations on an array: appending and modification. In practice you either append data or you modify data. Rarely do you append data *and* modify the original data, and the only case I've seen for that is for the buffer optimization trick above. This is the only case where slices might get you "non-deterministic" behavior (and that's only if you still want to use the original array before appending). I have no proof of this except for my experience reading tango code. The two major use cases for appending are, building an array of items using append, and using an array as a buffer. In the first case, you start out with an empty or fully owned array, so no harm. In the second case, there are actually two types of functions that accept buffers, ones that return a slice of the buffer, and ones which don't. The ones that return a slice, you use the slice, not the original buffer. The ones which don't return a slice, you use the original buffer if you expect important data to be there. I don't think there's ever a case where you see a function that takes an array append to the array, then modify the original part of the array *without* returning the result.
 I can't see how the MRU cache can provide any safety. The cache is
 finite, and not all the slices will fit in it, so for those slices that
 are not cached, I don't see how the cache can provide any safety.

In those cases, the slice is reallocated because the runtime isn't sure whether it will result in stomping. The MRU cache is an optimization which defaults to reallocation for safety. Andrei has mentioned that he thinks we can store the allocated length in the GC block, which I think would also work. You also wouldn't need an MRU cache in that case, but he says it's in *addition* to the MRU cache, so I'm not sure what he means.
 Safety can be provided if a ~= b is defined to be semantically the same  
 as
 a = a ~ b, and leaving the MRU cache as an optimization.

Yes, exactly. That is what the MRU cache does. It's no good if the current behavior becomes the fallback, the fallback must be a reallocation.
 In that case we
 agree slices are predictable and a little safer (because stomping is not
 possible). But they are still error prone if you expect them to be a full
 reference type. Being the only entity in D with such semantics, is
 something one can forget very easily and introduce subtle bugs. In this
 case, I really think providing ~= is a bad idea, it's just too error  
 prone
 and doesn't give you anything.

Providing the simple syntax for simple cases is what this does. I think it's worth having in the language. Let's all face it: slices and arrays being the same thing is a new concept that most of us have only seen in D (I don't know many languages, but probably it's been done before). And the hybrid value/reference nature of a slice is definitely new to me. I think it provides great power and flexibility that makes code just work the way you want it to *in most cases*. If you read a lot of the tango code, you see how elegant things can be with slices. But it's always going to be confusing to newcomers. It's like explaining pointers for the first time to someone.
 I still think the best is to just make slices immutable value types (you
 can mutate the data they point to, you just can't modify slices; ptr and
 length), and provide a proper dynamic array type in the library or
 something.

I think you mean slices should only be *shrinkable*, immutable makes no sense, you need to be able to rebind a slice. I don't see the harm in allowing appending as long as it's safe. Having a separate type in the library that optimizes appending and has complete reference semantics is fine with me, just leave slices and arrays alone (well, except the fix for stomping). They *can* live together in the same language. -Steve
Nov 24 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 24 Nov 2009 11:01:10 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 [snip]
 Andrei has mentioned that he thinks we can store the allocated length  
 in the GC block, which I think would also work.  You also wouldn't need  
 an MRU cache in that case, but he says it's in *addition* to the MRU  
 cache, so I'm not sure what he means.

Reaching the GC block is relatively expensive, so the MRU still helps. In essence it's like this. When appending: a) look up the cache, if there, you have O(1) amortized append that's really fast b) if not in the cache, talk to the GC, still O(1) amortized append that's not as fast Both help providing an important performance guarantee. I was a bit worried about guaranteeing "O(1) amortized append for up to 8 arrays at a time."

Have you considered the performance impact on allocating non-array types? That is, are you intending on all allocations storing the allocated length, even class or struct allocations who will likely never append? Or will there be a "non-appendable" flag? Also, the part without the MRU cache was my original proposal from last year, I had some ideas on how length could be stored. For example, in a page of up to 128 byte blocks, you only need 8 bits to store the length (alas, you cannot store with 4 bits for 16-byte blocks because you need to cover both 0 and 16). This reduces the overhead for those blocks. For 256 byte to 1-page blocks, 16 bits is acceptable multi-page blocks, the cost of storing a 32-bit integer is negligible. It is true the lookup of the MRU cache will not involve dissecting the address of the block to find it's container block, but you still will need a lock, or are you planning on doing something clever? I think the locking will be the bottleneck, and if you don't make it the same as the global lock, add the cost of both locks when you actually need to append. -Steve
Nov 24 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 24 Nov 2009 11:26:11 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 On Tue, 24 Nov 2009 11:01:10 -0500, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 [snip]
 Andrei has mentioned that he thinks we can store the allocated length  
 in the GC block, which I think would also work.  You also wouldn't  
 need an MRU cache in that case, but he says it's in *addition* to the  
 MRU cache, so I'm not sure what he means.

Reaching the GC block is relatively expensive, so the MRU still helps. In essence it's like this. When appending: a) look up the cache, if there, you have O(1) amortized append that's really fast b) if not in the cache, talk to the GC, still O(1) amortized append that's not as fast Both help providing an important performance guarantee. I was a bit worried about guaranteeing "O(1) amortized append for up to 8 arrays at a time."

types? That is, are you intending on all allocations storing the allocated length, even class or struct allocations who will likely never append? Or will there be a "non-appendable" flag? Also, the part without the MRU cache was my original proposal from last year, I had some ideas on how length could be stored. For example, in a page of up to 128 byte blocks, you only need 8 bits to store the length (alas, you cannot store with 4 bits for 16-byte blocks because you need to cover both 0 and 16). This reduces the overhead for those blocks. For 256 byte to 1-page blocks, 16 bits is acceptable multi-page blocks, the cost of storing a 32-bit integer is negligible.

Having access to the requested length is important at larger lengths, so probably we could be fine by not actually storing it for allocations up to 128 bytes.

So in other words, appending any array under 128 bytes which does not still sit in the MRU cache causes a reallocation? I'm not sure if it's the right move, but it is an optimization, so maybe it works well. It also handily fixes the problem of requiring allocated lengths for non-array allocations since most would be under 128 bytes.
 It is true the lookup of the MRU cache will not involve dissecting the  
 address of the block to find it's container block, but you still will  
 need a lock, or are you planning on doing something clever?   I think  
 the locking will be the bottleneck, and if you don't make it the same  
 as the global lock, add the cost of both locks when you actually need  
 to append.

The cache is a thread-local map from pointers to size_t. Using it does not require any locking I think.

When reallocating, do you not also need to update the allocated length in the heap, even if the allocation fits into the same block to prevent other threads from stomping? Doesn't that require a lock? Somewhere you have to share the allocated length between threads sharing the same array. I assumed it was going to be in the MRU cache... -Steve
Nov 24 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 24 Nov 2009 11:41:17 -0500, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:


 The cache is a thread-local map from pointers to size_t. Using it does  
 not require any locking I think.

When reallocating, do you not also need to update the allocated length in the heap, even if the allocation fits into the same block to prevent other threads from stomping? Doesn't that require a lock? Somewhere you have to share the allocated length between threads sharing the same array. I assumed it was going to be in the MRU cache...

If all this is true, I had another thought for an MRU cache. It could simply be a map of pointers to the right blocks in the GC that store the lengths (to avoid the block search with the global mutex locked). Lookup should be atomic without locking (I think, simply an integer load right?), but you'd have to lock to actually do an append. I don't think we solve the lock problem without having multiple heaps... -Steve
Nov 24 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 24 Nov 2009 12:46:37 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 On Tue, 24 Nov 2009 11:41:17 -0500, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 The cache is a thread-local map from pointers to size_t. Using it  
 does not require any locking I think.

When reallocating, do you not also need to update the allocated length in the heap, even if the allocation fits into the same block to prevent other threads from stomping? Doesn't that require a lock? Somewhere you have to share the allocated length between threads sharing the same array. I assumed it was going to be in the MRU cache...

simply be a map of pointers to the right blocks in the GC that store the lengths (to avoid the block search with the global mutex locked).

That's a great idea. I keep on dreaming someone will implement all this stuff we talk about :o).
 Lookup should be atomic without locking (I think, simply an integer  
 load right?), but you'd have to lock to actually do an append.
  I don't think we solve the lock problem without having multiple  
 heaps...

I don't think we need to worry about optimizing growth of shared arrays.

What about immutable arrays? Can't those be shared without being marked as shared? -Steve
Nov 24 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 24 Nov 2009 12:46:37 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 On Tue, 24 Nov 2009 11:41:17 -0500, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 The cache is a thread-local map from pointers to size_t. Using it  
 does not require any locking I think.

When reallocating, do you not also need to update the allocated length in the heap, even if the allocation fits into the same block to prevent other threads from stomping? Doesn't that require a lock? Somewhere you have to share the allocated length between threads sharing the same array. I assumed it was going to be in the MRU cache...

simply be a map of pointers to the right blocks in the GC that store the lengths (to avoid the block search with the global mutex locked).

That's a great idea. I keep on dreaming someone will implement all this stuff we talk about :o).

Note this also solves the problem of having to flush the MRU cache on a collection -- the same allocated length property should remain attached to that block. :) -Steve
Nov 24 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 24 Nov 2009 12:46:37 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 Lookup should be atomic without locking (I think, simply an integer  
 load right?), but you'd have to lock to actually do an append.
  I don't think we solve the lock problem without having multiple  
 heaps...

I don't think we need to worry about optimizing growth of shared arrays.

This doesn't compute. The heap is shared, you need to store the allocated length in the heap, you need to lock on every allocate, even for non-shared arrays. What am I missing? Can you update *parts* of the heap without locking? If so, why do we need an MRU cache? -Steve
Nov 24 2009
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 24 Nov 2009 15:00:11 -0500, Bartosz Milewski  
<bartosz-nospam relisoft.com> wrote:

 It's hard to argue whether something is or isn't bug prone and how  
 severe the bugs could be. An experienced programmer knows all the  
 gotchas so he or she will never introduce such bugs. A more bug-prone  
 person will be afraid to speak up in public so as not to appear stupid  
 or be bullied away. It's only when the language goes into production  
 that the problems surface. Then an experienced programmer becomes part  
 of a group or not so experienced programmers and he or she starts seeing  
 how much the rests of the team is prone to certain bugs and how hard it  
 is to prevent or even detect them. That's why the best languages attempt  
 to turn as many errors as possible into hard bugs.

Yes, but there is a very fuzzy line between preventing all bugs and preventing useful code. The safest most secure system is one that has no inputs or outputs.
 Non-deterministic behavior is a major source of soft bugs. A buggy  
 program might work correctly 99% of the time, introduce minor errors  
 0.9% of the time, and in 0.1% of the time lead to a disaster.

 I will try to come up with some scenarios that would qualify for the  
 "100 Gotchas with D arrays" to provide more food for though. Consider  
 these little language puzzles.

 Imagine you're reviewing this code written by a relative newbee:

 char[] quote(char[] word) {
   word.length += 2;
   word[length-1] = '"';
   for(int i = word.length-2; i > 0; --i)
     word[i] = word[i - 1];
   word[0] = '"';
   return word;
 }

 void storeFoos(char[] buffer, string pattern, Container c)
 {
 	char[] res = find(buffer, pattern);
 	c.store(quote(res[0..pattern.len]);
 }

 Is this buggy? If so, how and when will the bug manifest itself?

In my mind it is not buggy. If quote or storeFoos is not meant to be able to molest the data, it should require a const(char)[]. If this was D1 code, I'd say it depends on the purpose of storeFoos, if storeFoos is expecting to own the input buffer, it's not buggy. If it's not, then storeFoos is buggy (in all cases). But we aren't talking about D1.
 Or another story. An enthusiastic programmer comes to you with a very  
 clever rewrite of a function. This is the original:

 T[] duplicate(T)(T[] src) {
    static if (is(T == invariant))
       return src.idup;
    else
       return src.dup;
 }

 This is his improved version:

 T[] duplicate(T)(T[] src) {
    T tmp = src[$-1];
    src.length -= 1;
    src ~= tmp;
    return src;
 }

 No need for static if! The extension is bound to re-allocate because of  
 stomping rules. Will this always work, or is there a gotcha?

He forgot to check for length 0. Also, you are a bit generous on your use of 'clever' :) -Steve
Nov 24 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 18 Nov 2009 13:21:49 -0500, Bartosz Milewski  
<bartosz-nospam relisoft.com> wrote:

 Andrei Alexandrescu Wrote:

 Leandro Lucarella wrote:
 Andrei Alexandrescu, el 17 de noviembre a las 18:45 me escribiste:
 3.  If you **really** care about performance, you should only  



 don't know the length in advance.  If you know the length, you  



 pre-allocate.

how the proposal follows from the feedback. My perception is that this is a group of people who use D. Bartosz' concern didn't cause quite a riot, so as far as I can tell there is no big issue at stake.

I didn't say anything (until now) because this was discussed already. Dynamic arrays/slices appending is horribly broken (I know it's well defined, and deterministic, but you are just condemned to make

 it just doesn't work as one would expect, even when you know how it  

 you have to keep fighting your intuition all the time).

In which ways do you think arrays horribly broken? Same as Bartosz mentions? One question is whether you actually have had bugs and problems coding, or if arrays stopped you from getting work done.

What Leandro points out is that using D arrays is not straightforward and may present a steep learning curve. In particular, even the beginner would be required to understand section 4.1.9 of TDPL (Expanding). For many people arrays' behavior might be counter-intuitive and a source of mistakes. In fact, even after reading 4.1.9 I don't know what to expect in some cases. Here's an example: int[] a = [0]; auto b = a; a ~= 1; b ~= 2; What is a[1]? Is this considered "stomping" and requiring a re-allocation?

a[1] would be 1 if an MRU cache was introduced. This is exactly the sort of problem that the MRU cache is supposed to solve. What should happen is: a ~= 1 should allocate in place because its parameters were in the MRU cache. b ~= 1 should reallocate because the a~=1 should have removed its parameters from the MRU cache. I agree that arrays behave differently than what people naively expect. However, their usage is so much better than anything from other languages I've used. One of the biggest "problems" in understanding arrays I've found from having to explain them to other people is that they are half value type and half reference type. I'd say most of the confusion comes from that, but it's that feature that makes them so great IMO. I'm unsure whether introducing a different usage would leave them as great, but I hope we can find a way to make them intuitive *and* keep them as useful as they are now. The implementation-defined determinisim is probably another issue that bites people, but I think the cases are so much rarer. Generally when you are appending *and* manipulating the original data, you have a unique array. Otherwise your usage falls into either read-only + append usage, or buffer usage where you don't usually append. Data-stomping is a memory issue (and a const/invariant issue), so I think that needs the most immediate attention. If it can be solved without changing the syntax or system, then I think it's a win-win. We can always decide to use a new concept later if we want to fix the deterministic behavior. In fact, how hard would it be to introduce a deterministic construct *without* changing current behavior? I mean, if the end result of your proposal is you add something like std::vector, why not still allow appending to slices *and* add something like std::vector where people can use that if they want better behavior? -Steve
Nov 19 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 18 Nov 2009 14:03:11 -0500, Bartosz Milewski  
<bartosz-nospam relisoft.com> wrote:

 Andrei Alexandrescu Wrote:


 One thing that Bartosz didn't mention is that "this unique" is different
 than "the other unique", which I think reveals the half-bakedness of the
 proposal. "The other unique" that was intensively discussed before is
 transitive, meaning that the root reference points to a graph of objects
 having no other connection to the outer world. "This unique" is very
 different because it's shallow - only the array chunk is unaliased, but
 the members of the array don't need to be. This is a very important
 distinction.

Actually I was thinking of deep unique, but understandably I neither described full details of my proposal, nor have I completely "baked" it, which would require some serious work and a lot more discussion. One of these days I will blog about uniqueness in more detail. The problem of transitivity with respect to arrays has been discussed before in the context of const and immutable. We realized that shallow constness/immutability would be useful in some scenarios but, because of the added complexity, we punted it. My feeling is that, similarly, in most cases the restriction that a unique array may only hold unique objects is not unreasonable.

I wonder what your expert opinion is: Is something like unique possible without introducing all the other type modifiers (like 'lent', etc.)? If not, what are the minimum type modifiers you'd need to get unique to work? I really like the unique concept, and think it is really one of the biggest holes missing from the const system (aside from bug 1961). -Steve
Nov 19 2009
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Thu, 19 Nov 2009 14:56:40 +0300, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Wed, 18 Nov 2009 13:21:49 -0500, Bartosz Milewski  
 <bartosz-nospam relisoft.com> wrote:

 Andrei Alexandrescu Wrote:

 Leandro Lucarella wrote:
 Andrei Alexandrescu, el 17 de noviembre a las 18:45 me escribiste:
 3.  If you **really** care about performance, you should only  



 don't know the length in advance.  If you know the length, you  



 pre-allocate.

how the proposal follows from the feedback. My perception is that this is a group of people who use D. Bartosz' concern didn't cause quite a riot, so as far as I can tell there is no big issue at stake.

I didn't say anything (until now) because this was discussed already. Dynamic arrays/slices appending is horribly broken (I know it's well defined, and deterministic, but you are just condemned to make

 it just doesn't work as one would expect, even when you know how it  

 you have to keep fighting your intuition all the time).

In which ways do you think arrays horribly broken? Same as Bartosz mentions? One question is whether you actually have had bugs and problems coding, or if arrays stopped you from getting work done.

What Leandro points out is that using D arrays is not straightforward and may present a steep learning curve. In particular, even the beginner would be required to understand section 4.1.9 of TDPL (Expanding). For many people arrays' behavior might be counter-intuitive and a source of mistakes. In fact, even after reading 4.1.9 I don't know what to expect in some cases. Here's an example: int[] a = [0]; auto b = a; a ~= 1; b ~= 2; What is a[1]? Is this considered "stomping" and requiring a re-allocation?

a[1] would be 1 if an MRU cache was introduced. This is exactly the sort of problem that the MRU cache is supposed to solve. What should happen is: a ~= 1 should allocate in place because its parameters were in the MRU cache. b ~= 1 should reallocate because the a~=1 should have removed its parameters from the MRU cache. I agree that arrays behave differently than what people naively expect. However, their usage is so much better than anything from other languages I've used. One of the biggest "problems" in understanding arrays I've found from having to explain them to other people is that they are half value type and half reference type. I'd say most of the confusion comes from that, but it's that feature that makes them so great IMO. I'm unsure whether introducing a different usage would leave them as great, but I hope we can find a way to make them intuitive *and* keep them as useful as they are now. The implementation-defined determinisim is probably another issue that bites people, but I think the cases are so much rarer. Generally when you are appending *and* manipulating the original data, you have a unique array. Otherwise your usage falls into either read-only + append usage, or buffer usage where you don't usually append. Data-stomping is a memory issue (and a const/invariant issue), so I think that needs the most immediate attention. If it can be solved without changing the syntax or system, then I think it's a win-win. We can always decide to use a new concept later if we want to fix the deterministic behavior. In fact, how hard would it be to introduce a deterministic construct *without* changing current behavior? I mean, if the end result of your proposal is you add something like std::vector, why not still allow appending to slices *and* add something like std::vector where people can use that if they want better behavior? -Steve

Slices are ranges. And ranges are *shrink-only*. Are you going to append to your data structure? Use an Array instead. Slices are not suitable for that purpose!
Nov 19 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 19 Nov 2009 07:32:03 -0500, Denis Koroskin <2korden gmail.com>  
wrote:

 Slices are ranges. And ranges are *shrink-only*. Are you going to append  
 to your data structure? Use an Array instead. Slices are not suitable  
 for that purpose!

Arrays are ranges too. Just because the range interface does not define operations that allow growth doesn't mean that something that implements the interface *can't* add growth methods. I see no problem with slices and arrays being the same type, as long as the stomping problem is solved. -Steve
Nov 19 2009
prev sibling next sibling parent Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el 19 de noviembre a las 17:13 me escribiste:
 Leandro Lucarella wrote:
dsimcha, el 19 de noviembre a las 23:21 me escribiste:
Face it, D is a close to the metal language.  Any attempt to define the spec at
a
purely abstract level without reference to any implementation details, and
without
leaving certain details implementation defined is absurd.  I think people need
to
face the fact that you can't have close-to-the-metal flexibility and performance
and at the same time far-from-the-metal strict separation of specification from
implementation details.

Yes, but I think this is a place where you are doing early optimization at the expense of usability. You can always do the slice-thinguie yourself if you are that needed for speed that you can't afford an extra indirection when using an array (which will be the only overhead if T[new] were implemented as a pure reference type). I think the cost here in terms of usability if far higher than the little performance you loose, specially when you can overcome the issue for those weird cases where you need it.

I think one aspect that tends to be forgotten is that built-in arrays are the lowest abstraction - really just one small step above pointers. Efficiency lost there is lost forever. That's why I consider the current design adequate and the proposed designs less so.

I don't think so, slices (a view over somebody elses memory, no appending) should be the lowest level abstraction.
 In other words: you can build a more convenient facility on top of
 T[], but you couldn't build a more efficient facility on top of a
 convenient one. I think D made the right choice with built-in
 arrays.

I don't. Current built-in arrays are both dynamic arrays and slices, which is what messes all up. But I'll drop the subject, once again, for the sake of... well, everybody ;) -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- SEÑOR BIELSA: CON TODO RESPETO ¿USTED LO VE JUGAR A RIQUELME? -- Crónica TV
Nov 19 2009
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 20 Nov 2009 04:13:42 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Leandro Lucarella wrote:
 dsimcha, el 19 de noviembre a las 23:21 me escribiste:
 Face it, D is a close to the metal language.  Any attempt to define  
 the spec at a
 purely abstract level without reference to any implementation details,  
 and without
 leaving certain details implementation defined is absurd.  I think  
 people need to
 face the fact that you can't have close-to-the-metal flexibility and  
 performance
 and at the same time far-from-the-metal strict separation of  
 specification from
 implementation details.

optimization at the expense of usability. You can always do the slice-thinguie yourself if you are that needed for speed that you can't afford an extra indirection when using an array (which will be the only overhead if T[new] were implemented as a pure reference type). I think the cost here in terms of usability if far higher than the little performance you loose, specially when you can overcome the issue for those weird cases where you need it.

I think one aspect that tends to be forgotten is that built-in arrays are the lowest abstraction - really just one small step above pointers. Efficiency lost there is lost forever. That's why I consider the current design adequate and the proposed designs less so. In other words: you can build a more convenient facility on top of T[], but you couldn't build a more efficient facility on top of a convenient one. I think D made the right choice with built-in arrays. Andrei

Not so true. T[] appends are slow, *very* slow. Whenever I need anything efficient, I always use an additional "size" field (and arr.length is used as capacity). This is tricky, error-prone and not too intuitive to newcomers. As such, it contradicts with all your statements above. And there is also a hack to make it work a bit faster: T[] arr; arr.length = capacity arr.length = 0; Much to my disgust, having no higher-level abstraction in language core only encourages writing code like this. TBH, I used to write code like above before, too, but not now. I slowly eliminate everything like this (as well as all the ~= operations) from the code I wrote. It's unpredictable, slow, and leads to hard-to-fix bugs. In other words, I disagree with your opinion that D made a right choice with built-in pseudo-arrays.
Nov 19 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 19 Nov 2009 17:58:15 -0500, Bartosz Milewski  
<bartosz-nospam relisoft.com> wrote:

 Steven Schveighoffer Wrote:

 In fact, even after reading 4.1.9 I don't know what to expect in some
 cases. Here's an example:

 int[] a = [0];
 auto b = a;
 a ~= 1;
 b ~= 2;
 What is a[1]?

 Is this considered "stomping" and requiring a re-allocation?

a[1] would be 1 if an MRU cache was introduced. This is exactly the sort of problem that the MRU cache is supposed to solve. What should happen is: a ~= 1 should allocate in place because its parameters were in the MRU cache. b ~= 1 should reallocate because the a~=1 should have removed its parameters from the MRU cache.

Notice that you are using particular implementation detail (MRU cache) to explain the semantics of D arrays. There is a very important distinction between language specification and compiler implementation. Andrei already had to go pretty deep into implementation to describe arrays and slices: You can't define D arrays without talking about shared buffers and memory allocation. I don't think including the description of the MRU cache in the language specification is the right solution. But I'm afraid an abstract definition of "stomping" might turn out to be quite non-trivial.

I haven't yet read all the other posts, so someone else may already have pointed this out but... Having an MRU cache makes it so you *don't* have to explain its semantics (or stomping). Currently there is a paragraph in the spec (complete with example) talking about how stomping can occur, so you can just remove that part. There is no need to talk about the MRU cache when talking about arrays, that's an implementation detail. I was pointing it out because you are already used to the current bad implementation :) I wouldn't even bring up the MRU cache in the book or the spec. You just say that you can append to an array and it may make a copy of the data if necessary. It's just like realloc, except safer. -Steve
Nov 23 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 22 Nov 2009 21:19:15 -0500, Bartosz Milewski  
<bartosz-nospam relisoft.com> wrote:

 Andrei Alexandrescu Wrote:

 I'm not sure I figure what the issue is, and I'd appreciate a rehash of
 the few other issues brought up. (I thought they had been responded to.)

Some of the points are in my message from Sat, 05:19 pm. Especially this point: Can a conforming implementation simply re-allocate on every expansion? You make it sound like that would violate some performance guarantees but there are no specifics.

My view is that a conforming implementation could reallocate on every expansion. That translates to an MRU cache size of zero. But the MRU cache is just one way to solve the problem, and does not need to be in the language spec. It's an implementation detail. BTW, the more disturbing case than those being discussed, one which violates more than just your expectations is: string x = "hello".idup; auto y = x[0..1]; y ~= "appy"; We must have a solution to this problem, to keep the promise of immutability intact. The other problem of sometimes sharing data is not as important IMO. Basically, we have 2 options for the spec: A. Once you append to an array, you cannot change the data through that array for any other alias to the original array, so x ~= y is equivalent to x = x ~ y. B. Appending to an array may reallocate, so appending to x may allow x to share it's prefix bytes with another array. Whatever implementation satisfies the choice is correct, however optimized you want to make it. For instance, you could use an MRU cache for immutable data only, and still satisfy rule A. I also agree that there should be an array building mechanism, because these rules may not satisfy all usages. -Steve
Nov 23 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 21 Nov 2009 17:19:08 -0500, Bartosz Milewski  
<bartosz-nospam relisoft.com> wrote:

 int[] a = [0];
 auto b = a;
 a ~= 1;
 b ~= 2;
 What is a[1]?

 Is this considered "stomping" and requiring a re-allocation?

Can this question be answered without the recourse to implementation, and the MRU cache in particular? I thought about an abstract definition of "stomping".Something like that: The expanded part of the array is never implicitly shared with any other array. But I'm not sure about this case: int[] a = [1, 2, 3]; int[] b = a[0..1]; b ~= 4; what is a[1]? By my definition, this would be considered stomping, but I couldn't find this example in TDPL.

I believe that's because Andrei is hoping this problem will be fixed before TDPL comes out...
 I'm also not sure what to make of this statement: "client code is  
 guaranteed to benefit of good average performance over a large number of  
 appends to the same array". Since "good" is open to interpretation, such  
 a guarantee is meaningless. If it is meant as "amortized O(1)", it's  
 impossible to fulfill, unless the MRU cache is infinite and a lookup is  
 O(1).

If a reasonably small size limit is put on the MRU cache (defined by Andrei as 8), then MRU cache lookup is O(1). You still need a lock, but that's not any different than today's requirement, and it certainly wouldn't be any worse than re-allocating for every append. I agree there needs to be a better solution if you want the highest performance, but the simplest and most convenient solution should also perform reasonably well, and shouldn't corrupt data.
 Then there is the problem of multithreading. A global MRU cache would  
 require potentially expensive synchronization. A thread-local cache, on  
 the other hand, makes thread creation even more heavy-weight than it is  
 right now. My prediction is that D will be evolving towards user-level  
 threads and fine granularity concurrency, which requires very small  
 thread footprint.

You mean more heavyweight than allocating a new heap for each thread? The only construct which makes sense to tie individual MRU caches to is a heap, and I think adding the 64 bytes required for an MRU cache to a heap is not that big a deal. -Steve
Nov 23 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 23 Nov 2009 11:10:48 -0500, Leandro Lucarella <llucax gmail.com>
wrote:

 Steven Schveighoffer, el 23 de noviembre a las 07:34 me escribiste:
Notice that you are using particular implementation detail (MRU
cache) to explain the semantics of D arrays. There is a very
important distinction between language specification and compiler
implementation. Andrei already had to go pretty deep into
implementation to describe arrays and slices: You can't define D
arrays without talking about shared buffers and memory allocation.
I don't think including the description of the MRU cache in the
language specification is the right solution. But I'm afraid an
abstract definition of "stomping" might turn out to be quite
non-trivial.

I haven't yet read all the other posts, so someone else may already have pointed this out but... Having an MRU cache makes it so you *don't* have to explain its semantics (or stomping). Currently there is a paragraph in the spec (complete with example) talking about how stomping can occur, so you can just remove that part. There is no need to talk about the MRU cache when talking about arrays, that's an implementation detail. I was pointing it out because you are already used to the current bad implementation :) I wouldn't even bring up the MRU cache in the book or the spec. You just say that you can append to an array and it may make a copy of the data if necessary. It's just like realloc, except safer.

The thing is, with realloc() is less likely that you forget that the data can be copied because it returns the new pointer (that can be the same as the original pointer). And even in this case, I saw a lot of bugs related to realloc() misuse (and I made a couple myself). With slices is much worse.

realloc is worse. If I have multiple aliases to the same data, then I realloc one of those, the others all can become dangling pointers if the runtime decides to move the data. You also cannot realloc data that's not malloc'd but you can append to a slice of non-heap data without issue. No matter what you do with slice appending in D, you cannot access dangling pointers unless you access the slice's ptr field. Yes, you can run into trouble if you append to a slice and then change the original data in the slice, but that is a rare event, and it won't result in corrupting memory you didn't already have access to (at least, with the MRU cache). -Steve
Nov 23 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 23 Nov 2009 16:33:42 -0500, dsimcha <dsimcha yahoo.com> wrote:

 But one of my biggest gripes about the current appending implementation,  
 far more
 important from a practical perspective than any theoretical safety  
 guarantees, is
 that it doesn't scale to multiple threads.  I've experienced this in a  
 real-world
 program that I translated from someone else's C++ code that used
 vector::push_back() quite liberally inside core algorithms.  (The reason  
 is that
 it was written more for simplicity/readability than performance, so the  
 person who
 wrote it never thought to optimize this kind of stuff.)  When I did a  
 literal
 translation of this to D, everything was completely serialized.  It  
 didn't even
 scale to two cores.  I had to rewrite significant portions of the code  
 in a way
 that did not use appending to get it to scale at all.

 D is supposed to be a language that is geared toward multi-threading.  If
 something as simple as array appending is a threading bottleneck (and it  
 was for
 me in a real-world program), I don't care if it is O(1) because  
 effectively it has
 a constant the size of Jupiter.

It's hard to say that your example is a good fit for the arrays of D. What you need is a library type that does not require a global lock for appending. Such is not the case for every example of appending in D, and the existence of the current D arrays does not prevent you from adding more specialized types. There are other costs to using such types besides the global lock, such as storing the capacity, and if you are slicing, locally storing the slice extents. -Steve
Nov 23 2009
prev sibling next sibling parent Leandro Lucarella <llucax gmail.com> writes:
Steven Schveighoffer, el 23 de noviembre a las 15:18 me escribiste:
 On Mon, 23 Nov 2009 11:10:48 -0500, Leandro Lucarella <llucax gmail.com>
 wrote:
 
Steven Schveighoffer, el 23 de noviembre a las 07:34 me escribiste:
Notice that you are using particular implementation detail (MRU
cache) to explain the semantics of D arrays. There is a very
important distinction between language specification and compiler
implementation. Andrei already had to go pretty deep into
implementation to describe arrays and slices: You can't define D
arrays without talking about shared buffers and memory allocation.
I don't think including the description of the MRU cache in the
language specification is the right solution. But I'm afraid an
abstract definition of "stomping" might turn out to be quite
non-trivial.

I haven't yet read all the other posts, so someone else may already have pointed this out but... Having an MRU cache makes it so you *don't* have to explain its semantics (or stomping). Currently there is a paragraph in the spec (complete with example) talking about how stomping can occur, so you can just remove that part. There is no need to talk about the MRU cache when talking about arrays, that's an implementation detail. I was pointing it out because you are already used to the current bad implementation :) I wouldn't even bring up the MRU cache in the book or the spec. You just say that you can append to an array and it may make a copy of the data if necessary. It's just like realloc, except safer.

The thing is, with realloc() is less likely that you forget that the data can be copied because it returns the new pointer (that can be the same as the original pointer). And even in this case, I saw a lot of bugs related to realloc() misuse (and I made a couple myself). With slices is much worse.

realloc is worse. If I have multiple aliases to the same data, then I realloc one of those, the others all can become dangling pointers if the runtime decides to move the data.

Well, you are comparing GC vs no-GC, not realloc() vs. slices. I have no intention on following that discussion, I'm just saying that realloc() is less error-prone than slices (and realloc() is error-prone already).
 You also cannot realloc data that's not malloc'd but you can append to
 a slice of non-heap data without issue.

How is that? AFAIK slices uses gc_realloc() to do the actual realloc, if that's done in a piece of malloc'ed data it will fail. And even if it were true, I don't really see this as a big source of bugs, I really never had a bug because I tried to realloc() a non-heap piece of memory or appending to a slice of non-GC-heap memory either.
 No matter what you do with slice appending in D, you cannot access
 dangling pointers unless you access the slice's ptr field.

Again, that's only because D is GCed, not because slices are great.
 Yes, you can run into trouble if you append to a slice and then
 change the original data in the slice, but that is a rare event, and
 it won't result in corrupting memory you didn't already have access
 to (at least, with the MRU cache).

I'm a little lost now. I don't know of what hypothetical D are you talking about. I can't see how the MRU cache can provide any safety. The cache is finite, and not all the slices will fit in it, so for those slices that are not cached, I don't see how the cache can provide any safety. Safety can be provided if a ~= b is defined to be semantically the same as a = a ~ b, and leaving the MRU cache as an optimization. In that case we agree slices are predictable and a little safer (because stomping is not possible). But they are still error prone if you expect them to be a full reference type. Being the only entity in D with such semantics, is something one can forget very easily and introduce subtle bugs. In this case, I really think providing ~= is a bad idea, it's just too error prone and doesn't give you anything. I still think the best is to just make slices immutable value types (you can mutate the data they point to, you just can't modify slices; ptr and length), and provide a proper dynamic array type in the library or something. -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- The biggest lie you can tell yourself is When I get what I want I will be happy
Nov 23 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 24 Nov 2009 17:35:43 -0500, Bartosz Milewski  
<bartosz-nospam relisoft.com> wrote:

 Steven Schveighoffer Wrote:

 On Tue, 24 Nov 2009 16:52:52 -0500, Bartosz Milewski
 <bartosz-nospam relisoft.com> wrote:

 Steven Schveighoffer Wrote:
 Imagine you're reviewing this code written by a relative newbee:

 char[] quote(char[] word) {
   word.length += 2;
   word[length-1] = '"';
   for(int i = word.length-2; i > 0; --i)
     word[i] = word[i - 1];
   word[0] = '"';
   return word;
 }

 void storeFoos(char[] buffer, string pattern, Container c)
 {
 	char[] res = find(buffer, pattern);
 	c.store(quote(res[0..pattern.len]);
 }

 Is this buggy? If so, how and when will the bug manifest itself?

In my mind it is not buggy. If quote or storeFoos is not meant to be able to molest the data, it should require a const(char)[]. If this was


 code, I'd say it depends on the purpose of storeFoos, if storeFoos is
 expecting to own the input buffer, it's not buggy.  If it's not, then
 storeFoos is buggy (in all cases).  But we aren't talking about D1.

surrounding code expects the buffer not to be modified. The newbee didn't want to idup the buffer because in most buffers the pattern is not found.

Then he forgot the const decoration for the parameter. No duping is necessary: void storeFoos(const(char)[] buffer, string pattern, Container c) { ... // will result in compiler errors for given quote function, fix that. }

Well, he tried that, but then his quote function wouldn't work and he needs it in other places as well. You know how they are ;-). Besides he thinks "const" is for wimps. He won't stop arguing until you prove to him that there is an actual bug in his code.

I think this warrants more code reviews of this particular developer's code :) Bottom line: if a function isn't supposed to change the buffer, the signature should be const for that parameter. It's one of the principles of const, and why it's in D2 in the first place. I'd explain to the coder that he is wrong to expect that modifying a buffer in a function that isn't supposed to modify a buffer is acceptable (and hopefully he gets it, or else I don't have time to deal with people who insist on being right when they are not). BTW, in my experience, the newbie expectaction of ~= is usually that it modifies the original even when it doesn't, not the other way around.
 Or another story. An enthusiastic programmer comes to you with a  



 clever rewrite of a function. This is the original:

 T[] duplicate(T)(T[] src) {
    static if (is(T == invariant))
       return src.idup;
    else
       return src.dup;
 }





 This is his improved version:

 T[] duplicate(T)(T[] src) {
    T tmp = src[$-1];
    src.length -= 1;
    src ~= tmp;
    return src;
 }

 No need for static if! The extension is bound to re-allocate  



 of
 stomping rules. Will this always work, or is there a gotcha?

He forgot to check for length 0.

That he did. So now he goes back to the drawing board and adds an

 return for length 0. Is his code correct now?

Yes. I have a feeling you have some trick up your sleeve that you think makes it incorrect :)

Well, I was wondering if it could be rewritten as: T[] duplicate(T)(T[] src) { assert(src.length != 0); T tmp = src[$-1]; src.length -= 1; src.length += 1; src[$-1] = tmp; return src; } and whether it's okay for the optimizer to do some simple arithmetic optimization.

First, this code isn't equivalent to the original, because it doesn't work on const and immutable arrays. Second, the optimizer cannot do simple math optimization on src.length -= 1 and src.length += 1 because length is a property, not a field. Setting the length calls a function, which cannot be optimized out. I also thought of some problems with the original code with static if's. Assuming that version, why must this code fail?: class C {} const(C)[] arr; const(C)[] arr2 = arr.duplicate(); Oops, just tried it and no problems! I'll file a bug if it still happens on the latest compiler. -Steve
Nov 24 2009
prev sibling next sibling parent "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Steven Schveighoffer <schveiguy yahoo.com> wrote:

 You are again resorting to implementation. I guess in the current  
 implementation it's true that the compiler will indeed insert a call to  
 its internal function. But the language spec does not proscribe that.

Sure it does. It says that setting the length may reallocate the array to hold enough space. How do you do that without a function call?

As long as the compiler is aware of arrays' existence, and knows how those functions work, there might indeed be an optimization that turns length+=1;length-=1; into a nop. There isn't at the moment, and there might never be, but it's a conceivable optimization. -- Simen
Nov 25 2009
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 26 Nov 2009 17:45:30 -0500, Bartosz Milewski  
<bartosz-nospam relisoft.com> wrote:

 Steve, I don't know about you, but this exchange clarified some things  
 for me. The major one is that it's dangerous to define the semantics of  
 a language construct in terms of implementation. You were defending some  
 points using implementation arguments rather than sticking to defined  
 semantics.

I was defending the semantics by using an example implementation. I was not defining the semantics in terms of implementation. The semantics are defined by the spec, and do not indicate when an array is reallocated and when it is not. That detail is implementation defined. My examples use dmd's implementation to show how the assumption can break. You said the guy needs me to show him that it is broken, and all his tests pass, why can't I use my knowledge of the implementation to come up with an example? I could rewrite my statements as: "You should not rely on the array being reallocated via append, because D does not guarantee such reallocation. Using the reference implementation of dmd, it is possible to come up with an example of where this fails: ..."
 We have found out that one should never rely on the array being  
 re-allocated on expansion, even if it seems like there's no other way.  
 The only correct statement is that the freshly expanded part of the  
 array is guaranteed not to be write-shared with any other array.

I agree with this (except for "even if it seems like there's no other way," The spec says an allocation always occurs when you do a ~ b, so you can always rewrite a ~= b as a = a ~ b). In fact, at one point to avoid stomping I went through Tango and found all places where append could result in stomping, and changed the code this way. There were probably less than 5 instances. Append is not a very common operation when you didn't create the array to begin with.
 However, this discussion veered away from a more important point. I  
 don't believe that programmers will consciously make assumptions about  
 re-allocation breaking sharing.

For the most part, this is ok -- rarely do you see someone append to an array they didn't create *and* modify the original data. My belief is that people will expect more that appending an array *doesn't* reallocate. If you have experience in programming, the language you are used to either treats arrays as value types or as reference types. I don't think I've ever seen a language besides D that uses the hybrid type for arrays. So you are going to come to D expecting value or reference. If you expect value, you should quickly learn that's not the case because 99% of the time, arrays look like reference types. It is natural then to expect appending to an array to affect all other aliases of that array, after all it is a reference type. I just think your examples don't ring true in practice because there are simpler ways to guarantee allocation. You have to go out of your way to write bad code that doesn't work correctly. Finally, it's easy to turn an array into a reference type when passing as a parameter, just use the ref decorator. All we need is a way to turn it into a value type, and I think Andrei's idea of Value!(arr) would be great for that.
 The danger is that it's easy to miss accidental sharing and it's very  
 hard to test for it.

I think this danger is rare, and it's easy to search for (just search for ~= in your code, I did it with Tango). I think it can be very well defined in a tutorial or book chapter. -Steve
Dec 01 2009
prev sibling next sibling parent reply Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el 17 de noviembre a las 18:45 me escribiste:
3.  If you **really** care about performance, you should only append when you
don't know the length in advance.  If you know the length, you should always
pre-allocate.

We will have collections and all those good things, but I don't see how the proposal follows from the feedback. My perception is that this is a group of people who use D. Bartosz' concern didn't cause quite a riot, so as far as I can tell there is no big issue at stake.

I didn't say anything (until now) because this was discussed already. Dynamic arrays/slices appending is horribly broken (I know it's well defined, and deterministic, but you are just condemned to make mistakes, it just doesn't work as one would expect, even when you know how it works you have to keep fighting your intuition all the time). But it's a little pointless to keep making riots, when we were so close to finally fix this with T[new]/whatever and it all went back. -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Hey you, out there on your own Sitting naked by the phone Would you touch me?
Nov 18 2009
parent Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el 19 de noviembre a las 18:25 me escribiste:
 Leandro Lucarella wrote:
Andrei Alexandrescu, el 19 de noviembre a las 17:13 me escribiste:
Leandro Lucarella wrote:
dsimcha, el 19 de noviembre a las 23:21 me escribiste:
Face it, D is a close to the metal language.  Any attempt to define the spec at
a
purely abstract level without reference to any implementation details, and
without
leaving certain details implementation defined is absurd.  I think people need
to
face the fact that you can't have close-to-the-metal flexibility and performance
and at the same time far-from-the-metal strict separation of specification from
implementation details.

the expense of usability. You can always do the slice-thinguie yourself if you are that needed for speed that you can't afford an extra indirection when using an array (which will be the only overhead if T[new] were implemented as a pure reference type). I think the cost here in terms of usability if far higher than the little performance you loose, specially when you can overcome the issue for those weird cases where you need it.

arrays are the lowest abstraction - really just one small step above pointers. Efficiency lost there is lost forever. That's why I consider the current design adequate and the proposed designs less so.

I don't think so, slices (a view over somebody elses memory, no appending) should be the lowest level abstraction.
In other words: you can build a more convenient facility on top of
T[], but you couldn't build a more efficient facility on top of a
convenient one. I think D made the right choice with built-in
arrays.

I don't. Current built-in arrays are both dynamic arrays and slices, which is what messes all up.

Well it also makes arrays/slices quite capable.

Yes, capable of introducing bugs specially =)
 Right now they only have one problem (which different people ascribe
 a different gravity to) and are otherwise very expressive.

Nobody is judging the expressiveness of arrays, the bug-prone semantics is the problem.
 Now think of this: the MRU solves a grave issue with slices.

You are using arrays and slices interchangeably on purpose, right? I hate you! =)
 We haven't thought of that for years, and all of a sudden stomping is
 not an issue anymore.

And suddenly you are tying array implementation with the GC. You are making things worse, not better. This whole issue reminds me to this blog: http://thereifixedit.com/ =P
 I wouldn't be in the least surprised if someone posts in this group
 tomorrow that they found a way to make ~= more predictable.

Split slices and arrays, add a capacity field to arrays, remove appending from slices. There.
But I'll drop the subject, once again, for the sake of... well, everybody
;)

I encourage you to further your point (while in understanding that your existing arguments and proposals have been understood). This is the best time to make a contribution to D's design.

I keep replying just because I can't control myself, but I really think it's pointless, I have said (as many others) several times what are the problems and what would be a nice solution. You just find the solution unsuitable and think the cache is good enough.
 We may as well undefine "~" or "~=" or both for arrays if there's
 enough arguments and evidence that it's a problem to keep them.

I think that would be very bad. For once, I thought "~" where out of discussion, because it always create a new temporal (like any other binary operator), so I can't see a problem with it. And I hope you are only seeing as a possibility to remove "~=" from slices, making possible for a library array type to define it. I can totally lived with that. If the array type were imported in object.d it would be even better.
 When making proposals just keep in mind one important thing: with the
 context that you (all) have accumulated regarding this discussion, a lot
 of things seem very easy to understand, which are really weird when just
 starting with the language.

Please, name them! And BTW, understanding the MRU cache (and trying to predict it) will be any easier?
 One other thing is that it's all tradeoffs; things that are "good"
 in all dimensions are in very short supply.

I know that! This discussion is just because some think one tradeoff is better than the other.
 With arrays, we may be in the position of choosing a solution that won't
 please everyone.  The question right now is "which 5% of users you will
 disappoint".

I really think more than 5% of the users will hit the weird half value half reference semantics of slices, but, of course, I'm making this figures up, just like you did ;)
 As for reviving T[new], I believe that, unless some new insight comes
 forward, practically that ship has sailed. That doesn't preclude
 defining a library type that is a more encapsulated array.

I don't care about T[new], I'm fine with a library defined type. Just remove appending from slices (and make .length read-only).
 Fortunately, that can be implemented using T[] because T[] is expressive
 enough :o).

... -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Relax. I'll need some information first. Just the basic facts. Can you show me where it hurts?
Nov 20 2009
prev sibling next sibling parent Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el 18 de noviembre a las 07:22 me escribiste:
 Leandro Lucarella wrote:
Andrei Alexandrescu, el 17 de noviembre a las 18:45 me escribiste:
3.  If you **really** care about performance, you should only append when you
don't know the length in advance.  If you know the length, you should always
pre-allocate.

how the proposal follows from the feedback. My perception is that this is a group of people who use D. Bartosz' concern didn't cause quite a riot, so as far as I can tell there is no big issue at stake.

I didn't say anything (until now) because this was discussed already. Dynamic arrays/slices appending is horribly broken (I know it's well defined, and deterministic, but you are just condemned to make mistakes, it just doesn't work as one would expect, even when you know how it works you have to keep fighting your intuition all the time).

In which ways do you think arrays horribly broken? Same as Bartosz mentions?

Yes, I think dynamic arrays are grouping 2 separate things: a proper dynamic array able to efficiently appending stuff (without hacks like the size cache) and a slice (a read-only view of a piece of memory); a "random range" if you want to put it in terms of D. A dynamic array should be a reference type, a slice a value type. I explained my view before. I wont explain it with much detail again (I hope you remember the thread).
 One question is whether you actually have had bugs and problems
 coding, or if arrays stopped you from getting work done.

I had many bugs because of the reallocation when appending.
But it's a little pointless to keep making riots, when we were so close
to finally fix this with T[new]/whatever and it all went back.

Hmmmm... resignation is not a trait of this group :o).

It is after a good fight :) From time to time this kind of issues (that are present in the group since I can remember, which is about 4 years maybe) keep popping in the NG. I get tired of repeating myself over and over again (like may others, I guess). -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Hey, it's George. I got nothing to say. -- George Constanza, dejando un mensaje en la contestadora de Jerry
Nov 18 2009
prev sibling next sibling parent reply gzp <galap freemail.hu> writes:
Bartosz Milewski rta:
 dsimcha Wrote:
 
 The one thing that I think has been missing from this discussion is, what
would be
 the alternative if we didn't have this "non-deterministic" reallocation?  How
else
 could you **efficiently** implement dynamic arrays?

In the long run (D3), I proposed using the "unique" type modifier. If an array is unique, the compiler knows that there are no slices to worry about, and it can use in-place reallocation to its heart content. That pretty much solves the performance problem. In the short run (D2), I would suggest sticking to "reallocate on every extension" semantics (especially in SafeD) and provide a library solution (a la C++ std::vector) where the performance of appending is an issue.

It seems as we cannot persuade Andrei and Walter, that it is a really-really bad and dangerous implementation. Everybody knows what is going on when an int[] "array" is resized. It doesn't really matters, how we call it, indeterminism, undefined behaviour, implementation dependent behaviour. But we all know it's a source of a hardly detectable, dangerous bug. As I've said it before it's worse than buffer overrun, as they can be detected using some memory patterns and memory guarding techniques. But this bug cannot be detected, it just occurs once in a while, (like implementing parallel programs without the proper critical sections.) It's hard enough to create good (parallel) programs by itself, so don't harden the work with placing such a trap. So please take a look at the language with a different eye. I know you've worked a lot on it, but I'm afraid such a mistake could ruin the whole language. Sorry. Or at least just think why are there so many posts concerning this feature. int[] is not a range, it is a range + random_copy_on_resize (replace the word random with any of your choice, it is not a matter) I think we can agree that passing "array" int[] as a (non-cont) value type is highly avoidable. Than why do we allow it ??? With std::vector + iterator you always now when it is invalid (when the array is resized). But here you don't have an object whose modification (surly) invalidates the ranges. I'm missing to have reference for the actual array. The semantics is not so important for me (now), but distinct the two object: 1. have array those contains the element + structure, that can be resized: RandomAccessArray!(int) 2. have slices (ranges) of array, where the structure cannot be altered only the elements: Rang!(int) 3. decide if int[] stands for a short version for either RandomAccessArray!(int) or Range!(int), but DO NOT have a mixed meaning. Gzp
Nov 19 2009
parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Thu, 19 Nov 2009 18:33:22 +0300, gzp <galap freemail.hu> wrote:

 Bartosz Milewski =C3=ADrta:
 dsimcha Wrote:

 The one thing that I think has been missing from this discussion is,=



 what would be
 the alternative if we didn't have this "non-deterministic"  =



 reallocation?  How else
 could you **efficiently** implement dynamic arrays?



 an array is unique, the compiler knows that there are no slices to  =


 worry about, and it can use in-place reallocation to its heart conten=


 That pretty much solves the performance problem.  In the short run  =


 (D2), I would suggest sticking to "reallocate on every extension"  =


 semantics (especially in SafeD) and provide a library solution (a la =


 C++ std::vector) where the performance of appending is an issue.

It seems as we cannot persuade Andrei and Walter, that it is a =

 really-really bad and dangerous implementation. Everybody knows what i=

 going on when an int[] "array" is resized. It doesn't really matters, =

 how we call it, indeterminism, undefined behaviour, implementation  =

 dependent behaviour. But we all know it's a source of a hardly  =

 detectable, dangerous bug.
 As I've said it before it's worse than buffer overrun, as they can be =

 detected using some memory patterns and memory guarding techniques. Bu=

 this bug cannot be detected, it just occurs once in a while, (like  =

 implementing parallel programs without the proper critical sections.)
 It's hard enough to create good (parallel) programs by itself, so don'=

 harden the work with placing such a trap.

 So please take a look at the language with a different eye. I know  =

 you've worked a lot on it, but I'm afraid such a mistake could ruin th=

 whole language. Sorry. Or at least just think why are there so many  =

 posts concerning this feature.

 int[] is not a range, it is a range + random_copy_on_resize (replace t=

 word random with any of your choice, it is not a matter)

 I think we can agree that passing "array" int[] as a (non-cont) value =

 type is highly avoidable. Than why do we allow it ???

 With std::vector + iterator you always now when it is invalid (when th=

 array is resized). But here you don't have an object whose modificatio=

 (surly) invalidates the ranges. I'm missing to have reference for the =

 actual array.

 The semantics is not so important for me (now), but
 distinct the two object:
 1. have array those contains the element + structure, that can be  =

 resized: RandomAccessArray!(int)
 2. have slices (ranges) of array, where the structure cannot be altere=

 only the elements: Rang!(int)
 3. decide if int[] stands for a short version for either  =

 RandomAccessArray!(int) or Range!(int), but DO NOT have a mixed meanin=

 Gzp

Same here. FWIW, I strongly believe demise of T[new] was a step in a wro= ng = direction, and I feel highly frustrated about it. It was one of the most= = anticipated features that was asked for *years*! And it's gone when it w= as = so close to be implemented... The only more or less reasonable answer why T[new] "sucked" was: On Mon, 19 Oct 2009 01:55:28 +0400, Andrei Alexandrescu = <SeeWebsiteForEmail erdani.org> wrote:
 Returning a distinct type from .dup and ~ makes slices not closed over=

 these operations, a source of complication, confusion, and bloating.

I see no problem returning T[] when a slice is dup'ed or concatenated. = That's what always happened anyway.
Nov 19 2009
parent reply grauzone <none example.net> writes:
Denis Koroskin wrote:
 Same here. FWIW, I strongly believe demise of T[new] was a step in a 
 wrong direction, and I feel highly frustrated about it. It was one of 
 the most anticipated features that was asked for *years*! And it's gone 
 when it was so close to be implemented...
 
 The only more or less reasonable answer why T[new] "sucked" was:
 
 On Mon, 19 Oct 2009 01:55:28 +0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 Returning a distinct type from .dup and ~ makes slices not closed over 
 these operations, a source of complication, confusion, and bloating.

I see no problem returning T[] when a slice is dup'ed or concatenated. That's what always happened anyway.

I think Walter and Andrei just felt like many people do when they have to adapt to a new environment or some other big change: the new thing "sucks", you feel bad, you thought "the past was better", you want go go back. Also, you don't see the advantages, you only get riled up by the disadvantage (because _every_ change has disadvantages; that's only natural). So they just threw it away. Also, I don't quite understand why they find the new semantics caused by the append-caching simpler. Even Bartosz got worried about the undeterministic behavior of this.
Nov 19 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
grauzone wrote:
 Denis Koroskin wrote:
 Same here. FWIW, I strongly believe demise of T[new] was a step in a 
 wrong direction, and I feel highly frustrated about it. It was one of 
 the most anticipated features that was asked for *years*! And it's 
 gone when it was so close to be implemented...

 The only more or less reasonable answer why T[new] "sucked" was:

 On Mon, 19 Oct 2009 01:55:28 +0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 Returning a distinct type from .dup and ~ makes slices not closed 
 over these operations, a source of complication, confusion, and 
 bloating.

I see no problem returning T[] when a slice is dup'ed or concatenated. That's what always happened anyway.

I think Walter and Andrei just felt like many people do when they have to adapt to a new environment or some other big change: the new thing "sucks", you feel bad, you thought "the past was better", you want go go back. Also, you don't see the advantages, you only get riled up by the disadvantage (because _every_ change has disadvantages; that's only natural). So they just threw it away. Also, I don't quite understand why they find the new semantics caused by the append-caching simpler. Even Bartosz got worried about the undeterministic behavior of this.

However flattering the attention to Walter's and my psychology could be, it may be difficult to find supporting evidence for the theory above. Andrei
Nov 19 2009
prev sibling next sibling parent Leandro Lucarella <llucax gmail.com> writes:
dsimcha, el 19 de noviembre a las 23:21 me escribiste:
 Face it, D is a close to the metal language.  Any attempt to define the spec
at a
 purely abstract level without reference to any implementation details, and
without
 leaving certain details implementation defined is absurd.  I think people need
to
 face the fact that you can't have close-to-the-metal flexibility and
performance
 and at the same time far-from-the-metal strict separation of specification from
 implementation details.

Yes, but I think this is a place where you are doing early optimization at the expense of usability. You can always do the slice-thinguie yourself if you are that needed for speed that you can't afford an extra indirection when using an array (which will be the only overhead if T[new] were implemented as a pure reference type). I think the cost here in terms of usability if far higher than the little performance you loose, specially when you can overcome the issue for those weird cases where you need it. -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Pa' ella cociné, pa' ella lavé, pa' ella soñe Paella completa, $2,50 Pero, la luz mala me tira, y yo? yo soy ligero pa'l trote La luz buena, está en el monte, allá voy, al horizonte
Nov 19 2009
prev sibling parent Leandro Lucarella <llucax gmail.com> writes:
Steven Schveighoffer, el 23 de noviembre a las 07:34 me escribiste:
Notice that you are using particular implementation detail (MRU
cache) to explain the semantics of D arrays. There is a very
important distinction between language specification and compiler
implementation. Andrei already had to go pretty deep into
implementation to describe arrays and slices: You can't define D
arrays without talking about shared buffers and memory allocation.
I don't think including the description of the MRU cache in the
language specification is the right solution. But I'm afraid an
abstract definition of "stomping" might turn out to be quite
non-trivial.

I haven't yet read all the other posts, so someone else may already have pointed this out but... Having an MRU cache makes it so you *don't* have to explain its semantics (or stomping). Currently there is a paragraph in the spec (complete with example) talking about how stomping can occur, so you can just remove that part. There is no need to talk about the MRU cache when talking about arrays, that's an implementation detail. I was pointing it out because you are already used to the current bad implementation :) I wouldn't even bring up the MRU cache in the book or the spec. You just say that you can append to an array and it may make a copy of the data if necessary. It's just like realloc, except safer.

The thing is, with realloc() is less likely that you forget that the data can be copied because it returns the new pointer (that can be the same as the original pointer). And even in this case, I saw a lot of bugs related to realloc() misuse (and I made a couple myself). With slices is much worse. -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Si le decía "Vamos al cine, rica" Me decía "Veamos una de Kusturica" Si le decía "Vamos a oler las flores" Me hablaba de Virginia Wolf y sus amores Me hizo mucho mal la cumbiera intelectual
Nov 23 2009