www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Don't use arrays as stacks

reply "Nick Sabalausky" <a a.a> writes:
If anyone cares, I've put up a little thing about why it's best not to use 
D's arrays as stacks. This is drawn from a direct experience a few months 
ago. Figured if I fell into that trap then others might too, so this could 
be helpful for some people. There's a no-login-needed comments section 
already there.

https://www.semitwist.com/articles/article/view/don-t-use-arrays-as-stacks 
Sep 24 2011
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, September 25, 2011 02:37:25 Nick Sabalausky wrote:
 If anyone cares, I've put up a little thing about why it's best not to use
 D's arrays as stacks. This is drawn from a direct experience a few months
 ago. Figured if I fell into that trap then others might too, so this could
 be helpful for some people. There's a no-login-needed comments section
 already there.
 
 https://www.semitwist.com/articles/article/view/don-t-use-arrays-as-stacks

Yeah. If you want to use an array for a stack, it really needs to be wrapped in a struct or class like you'd do in C++. Expanding it when the size of the array needs to increase to accomodate items pushed onto the stack is certainly much easier in D, but trying to use the array as a stack directly is definitely going to cause issues as you show in the article. - Jonathan M Davis
Sep 24 2011
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/25/11 1:37 AM, Nick Sabalausky wrote:
 If anyone cares, I've put up a little thing about why it's best not to use
 D's arrays as stacks. This is drawn from a direct experience a few months
 ago. Figured if I fell into that trap then others might too, so this could
 be helpful for some people. There's a no-login-needed comments section
 already there.

 https://www.semitwist.com/articles/article/view/don-t-use-arrays-as-stacks

Nice piece. http://www.reddit.com/r/programming/comments/kqoz2/dont_use_d_arrays_as_stacks/ Andrei
Sep 24 2011
prev sibling next sibling parent "Nick Sabalausky" <a a.a> writes:
"Nick Sabalausky" <a a.a> wrote in message 
news:j5mi7l$1dtf$1 digitalmars.com...
 If anyone cares, I've put up a little thing about why it's best not to use 
 D's arrays as stacks. This is drawn from a direct experience a few months 
 ago. Figured if I fell into that trap then others might too, so this could 
 be helpful for some people. There's a no-login-needed comments section 
 already there.

 https://www.semitwist.com/articles/article/view/don-t-use-arrays-as-stacks

And don't worry, it's not as long as my contest "article" was ;)
Sep 24 2011
prev sibling next sibling parent Andrew Wiley <wiley.andrew.j gmail.com> writes:
On Sun, Sep 25, 2011 at 1:45 AM, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 On Sunday, September 25, 2011 02:37:25 Nick Sabalausky wrote:
 If anyone cares, I've put up a little thing about why it's best not to use
 D's arrays as stacks. This is drawn from a direct experience a few months
 ago. Figured if I fell into that trap then others might too, so this could
 be helpful for some people. There's a no-login-needed comments section
 already there.

 https://www.semitwist.com/articles/article/view/don-t-use-arrays-as-stacks

Yeah. If you want to use an array for a stack, it really needs to be wrapped in a struct or class like you'd do in C++. Expanding it when the size of the array needs to increase to accomodate items pushed onto the stack is certainly much easier in D, but trying to use the array as a stack directly is definitely going to cause issues as you show in the article. - Jonathan M Davis

Isn't this exactly what assumeSafeAppend is for?
Sep 25 2011
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, September 25, 2011 03:18:29 Andrew Wiley wrote:
 On Sun, Sep 25, 2011 at 1:45 AM, Jonathan M Davis <jmdavisProg gmx.com> 

 On Sunday, September 25, 2011 02:37:25 Nick Sabalausky wrote:
 If anyone cares, I've put up a little thing about why it's best not to
 use D's arrays as stacks. This is drawn from a direct experience a
 few months ago. Figured if I fell into that trap then others might
 too, so this could be helpful for some people. There's a
 no-login-needed comments section already there.
 
 https://www.semitwist.com/articles/article/view/don-t-use-arrays-as-st
 acks> 

wrapped in a struct or class like you'd do in C++. Expanding it when the size of the array needs to increase to accomodate items pushed onto the stack is certainly much easier in D, but trying to use the array as a stack directly is definitely going to cause issues as you show in the article. - Jonathan M Davis

Isn't this exactly what assumeSafeAppend is for?

Sure, you could do that, but simply concatenating to push and then slicing to pop doesn't work. You need to do something like use assumeSafeAppend or just treat the stack as being part of the array instead of the whole. And in either case, if the array isn't wrapped in a struct or class, you're asking for trouble. In the case where you're just using part of the array as a stack, you need something to keep track of which part of the array is currently the stack, so you need a wrapper, and if you're using assumeSafeAppend, then you need to guarantee that nowhere else has a reference to that array (otherwise it's _not_ safe to assume that it's safe to append), so you need a wrapper. In either case, you need a wrapper. - Jonathan M Davis
Sep 25 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.160.1316939358.26225.digitalmars-d puremagic.com...
 On Sunday, September 25, 2011 03:18:29 Andrew Wiley wrote:
 Isn't this exactly what assumeSafeAppend is for?


Hmm, I didn't know about that. (Actually, I remember hearing it mentioned before, but then totally forgot about it.)
 and if you're using assumeSafeAppend, then you
 need to guarantee that nowhere else has a reference to that array 
 (otherwise
 it's _not_ safe to assume that it's safe to append)

Would the consequences of failing to do that be any worse (or any different at all?) than what I mentioned about: "One caveat about this method: If you save a slice of the stack, pop elements off the stack, and then push new values back on, the old slice you took will likely reflect the new values, not the original ones." ...?
Sep 25 2011
parent Lutger Blijdestijn <lutger.blijdestijn gmail.com> writes:
Nick Sabalausky wrote:

 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.160.1316939358.26225.digitalmars-d puremagic.com...
 On Sunday, September 25, 2011 03:18:29 Andrew Wiley wrote:
 Isn't this exactly what assumeSafeAppend is for?


Hmm, I didn't know about that. (Actually, I remember hearing it mentioned before, but then totally forgot about it.)
 and if you're using assumeSafeAppend, then you
 need to guarantee that nowhere else has a reference to that array
 (otherwise
 it's _not_ safe to assume that it's safe to append)

Would the consequences of failing to do that be any worse (or any different at all?) than what I mentioned about: "One caveat about this method: If you save a slice of the stack, pop elements off the stack, and then push new values back on, the old slice you took will likely reflect the new values, not the original ones." ...?

It looks like it behaves the same, but the docs mention this: 'Calling this function, and then using references to data located after the given array results in undefined behavior.' So it is not wise to depend on it. One thing that wasn't clear to me (but might be obvious): after modifying an array, the behavior is reset and you will need to call assumeSafeAppend again.
Sep 25 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, September 25, 2011 07:05:34 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.160.1316939358.26225.digitalmars-d puremagic.com...
 
 On Sunday, September 25, 2011 03:18:29 Andrew Wiley wrote:
 Isn't this exactly what assumeSafeAppend is for?


Hmm, I didn't know about that. (Actually, I remember hearing it mentioned before, but then totally forgot about it.)
 and if you're using assumeSafeAppend, then you
 need to guarantee that nowhere else has a reference to that array
 (otherwise
 it's _not_ safe to assume that it's safe to append)

Would the consequences of failing to do that be any worse (or any different at all?) than what I mentioned about: "One caveat about this method: If you save a slice of the stack, pop elements off the stack, and then push new values back on, the old slice you took will likely reflect the new values, not the original ones." ...?

I'm not sure. If you have 2 slices which point to the same memory, and one is longer than the other, and then you call assumeSafeAppend on the smaller one, then if you append to the smaller one, it will increase its size into the smaller one, which may or may not cause issues depending on what you're doing. The real potential issue though is what happens if you use the longer slice and append to it. I really don't know what happens at that point. It'll probably just adjust the information on that block such that the pointer (or length - I'm not sure which it is) that it keeps to the end of the longest slice may be adjusted accordingly, or it could cause issues, because the end of that slice is already beyond what the block thinks is the end of the longest slice. So, the problem is more or less the same as the situation that you gave, but I'm not sure that it's identical. Regardless, it's just safer to wrap the array in a stack struct and not give access to the array. Normally all you care about with a stack is pushing and popping elements on and off and possibly peeking at the top element - none of which require having access to the array - so wrapping the array and not giving external access to it should work just fine. - Jonathan M Davis
Sep 25 2011
prev sibling next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Nick Sabalausky:

 https://www.semitwist.com/articles/article/view/don-t-use-arrays-as-stacks

Is it possible to write a function like: ForeachType!A unsafePop(A)(A)if(isDynamicArray!A) { /*...*/ } that contains both the popping (and something like assumeSafeAppend to mess with the druntime data structures) and use it to allow a stack-like usage of D dynamic arrays? (This function is not meant to replace the usage of a proper stack data structure based on a deque when you need a heavy-duty stack, but it's handy in other lighter cases). Bye, bearophile
Sep 25 2011
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 25 Sep 2011 02:37:25 -0400, Nick Sabalausky <a a.a> wrote:

 If anyone cares, I've put up a little thing about why it's best not to  
 use
 D's arrays as stacks. This is drawn from a direct experience a few months
 ago. Figured if I fell into that trap then others might too, so this  
 could
 be helpful for some people. There's a no-login-needed comments section
 already there.

 https://www.semitwist.com/articles/article/view/don-t-use-arrays-as-stacks

Nice article!
 For more information about D's arrays and slicing, see Steven  
 Schveighoffer's award-winning article, D Slices.

:D
 [1] Why is the capacity set to zero instead of the actual length of  
 four? I'm not certain, but my guess is that zeroing the value is  
 slightly faster than copying the length. Either that, or perhaps it has  
 to do with the fact that the slice doesn't actually "own" the data.

It was an arbitrary decision, but actually serves as an important differentiator. 0 simply means "any append to this array will reallocate". capacity == length would mean the same thing, but has the distinction of knowing that the length is the actual block length. This might not seem like much, but you may want to know this difference, for example, if you wanted to call assumeSafeAppend. The stack example is a good example. If you wrote a generic "popStack" function, which takes an array, knowing before popping that you have control of the block is important (you may not call assumeSafeAppend if your pre-pop capacity was 0).
 One caveat about this method: If you save a slice of the stack, pop  
 elements off the stack, and then push new values back on, the old slice  
 you took will likely [4] reflect the new values, not the original ones.

 [4] I say "likely" instead of "definitely" because of an unfortunate  
 indeterminism inherent in D's array/slice system. This quirk is  
 explained in the "Determinism" section of Steven Schveighoffer's article  
 mentioned above.

Actually, with the new functions capacity, assumeSafeAppend, and reserve, most of the "non-determinism" is mitigated. In your case, however, you know you have control of the stack, and all its data, popping and pushing will *definitely* overwrite the old value. -Steve
Sep 26 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"Steven Schveighoffer" <schveiguy yahoo.com> wrote in message 
news:op.v2e19fkieav7ka localhost.localdomain...
 On Sun, 25 Sep 2011 02:37:25 -0400, Nick Sabalausky <a a.a> wrote:

 If anyone cares, I've put up a little thing about why it's best not to 
 use
 D's arrays as stacks. This is drawn from a direct experience a few months
 ago. Figured if I fell into that trap then others might too, so this 
 could
 be helpful for some people. There's a no-login-needed comments section
 already there.

 https://www.semitwist.com/articles/article/view/don-t-use-arrays-as-stacks

Nice article!

Thanks! Means a lot coming from people like you and Andrei. :)
 [1] Why is the capacity set to zero instead of the actual length of 
 four? I'm not certain, but my guess is that zeroing the value is 
 slightly faster than copying the length. Either that, or perhaps it has 
 to do with the fact that the slice doesn't actually "own" the data.

It was an arbitrary decision, but actually serves as an important differentiator. 0 simply means "any append to this array will reallocate". capacity == length would mean the same thing, but has the distinction of knowing that the length is the actual block length.

Or rather that the *end* of the array is the *end* of the actual block, right? (Since the array could also have some of first X elements sliced off.)
 This  might not seem like much, but you may want to know this difference, 
 for  example, if you wanted to call assumeSafeAppend.  The stack example 
 is a  good example.  If you wrote a generic "popStack" function, which 
 takes an  array, knowing before popping that you have control of the block 
 is  important (you may not call assumeSafeAppend if your pre-pop capacity 
 was  0).

 One caveat about this method: If you save a slice of the stack, pop 
 elements off the stack, and then push new values back on, the old slice 
 you took will likely [4] reflect the new values, not the original ones.

 [4] I say "likely" instead of "definitely" because of an unfortunate 
 indeterminism inherent in D's array/slice system. This quirk is 
 explained in the "Determinism" section of Steven Schveighoffer's article 
 mentioned above.

Actually, with the new functions capacity, assumeSafeAppend, and reserve, most of the "non-determinism" is mitigated. In your case, however, you know you have control of the stack, and all its data, popping and pushing will *definitely* overwrite the old value.

But when the stack grows past its currently-allocated size and it needs to expand (ex: from 1024 to 1500, in order to accomodate that 1025th element), there's still a chance it would get moved to a different memory location, right? In which case, if you then pop all the way back to 900, and then push 100 or so back again, *now* any slice that *had* been pointing to the original, say, [920..970] will no longer be pointing to the new values, they'll still be pointing to the old values, right? So, for instance, you can't take an instance on my Stack struct, do "foo = stack[0..10]", and then say "foo is guaranteed to always point to the bottom 10 elements of stack (at least until you change foo)". Right? That's what I was referring to with that footnote [4]. To make that kind of guarantee, you'd have to specifically code the Stack class so that all slices of the Stack get updated whenever the Stack's data gets moved. Not that slicing a stack would be all that common, but if it were done for whatever reason...
Sep 26 2011
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Nick Sabalausky:

 Not that slicing a stack would be all that common, but if it were done for 
 whatever reason...

Is it possible to write a function like: ForeachType!A unsafePop(A)(A)if(isDynamicArray!A) { /*...*/ } that contains both the popping (and something like assumeSafeAppend to mess with the druntime data structures) and use it to allow a stack-like usage of D dynamic arrays? (This function is not meant to replace the usage of a proper stack data structure based on a deque when you need a heavy-duty stack, but it's handy in other lighter cases). Bye, bearophile
Sep 26 2011
parent "Nick Sabalausky" <a a.a> writes:
"bearophile" <bearophileHUGS lycos.com> wrote in message 
news:j5rn5o$1q0g$1 digitalmars.com...
 Nick Sabalausky:

 Not that slicing a stack would be all that common, but if it were done 
 for
 whatever reason...

Is it possible to write a function like: ForeachType!A unsafePop(A)(A)if(isDynamicArray!A) { /*...*/ } that contains both the popping (and something like assumeSafeAppend to mess with the druntime data structures) and use it to allow a stack-like usage of D dynamic arrays? (This function is not meant to replace the usage of a proper stack data structure based on a deque when you need a heavy-duty stack, but it's handy in other lighter cases).

Umm, maybe. Sounds like it should be feasable, though I haven't tried. Although going by what Steve said about assumeSafeAppend, it might actually end up slower than a wrapped Stack type that doesn't use assumeSafeAppend.
Sep 26 2011
prev sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Steven Schveighoffer" <schveiguy yahoo.com> wrote in message 
news:op.v2ita9hseav7ka localhost.localdomain...
 On Tue, 27 Sep 2011 00:22:02 -0400, Nick Sabalausky <a a.a> wrote:

 There's never a guarantee that a slice will always point at the current 
 values.  There can't be, because you can never guarantee it will not be 
 reallocated on expansion.

Right.
 But what I thought you were talking about is popping values off, and then 
 pushing values on, without exceeding the capacity.  In fact, you would be 
 guaranteed the slice contains the newer values, even if a reallocation 
 occurs, since you have to push values into the slice before exceeding the 
 capacity.

 The situation you describe in this reply is, take a slice, push elements 
 on exceeding capacity, then pop elements back, then push elements on. 
 Quite different :)

Yes, but the slice "owner" won't necessarily know which of those scenarios has occurred. Of course, as it is now, the slice owner won't even know if the values it points to have been pooped off and *not* been pushed back on again. That said, I agree with you below:
 You could create a "stack slice" which instead of pointing at the memory, 
 points at the stack aggregate itself (which would actually have to be 
 heap-allocated), and have the lower and upper bounds.  Then a "slice" of 
 that type would continue to point at the current data.

I should definitely do something like that (or just eliminate the ability to take a slice, but I suspect that might be a little more limiting that it might initially seem). Or maybe I should just take a look at dcollections ;)
Sep 28 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"Steven Schveighoffer" <schveiguy yahoo.com> wrote in message 
news:op.v2i2yv0beav7ka localhost.localdomain...
 On Wed, 28 Sep 2011 12:11:29 -0400, Nick Sabalausky <a a.a> wrote:

 Of course, as it is now, the slice owner won't even know if
 the values it points to have been pooped off and *not* been pushed back 
 on
 again.

Freudian slip or simple typo? Either way, *hilarious* :)

lmao. And it's surprisingly appropriate for the context.
 Gives a new  mental picture for a stack...

Yea, and properly "upside-down" just like our trees. Although I'm not I like the implications that has for "push"...I guess that means the analogy would be better suited for a FIFO.
Sep 28 2011
parent "Nick Sabalausky" <a a.a> writes:
"Nick Sabalausky" <a a.a> wrote in message 
news:j5vte2$3ql$1 digitalmars.com...
 ...Although I'm not [*sure*] I like the implications...
 

Sep 28 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 25 Sep 2011 07:05:34 -0400, Nick Sabalausky <a a.a> wrote:

 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.160.1316939358.26225.digitalmars-d puremagic.com...
 On Sunday, September 25, 2011 03:18:29 Andrew Wiley wrote:
 Isn't this exactly what assumeSafeAppend is for?


Hmm, I didn't know about that. (Actually, I remember hearing it mentioned before, but then totally forgot about it.)
 and if you're using assumeSafeAppend, then you
 need to guarantee that nowhere else has a reference to that array
 (otherwise
 it's _not_ safe to assume that it's safe to append)

Would the consequences of failing to do that be any worse (or any different at all?) than what I mentioned about: "One caveat about this method: If you save a slice of the stack, pop elements off the stack, and then push new values back on, the old slice you took will likely reflect the new values, not the original ones." ...?

The caveat is the same, however, you should continue doing it the way you are doing it. Using the runtime is somewhat magic, and there is a cost for being that magical -- too many runtime calls :) Every call to capacity and assumeSafeAppend *cannot* be inlined or optimized, since they end up calling extern(C) function prototypes. BTW, if you want to compare your performance with a stack implementation that uses assumeSafeAppend, try out dcollections' ArrayList, in which popBack does use assumeSafeAppend. -Steve
Sep 26 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 27 Sep 2011 00:22:02 -0400, Nick Sabalausky <a a.a> wrote:

 "Steven Schveighoffer" <schveiguy yahoo.com> wrote in message
 [1] Why is the capacity set to zero instead of the actual length of
 four? I'm not certain, but my guess is that zeroing the value is
 slightly faster than copying the length. Either that, or perhaps it has
 to do with the fact that the slice doesn't actually "own" the data.

It was an arbitrary decision, but actually serves as an important differentiator. 0 simply means "any append to this array will reallocate". capacity == length would mean the same thing, but has the distinction of knowing that the length is the actual block length.

Or rather that the *end* of the array is the *end* of the actual block, right? (Since the array could also have some of first X elements sliced off.)

Yes, that's a better description.
 One caveat about this method: If you save a slice of the stack, pop
 elements off the stack, and then push new values back on, the old slice
 you took will likely [4] reflect the new values, not the original ones.

 [4] I say "likely" instead of "definitely" because of an unfortunate
 indeterminism inherent in D's array/slice system. This quirk is
 explained in the "Determinism" section of Steven Schveighoffer's  
 article
 mentioned above.

Actually, with the new functions capacity, assumeSafeAppend, and reserve, most of the "non-determinism" is mitigated. In your case, however, you know you have control of the stack, and all its data, popping and pushing will *definitely* overwrite the old value.

But when the stack grows past its currently-allocated size and it needs to expand (ex: from 1024 to 1500, in order to accomodate that 1025th element), there's still a chance it would get moved to a different memory location, right? In which case, if you then pop all the way back to 900, and then push 100 or so back again, *now* any slice that *had* been pointing to the original, say, [920..970] will no longer be pointing to the new values, they'll still be pointing to the old values, right? So, for instance, you can't take an instance on my Stack struct, do "foo = stack[0..10]", and then say "foo is guaranteed to always point to the bottom 10 elements of stack (at least until you change foo)". Right? That's what I was referring to with that footnote [4]. To make that kind of guarantee, you'd have to specifically code the Stack class so that all slices of the Stack get updated whenever the Stack's data gets moved.

There's never a guarantee that a slice will always point at the current values. There can't be, because you can never guarantee it will not be reallocated on expansion. But what I thought you were talking about is popping values off, and then pushing values on, without exceeding the capacity. In fact, you would be guaranteed the slice contains the newer values, even if a reallocation occurs, since you have to push values into the slice before exceeding the capacity. The situation you describe in this reply is, take a slice, push elements on exceeding capacity, then pop elements back, then push elements on. Quite different :) You could create a "stack slice" which instead of pointing at the memory, points at the stack aggregate itself (which would actually have to be heap-allocated), and have the lower and upper bounds. Then a "slice" of that type would continue to point at the current data. -Steve
Sep 28 2011
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 28 Sep 2011 12:11:29 -0400, Nick Sabalausky <a a.a> wrote:

 Of course, as it is now, the slice owner won't even know if
 the values it points to have been pooped off and *not* been pushed back  
 on
 again.

Freudian slip or simple typo? Either way, *hilarious* :) Gives a new mental picture for a stack... -Steve
Sep 28 2011