www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - More range woes: std.array.save is invalid

reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8061

After being sidetracked by the stack-allocated buffer fiasco, I finally
found the root problem in the above bug.

The problem is std.array.save:

 property T[] save(T)(T[] a)  safe pure nothrow
{
    return a;
}

This implementation is only correct for certain values of T. It is wrong
when T is a forward range, for example, because if you save a T[] and
iterate over its elements, it will consume the T's in the array, so that
the original range has elements that are now consumed.

The correct implementation when T is a forward range is to return a copy
of the array where the elements are obtained by calling T.save.

The problem is, now .save is a potentially very expensive operation.

At a deeper level, it can be argued that actually, std.array.save is not
wrong, because you asked it to save the array, not the elements within
the array. So in that sense, there's nothing wrong with the
implementation.

What is wrong here is that many algorithms in std.range and
std.algorithm wrongly assume that just because the outer range (that is,
T[]) is a forward range, *and* the inner ranges (that is, T) are also
forward ranges, that therefore T[].save will save the current position
of the *range of ranges*.  Actually, the only thing that is guaranteed
to be saved is the current position of the *containing range*, it does
NOT save the position of the subranges at all (at least, not according
to the current definition of .save, which is NOT transitive). This means
that many wrapper ranges are broken, because they export a .save method
that does not work correctly under certain circumstances.

The correct way to wrap a range of ranges as a forward range, is if the
wrapping range's .save method *copies* the outer range and populates it
with .save'd subranges. Otherwise, it must be treated only as an input
range. However, since there's no general way to copy a range, all of
these wrapper ranges cannot be more than *input* ranges.

That is, unless we redefine .save to be transitive. But then, it becomes
just a variant of a general .dup function for ranges. And so we come
back to a fundamental issue in D, that there's no generic way to
duplicate a value. This is causing us a lot of grief in generic code
(the whole transient .front issue is another instance of this problem,
since transience wouldn't be a problem if there was a generic .dup
method), and, as I see it, will continue to cause us more grief in the
future, because there's no way you can assume that assigning anything to
anything else will not magically become invalid when you subsequently
perform a seemingly-unrelated operation. That makes writing generic code
a veritable minefield.


T

-- 
Why can't you just be a nonconformist like everyone else? -- YHL
Dec 19 2012
next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 20 December 2012 at 07:05:08 UTC, H. S. Teoh wrote:
 The problem is std.array.save:

  property T[] save(T)(T[] a)  safe pure nothrow
 {
     return a;
 }

 This implementation is only correct for certain values of T. It 
 is wrong
 when T is a forward range, for example, because if you save a 
 T[] and
 iterate over its elements, it will consume the T's in the 
 array, so that
 the original range has elements that are now consumed.

 T

I'd say that's the correct behavior: A range is just a way to iterate over contents, and "save" makes no promise (and *should* make no promise) to make copies of the range's elements. If you modify one of the elements in your range, then you'd better hope that change is seen by all other ranges iterating over the same data. What I'm basically saying is: //---- auto a = [1, 2, 3] auto b = a.save; b[0] = 5; assert(a[0] == 5); //Correct behavior //---- The fact that you have ints or forwards ranges as elements is irrelevant. --------------------------- BTW: Just the same way, "dup" will not transitively duplicate the contents of an array, or of std.container.array. --------------------------- To get your described behavior, then you should wrap your array into your own range, and make the array's elements part of your iteration definition. This way you'll surprise no one.
Dec 19 2012
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 20 December 2012 at 07:05:08 UTC, H. S. Teoh wrote:
 What is wrong here is that many algorithms in std.range and
 std.algorithm wrongly assume that just because the outer range 
 (that is,
 T[]) is a forward range, *and* the inner ranges (that is, T) 
 are also
 forward ranges, that therefore T[].save will save the current 
 position
 of the *range of ranges*.

Most probably, algorithm is still not quite great about using save. This is especially true in regards to RoR, but that's a special case, and we are fixing those one by one.
Dec 19 2012
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/20/12 2:03 AM, H. S. Teoh wrote:
 http://d.puremagic.com/issues/show_bug.cgi?id=8061

 After being sidetracked by the stack-allocated buffer fiasco, I finally
 found the root problem in the above bug.

 The problem is std.array.save:

  property T[] save(T)(T[] a)  safe pure nothrow
 {
      return a;
 }

 This implementation is only correct for certain values of T. It is wrong
 when T is a forward range, for example, because if you save a T[] and
 iterate over its elements, it will consume the T's in the array, so that
 the original range has elements that are now consumed.

 The correct implementation when T is a forward range is to return a copy
 of the array where the elements are obtained by calling T.save.

I think you're overthinking this. Save saves the position in the current range without any promise about its contents. Andrei
Dec 19 2012