digitalmars.D - BIg semantic issues with SList & DList

monarch_dodra (109/109) Sep 12 2012 Basically, the problem is that their copy semantics adhere to

monarch_dodra (51/52) Sep 13 2012 I did some tests and implementations, and I came with 3 options:

monarch_dodra (135/139) Sep 13 2012 I'll just copy paste here the 3 headers which contain examples,

"monarch_dodra" <monarchdodra gmail.com> writes:

Basically, the problem is that their copy semantics adhere to 
neither value nor reference semantics. The embody a "reference to 
a third party chain system" semantic.

Erm... What does this mean?

Basically, a "chain of nodes is created", and the lists reference 
points in that chain, but not necessarily the beginnings nor 
ends. How is this a problem? I'll start with SList:

SList
--------
import std.stdio, std.container, std.range, std.iostream, 
std.algorithm;
void main()
{
     auto a = SList!int([3, 4]);
     auto b = a;
     // ab -> 3 -> 4 -> null
     assert(a[].equal([3, 4]));
     assert(b[].equal([3, 4]));
     a.stableInsertFront(1);
     b.stableInsertFront(2);
     // a -> 1
     //       \
     //        3 -> 4 -> null
     //       /
     // b -> 2
     assert(a[].equal([1, 3, 4]));
     assert(b[].equal([2, 3, 4]));

     //a: change the 2nd element (4) into 5;
     a[].drop(2).front = 5;
     // a -> 1
     //       \
     //        3 -> 5 -> null
     //       /
     // b -> 2
     assert(a[].equal([1, 3, 5]));
     assert(b[].equal([2, 3, 5]));

     a.clear();
     // a -> null
     //
     //        3 -> 4 -> null
     //       /
     // b -> 2
     assert(a[].empty);
     assert(b[].equal([2, 3, 5]));

     a.insert(1);
     // -> 1 -> null
     //
     //      3 -> 5-> null
     //     /
     // -> 2
     assert(a[].equal([1]));
     assert(b[].equal([2, 3, 5]));
}
--------
Is this a clear example of what I mean?
NOW: The big question:

IS THIS A LEGITIMATE BEHAVIOR?

IMO, for SList, yes, yes it is. _HOWEVER_, it must be much 
clearly documented in the docs, as well as examples provided.

I have done some relatively heavy testing, and I have found no 
bugs in SList.

However, there are a couple *Gotcha's*, in particular, regarding 
the effects of "remove": Are we merely advancing the SList's 
"root pointer", or actually excising elements from the "chain" 
itself? Depends on context :D but nothing that can't be 
documented.

DLists has the same semantic:
--------
import std.stdio, std.container, std.range, std.algorithm;
void main()
{
     auto a = DList!int([3, 4]);
     auto b = a;
     // (3 - 4)
     assert(a[].equal([3, 4]));
     assert(b[].equal([3, 4]));

     b.stableInsertFront(1);
     b.stableInsertBack(5);
     // (2 - (3 - 4) - 5)
     assert(a[].equal([3, 4]));
     assert(b[].equal([1, 3, 4, 5]));

     a.stableInsertFront(2);
     // (1 - (2 - 3 - 4) - 5)
     assert(a[].equal([2, 3, 4]));
     writeln(b[]);
     assert(b[].equal([1, 2, 3, 4, 5]));

     //Does not work as expected!
     a.remove(a[]);
     writeln(a[]); //[2, 3, 4]
     writeln(b[]); //[1, 5]
}
--------
The example stops here, because DList was obviously written 
without this in mind. I don't have enough fingers on my hands to 
list the use-cases that break it.

Again, same question:

IS THIS A LEGITIMATE BEHAVIOR?

The gotchas are even more prevalent here. Not only gotchas, but 
just plain problems regarding theoretical behavior: What does it 
mean to append two DLists, if those DLists are views into two 
different, but, bigger, chains?

IMO: For DList, it is just too damn complicated to keep this 
semantic. I'd migrate to "true reference".


--------
The good news is that having "true" reference semantics with 
{SD}List should be relatively easy. Depending on the output of 
this discussion, it is something <s>I wouldn't mind</s> I'd enjoy 
doing.

TY for reading.

Sep 12 2012

"monarch_dodra" <monarchdodra gmail.com> writes:

On Wednesday, 12 September 2012 at 17:24:57 UTC, monarch_dodra
wrote:
[SNIP]

I did some tests and implementations, and I came with 3 options:

1) Embrace the "view of a chain" implementation and document it.
2) Use classic reference semantics. However, ranges can still
have an "out of bonds" view of the chain. This means the ranges
are virtually never invalidated.
3) Use classic reference semantics. ENFORCE range bounds be
inside the underlying DList. This forces us to systematically
"detach" removed nodes (as opposed to just advancing the _first
pointer)
...
4) There was a 4th solution I did not like: use classic reference
semantics. Do not provide support for "out of bounds" view of the
Range, but do not enforce it either. This is the "Bites you in
the ass" solution, and kind of what we currently had :(

All 3 solutions have been implemented.

Common to all 3 implementations:
*Added a LOT of unit tests.
*Added a description in the beggining
*Fixed the opOpAssign functions. I know they weren't working
because their names were typoed (!).
*MUCH more robust assertions of internals (and minor bugfixes).
*Ranges are now assertion based validated. (as opposed to
enforced).

Implementation 1:
This embraces the "chain" view. It works as advertised. The
downside though is in the concept, in that it _IS_ actually
possible to invalidate a DList itself (not just the range).
Everything is Asserted, and I do not think it is possible to have
some bad usage which isn't asserted. I had to change a few
implementations, but everything works now.
Here:
https://github.com/monarchdodra/phobos/commit/054fe9bea5652aa0e3089d279d4df2ae069d9f30#std/container.d

Implementation 2:
Reference semantics. However, Ranges can still have an out of
bounds view of the chain. This is because functions like
"removeFront" don't actually detach nodes, merelly advance the
get pointer.
The *real* advantage of this approach is that Range are virtually
never invalidated. Also had to change a few implementations.

This is kind of a hybrid soltuion, as the DLists themselves keep
the traditional semantics, but the ranges themselves are more of
a "can see anywhere in the chain" object.
https://github.com/monarchdodra/phobos/commit/2b08b46bf552f208deb897ec0f8c5b3327225c8c#std/container.d

Implementation 3:
Classic implementation. No surprises. Robust, safe, expected.
Boring. Boring is good. IMO, this is what people expect. Not a
lot of changes here actually.
https://github.com/monarchdodra/phobos/commit/8ce21e1a9a0107e74ff86a9acc736595ef73362c#std/container.d

Could I get any kind of feedback?

Sep 13 2012

"monarch_dodra" <monarchdodra gmail.com> writes:

On Thursday, 13 September 2012 at 19:37:41 UTC, monarch_dodra 
wrote:
 On Wednesday, 12 September 2012 at 17:24:57 UTC, monarch_dodra
 wrote:
 [SNIP]

 [SNIP]

I'll just copy paste here the 3 headers which contain examples, 
to give a quicker idea of what I went for:

111 CHAIN IMPLEMENTATION 111
/**
    Implements a doubly-linked list.

    $(D DList) uses neither reference nor value semantics. They 
can be seen as
    several different handles into an external chain of nodes. 
Several different
    $(D DList)s can all reference different points in a same chain.

    $(D DList.Range) is, for all intents and purposes, a DList 
with range
    semantics. The $(D DList.Range) has a view directly into the 
chain itself.
    It is not tied to its parent $(D DList), and may be used to 
operate on
    other lists (that point to the same chain).

    The ONLY operation that can invalidate a $(D DList) or $(D 
DList.Range), but
    which will invalidate BOTH, is the $(D remove) operation, if 
the cut Range
    overlaps with the boundaries of another DList or DList.Range.

    Example:
----
auto a = DList!int([3, 4]); //Create a new chain
auto b = a; //Point to the same chain
// (3 - 4)
assert(a[].equal([3, 4]));
assert(b[].equal([3, 4]));

b.stableInsertFront(1); //insert before of b
b.stableInsertBack(5);  //insert after of b
// (2 - (3 - 4) - 5)
assert(a[].equal([3, 4])); //a is not changed
assert(b[].equal([1, 3, 4, 5])); // but a is

a.stableInsertFront(2); //insert in front of a, this will insert 
"inside" the chain
// (1 - (2 - 3 - 4) - 5)
assert(a[].equal([2, 3, 4])); //a is modified
assert(b[].equal([1, 2, 3, 4, 5])); //and so is b;

a.remove(a[]); //remove all the elements of a;
// (1 - 5)
assert(a[].empty); //a is empty
assert(b[].equal([1, 5])); //b has lost some of its elements;

a.insert(2); //insert in a. This will create a new chain
// (2)
// (1 - 5)
assert(a[].equal([2])); //a is empty
assert(b[].equal([1, 5])); //b has lost some of its elements;
----
  */

222 HYBRID REFERENCE IMPLEMENATION 222
/**
    Implements a doubly-linked list.

    $(D DList) use reference semantics. They give a view into a 
bidirectional
    chain of nodes.

    $(D DList.Range) can be used to have $(D BidirectionalRange) 
view of the
    chain of nodes. $(D DList.Ranges) are almost never 
invalidated, but they
    may have access to nodes that are past the end of original $(D 
DList)'s,
    if that $(D DList) was shortened.

    Example:
----
auto a = DList!int([1, 2, 3]); //Create a new DList
auto b = a; //A reference to that DList
auto r = a[]; //A range view of the list

a.removeFront();
a.removeBack();
assert(a[].equal([2])); // both a and b have been affected
assert(b[].equal([2])); // both a and b have been affected
assert(r.equal([1, 2, 3])); //But the range remains unchanged

a.insertFront(1);
assert(a[].equal([1, 2])); // both a and b have been affected
assert(b[].equal([1, 2])); // both a and b have been affected
assert(r.equal([1, 1, 2, 3])); //But so has the range

auto c = a; //Create a third reference
a.clear();  //Detaches from the chain
assert(a[].empty); //a is now empty
assert(b[].equal([1, 2])); //But a and b are unchanged
assert(c[].equal([1, 2])); //But a and b are unchanged
assert(r.equal([1, 1, 2, 3]));

b.remove(b[]); //Do a hard removal of the actual elements
assert(b[].empty); //b is empty of course
assert(c[].empty); //but so is c
assert(r.equal([1, 3])); //and the range has lost an element: It 
was removed from the chain

b.insert(1); //Insert an element into b
assert(a[].empty); //a remains unchanged
assert(b[].equal([1])); //b has the new element
assert(c[].equal([1])); //and so does c
assert(r.equal([1, 3])); //r points to the old chain, so does not 
see this new element
----
  */

333 CLASSIC IMPLEMENTATION 333
/**
    Implements a doubly-linked list.

    $(D DList) use reference semantics.

    $(D DList.Range) may be used to have a $(D BidirectionalRange) 
view into
    the DList.

    Example:
----
auto a = DList!int([1, 2, 3]); //Create a new DList
auto b = a; //A reference to that DList

a.removeFront();
a.removeBack();
assert(a[].equal([2])); // both a and b have been affected
assert(b[].equal([2])); // both a and b have been affected

a.insertFront(1);
assert(a[].equal([1, 2])); // both a and b have been affected
assert(b[].equal([1, 2])); // both a and b have been affected

auto c = a; //Create a third reference
a.clear();  //Detaches from the chain
assert(a[].empty); //a is now empty
assert(b[].equal([1, 2])); //But a and b are unchanged
assert(c[].equal([1, 2])); //But a and b are unchanged

b.remove(b[]); //Do a hard removal of the actual elements
assert(b[].empty); //b is empty of course
assert(c[].empty); //but so is c

b.insert(1); //Insert an element into b
assert(a[].empty); //a remains unchanged
assert(b[].equal([1])); //b has the new element
assert(c[].equal([1])); //and so does c
----
  */
/*
There is not much to show here actually: Ranges are *immediatly* 
invalidated, and attempt to use an invalid range will assert 
pretty quickly.
  */

Sep 13 2012

D Programming

C/C++ Programming

Other

digitalmars.D - BIg semantic issues with SList & DList