## digitalmars.D - BIg semantic issues with SList & DList

• monarch_dodra (109/109) Sep 12 2012 Basically, the problem is that their copy semantics adhere to
• monarch_dodra (51/52) Sep 13 2012 I did some tests and implementations, and I came with 3 options:
• monarch_dodra (135/139) Sep 13 2012 I'll just copy paste here the 3 headers which contain examples,
"monarch_dodra" <monarchdodra gmail.com> writes:
```Basically, the problem is that their copy semantics adhere to
neither value nor reference semantics. The embody a "reference to
a third party chain system" semantic.

Erm... What does this mean?

Basically, a "chain of nodes is created", and the lists reference
points in that chain, but not necessarily the beginnings nor

SList
--------
import std.stdio, std.container, std.range, std.iostream,
std.algorithm;
void main()
{
auto a = SList!int([3, 4]);
auto b = a;
// ab -> 3 -> 4 -> null
assert(a[].equal([3, 4]));
assert(b[].equal([3, 4]));
a.stableInsertFront(1);
b.stableInsertFront(2);
// a -> 1
//       \
//        3 -> 4 -> null
//       /
// b -> 2
assert(a[].equal([1, 3, 4]));
assert(b[].equal([2, 3, 4]));

//a: change the 2nd element (4) into 5;
a[].drop(2).front = 5;
// a -> 1
//       \
//        3 -> 5 -> null
//       /
// b -> 2
assert(a[].equal([1, 3, 5]));
assert(b[].equal([2, 3, 5]));

a.clear();
// a -> null
//
//        3 -> 4 -> null
//       /
// b -> 2
assert(a[].empty);
assert(b[].equal([2, 3, 5]));

a.insert(1);
// -> 1 -> null
//
//      3 -> 5-> null
//     /
// -> 2
assert(a[].equal([1]));
assert(b[].equal([2, 3, 5]));
}
--------
Is this a clear example of what I mean?
NOW: The big question:

IS THIS A LEGITIMATE BEHAVIOR?

IMO, for SList, yes, yes it is. _HOWEVER_, it must be much
clearly documented in the docs, as well as examples provided.

I have done some relatively heavy testing, and I have found no
bugs in SList.

However, there are a couple *Gotcha's*, in particular, regarding
the effects of "remove": Are we merely advancing the SList's
"root pointer", or actually excising elements from the "chain"
itself? Depends on context :D but nothing that can't be
documented.

DLists has the same semantic:
--------
import std.stdio, std.container, std.range, std.algorithm;
void main()
{
auto a = DList!int([3, 4]);
auto b = a;
// (3 - 4)
assert(a[].equal([3, 4]));
assert(b[].equal([3, 4]));

b.stableInsertFront(1);
b.stableInsertBack(5);
// (2 - (3 - 4) - 5)
assert(a[].equal([3, 4]));
assert(b[].equal([1, 3, 4, 5]));

a.stableInsertFront(2);
// (1 - (2 - 3 - 4) - 5)
assert(a[].equal([2, 3, 4]));
writeln(b[]);
assert(b[].equal([1, 2, 3, 4, 5]));

//Does not work as expected!
a.remove(a[]);
writeln(a[]); //[2, 3, 4]
writeln(b[]); //[1, 5]
}
--------
The example stops here, because DList was obviously written
without this in mind. I don't have enough fingers on my hands to
list the use-cases that break it.

Again, same question:

IS THIS A LEGITIMATE BEHAVIOR?

The gotchas are even more prevalent here. Not only gotchas, but
just plain problems regarding theoretical behavior: What does it
mean to append two DLists, if those DLists are views into two
different, but, bigger, chains?

IMO: For DList, it is just too damn complicated to keep this
semantic. I'd migrate to "true reference".

--------
The good news is that having "true" reference semantics with
{SD}List should be relatively easy. Depending on the output of
this discussion, it is something <s>I wouldn't mind</s> I'd enjoy
doing.

```
Sep 12 2012
"monarch_dodra" <monarchdodra gmail.com> writes:
```On Wednesday, 12 September 2012 at 17:24:57 UTC, monarch_dodra
wrote:
[SNIP]

I did some tests and implementations, and I came with 3 options:

1) Embrace the "view of a chain" implementation and document it.
2) Use classic reference semantics. However, ranges can still
have an "out of bonds" view of the chain. This means the ranges
are virtually never invalidated.
3) Use classic reference semantics. ENFORCE range bounds be
inside the underlying DList. This forces us to systematically
"detach" removed nodes (as opposed to just advancing the _first
pointer)
...
4) There was a 4th solution I did not like: use classic reference
semantics. Do not provide support for "out of bounds" view of the
Range, but do not enforce it either. This is the "Bites you in
the ass" solution, and kind of what we currently had :(

All 3 solutions have been implemented.

Common to all 3 implementations:
*Added a LOT of unit tests.
*Added a description in the beggining
*Fixed the opOpAssign functions. I know they weren't working
because their names were typoed (!).
*MUCH more robust assertions of internals (and minor bugfixes).
*Ranges are now assertion based validated. (as opposed to
enforced).

Implementation 1:
This embraces the "chain" view. It works as advertised. The
downside though is in the concept, in that it _IS_ actually
possible to invalidate a DList itself (not just the range).
Everything is Asserted, and I do not think it is possible to have
some bad usage which isn't asserted. I had to change a few
implementations, but everything works now.
Here:
https://github.com/monarchdodra/phobos/commit/054fe9bea5652aa0e3089d279d4df2ae069d9f30#std/container.d

Implementation 2:
Reference semantics. However, Ranges can still have an out of
bounds view of the chain. This is because functions like
"removeFront" don't actually detach nodes, merelly advance the
get pointer.
The *real* advantage of this approach is that Range are virtually
never invalidated. Also had to change a few implementations.

This is kind of a hybrid soltuion, as the DLists themselves keep
the traditional semantics, but the ranges themselves are more of
a "can see anywhere in the chain" object.
https://github.com/monarchdodra/phobos/commit/2b08b46bf552f208deb897ec0f8c5b3327225c8c#std/container.d

Implementation 3:
Classic implementation. No surprises. Robust, safe, expected.
Boring. Boring is good. IMO, this is what people expect. Not a
lot of changes here actually.
https://github.com/monarchdodra/phobos/commit/8ce21e1a9a0107e74ff86a9acc736595ef73362c#std/container.d

Could I get any kind of feedback?
```
Sep 13 2012
"monarch_dodra" <monarchdodra gmail.com> writes:
```On Thursday, 13 September 2012 at 19:37:41 UTC, monarch_dodra
wrote:
On Wednesday, 12 September 2012 at 17:24:57 UTC, monarch_dodra
wrote:
[SNIP]

[SNIP]

I'll just copy paste here the 3 headers which contain examples,
to give a quicker idea of what I went for:

111 CHAIN IMPLEMENTATION 111
/**

\$(D DList) uses neither reference nor value semantics. They
can be seen as
several different handles into an external chain of nodes.
Several different
\$(D DList)s can all reference different points in a same chain.

\$(D DList.Range) is, for all intents and purposes, a DList
with range
semantics. The \$(D DList.Range) has a view directly into the
chain itself.
It is not tied to its parent \$(D DList), and may be used to
operate on
other lists (that point to the same chain).

The ONLY operation that can invalidate a \$(D DList) or \$(D
DList.Range), but
which will invalidate BOTH, is the \$(D remove) operation, if
the cut Range
overlaps with the boundaries of another DList or DList.Range.

Example:
----
auto a = DList!int([3, 4]); //Create a new chain
auto b = a; //Point to the same chain
// (3 - 4)
assert(a[].equal([3, 4]));
assert(b[].equal([3, 4]));

b.stableInsertFront(1); //insert before of b
b.stableInsertBack(5);  //insert after of b
// (2 - (3 - 4) - 5)
assert(a[].equal([3, 4])); //a is not changed
assert(b[].equal([1, 3, 4, 5])); // but a is

a.stableInsertFront(2); //insert in front of a, this will insert
"inside" the chain
// (1 - (2 - 3 - 4) - 5)
assert(a[].equal([2, 3, 4])); //a is modified
assert(b[].equal([1, 2, 3, 4, 5])); //and so is b;

a.remove(a[]); //remove all the elements of a;
// (1 - 5)
assert(a[].empty); //a is empty
assert(b[].equal([1, 5])); //b has lost some of its elements;

a.insert(2); //insert in a. This will create a new chain
// (2)
// (1 - 5)
assert(a[].equal([2])); //a is empty
assert(b[].equal([1, 5])); //b has lost some of its elements;
----
*/

222 HYBRID REFERENCE IMPLEMENATION 222
/**

\$(D DList) use reference semantics. They give a view into a
bidirectional
chain of nodes.

\$(D DList.Range) can be used to have \$(D BidirectionalRange)
view of the
chain of nodes. \$(D DList.Ranges) are almost never
invalidated, but they
may have access to nodes that are past the end of original \$(D
DList)'s,
if that \$(D DList) was shortened.

Example:
----
auto a = DList!int([1, 2, 3]); //Create a new DList
auto b = a; //A reference to that DList
auto r = a[]; //A range view of the list

a.removeFront();
a.removeBack();
assert(a[].equal([2])); // both a and b have been affected
assert(b[].equal([2])); // both a and b have been affected
assert(r.equal([1, 2, 3])); //But the range remains unchanged

a.insertFront(1);
assert(a[].equal([1, 2])); // both a and b have been affected
assert(b[].equal([1, 2])); // both a and b have been affected
assert(r.equal([1, 1, 2, 3])); //But so has the range

auto c = a; //Create a third reference
a.clear();  //Detaches from the chain
assert(a[].empty); //a is now empty
assert(b[].equal([1, 2])); //But a and b are unchanged
assert(c[].equal([1, 2])); //But a and b are unchanged
assert(r.equal([1, 1, 2, 3]));

b.remove(b[]); //Do a hard removal of the actual elements
assert(b[].empty); //b is empty of course
assert(c[].empty); //but so is c
assert(r.equal([1, 3])); //and the range has lost an element: It
was removed from the chain

b.insert(1); //Insert an element into b
assert(a[].empty); //a remains unchanged
assert(b[].equal([1])); //b has the new element
assert(c[].equal([1])); //and so does c
assert(r.equal([1, 3])); //r points to the old chain, so does not
see this new element
----
*/

333 CLASSIC IMPLEMENTATION 333
/**

\$(D DList) use reference semantics.

\$(D DList.Range) may be used to have a \$(D BidirectionalRange)
view into
the DList.

Example:
----
auto a = DList!int([1, 2, 3]); //Create a new DList
auto b = a; //A reference to that DList

a.removeFront();
a.removeBack();
assert(a[].equal([2])); // both a and b have been affected
assert(b[].equal([2])); // both a and b have been affected

a.insertFront(1);
assert(a[].equal([1, 2])); // both a and b have been affected
assert(b[].equal([1, 2])); // both a and b have been affected

auto c = a; //Create a third reference
a.clear();  //Detaches from the chain
assert(a[].empty); //a is now empty
assert(b[].equal([1, 2])); //But a and b are unchanged
assert(c[].equal([1, 2])); //But a and b are unchanged

b.remove(b[]); //Do a hard removal of the actual elements
assert(b[].empty); //b is empty of course
assert(c[].empty); //but so is c

b.insert(1); //Insert an element into b
assert(a[].empty); //a remains unchanged
assert(b[].equal([1])); //b has the new element
assert(c[].equal([1])); //and so does c
----
*/
/*
There is not much to show here actually: Ranges are *immediatly*
invalidated, and attempt to use an invalid range will assert
pretty quickly.
*/
```
Sep 13 2012