www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Ruling out arbitrary cost copy construction?

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
I think ranges are a great thing, having simplicity as one of their 
advantages. In the beginning they were indeed rather simple:

struct MyRange {
     bool empty();
     ref Widget front();
     void popFront();
}

with the appropriate refinements for bidirectional and random.

Then there was the need to distinguish one-pass, "input" ranges (e.g. 
files) from many-pass, "forward" ranges. So the "save" function got born 
for forward ranges and above:

struct MyRange {
     bool empty();
     ref Widget front();
     void popFront();
     MyRange save();
}

Then  property came about which required extra adornments:

struct MyRange {
      property bool empty();
      property ref Widget front();
     void popFront();
      property MyRange save();
}

Then some ranges were unable to return ref, but they could receive 
assignment. I call these /sealed ranges/ because they "seal" the address 
of their elements making it inaccessible:

struct MyRange {
      property bool empty();
      property Widget front();
      property void front(Widget);
     void popFront();
      property MyRange save();
}

This made bidirectional and random-access range interfaces quite big 
because now we're talking about back() (two versions), opIndex(), 
opIndexAssign() and opIndexOpAssign().

Until now I think we're solid on design decisions. The discussion starts 
here.

And then there was this nagging thing which is well-understood in the 
C++ world. A sealed range returns by value and accepts by value. But in 
C++, an object's copy constructor is arbitrarily expensive. The library 
must be designed in accordance to that and carefully navigate around 
that issue.

For example, swap is a fundamental primitive used in many algorithms. It 
should swap objects at constant cost. Indeed, for references, swap is 
easy to implement:

void swap(T)(ref T lhs, ref T rhs)
{
     assert(!pointsTo(lhs, rhs) && !pointsTo(rhs, lhs)
         && !pointsTo(lhs, lhs) && !pointsTo(rhs, rhs));
     T tmp = move(lhs);
     lhs = move(rhs);
     rhs = move(tmp);
}

or similar. However, a sealed range does not offer references, so trying 
e.g.

swap(r1.front, r2.front);

will not work. This is a problem.

To solve that problem, I introduced moveFront(), moveBack(), and 
moveAt(size_t), all of which destructively read the front, back, or an 
indexed element respectively off the range. Then you can swap r1.front 
with r2.front at constant cost like this:

     T tmp = r1.moveFront();
     r1.front = r2.moveFront();
     r2.front = move(tmp);

All of this works and is rock-solid, but it does load the range 
interface considerably. To a newcomer coming without the background 
above, a full-fledged range definition may look quite loaded.

One simplification is to simply decree that Phobos (and D in general) 
shuns objects with eager copy. Any this(this) could be considered 
constant cost. That would have two consequences:

1. It would simplify all ranges and many algorithms because there's no 
more need for moveXxx and swapping can be done the old-fashioned way 
(with assignments in inline code):

     T tmp = r1.front;
     r1.front = r2.front;
     r2.front = tmp;

In general many things become considerably easier if you can simply 
assume that copying objects around won't be part of your complexity 
requirements or the practical costs of your algorithm.

2. It would force certain types (such as BigInt) that allocate resources 
and have value semantics to resort to reference counting.

3. It would give more legitimacy to sealed objects that return data by 
value (as opposed to reference). I believe sealed objects are very 
important for safety.

4. It would be a definite departure from C++, where all value copies are 
considered of arbitrary cost. This would provide a convenient straw-man 
for naysayers (e.g. "Hey, D calls the copy constructor even more often 
than C++! No thanks, I'll stick with C++0x which solves it all with 
rvalue references").

5. It would make things look and feel familiar to people coming from any 
other languages than C++, who seldom need to define a costly this(this) 
at all.

Please discuss.


Andrei
Oct 06 2010
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
Vote++.  IMHO the problem with arbitrary cost copy construction is that
abstractions that are this leaky don't actually make people's lives simpler.
Abstractions (like value semantics) are supposed to make the code easier to
reason
about.  When an abstraction forces you to think hard about things as trivial as
variable assignments, I think it's better to either scrap the abstraction and
require all copying to be explicit, or use a less leaky abstraction like
reference
counting/COW.

With regard to using reference counting/COW instead of eager copying, I
understand
the objection that seemingly innocent actions can throw.  However, I assume when
this has been mentioned in the past the exception in question was out of memory.
Few programs are actually designed to handle out of memory errors anyhow, except
maybe in a very generic way very far up the call stack.  That's why D declares
out
of memory to be an Error instead of an Exception.  Therefore, this issue is
IMHO a
non-issue.

To play Devil's Advocate a little, though, I do think we've substantially
mitigated the bloat issue by defining default behavior for moveFront() and
friends
in cases where either there's no postblit or the range returns by reference. 
This
way you only need to care about these functions if you're writing a sealed range
that you plan to put structs w/ postblits in.
Oct 06 2010
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
dsimcha wrote:
 Vote++.  IMHO the problem with arbitrary cost copy construction is that
 abstractions that are this leaky don't actually make people's lives simpler.
 Abstractions (like value semantics) are supposed to make the code easier to
reason
 about.  When an abstraction forces you to think hard about things as trivial as
 variable assignments, I think it's better to either scrap the abstraction and
 require all copying to be explicit, or use a less leaky abstraction like
reference
 counting/COW.

I agree. I was very reluctant to even put in things like this(this), and the only thing that forced the issue was the need to support reference counting.
Oct 06 2010
prev sibling next sibling parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 06/10/2010 17:57, dsimcha wrote:
 Vote++.  IMHO the problem with arbitrary cost copy construction is that
 abstractions that are this leaky don't actually make people's lives simpler.
 Abstractions (like value semantics) are supposed to make the code easier to
reason
 about.  When an abstraction forces you to think hard about things as trivial as
 variable assignments, I think it's better to either scrap the abstraction and
 require all copying to be explicit, or use a less leaky abstraction like
reference
 counting/COW.

I agree with this as well. But I'm still wondering if the original copy construction problem couldn't be solved in some other way. -- Bruno Medeiros - Software Engineer
Oct 29 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/29/10 7:55 CDT, Bruno Medeiros wrote:
 On 06/10/2010 17:57, dsimcha wrote:
 Vote++. IMHO the problem with arbitrary cost copy construction is that
 abstractions that are this leaky don't actually make people's lives
 simpler.
 Abstractions (like value semantics) are supposed to make the code
 easier to reason
 about. When an abstraction forces you to think hard about things as
 trivial as
 variable assignments, I think it's better to either scrap the
 abstraction and
 require all copying to be explicit, or use a less leaky abstraction
 like reference
 counting/COW.

I agree with this as well. But I'm still wondering if the original copy construction problem couldn't be solved in some other way.

FWIW this matter is still tormenting me. It gridlocks work on ranges, algorithms, and containers. To recap: 1. Arbitrary cost copy construction: + Makes value types easy to define: just hook the copying code into this(this) + Is familiar to programmers coming from C++ + If it fails, fails early, not surprisingly during a later operation - Creates inefficiencies for innocuous-looking code - Contorts code that wants to be efficient - Makes containers, ranges and other abstractions bulkier and more difficult to define, implement, and use - Forces exposure of raw addresses, which hurt abstraction power and safety 2. Constant-cost copy construction (via mandatory refcounting for value types): + Simplifies life virtually everywhere in client code + Makes code easy to reason about (no hidden inefficiencies, failure paths are more visible) + Makes generic code easier to define and more efficient for many types - Makes value types difficult to define: a value type must store all state and do all manipulation indirectly, via a reference-counted pointer; all methods must check the pointer against null; all mutating methods must use a copy-on-write approach - Needs copy-on-write (COW) approaches in defining object types and COW is error-prone because it relies on the programmer remembering to always ensure a unique copy prior to changing the object - Can become a notable difference from C++ (and consequently a contention point for C++ programmers who are considering migrating to D): arbitrary cost of construction hurts C++ in many places, but it has contained the situation with a combination of complexity, discipline, and acceptance of inefficiencies. Any tie-breaking arguments, I'm all ears. Andrei
Oct 29 2010
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s

 Any tie-breaking arguments, I'm all ears.
 Andrei

Uh...How about that if people want C++, they know where to find it? I think familiarity to C++ programmers is The Wrong Reason (TM) to allow arbitrary cost copy construction. Furthermore, I don't see crufty old C++ programmers as being more important to D than people from other backgrounds. I see D users coming from a variety of backgrounds: 1. Crufty old C/C++ programmers. 2. People who like dynamic languages but need more speed and ability to do low-level work. D is about the most flexible close-to-the-metal/efficient/statically typed language out there. 3. Java/C# programmers who want a language that isn't absurdly verbose. 4. New programmers who don't have much already invested in any other language and want something advanced, modern and w/o tons of legacy cruft. The first **may** want eager copying. The latter three almost certainly won't.
Oct 29 2010
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/29/10 12:15 CDT, dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s

 Any tie-breaking arguments, I'm all ears.
 Andrei

Uh...How about that if people want C++, they know where to find it? I think familiarity to C++ programmers is The Wrong Reason (TM) to allow arbitrary cost copy construction. Furthermore, I don't see crufty old C++ programmers as being more important to D than people from other backgrounds. I see D users coming from a variety of backgrounds: 1. Crufty old C/C++ programmers. 2. People who like dynamic languages but need more speed and ability to do low-level work. D is about the most flexible close-to-the-metal/efficient/statically typed language out there. 3. Java/C# programmers who want a language that isn't absurdly verbose. 4. New programmers who don't have much already invested in any other language and want something advanced, modern and w/o tons of legacy cruft. The first **may** want eager copying. The latter three almost certainly won't.

Not all C++ programmers are crufty and old :o). Anyhow, there are other advantages to arbitrary cost copy construction, as I specified. For what it's worth, if eliminating it made things overall easier, C++ programmers would have loved it. It's the liabilities I'm worried about. Andrei
Oct 29 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/29/10 12:53 CDT, Jonathan M Davis wrote:
 On Friday, October 29, 2010 10:15:09 dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s

 Any tie-breaking arguments, I'm all ears.
 Andrei

Uh...How about that if people want C++, they know where to find it? I think familiarity to C++ programmers is The Wrong Reason (TM) to allow arbitrary cost copy construction. Furthermore, I don't see crufty old C++ programmers as being more important to D than people from other backgrounds. I see D users coming from a variety of backgrounds: 1. Crufty old C/C++ programmers. 2. People who like dynamic languages but need more speed and ability to do low-level work. D is about the most flexible close-to-the-metal/efficient/statically typed language out there. 3. Java/C# programmers who want a language that isn't absurdly verbose. 4. New programmers who don't have much already invested in any other language and want something advanced, modern and w/o tons of legacy cruft. The first **may** want eager copying. The latter three almost certainly won't.

Except that how many of those other languages _have_ copy construction? Java doesn't. C#'s classes obviously don't, and IIRC, they're structs don't either (though they may - it's been a while since I used C#). If we're talking about a feature that's pretty much only in C++, then it's a question of whether the C++ programmers are going to be confused and/or surprised and whether someone who has never dealt with such a feature would be confused and/or surprised. Also, for the non-C++ folks who aren't familiar with a feature, they're not going to know about it anyway, so whether you follow C++ or not is irrelevant to them while it is _not_ irrelevant to the C++ programmers. By no means do I think that we should do everything in a particular way just because C++ did it, but particularly when dealing with a feature that is C++- centric, we need to be aware of how what we do will affect C++ programmers.

If the feature had nothing else going for it except being in C++ it would have been eliminated. But value types are a very solid abstraction. We definitely want to support them in D. All - please do not take only one isolated argument from those I listed and focus discussion on it alone. It can only lead to stilted views.
 As for eager-coping, it's a _value type_, of _course_ it's going to be copied
 eagerly. ints, floats, bools, etc. _all_ copy eagerly. How can you expect
 anything else? It's quite explicit that that's how value types work. So, if you
 write a struct where copying is expensive, you're asking for trouble and should
 know it. The only real issue with it that I see is the non-obvious places where
 copying take place. But if you tried to make it so that structs didn't copy
 eagerly, then you're making it so that they're not value types anymore which
 will increas overheard and goes against what structs are meant for.

In fact reference counting allows you to define value types with cheap copy construction, so the above is incorrect.
 C++ doesn't stop you or help you from being stupid about expensive copy-
 construction. Why should D? Sure, it sucks when your program is slow, but we
 have classes which are reference types and don't have this problem. So, if you
 want a reference type, there it is. I don't see how anyone could expect
 expensive copying not to result in poor performance for a _value type_.

That means ranges need to define moveFront, moveBack, moveAt, and countless othed contortions in library code and client code alike. I think the view you're conveying is overly simplistic. Andrei
Oct 29 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/29/10 13:19 CDT, Jonathan M Davis wrote:
 I'm arguing that you do _not_ define those and that you let user code fend for
 itself. If a programmer is foolish enough to write code with overly expensive
 postblit constructors, then they're just shooting themselves in the foot.

I think it would be hasty to classify them as foolish. Expensive copy constructors are meant to protect an abstraction, not some questionable practice. In brief your stance gets back to refcounting: a programmer must choose to use refcounting whenever they have expensive state to manipulate. Andrei
Oct 29 2010
prev sibling next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 To recap:
 1. Arbitrary cost copy construction:
 + Makes value types easy to define: just hook the copying code into
 this(this)
 + Is familiar to programmers coming from C++
 + If it fails, fails early, not surprisingly during a later operation

BTW, I don't see why failing during a variable assignment is any less bad than failing during any other seemingly innocuous operation.
Oct 29 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/29/10 12:18 CDT, dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 To recap:
 1. Arbitrary cost copy construction:
 + Makes value types easy to define: just hook the copying code into
 this(this)
 + Is familiar to programmers coming from C++
 + If it fails, fails early, not surprisingly during a later operation

BTW, I don't see why failing during a variable assignment is any less bad than failing during any other seemingly innocuous operation.

One problem is that copying is often implicit and as such more difficult to see with the naked eye. Andrei
Oct 29 2010
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/29/10 12:21 CDT, Jonathan M Davis wrote:
 Honestly, what I expect to happen if constant-copy cost is expected is that
code
 won't be written that way and the code that expects a constant-copy cost is
 going to be slow. Really, I'd say that in the general case, either arbitrary
 copy construction is going to be expected, and there will be code which assumes
 constant-copy cost and so is slow, or constant-copy cost construction is going
 to be expected and any postplits which don't have a constant-copy cost will are
 going to be inefficient in various places.

 I really don't think that you have any hope of enforcing either scheme.

That is correct. I don't want to enforce as much as choosing one stance and sticking with it throughout the standard library. The STL is consistently making the following stated assumptions: 1. Functors and iterators are cheap to copy 2. General objects (including containers) are arbitrarily expensive to copy Once STL chose a way, everyone using it could figure out a way to use it gainfully, or to debug performance problems if there were any.
 Programmers just won't think about it in the general case. So, unless the
 "mandatory refcounting for value type" (and I have no clue how you'd do that)
is
 actually enforced, you can't enforce either, and you _will_ have
inefficiencies.
 It's just a question of where and what kind.

Walter and I have been seriously discussing introducing refcounted which would automate part of the process. The problem is you can't automate all or even most of it. Either you must assume client code uses const consistently, or that client code inserts manual calls whenever they plan to change the object.
 If anything, I'm inclined to say that we assume that the postblit is O(1) and
 let the programmer worry about any inefficiencies. We can point out that
anything
 worse that O(1) will be a performance problem, but it seems to me that any
 attempt to either accomodate arbitrary cost postblit constructors or to try and
 use any kind of scheme which forces programmers to write postblits in a certain
 way is too complicated and doomed to failure. And even if it works, it will be
 highly annoying to deal with.

It sure is annoying, but it does work. Don, can you estimate how difficult it would be to convert BigInt to a refcounted implementation? Andrei
Oct 29 2010
parent reply Don <nospam nospam.com> writes:
Andrei Alexandrescu wrote:
 If anything, I'm inclined to say that we assume that the postblit is 
 O(1) and
 let the programmer worry about any inefficiencies. We can point out 
 that anything
 worse that O(1) will be a performance problem, but it seems to me that 
 any
 attempt to either accomodate arbitrary cost postblit constructors or 
 to try and
 use any kind of scheme which forces programmers to write postblits in 
 a certain
 way is too complicated and doomed to failure. And even if it works, it 
 will be
 highly annoying to deal with.

It sure is annoying, but it does work. Don, can you estimate how difficult it would be to convert BigInt to a refcounted implementation?

At the moment, I think it's impossible. Has anyone succesfully implemented refcounting in D? As long as bug 3516 (Destructor not called on temporaries) remains open, it doesn't seem to be possible. Is that the only blocker, or are there others?
Oct 30 2010
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Don (nospam nospam.com)'s article
 Andrei Alexandrescu wrote:
 If anything, I'm inclined to say that we assume that the postblit is
 O(1) and
 let the programmer worry about any inefficiencies. We can point out
 that anything
 worse that O(1) will be a performance problem, but it seems to me that
 any
 attempt to either accomodate arbitrary cost postblit constructors or
 to try and
 use any kind of scheme which forces programmers to write postblits in
 a certain
 way is too complicated and doomed to failure. And even if it works, it
 will be
 highly annoying to deal with.

It sure is annoying, but it does work. Don, can you estimate how difficult it would be to convert BigInt to a refcounted implementation?

Has anyone succesfully implemented refcounting in D? As long as bug 3516 (Destructor not called on temporaries) remains open, it doesn't seem to be possible. Is that the only blocker, or are there others?

Can someone please clarify something for me? I thought BigInt was already COW (though I guess COW w/o ref counting is still pretty inefficient).
Oct 30 2010
parent reply Don <nospam nospam.com> writes:
dsimcha wrote:
 == Quote from Don (nospam nospam.com)'s article
 Andrei Alexandrescu wrote:
 If anything, I'm inclined to say that we assume that the postblit is
 O(1) and
 let the programmer worry about any inefficiencies. We can point out
 that anything
 worse that O(1) will be a performance problem, but it seems to me that
 any
 attempt to either accomodate arbitrary cost postblit constructors or
 to try and
 use any kind of scheme which forces programmers to write postblits in
 a certain
 way is too complicated and doomed to failure. And even if it works, it
 will be
 highly annoying to deal with.

Don, can you estimate how difficult it would be to convert BigInt to a refcounted implementation?

Has anyone succesfully implemented refcounting in D? As long as bug 3516 (Destructor not called on temporaries) remains open, it doesn't seem to be possible. Is that the only blocker, or are there others?

Can someone please clarify something for me? I thought BigInt was already COW (though I guess COW w/o ref counting is still pretty inefficient).

Yes, it's COW, but without refcounting. Which isn't really too terrible, because very few BigInt operations can be performed in-place. Even ++x can trigger a memory allocation. The most important issues would be fixed with a small-value optimisation (values x where -long.max <= x <= long.max stored in the struct).
Oct 30 2010
parent reply Walter Bright <newshound2 digitalmars.com> writes:
Don wrote:
 The most important issues would be fixed with a small-value optimisation 
 (values x where -long.max <= x <= long.max stored in the struct).

Drat, I thought it already did that. I think it's also important to make BigInt.sizeof == ulong.sizeof. To that end, you can use the least significant bit to mean "this is not a pointer", as in a 63 bit integer << 1. (Pointers will always be aligned.)
Oct 30 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Walter Bright:

 I think it's also important to make BigInt.sizeof == ulong.sizeof. To that
end, 
 you can use the least significant bit to mean "this is not a pointer", as in a 
 63 bit integer << 1.

Lot of time ago I have instead suggest BigInt.sizeof == size_t.sizeof, because in most programs you need integers smaller than 63 bit, 31 bits are enough. This increases the performance a little on 32 bit systems. Bye, bearophile
Oct 30 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
 Lot of time ago I have instead suggest BigInt.sizeof == size_t.sizeof, because
in most programs you need integers smaller than 63 bit, 31 bits are enough.
This increases the performance a little on 32 bit systems.

The compiler also has to make sure the tests for out-of-smallint-range are always inlined, otherwise you are mostly wasting your time optimizing BigInt for small integers. Bye, bearophile
Oct 30 2010
parent Don <nospam nospam.com> writes:
bearophile wrote:
 Lot of time ago I have instead suggest BigInt.sizeof == size_t.sizeof, because
in most programs you need integers smaller than 63 bit, 31 bits are enough.
This increases the performance a little on 32 bit systems.

The compiler also has to make sure the tests for out-of-smallint-range are always inlined, otherwise you are mostly wasting your time optimizing BigInt for small integers. Bye, bearophile

The big savings come from avoiding memory allocation.
Oct 30 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/30/10 2:24 CDT, Don wrote:
 Andrei Alexandrescu wrote:
 If anything, I'm inclined to say that we assume that the postblit is
 O(1) and
 let the programmer worry about any inefficiencies. We can point out
 that anything
 worse that O(1) will be a performance problem, but it seems to me
 that any
 attempt to either accomodate arbitrary cost postblit constructors or
 to try and
 use any kind of scheme which forces programmers to write postblits in
 a certain
 way is too complicated and doomed to failure. And even if it works,
 it will be
 highly annoying to deal with.

It sure is annoying, but it does work. Don, can you estimate how difficult it would be to convert BigInt to a refcounted implementation?

At the moment, I think it's impossible. Has anyone succesfully implemented refcounting in D? As long as bug 3516 (Destructor not called on temporaries) remains open, it doesn't seem to be possible. Is that the only blocker, or are there others?

I managed to define and use RefCounted in Phobos. File also uses hand-made reference counting. I think RefCounted is a pretty good abstraction (unless it hits the bug you mentioned.) Andrei
Oct 30 2010
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-10-30 20:49:38 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 10/30/10 2:24 CDT, Don wrote:
 At the moment, I think it's impossible.
 Has anyone succesfully implemented refcounting in D? As long as bug 3516
 (Destructor not called on temporaries) remains open, it doesn't seem to
 be possible.
 Is that the only blocker, or are there others?

I managed to define and use RefCounted in Phobos. File also uses hand-made reference counting. I think RefCounted is a pretty good abstraction (unless it hits the bug you mentioned.)

I like the idea of RefCounted as a way to automatically make things reference counted. But like File and many similar ref-counted structs, it has this race condition (bug 4624) when stored inside the GC heap. Currently, most of Phobos's ref-counted structs are race-free only when they reside on the stack or if your program has only one thread (because the GC doesn't spawn threads if I'm correct). It's a little sad that the language doesn't prevent races in destructors (bug 4621). -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 30 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/30/2010 09:40 PM, Michel Fortin wrote:
 On 2010-10-30 20:49:38 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 On 10/30/10 2:24 CDT, Don wrote:
 At the moment, I think it's impossible.
 Has anyone succesfully implemented refcounting in D? As long as bug 3516
 (Destructor not called on temporaries) remains open, it doesn't seem to
 be possible.
 Is that the only blocker, or are there others?

I managed to define and use RefCounted in Phobos. File also uses hand-made reference counting. I think RefCounted is a pretty good abstraction (unless it hits the bug you mentioned.)

I like the idea of RefCounted as a way to automatically make things reference counted.

Unfortunately it's only a semi-automated mechanism.
 But like File and many similar ref-counted structs, it has this race
 condition (bug 4624) when stored inside the GC heap. Currently, most of
 Phobos's ref-counted structs are race-free only when they reside on the
 stack or if your program has only one thread (because the GC doesn't
 spawn threads if I'm correct).

 It's a little sad that the language doesn't prevent races in destructors
 (bug 4621).

I hope we're able to solve these implementation issues that can be seen as independent from the decision at hand. Walter and I discussed the matter again today and we're on the brink of deciding that cheap copy construction is to be assumed. This simplifies the language and the library a great deal, and makes it perfectly good for 95% of the cases. For a minority of types, code would need to go through extra hoops (e.g. COW, refcounting) to be compliant. I'm looking for more feedback from the larger D community. This is a very important decision that marks one of the largest departures from the C++ style. Taking the wrong turn here could alienate many programmers coming from C++. So, everybody - this is really the time to speak up or forever be silent. Andrei
Oct 30 2010
next sibling parent Rainer Deyke <rainerd eldwood.com> writes:
On 10/30/2010 21:56, Andrei Alexandrescu wrote:
 Walter and I discussed the matter again today and we're on the brink of
 deciding that cheap copy construction is to be assumed. This simplifies
 the language and the library a great deal, and makes it perfectly good
 for 95% of the cases. For a minority of types, code would need to go
 through extra hoops (e.g. COW, refcounting) to be compliant.

For what it's worth, I've used the ref-counting/COW style even in C++. C++ might not assume that copy construction is cheap, but it certainly performs better when it is. -- Rainer Deyke - rainerd eldwood.com
Oct 30 2010
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 31.10.2010 6:56, Andrei Alexandrescu wrote:
[snip]
 Walter and I discussed the matter again today and we're on the brink 
 of deciding that cheap copy construction is to be assumed. This 
 simplifies the language and the library a great deal, and makes it 
 perfectly good for 95% of the cases. For a minority of types, code 
 would need to go through extra hoops (e.g. COW, refcounting) to be 
 compliant.

 I'm looking for more feedback from the larger D community. This is a 
 very important decision that marks one of the largest departures from 
 the C++ style. Taking the wrong turn here could alienate many 
 programmers coming from C++.

 So, everybody - this is really the time to speak up or forever be silent.


 Andrei

The only things I've ever seen in C++ _correctly_ using _costly_ copy constructor are containers and some bignums, but then most of the time one need to define swap for it to perform well in the std::sort for instance, and that is oftentimes forgotten. That indicates that containers are better off being reference types, which is what they are in D anyway, and bignums I do belive could use COW/small number optimization. I'd go with cheap constructors assumed given that it simplifies matters a lot. -- Dmitry Olshansky
Oct 31 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/31/10 3:38 AM, Dmitry Olshansky wrote:
 On 31.10.2010 6:56, Andrei Alexandrescu wrote:
 [snip]
 Walter and I discussed the matter again today and we're on the brink
 of deciding that cheap copy construction is to be assumed. This
 simplifies the language and the library a great deal, and makes it
 perfectly good for 95% of the cases. For a minority of types, code
 would need to go through extra hoops (e.g. COW, refcounting) to be
 compliant.

 I'm looking for more feedback from the larger D community. This is a
 very important decision that marks one of the largest departures from
 the C++ style. Taking the wrong turn here could alienate many
 programmers coming from C++.

 So, everybody - this is really the time to speak up or forever be silent.


 Andrei

The only things I've ever seen in C++ _correctly_ using _costly_ copy constructor are containers and some bignums, but then most of the time one need to define swap for it to perform well in the std::sort for instance, and that is oftentimes forgotten. That indicates that containers are better off being reference types, which is what they are in D anyway, and bignums I do belive could use COW/small number optimization. I'd go with cheap constructors assumed given that it simplifies matters a lot.

I've been thinking of this too - costly copy construction requires constant attention in client and library code alike (e.g. STL containers) or is flat out in bad taste (e.g. the mythical Matrix class that returns copies from operators). Andrei
Oct 31 2010
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-10-30 23:56:24 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 10/30/2010 09:40 PM, Michel Fortin wrote:
 
 But like File and many similar ref-counted structs, it has this race
 condition (bug 4624) when stored inside the GC heap. Currently, most of
 Phobos's ref-counted structs are race-free only when they reside on the
 stack or if your program has only one thread (because the GC doesn't
 spawn threads if I'm correct).
 
 It's a little sad that the language doesn't prevent races in destructors
 (bug 4621).

I hope we're able to solve these implementation issues that can be seen as independent from the decision at hand.

Whether they're independent or not depends on the assumptions behind your question. Since everywhere you propose ruling out arbitrary-cost copy construction you also suggest COW with a reference counter, my understanding is that you assume COW is usable and efficient. Now, perhaps this isn't bothering you, but keep in mind that std::string in C++ was originally specified in a way that allows COW, and this ability was removed in C++0x because it isn't efficient in a multithreaded environment. What makes you think it'll work differently for D? Fixing the bugs above will either 1) affect the performance of the ref-counted things (need to use atomic ops on the counter) or 2) restrict usage of RefCounted and others to non-GC memory for multithreaded programs (as the current design does, even though the compiler doesn't warn you). This is a drawback you should consider before asking everyone to use copy on write with reference counting. I haven't seen you acknowledge it yet.
 Walter and I discussed the matter again today and we're on the brink of 
 deciding that cheap copy construction is to be assumed. This simplifies 
 the language and the library a great deal, and makes it perfectly good 
 for 95% of the cases. For a minority of types, code would need to go 
 through extra hoops (e.g. COW, refcounting) to be compliant.

I take note that disabling postblit and forcing the use of a separate copy function would also be compliant. :-)
 I'm looking for more feedback from the larger D community. This is a 
 very important decision that marks one of the largest departures from 
 the C++ style. Taking the wrong turn here could alienate many 
 programmers coming from C++.

As far as I understand, nothing would prevent me from writing a value type with arbitrary-cost copy construction if I wanted to. And someone could have good reasons to do this (porting a C++ app for instance). I'm not opposed to the idea of ruling out arbitrary-cost postblits, but I fear it might be futile. If std::string is any indication, COW doesn't scale well with multithreading. There is a way to make COW efficient for thread-local objects in the GC-heap, but that would require thread-local GC heaps, and this has been ruled out in the design of 'shared' and 'immutable'. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 31 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/31/10 5:40 AM, Michel Fortin wrote:
 On 2010-10-30 23:56:24 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 On 10/30/2010 09:40 PM, Michel Fortin wrote:

 But like File and many similar ref-counted structs, it has this race
 condition (bug 4624) when stored inside the GC heap. Currently, most of
 Phobos's ref-counted structs are race-free only when they reside on the
 stack or if your program has only one thread (because the GC doesn't
 spawn threads if I'm correct).

 It's a little sad that the language doesn't prevent races in destructors
 (bug 4621).

I hope we're able to solve these implementation issues that can be seen as independent from the decision at hand.

Whether they're independent or not depends on the assumptions behind your question. Since everywhere you propose ruling out arbitrary-cost copy construction you also suggest COW with a reference counter, my understanding is that you assume COW is usable and efficient. Now, perhaps this isn't bothering you, but keep in mind that std::string in C++ was originally specified in a way that allows COW, and this ability was removed in C++0x because it isn't efficient in a multithreaded environment. What makes you think it'll work differently for D? Fixing the bugs above will either 1) affect the performance of the ref-counted things (need to use atomic ops on the counter) or 2) restrict usage of RefCounted and others to non-GC memory for multithreaded programs (as the current design does, even though the compiler doesn't warn you). This is a drawback you should consider before asking everyone to use copy on write with reference counting. I haven't seen you acknowledge it yet.
 Walter and I discussed the matter again today and we're on the brink
 of deciding that cheap copy construction is to be assumed. This
 simplifies the language and the library a great deal, and makes it
 perfectly good for 95% of the cases. For a minority of types, code
 would need to go through extra hoops (e.g. COW, refcounting) to be
 compliant.

I take note that disabling postblit and forcing the use of a separate copy function would also be compliant. :-)
 I'm looking for more feedback from the larger D community. This is a
 very important decision that marks one of the largest departures from
 the C++ style. Taking the wrong turn here could alienate many
 programmers coming from C++.

As far as I understand, nothing would prevent me from writing a value type with arbitrary-cost copy construction if I wanted to. And someone could have good reasons to do this (porting a C++ app for instance). I'm not opposed to the idea of ruling out arbitrary-cost postblits, but I fear it might be futile. If std::string is any indication, COW doesn't scale well with multithreading. There is a way to make COW efficient for thread-local objects in the GC-heap, but that would require thread-local GC heaps, and this has been ruled out in the design of 'shared' and 'immutable'.

D's approach to concurrency is predicated by distinguishing shared data from unshared data statically (by means of the shared qualifier). As such, unshared types can always use non-atomic reference count manipulation in confidence that there is no undue sharing going on. Andrei
Oct 31 2010
parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-10-31 12:39:10 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 10/31/10 5:40 AM, Michel Fortin wrote:
 I'm not opposed to the idea of ruling out arbitrary-cost postblits, but
 I fear it might be futile. If std::string is any indication, COW doesn't
 scale well with multithreading. There is a way to make COW efficient for
 thread-local objects in the GC-heap, but that would require thread-local
 GC heaps, and this has been ruled out in the design of 'shared' and
 'immutable'.

D's approach to concurrency is predicated by distinguishing shared data from unshared data statically (by means of the shared qualifier).

Yes, and that was a very good idea.
 As such, unshared types can always use non-atomic reference count 
 manipulation in confidence that there is no undue sharing going on.

And it works, except in a destructor when that destructor is called by the GC (because the GC can call a destructor from another thread). The discussion of 4621 shows that the compiler can't detect the races that might arise if you dereference a member in a destructor. If the compiler could be made to prevent those races, all instances of bug 4624 would be caught at compile-time and you'd have to either: 1. use a shared reference counter with atomic operations (and possibly cast your way around), or 2. restrict those structs to the stack and non-GC allocated memory. That's the conclusion from the discussion of bug 4621. A third way would be to make the GC call a destructor in the thread that owns the memory block, but how should the GC determine which thread owns which block? Is memory always owned by the thread that allocated it? And does that mean the destructor will be postponed until that thread asks for a new memory allocation? Perhaps that's what we should do. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 31 2010
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-10-30 23:56:24 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Walter and I discussed the matter again today and we're on the brink of 
 deciding that cheap copy construction is to be assumed. This simplifies 
 the language and the library a great deal, and makes it perfectly good 
 for 95% of the cases. For a minority of types, code would need to go 
 through extra hoops (e.g. COW, refcounting) to be compliant.

A simple question: can a reference counter work with const and immutable? const(S) func(ref const(S) s) { return s; // this should increment the reference counter, but can we bypass const? } Bypassing const/immutable could work, but if your data is immutable and resides in read-only memory you'll get a crash. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 31 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/31/10 8:04 AM, Michel Fortin wrote:
 On 2010-10-30 23:56:24 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 Walter and I discussed the matter again today and we're on the brink
 of deciding that cheap copy construction is to be assumed. This
 simplifies the language and the library a great deal, and makes it
 perfectly good for 95% of the cases. For a minority of types, code
 would need to go through extra hoops (e.g. COW, refcounting) to be
 compliant.

A simple question: can a reference counter work with const and immutable? const(S) func(ref const(S) s) { return s; // this should increment the reference counter, but can we bypass const? } Bypassing const/immutable could work, but if your data is immutable and resides in read-only memory you'll get a crash.

There are several solutions possible, some that require the compiler knowing about the idiom, and some relying on trusted code. One in the latter category is to create immutable objects with an unattainable reference count (e.g. size_t.max) and then incrementing the reference count only if it's not equal to that value. That adds one more test for code that copies const object, but I think it's acceptable. Andrei
Oct 31 2010
next sibling parent Rainer Deyke <rainerd eldwood.com> writes:
On 10/31/2010 10:42, Andrei Alexandrescu wrote:
 There are several solutions possible, some that require the compiler
 knowing about the idiom, and some relying on trusted code. One in the
 latter category is to create immutable objects with an unattainable
 reference count (e.g. size_t.max) and then incrementing the reference
 count only if it's not equal to that value. That adds one more test for
 code that copies const object, but I think it's acceptable.

This means that immutable objects do not have deterministic destructors. I don't think this is acceptable. -- Rainer Deyke - rainerd eldwood.com
Oct 31 2010
prev sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 31/10/2010 16:42, Andrei Alexandrescu wrote:
 On 10/31/10 8:04 AM, Michel Fortin wrote:
 On 2010-10-30 23:56:24 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 Walter and I discussed the matter again today and we're on the brink
 of deciding that cheap copy construction is to be assumed. This
 simplifies the language and the library a great deal, and makes it
 perfectly good for 95% of the cases. For a minority of types, code
 would need to go through extra hoops (e.g. COW, refcounting) to be
 compliant.

A simple question: can a reference counter work with const and immutable? const(S) func(ref const(S) s) { return s; // this should increment the reference counter, but can we bypass const? } Bypassing const/immutable could work, but if your data is immutable and resides in read-only memory you'll get a crash.

There are several solutions possible, some that require the compiler knowing about the idiom, and some relying on trusted code. One in the latter category is to create immutable objects with an unattainable reference count (e.g. size_t.max) and then incrementing the reference count only if it's not equal to that value. That adds one more test for code that copies const object, but I think it's acceptable. Andrei

Ehh? So here we would have logical const instead of strict const? Feels a bit like this could be an opening of Pandora's box... -- Bruno Medeiros - Software Engineer
Nov 01 2010
prev sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 31/10/2010 03:56, Andrei Alexandrescu wrote:
 On 10/30/2010 09:40 PM, Michel Fortin wrote:
 On 2010-10-30 20:49:38 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 On 10/30/10 2:24 CDT, Don wrote:
 At the moment, I think it's impossible.
 Has anyone succesfully implemented refcounting in D? As long as bug
 3516
 (Destructor not called on temporaries) remains open, it doesn't seem to
 be possible.
 Is that the only blocker, or are there others?

I managed to define and use RefCounted in Phobos. File also uses hand-made reference counting. I think RefCounted is a pretty good abstraction (unless it hits the bug you mentioned.)

I like the idea of RefCounted as a way to automatically make things reference counted.

Unfortunately it's only a semi-automated mechanism.
 But like File and many similar ref-counted structs, it has this race
 condition (bug 4624) when stored inside the GC heap. Currently, most of
 Phobos's ref-counted structs are race-free only when they reside on the
 stack or if your program has only one thread (because the GC doesn't
 spawn threads if I'm correct).

 It's a little sad that the language doesn't prevent races in destructors
 (bug 4621).

I hope we're able to solve these implementation issues that can be seen as independent from the decision at hand. Walter and I discussed the matter again today and we're on the brink of deciding that cheap copy construction is to be assumed. This simplifies the language and the library a great deal, and makes it perfectly good for 95% of the cases. For a minority of types, code would need to go through extra hoops (e.g. COW, refcounting) to be compliant. I'm looking for more feedback from the larger D community. This is a very important decision that marks one of the largest departures from the C++ style. Taking the wrong turn here could alienate many programmers coming from C++. So, everybody - this is really the time to speak up or forever be silent. Andrei

I would also go for "2. Constant-cost copy construction", but I can't really make a case for it. I can only state that it seems to me that the benefits in library simplification (both in implementation and API) would be far more valuable than the drawbacks ("Makes value types difficult to define"). It should be considered that those 5% types (large value types) will not strictly need to have a refcounting+COW support to be usable in containers and algorithms: just store pointers-to-the-value-type in the container, instead of using the value type directly. (Basicly, use them as reference types). So it's not the end of the world for some value type if it doesn't implement refcounting+COW. -- Bruno Medeiros - Software Engineer
Nov 01 2010
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, October 29, 2010 10:15:09 dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s
 
 Any tie-breaking arguments, I'm all ears.
 Andrei

Uh...How about that if people want C++, they know where to find it? I think familiarity to C++ programmers is The Wrong Reason (TM) to allow arbitrary cost copy construction. Furthermore, I don't see crufty old C++ programmers as being more important to D than people from other backgrounds. I see D users coming from a variety of backgrounds: 1. Crufty old C/C++ programmers. 2. People who like dynamic languages but need more speed and ability to do low-level work. D is about the most flexible close-to-the-metal/efficient/statically typed language out there. 3. Java/C# programmers who want a language that isn't absurdly verbose. 4. New programmers who don't have much already invested in any other language and want something advanced, modern and w/o tons of legacy cruft. The first **may** want eager copying. The latter three almost certainly won't.

Except that how many of those other languages _have_ copy construction? Java doesn't. C#'s classes obviously don't, and IIRC, they're structs don't either (though they may - it's been a while since I used C#). If we're talking about a feature that's pretty much only in C++, then it's a question of whether the C++ programmers are going to be confused and/or surprised and whether someone who has never dealt with such a feature would be confused and/or surprised. Also, for the non-C++ folks who aren't familiar with a feature, they're not going to know about it anyway, so whether you follow C++ or not is irrelevant to them while it is _not_ irrelevant to the C++ programmers. By no means do I think that we should do everything in a particular way just because C++ did it, but particularly when dealing with a feature that is C++- centric, we need to be aware of how what we do will affect C++ programmers. As for eager-coping, it's a _value type_, of _course_ it's going to be copied eagerly. ints, floats, bools, etc. _all_ copy eagerly. How can you expect anything else? It's quite explicit that that's how value types work. So, if you write a struct where copying is expensive, you're asking for trouble and should know it. The only real issue with it that I see is the non-obvious places where copying take place. But if you tried to make it so that structs didn't copy eagerly, then you're making it so that they're not value types anymore which will increas overheard and goes against what structs are meant for. C++ doesn't stop you or help you from being stupid about expensive copy- construction. Why should D? Sure, it sucks when your program is slow, but we have classes which are reference types and don't have this problem. So, if you want a reference type, there it is. I don't see how anyone could expect expensive copying not to result in poor performance for a _value type_. - Jonathan M Davis
Oct 29 2010
prev sibling next sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-10-29 11:59:51 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 FWIW this matter is still tormenting me. It gridlocks work on ranges, 
 algorithms, and containers.
 
 To recap:
 
 1. Arbitrary cost copy construction:
 
 + Makes value types easy to define: just hook the copying code into this(this)
 + Is familiar to programmers coming from C++
 + If it fails, fails early, not surprisingly during a later operation
 - Creates inefficiencies for innocuous-looking code
 - Contorts code that wants to be efficient
 - Makes containers, ranges and other abstractions bulkier and more 
 difficult to define, implement, and use
 - Forces exposure of raw addresses, which hurt abstraction power and safety
 
 2. Constant-cost copy construction (via mandatory refcounting for value types):
 
 + Simplifies life virtually everywhere in client code
 + Makes code easy to reason about (no hidden inefficiencies, failure 
 paths are more visible)
 + Makes generic code easier to define and more efficient for many types
 - Makes value types difficult to define: a value type must store all 
 state and do all manipulation indirectly, via a reference-counted 
 pointer; all methods must check the pointer against null; all mutating 
 methods must use a copy-on-write approach
 - Needs copy-on-write (COW) approaches in defining object types and COW 
 is error-prone because it relies on the programmer remembering to 
 always ensure a unique copy prior to changing the object
 - Can become a notable difference from C++ (and consequently a 
 contention point for C++ programmers who are considering migrating to 
 D): arbitrary cost of construction hurts C++ in many places, but it has 
 contained the situation with a combination of complexity, discipline, 
 and acceptance of inefficiencies.

Most instances of ref-counted structs in Phobos contain race conditions when put on the GC heap. ;-( That's actually a big drawback. Perhaps you should look at what fixing this implies before making this advantage/disadvantage list, as it might get a new negative point (atomic op during copy). <http://d.puremagic.com/issues/show_bug.cgi?id=4624> Anyway, my preferred solution to this problem is this third option: 3. Types with arbitrary-cost copying disable this(this) and define explicit dup() function. + Can easily implement either 2 (ref counting) or 1 (arbitrary cost copy constructor) on top of it with a standard template, so you can choose which tradeoff is best for you. + Can live on the stack (ref counting prohibits that). + If it fails, it fails either at compile time when trying to copy implicitly or early at runtime where you call dup explicitly. At this point you can either wrap it in solution 2 (ref counted template) or 1 (arbitrary copy template), or you can change your code to just bypass the problem. - Too much flexibility might be harder to grasp. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 29 2010
prev sibling parent reply Pillsy <pillsbury gmail.com> writes:
Andrei Alexandrescu Wrote:
[...]
 FWIW this matter is still tormenting me. It gridlocks work on ranges, 
 algorithms, and containers.

Here's my take: it seems in both cases you're arguing about contorting something in the name of a relatively niche construction. In one case, it's designing libraries around the possibility that you might have an expensive-to-copy value type, and in the other you seem to be seriously contorting the very idea of "value type" itself. I'm sort of wondering where all these expensive-to-copy value types are going to come from. It's not like writing an expensive-to-copy struct is likely to be something you do by accident, right? I can't think of a case where someone just does it because they know better. A programmer coming from C is likely to naively treat D structs like C structs, and be fine. A programmer coming from Java# or a dynamic scripting language is probably going to be happier sticking to classes and their comforting and familiar reference semantics when they're getting started. That leaves programmers coming from C++, who are likely to know what they're getting themselves into, or more experienced D programmers, who again are likely to know what they're getting themselves into. I think this is a definite worse-is-better situation. Just assume that value types don't have randomly expensive copies in the standard library code, while letting people who want to take the risk of shooting themslves in the foot shoot themselves in the foot by having this(this) open an HTTP collection to Alpha Centauri if that's what they really want. Cheers, Pillsy
Oct 29 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/29/10 15:18 CDT, Pillsy wrote:
 Andrei Alexandrescu Wrote: [...]
 FWIW this matter is still tormenting me. It gridlocks work on
 ranges, algorithms, and containers.

Here's my take: it seems in both cases you're arguing about contorting something in the name of a relatively niche construction. In one case, it's designing libraries around the possibility that you might have an expensive-to-copy value type, and in the other you seem to be seriously contorting the very idea of "value type" itself.

Very well put. (Why don't you post more often?)
 I'm sort of wondering where all these expensive-to-copy value types
 are going to come from. It's not like writing an expensive-to-copy
 struct is likely to be something you do by accident, right? I can't
 think of a case where someone just does it because they know better.

The typical case is value types of variable length: strings (the built-in offering notwithstanding), BigInt, BigFloat, STL-style containers, possibly even vectors and matrices (though that's almost always a design mistake). And then of course come types that contain such. So "expensive-to-copy" is a transitive property that rots everything upwards. In fact let me just add this to the list of pros and cons: 1. Arbitrary cost copy construction: - Propagates through the membership relation, i.e. a struct with at least one costly-to-copy member becomes itself costly-to-copy even though it wasn't planning to or wasn't aware of. 2. Constant-cost copy construction (via mandatory refcounting for value types): + Transparent for the member-of relation, i.e. a constant copy member keeps its containing struct also constant copy.
 A programmer coming from C is likely to naively treat D structs like
 C structs, and be fine.

Except as noted above. A C programmer might plop a couple of BigInts in a struct and think that they're in good shape.
 A programmer coming from Java# or a dynamic scripting language is
 probably going to be happier sticking to classes and their comforting
 and familiar reference semantics when they're getting started.

 That leaves programmers coming from C++, who are likely to know what
 they're getting themselves into, or more experienced D programmers,
 who again are likely to know what they're getting themselves into.

 I think this is a definite worse-is-better situation. Just assume
 that value types don't have randomly expensive copies in the standard
 library code, while letting people who want to take the risk of
 shooting themslves in the foot shoot themselves in the foot by having
 this(this) open an HTTP collection to Alpha Centauri if that's what
 they really want.

 Cheers, Pillsy

Thanks for the vote. Andrei
Oct 29 2010
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, October 29, 2010 08:59:51 Andrei Alexandrescu wrote:
 On 10/29/10 7:55 CDT, Bruno Medeiros wrote:
 On 06/10/2010 17:57, dsimcha wrote:
 Vote++. IMHO the problem with arbitrary cost copy construction is that
 abstractions that are this leaky don't actually make people's lives
 simpler.
 Abstractions (like value semantics) are supposed to make the code
 easier to reason
 about. When an abstraction forces you to think hard about things as
 trivial as
 variable assignments, I think it's better to either scrap the
 abstraction and
 require all copying to be explicit, or use a less leaky abstraction
 like reference
 counting/COW.

I agree with this as well. But I'm still wondering if the original copy construction problem couldn't be solved in some other way.

FWIW this matter is still tormenting me. It gridlocks work on ranges, algorithms, and containers. To recap: 1. Arbitrary cost copy construction: + Makes value types easy to define: just hook the copying code into this(this) + Is familiar to programmers coming from C++ + If it fails, fails early, not surprisingly during a later operation - Creates inefficiencies for innocuous-looking code - Contorts code that wants to be efficient - Makes containers, ranges and other abstractions bulkier and more difficult to define, implement, and use - Forces exposure of raw addresses, which hurt abstraction power and safety 2. Constant-cost copy construction (via mandatory refcounting for value types): + Simplifies life virtually everywhere in client code + Makes code easy to reason about (no hidden inefficiencies, failure paths are more visible) + Makes generic code easier to define and more efficient for many types - Makes value types difficult to define: a value type must store all state and do all manipulation indirectly, via a reference-counted pointer; all methods must check the pointer against null; all mutating methods must use a copy-on-write approach - Needs copy-on-write (COW) approaches in defining object types and COW is error-prone because it relies on the programmer remembering to always ensure a unique copy prior to changing the object - Can become a notable difference from C++ (and consequently a contention point for C++ programmers who are considering migrating to D): arbitrary cost of construction hurts C++ in many places, but it has contained the situation with a combination of complexity, discipline, and acceptance of inefficiencies. Any tie-breaking arguments, I'm all ears. Andrei

Honestly, what I expect to happen if constant-copy cost is expected is that code won't be written that way and the code that expects a constant-copy cost is going to be slow. Really, I'd say that in the general case, either arbitrary copy construction is going to be expected, and there will be code which assumes constant-copy cost and so is slow, or constant-copy cost construction is going to be expected and any postplits which don't have a constant-copy cost will are going to be inefficient in various places. I really don't think that you have any hope of enforcing either scheme. Programmers just won't think about it in the general case. So, unless the "mandatory refcounting for value type" (and I have no clue how you'd do that) is actually enforced, you can't enforce either, and you _will_ have inefficiencies. It's just a question of where and what kind. If anything, I'm inclined to say that we assume that the postblit is O(1) and let the programmer worry about any inefficiencies. We can point out that anything worse that O(1) will be a performance problem, but it seems to me that any attempt to either accomodate arbitrary cost postblit constructors or to try and use any kind of scheme which forces programmers to write postblits in a certain way is too complicated and doomed to failure. And even if it works, it will be highly annoying to deal with. - Jonathan M Davis
Oct 29 2010
prev sibling next sibling parent so <so so.do> writes:
On Fri, 29 Oct 2010 20:15:09 +0300, dsimcha <dsimcha yahoo.com> wrote:

 1.  Crufty old C/C++ programmers.

 2.  People who like dynamic languages but need more speed and ability to  
 do
 low-level work.  D is about the most flexible
 close-to-the-metal/efficient/statically typed language out there.

 3.  Java/C# programmers who want a language that isn't absurdly verbose.

 4.  New programmers who don't have much already invested in any other  
 language and
 want something advanced, modern and w/o tons of legacy cruft.

 The first **may** want eager copying.  The latter three almost certainly  
 won't.

If you believe that, you have no valid reason following/using D, with a community like you described you can't get anywhere. Can't speak for other languages or other people but for me if someone is coming from C/C++, not because he sucks but the language is not enough. If you think D is easier by margin, you are delusional. If D being so much easier not the case why do a C user want this nonsense transition? Answer is "Quality of life". C/C++ has everything for a "crufty old" programmer, he doesn't need a transition. Who needs this transition is the ones that pushing the language limits. Some people should really get rid of this C/C++ complex, hate.. I really hate to say this but you got to accept an average C/C++ programmer is much better than best of any higher language programmer. If one argue against this, i got nothing else to say, he needs to wake up and check the games he plays, the OS he uses, anything but "simple string processing". Thank you! -- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
Oct 29 2010
prev sibling next sibling parent "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 Walter and I discussed the matter again today and we're on the brink of  
 deciding that cheap copy construction is to be assumed. This simplifies  
 the language and the library a great deal, and makes it perfectly good  
 for 95% of the cases. For a minority of types, code would need to go  
 through extra hoops (e.g. COW, refcounting) to be compliant.

 I'm looking for more feedback from the larger D community. This is a  
 very important decision that marks one of the largest departures from  
 the C++ style. Taking the wrong turn here could alienate many  
 programmers coming from C++.

 So, everybody - this is really the time to speak up or forever be silent.

I am almost certain you're doing the right thing by assuming cheap copy construction, but I have no good arguments to back me up. Hence my holding my breath and waiting, thus far. -- Simen
Oct 31 2010
prev sibling parent Emil Madsen <sovende gmail.com> writes:
--90e6ba539e3cf714130494b8b5be
Content-Type: text/plain; charset=ISO-8859-1

Make the syntax ugly? - so you cant avoid seeing it?

On 29 October 2010 19:40, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org
 wrote:

 On 10/29/10 12:18 CDT, dsimcha wrote:

 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s
 article

 To recap:
 1. Arbitrary cost copy construction:
 + Makes value types easy to define: just hook the copying code into
 this(this)
 + Is familiar to programmers coming from C++
 + If it fails, fails early, not surprisingly during a later operation

BTW, I don't see why failing during a variable assignment is any less bad than failing during any other seemingly innocuous operation.

One problem is that copying is often implicit and as such more difficult to see with the naked eye. Andrei

-- // Yours sincerely // Emil 'Skeen' Madsen --90e6ba539e3cf714130494b8b5be Content-Type: text/html; charset=ISO-8859-1 Make the syntax ugly? - so you cant avoid seeing it?<br><br><div class="gmail_quote">On 29 October 2010 19:40, Andrei Alexandrescu <span dir="ltr">&lt;<a href="mailto:SeeWebsiteForEmail erdani.org">SeeWebsiteForEmail erdan .org</a>&gt;</span> wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div><div></div><div class="h5">On 10/29/10 12:18 CDT, dsimcha wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> == Quote from Andrei Alexandrescu (<a href="mailto:SeeWebsiteForEmail erdani.org" target="_blank">SeeWebsiteForEmail erdani.org</a>)&#39;s article<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> To recap:<br> 1. Arbitrary cost copy construction:<br> + Makes value types easy to define: just hook the copying code into<br> this(this)<br> + Is familiar to programmers coming from C++<br> + If it fails, fails early, not surprisingly during a later operation<br> </blockquote> <br> BTW, I don&#39;t see why failing during a variable assignment is any less bad than<br> failing during any other seemingly innocuous operation.<br> </blockquote> <br></div></div> One problem is that copying is often implicit and as such more difficult to see with the naked eye.<br><font color="#888888"> <br> Andrei<br> </font></blockquote></div><br><br clear="all"><br>-- <br>// Yours sincerely<br>// Emil &#39;Skeen&#39; Madsen<br> --90e6ba539e3cf714130494b8b5be--
Nov 10 2010
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 06 Oct 2010 12:34:54 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 I think ranges are a great thing, having simplicity as one of their  
 advantages. In the beginning they were indeed rather simple:

 struct MyRange {
      bool empty();
      ref Widget front();
      void popFront();
 }

 with the appropriate refinements for bidirectional and random.

 Then there was the need to distinguish one-pass, "input" ranges (e.g.  
 files) from many-pass, "forward" ranges. So the "save" function got born  
 for forward ranges and above:

 struct MyRange {
      bool empty();
      ref Widget front();
      void popFront();
      MyRange save();
 }

 Then  property came about which required extra adornments:

 struct MyRange {
       property bool empty();
       property ref Widget front();
      void popFront();
       property MyRange save();
 }

or more succinctly: struct MyRange { property { bool empty(); ref Widget front(); MyRange save(); } void popFront(); }
 Then some ranges were unable to return ref, but they could receive  
 assignment. I call these /sealed ranges/ because they "seal" the address  
 of their elements making it inaccessible:

 struct MyRange {
       property bool empty();
       property Widget front();
       property void front(Widget);
      void popFront();
       property MyRange save();
 }

 This made bidirectional and random-access range interfaces quite big  
 because now we're talking about back() (two versions), opIndex(),  
 opIndexAssign() and opIndexOpAssign().

 Until now I think we're solid on design decisions. The discussion starts  
 here.

I agree all of this except for save() (not a secret to anyone who's read my posts). WRT sealed ranges, that is one of those things where you have a tradeoff between simplicity and control. Note that with sealed ranges, they are used almost exactly like unsealed ranges except for a couple restrictions. This is important for users because they can be considered hidden details (see my note later).
 And then there was this nagging thing which is well-understood in the  
 C++ world. A sealed range returns by value and accepts by value. But in  
 C++, an object's copy constructor is arbitrarily expensive. The library  
 must be designed in accordance to that and carefully navigate around  
 that issue.

 To solve that problem, I introduced moveFront(), moveBack(), and  
 moveAt(size_t), all of which destructively read the front, back, or an  
 indexed element respectively off the range. Then you can swap r1.front  
 with r2.front at constant cost like this:

      T tmp = r1.moveFront();
      r1.front = r2.moveFront();
      r2.front = move(tmp);

I'll also note, another unsolved issue with this is iteration -- iterating a sealed range via foreach is not going to use moveFront.
 All of this works and is rock-solid, but it does load the range  
 interface considerably. To a newcomer coming without the background  
 above, a full-fledged range definition may look quite loaded.

 One simplification is to simply decree that Phobos (and D in general)  
 shuns objects with eager copy. Any this(this) could be considered  
 constant cost.

Yes, I like this plan. I recently ran into this problem and just to simplify things unsealed my range in question rather than define moveX. I'll note that moveX is more trouble than simply sealing ranges by not returning ref because the usage changes -- you have to start using moveX if its defined, or just X when it's not. And if you don't, there is no warning or error from the compiler.
 2. It would force certain types (such as BigInt) that allocate resources  
 and have value semantics to resort to reference counting.

Or to be considered immutable. I believe Java does it this way. Of course, Java's GC is much better at dealing with this...
 4. It would be a definite departure from C++, where all value copies are  
 considered of arbitrary cost. This would provide a convenient straw-man  
 for naysayers (e.g. "Hey, D calls the copy constructor even more often  
 than C++! No thanks, I'll stick with C++0x which solves it all with  
 rvalue references").

Wait, shouldn't auto ref solve the rvalue references thing? I mean if you are copying an rvalue, there is no reason to keep the original around, so no need to call the copy constructor, right? -Steve
Oct 06 2010
next sibling parent Pillsy <pillsbury gmail.com> writes:
Steven Schveighoffer Wrote:

 On Wed, 06 Oct 2010 12:34:54 -0400, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:

 2. It would force certain types (such as BigInt) that allocate
 resources and have value semantics to resort to reference 
 counting.


 Or to be considered immutable.  I believe Java does it this way.  Of  
 course, Java's GC is much better at dealing with this...

I would vote prettty strongly for immutability for things in the standard library. I think the abstraction isn't as leaky and designing library interfaces around current GC QoI issues strikes me as a bad plan. Cheers, Pillsy
Oct 06 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/6/10 12:50 CDT, Steven Schveighoffer wrote:
 On Wed, 06 Oct 2010 12:34:54 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 I think ranges are a great thing, having simplicity as one of their
 advantages. In the beginning they were indeed rather simple:

 struct MyRange {
 bool empty();
 ref Widget front();
 void popFront();
 }

 with the appropriate refinements for bidirectional and random.

 Then there was the need to distinguish one-pass, "input" ranges (e.g.
 files) from many-pass, "forward" ranges. So the "save" function got
 born for forward ranges and above:

 struct MyRange {
 bool empty();
 ref Widget front();
 void popFront();
 MyRange save();
 }

 Then  property came about which required extra adornments:

 struct MyRange {
  property bool empty();
  property ref Widget front();
 void popFront();
  property MyRange save();
 }

or more succinctly: struct MyRange { property { bool empty(); ref Widget front(); MyRange save(); } void popFront(); }
 Then some ranges were unable to return ref, but they could receive
 assignment. I call these /sealed ranges/ because they "seal" the
 address of their elements making it inaccessible:

 struct MyRange {
  property bool empty();
  property Widget front();
  property void front(Widget);
 void popFront();
  property MyRange save();
 }

 This made bidirectional and random-access range interfaces quite big
 because now we're talking about back() (two versions), opIndex(),
 opIndexAssign() and opIndexOpAssign().

 Until now I think we're solid on design decisions. The discussion
 starts here.

I agree all of this except for save() (not a secret to anyone who's read my posts).

What was your alternative design that obviates save()?
 To solve that problem, I introduced moveFront(), moveBack(), and
 moveAt(size_t), all of which destructively read the front, back, or an
 indexed element respectively off the range. Then you can swap r1.front
 with r2.front at constant cost like this:

 T tmp = r1.moveFront();
 r1.front = r2.moveFront();
 r2.front = move(tmp);

I'll also note, another unsolved issue with this is iteration -- iterating a sealed range via foreach is not going to use moveFront.

Well it shouldn't because the user doesn't want to destroy the stuff is iterating.
 4. It would be a definite departure from C++, where all value copies
 are considered of arbitrary cost. This would provide a convenient
 straw-man for naysayers (e.g. "Hey, D calls the copy constructor even
 more often than C++! No thanks, I'll stick with C++0x which solves it
 all with rvalue references").

Wait, shouldn't auto ref solve the rvalue references thing? I mean if you are copying an rvalue, there is no reason to keep the original around, so no need to call the copy constructor, right?

auto ref has its uses, but designing an interface with front() but not moveFront() will make it impossible to move the front out of the range. Andrei
Oct 06 2010
next sibling parent Rainer Deyke <rainerd eldwood.com> writes:
On 10/7/2010 07:24, Steven Schveighoffer wrote:
 First, let me discuss why I don't like save.

...
 So my question is, what is the point of save?  The whole point is for
 this last class of ranges, so they can implement a way to copy the
 iteration position in a way that isn't doable via simple assignment. 
 But there *AREN'T ANY* of these ranges in existence.  Why do we have a
 feature that is littered all over phobos and any range that wants to be
 more than a basic imput range when the implementation is return this; ?

Let me add two reasons to that list. First, I expect that forgetting to call 'save' is or will be a very common bug. There's no way to detect it at compile time, and when the code is used with a range with a trivial 'save', the code will work as expected. The bug will only be triggered by a range with a non-trivial 'save'. Therefore ranges with non-trivial 'save' should be considered error-prone and should not be used. Second, it is in fact fairly easy to wrap a range with a non-trivial 'save' in a range with a trivial 'save', using a copy-on-write strategy. So if there /were/ any ranges with non-trivial 'save', it would be easy enough to rewrite them to eliminate the non-trivial 'save'. Eliminating 'save' makes ranges a lot easier and safer for range users, at a minor cost for range writers. Since range users greatly outnumber range writers, this seems like an overwhelmingly good trade-off. -- Rainer Deyke - rainerd eldwood.com
Oct 07 2010
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/7/10 8:24 CDT, Steven Schveighoffer wrote:
 On Wed, 06 Oct 2010 19:09:15 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 10/6/10 12:50 CDT, Steven Schveighoffer wrote:
 I agree all of this except for save() (not a secret to anyone who's read
 my posts).

What was your alternative design that obviates save()?

First, let me discuss why I don't like save. Let's classify ranges into two types. Primitive ranges are ones that provide a range interface to data that otherwise does not have a range interface. A good example is a container range. Compound ranges are ranges that combine or alter range functionality from a wrapped range or set of ranges. A good example is Zip or Chain. Let's also define the trivial save implementation as a single line: return this; We can assume that compound ranges don't contain a trivial solution, but only because the range(s) they contain could implement a non-trivial solution. So we can qualify those trivial, even though they don't have the canonical trival solution. Most ranges that are not compound implement the trivial solution. This includes all dcollections ranges and *ALL* non-compound ranges in phobos. Then there are ranges which cannot implement save whatsoever, even if they wanted to. A range that provides the range interface for a network stream would be a good example. That leaves us with ranges which cannot implement the trival solution, but can implement save. A class-based range would be a good example. Currently, there are zero examples of this type of range. That's right, zero (grep 'save()' std/*.d). I contend that we have no place for such ranges in D. So my question is, what is the point of save? The whole point is for this last class of ranges, so they can implement a way to copy the iteration position in a way that isn't doable via simple assignment. But there *AREN'T ANY* of these ranges in existence. Why do we have a feature that is littered all over phobos and any range that wants to be more than a basic imput range when the implementation is return this; ?

I knew my retort for a while now, but neglected to write it down. David Simcha added dynamic interfaces for std.range, in which save() does work and is actually instrumental. That suggests save() is useful. Now, dynamic interfaces are not just an obscure corner case, so I think they're here to stay and provide a compelling use case. Andrei
Oct 25 2010
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-10-06 12:34:54 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 To solve that problem, I introduced moveFront(), moveBack(), and 
 moveAt(size_t), all of which destructively read the front, back, or an 
 indexed element respectively off the range. Then you can swap r1.front 
 with r2.front at constant cost like this:
 
      T tmp = r1.moveFront();
      r1.front = r2.moveFront();
      r2.front = move(tmp);
 
 All of this works and is rock-solid, but it does load the range 
 interface considerably. To a newcomer coming without the background 
 above, a full-fledged range definition may look quite loaded.

I think like you that this does not belong in the range interface. You are illustrating quite well that it's not the right layer to solve the problem. But I don't think saying "move semantics are too complicated to implement, let's forget them for range" is a solution either. If you start this with ranges, then it'll continue for other things that'll also suffer from the same problem, and move semantics will not be very usable anywhere. But improving move semantics would probably require some language changes and there seem to be little interest with that currently, so I'll leave it at that and go to the second point...
 One simplification is to simply decree that Phobos (and D in general) 
 shuns objects with eager copy. Any this(this) could be considered 
 constant cost.

Not a bad idea in itself. But while it makes "move" less needed, a dependence on the ability to copy in algorithms will prevent them from working with non-copyable structs. Could algorithms use "move" when possible (when "front" returns a ref) and do a copy when move isn't available? That'd be a good compromise I think. T tmp = moveOrCopy(r1.front); r1.front = moveOrCopy(r2.front); r2.front = moveOrCopy(tmp); -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 06 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/6/10 13:59 CDT, Michel Fortin wrote:
 Could algorithms use "move" when possible (when "front" returns a ref)
 and do a copy when move isn't available? That'd be a good compromise I
 think.

 T tmp = moveOrCopy(r1.front);
 r1.front = moveOrCopy(r2.front);
 r2.front = moveOrCopy(tmp);

Then problem is, people sort ranges of their own objects and are amazed at the unbearable performance. Andrei
Oct 06 2010
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-10-06 16:14:58 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 10/6/10 13:59 CDT, Michel Fortin wrote:
 Could algorithms use "move" when possible (when "front" returns a ref)
 and do a copy when move isn't available? That'd be a good compromise I
 think.
 
 T tmp = moveOrCopy(r1.front);
 r1.front = moveOrCopy(r2.front);
 r2.front = moveOrCopy(tmp);

Then problem is, people sort ranges of their own objects and are amazed at the unbearable performance.

I'll repeat myself in clearer terms: I agree with you that copy-constructor should be of linear complexity. I agree that you should use a copy when you can't do a move (when front doesn't return a ref). I disagree that you should use a copy even when you can do a move (when front *does* return a ref). Therefore, I suggest that you use "move" when you can (when front returns a ref) and fallback to a copy otherwise (when front doesn't return a ref). Hence "moveOrCopy", which moves when it can move and copies when it cannot move. How can this result in poorer performances than to always copy? It can only increase performance because you're copying less. (I understand that it was probably just a misunderstanding of my suggestion.) -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 06 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/6/10 16:38 CDT, Michel Fortin wrote:
 On 2010-10-06 16:14:58 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 On 10/6/10 13:59 CDT, Michel Fortin wrote:
 Could algorithms use "move" when possible (when "front" returns a ref)
 and do a copy when move isn't available? That'd be a good compromise I
 think.

 T tmp = moveOrCopy(r1.front);
 r1.front = moveOrCopy(r2.front);
 r2.front = moveOrCopy(tmp);

Then problem is, people sort ranges of their own objects and are amazed at the unbearable performance.

I'll repeat myself in clearer terms: I agree with you that copy-constructor should be of linear complexity.

But I'm asking for constant complexity. We can only simplify if copy construction has no impact on overall complexity.
 I agree that you should use a copy when you can't do a move (when front
 doesn't return a ref).

But that makes it impossible for an algorithm to guarantee good complexities.
 I disagree that you should use a copy even when you can do a move (when
 front *does* return a ref).

Alright.
 Therefore, I suggest that you use "move" when you can (when front
 returns a ref) and fallback to a copy otherwise (when front doesn't
 return a ref). Hence "moveOrCopy", which moves when it can move and
 copies when it cannot move.

 How can this result in poorer performances than to always copy? It can
 only increase performance because you're copying less. (I understand
 that it was probably just a misunderstanding of my suggestion.)

It doesn't lead to poorer performance than copying. The only issue is that now you need to carry cost of copy everywhere as a possible term/factor. Also, many algorithms would approach things differently if the copy were O(n). This complicates, not simplifies, things. Andrei
Oct 06 2010
parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-10-06 19:03:45 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 10/6/10 16:38 CDT, Michel Fortin wrote:
 On 2010-10-06 16:14:58 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:
 
 On 10/6/10 13:59 CDT, Michel Fortin wrote:
 Could algorithms use "move" when possible (when "front" returns a ref)
 and do a copy when move isn't available? That'd be a good compromise I
 think.
 
 T tmp = moveOrCopy(r1.front);
 r1.front = moveOrCopy(r2.front);
 r2.front = moveOrCopy(tmp);

Then problem is, people sort ranges of their own objects and are amazed at the unbearable performance.

I'll repeat myself in clearer terms: I agree with you that copy-constructor should be of linear complexity.

But I'm asking for constant complexity. We can only simplify if copy construction has no impact on overall complexity.

Sorry, I was just confused and said linear when I meant constant.
 I agree that you should use a copy when you can't do a move (when front
 doesn't return a ref).

But that makes it impossible for an algorithm to guarantee good complexities.

Given my "by linear I meant constant" correction above, that should be the same as you're proposing?
 I disagree that you should use a copy even when you can do a move (when
 front *does* return a ref).

Alright.
 Therefore, I suggest that you use "move" when you can (when front
 returns a ref) and fallback to a copy otherwise (when front doesn't
 return a ref). Hence "moveOrCopy", which moves when it can move and
 copies when it cannot move.
 
 How can this result in poorer performances than to always copy? It can
 only increase performance because you're copying less. (I understand
 that it was probably just a misunderstanding of my suggestion.)

It doesn't lead to poorer performance than copying. The only issue is that now you need to carry cost of copy everywhere as a possible term/factor. Also, many algorithms would approach things differently if the copy were O(n). This complicates, not simplifies, things.

Does this still apply given my correction above? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 06 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Reprise with the correction in:

On 10/6/10 16:38 CDT, Michel Fortin wrote:
 On 2010-10-06 16:14:58 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 On 10/6/10 13:59 CDT, Michel Fortin wrote:
 Could algorithms use "move" when possible (when "front" returns a ref)
 and do a copy when move isn't available? That'd be a good compromise I
 think.

 T tmp = moveOrCopy(r1.front);
 r1.front = moveOrCopy(r2.front);
 r2.front = moveOrCopy(tmp);

Then problem is, people sort ranges of their own objects and are amazed at the unbearable performance.

I'll repeat myself in clearer terms: I agree with you that copy-constructor should be of *constant* complexity. I agree that you should use a copy when you can't do a move (when front doesn't return a ref). I disagree that you should use a copy even when you can do a move (when front *does* return a ref). Therefore, I suggest that you use "move" when you can (when front returns a ref) and fallback to a copy otherwise (when front doesn't return a ref). Hence "moveOrCopy", which moves when it can move and copies when it cannot move. How can this result in poorer performances than to always copy? It can only increase performance because you're copying less. (I understand that it was probably just a misunderstanding of my suggestion.)

Once the copy constructor has constant complexity, moving vs. copying becomes a minor optimization. Andrei
Oct 06 2010
next sibling parent Rainer Deyke <rainerd eldwood.com> writes:
On 10/6/2010 19:58, Andrei Alexandrescu wrote:
 Once the copy constructor has constant complexity, moving vs. copying
 becomes a minor optimization.

this(this) { sleep(1000000000); // <-- Constant amount. } -- Rainer Deyke - rainerd eldwood.com
Oct 06 2010
prev sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-10-06 21:58:42 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Once the copy constructor has constant complexity, moving vs. copying 
 becomes a minor optimization.

But if your algorithm can work using only moves, it can support types that are non-copiable, so it's not simply an optimization. One reason a type might be non-copiable is to make it simpler (not have to write some reference-counted copy-on-write logic). Also, even though you call it a minor optimization when it doesn't affect complexity, copying by incrementing/decrementing a reference counter requires atomic operations so it'll be noticeably more efficient to move rather than copy, even though both have the same algorithmic complexity. So I think it's important that move be used each time it's possible to use it. Or perhaps it's just me not liking the added requirement of reference counters everywhere in a GC-managed environment. As I mentioned in my other post, they're very tricky to implement correctly, at least as long as you're allowed to put one of these ref-counting struct in the GC-managed heap. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 06 2010
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-10-06 12:34:54 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 2. It would force certain types (such as BigInt) that allocate 
 resources and have value semantics to resort to reference counting.

Also, before asking for more reference counting, perhaps you should check that bug and find a nice way to do reference counting correctly with no race conditions (unlike what's done in Phobos currently). <http://d.puremagic.com/issues/show_bug.cgi?id=4624> The most problematic thing with reference-counted memory is that the GC can call your struct's destructor from any thread which can cause low-level races if the reference counter isn't manipulated with atomic operations. And atomic operations slow things down, are you sure you want to force this in BigInt? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 06 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/6/10 14:09 CDT, Michel Fortin wrote:
 On 2010-10-06 12:34:54 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 2. It would force certain types (such as BigInt) that allocate
 resources and have value semantics to resort to reference counting.

Also, before asking for more reference counting, perhaps you should check that bug and find a nice way to do reference counting correctly with no race conditions (unlike what's done in Phobos currently). <http://d.puremagic.com/issues/show_bug.cgi?id=4624>

I'm not sure the bug report is valid, but I agree it's a matter to look into.
 The most problematic thing with reference-counted memory is that the GC
 can call your struct's destructor from any thread which can cause
 low-level races if the reference counter isn't manipulated with atomic
 operations. And atomic operations slow things down, are you sure you
 want to force this in BigInt?

The GC shouldn't be able to destroy things naively concurrently with other threads. Currently all threads are frozen; I'm not sure whether a frozen thread commits its writes, so that needs to be verified. Andrei
Oct 06 2010
next sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-10-06 16:19:45 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 10/6/10 14:09 CDT, Michel Fortin wrote:
 The most problematic thing with reference-counted memory is that the GC
 can call your struct's destructor from any thread which can cause
 low-level races if the reference counter isn't manipulated with atomic
 operations. And atomic operations slow things down, are you sure you
 want to force this in BigInt?

The GC shouldn't be able to destroy things naively concurrently with other threads. Currently all threads are frozen; I'm not sure whether a frozen thread commits its writes, so that needs to be verified.

My current understanding (after looking at the GC's code) is that all threads are frozen during the mark phase, while it builds a list of objects to destroy. Then, threads are unfrozen while objects on the list are destroyed. And even if all threads were frozen while objects are being destroyed, how do you know that one thread hasn't been frozen in the middle of a read-modify-write operation with reference counter? Bad timing could result in an invalid reference count. Again, you'd need an atomic operation to make sure the GC doesn't stop a thread in the middle of a read-modify-write. Another solution would be to make the GC call destructors in the same thread that owns the object, but that would require thread-local memory pools. Related is the discussion for this bug: <http://d.puremagic.com/issues/show_bug.cgi?id=4621> -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 06 2010
prev sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 The GC shouldn't be able to destroy things naively concurrently with
 other threads. Currently all threads are frozen; I'm not sure whether a
 frozen thread commits its writes, so that needs to be verified.
 Andrei

How could freezing a thread not commit all its writes? If it didn't, then pointer updates wouldn't be committed and things might get GC'd when they shouldn't.
Oct 06 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/6/10 17:01 CDT, dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 The GC shouldn't be able to destroy things naively concurrently with
 other threads. Currently all threads are frozen; I'm not sure whether a
 frozen thread commits its writes, so that needs to be verified.
 Andrei

How could freezing a thread not commit all its writes? If it didn't, then pointer updates wouldn't be committed and things might get GC'd when they shouldn't.

That's what I thought! Andrei
Oct 06 2010
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 06 Oct 2010 16:19:45 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 10/6/10 14:09 CDT, Michel Fortin wrote:
 On 2010-10-06 12:34:54 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 2. It would force certain types (such as BigInt) that allocate
 resources and have value semantics to resort to reference counting.

Also, before asking for more reference counting, perhaps you should check that bug and find a nice way to do reference counting correctly with no race conditions (unlike what's done in Phobos currently). <http://d.puremagic.com/issues/show_bug.cgi?id=4624>

I'm not sure the bug report is valid, but I agree it's a matter to look into.

It is valid. When you use GC-allocated memory in a destructor, it's bad news. And structs that are members of classes can have their destructors called from the GC.
 The most problematic thing with reference-counted memory is that the GC
 can call your struct's destructor from any thread which can cause
 low-level races if the reference counter isn't manipulated with atomic
 operations. And atomic operations slow things down, are you sure you
 want to force this in BigInt?

The GC shouldn't be able to destroy things naively concurrently with other threads. Currently all threads are frozen; I'm not sure whether a frozen thread commits its writes, so that needs to be verified.

I think Michel is right. Let's say thread A is copying a File struct to the stack, and is on this line: this(this) { if (!p) return; assert(p.refs);
       ++p.refs;

And thread B is allocating memory. This triggers a GC collection cycle, and in the heap is a class object which contains the same File struct as a member. That File's destructor is called, and --refs is called. Note that the object being destroyed does not have to be shared. This is a classic race. -Steve
Oct 07 2010
prev sibling next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:i8i8g3$2rof$1 digitalmars.com...
 And then there was this nagging thing which is well-understood in the C++ 
 world. A sealed range returns by value and accepts by value. But in C++, 
 an object's copy constructor is arbitrarily expensive. The library must be 
 designed in accordance to that and carefully navigate around that issue.

 For example, swap is a fundamental primitive used in many algorithms. It 
 should swap objects at constant cost. Indeed, for references, swap is easy 
 to implement:

 void swap(T)(ref T lhs, ref T rhs)
 {
     assert(!pointsTo(lhs, rhs) && !pointsTo(rhs, lhs)
         && !pointsTo(lhs, lhs) && !pointsTo(rhs, rhs));
     T tmp = move(lhs);
     lhs = move(rhs);
     rhs = move(tmp);
 }

 or similar. However, a sealed range does not offer references, so trying 
 e.g.

 swap(r1.front, r2.front);

 will not work. This is a problem.

 To solve that problem, I introduced moveFront(), moveBack(), and 
 moveAt(size_t), all of which destructively read the front, back, or an 
 indexed element respectively off the range. Then you can swap r1.front 
 with r2.front at constant cost like this:

     T tmp = r1.moveFront();
     r1.front = r2.moveFront();
     r2.front = move(tmp);

 All of this works and is rock-solid, but it does load the range interface 
 considerably. To a newcomer coming without the background above, a 
 full-fledged range definition may look quite loaded.

 One simplification is to simply decree that Phobos (and D in general) 
 shuns objects with eager copy. Any this(this) could be considered constant 
 cost. That would have two consequences:

 1. It would simplify all ranges and many algorithms because there's no 
 more need for moveXxx and swapping can be done the old-fashioned way (with 
 assignments in inline code):

     T tmp = r1.front;
     r1.front = r2.front;
     r2.front = tmp;

 In general many things become considerably easier if you can simply assume 
 that copying objects around won't be part of your complexity requirements 
 or the practical costs of your algorithm.

 2. It would force certain types (such as BigInt) that allocate resources 
 and have value semantics to resort to reference counting.

 3. It would give more legitimacy to sealed objects that return data by 
 value (as opposed to reference). I believe sealed objects are very 
 important for safety.

 4. It would be a definite departure from C++, where all value copies are 
 considered of arbitrary cost. This would provide a convenient straw-man 
 for naysayers (e.g. "Hey, D calls the copy constructor even more often 
 than C++! No thanks, I'll stick with C++0x which solves it all with rvalue 
 references").

 5. It would make things look and feel familiar to people coming from any 
 other languages than C++, who seldom need to define a costly this(this) at 
 all.

 Please discuss.

So you want to simplify usage of sealed ranges by forcing all classes that allocate in their ctor to be reference counted classes instead of GCed?
Oct 06 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/6/10 16:02 CDT, Nick Sabalausky wrote:
 So you want to simplify usage of sealed ranges by forcing all classes that
 allocate in their ctor to be reference counted classes instead of GCed?

Almost. Rephrased just a bit: "So you want to simplify usage of sealed ranges by forcing all structs that allocate in their this(this) to use reference counted internally instead of eager copying?" Yes :o). Andrei
Oct 06 2010
parent reply "Nick Sabalausky" <a a.a> writes:
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:i8ioif$1rl3$1 digitalmars.com...
 On 10/6/10 16:02 CDT, Nick Sabalausky wrote:
 So you want to simplify usage of sealed ranges by forcing all classes 
 that
 allocate in their ctor to be reference counted classes instead of GCed?

Almost. Rephrased just a bit: "So you want to simplify usage of sealed ranges by forcing all structs that allocate in their this(this) to use reference counted internally instead of eager copying?" Yes :o).

Ok. And that means reference counting on the memory the struct allocates in this(this) (and not on all uses of the struct itself)?
Oct 06 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/6/10 16:30 CDT, Nick Sabalausky wrote:
 "Andrei Alexandrescu"<SeeWebsiteForEmail erdani.org>  wrote in message
 news:i8ioif$1rl3$1 digitalmars.com...
 On 10/6/10 16:02 CDT, Nick Sabalausky wrote:
 So you want to simplify usage of sealed ranges by forcing all classes
 that
 allocate in their ctor to be reference counted classes instead of GCed?

Almost. Rephrased just a bit: "So you want to simplify usage of sealed ranges by forcing all structs that allocate in their this(this) to use reference counted internally instead of eager copying?" Yes :o).

Ok. And that means reference counting on the memory the struct allocates in this(this) (and not on all uses of the struct itself)?

That is correct. I should emphasize that this is in all likelihood going to be a contentious point to C++ programmers used to eager copy semantics. Reference counting is robust and often economical, but adds hoops to the implementation - each method of the struct must redirect through the refcounted handle and duplicate it if necessary etc. I fear that it would be windfall for nitpickers, who in turn will influence anyone who likes to worry. Andrei
Oct 06 2010
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
This is a meta comment Andrei and I discussed earlier today.

C++ tries to make 99.9% of the use cases doable, but they do it at the cost of 
90% of C++ programmers being unable to implement a usable iterator or
container. 
If you've ever peeked under the hood at the implementation of STL iterators and 
containers, you know what I mean. If you've ever wondered what an rvalue 
reference was and why, you know what I mean. If you've noticed the dearth of
C++ 
reusable libraries, while such grow like weeds for Java/C#/Python, you know
what 
I mean.

I think what we're doing with struct ctors, dtors, postblit and opAssign is in 
the right direction. In C++, if you provide a copy constructor, but forget the 
opAssign, almost certainly the default opAssign generated by the compiler will 
be wrong (it's a memcpy). Even worse, this error will often generate corruption 
at runtime.

D, on the other hand, builds a copy constructor out of a postblit, meaning that 
the compiler generates for you most of the mechanics of the copy constructor. 
The postblit only deals with where it differs from a memcpy. Best of all, a 
*correct* default opAssign is automatically generated out of the postblit and 
destructor.

You might be able to hand write a more efficient opAssign, but if you don't, at 
least you'll get a logically correct one, rather than the C++ default one which 
is almost certainly wrong.

Applying the same idea to ranges suggests that missing functionality should be 
either automatically and correctly generated, or if that is not possible, issue 
a compile time error.

Additionally, if D range/container implementations start looking anything like 
C++ STL implementations in terms of complexity, we've failed.
Oct 06 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/6/10 22:39 CDT, Walter Bright wrote:
 This is a meta comment Andrei and I discussed earlier today.

 C++ tries to make 99.9% of the use cases doable, but they do it at the
 cost of 90% of C++ programmers being unable to implement a usable
 iterator or container. If you've ever peeked under the hood at the
 implementation of STL iterators and containers, you know what I mean. If
 you've ever wondered what an rvalue reference was and why, you know what
 I mean. If you've noticed the dearth of C++ reusable libraries, while
 such grow like weeds for Java/C#/Python, you know what I mean.

 I think what we're doing with struct ctors, dtors, postblit and opAssign
 is in the right direction. In C++, if you provide a copy constructor,
 but forget the opAssign, almost certainly the default opAssign generated
 by the compiler will be wrong (it's a memcpy). Even worse, this error
 will often generate corruption at runtime.

 D, on the other hand, builds a copy constructor out of a postblit,
 meaning that the compiler generates for you most of the mechanics of the
 copy constructor. The postblit only deals with where it differs from a
 memcpy. Best of all, a *correct* default opAssign is automatically
 generated out of the postblit and destructor.

 You might be able to hand write a more efficient opAssign, but if you
 don't, at least you'll get a logically correct one, rather than the C++
 default one which is almost certainly wrong.

 Applying the same idea to ranges suggests that missing functionality
 should be either automatically and correctly generated, or if that is
 not possible, issue a compile time error.

 Additionally, if D range/container implementations start looking
 anything like C++ STL implementations in terms of complexity, we've failed.

I agree with all of the above. After all has been said and done, it looks like uniform function call syntax is a pivotal feature for simplifying ranges. Most ranges can simply define the basic operations, and std.range takes care of defining boilerplate defaults for a host of cases. For example, you just call r.moveFront() and that becomes moveFront(r) which is defined by std.range. Whenever the range is defined in a way that makes it impossible to generate e.g. moveFront() appropriately, the user would have to. Would this be acceptable? Andrei
Oct 06 2010
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-10-07 02:09:04 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 I agree with all of the above. After all has been said and done, it 
 looks like uniform function call syntax is a pivotal feature for 
 simplifying ranges. Most ranges can simply define the basic operations, 
 and std.range takes care of defining boilerplate defaults for a host of 
 cases. For example, you just call r.moveFront() and that becomes 
 moveFront(r) which is defined by std.range.

That's good. I'm glad to see that using move semantics is still on the table. Another note, you don't really need to wait for the uniform function call syntax for this to work. The moveFront function template in std.range could check for the presence of moveFront in the range and call it when available. This means you have to write moveFront(r) everywhere instead of r.moveFront(), which might be an annoyance but at least it works.
 Whenever the range is defined in a way that makes it impossible to 
 generate e.g. moveFront() appropriately, the user would have to. Would 
 this be acceptable?

Seems good to me. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 07 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/7/10 7:34 CDT, Michel Fortin wrote:
 On 2010-10-07 02:09:04 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 I agree with all of the above. After all has been said and done, it
 looks like uniform function call syntax is a pivotal feature for
 simplifying ranges. Most ranges can simply define the basic
 operations, and std.range takes care of defining boilerplate defaults
 for a host of cases. For example, you just call r.moveFront() and that
 becomes moveFront(r) which is defined by std.range.

That's good. I'm glad to see that using move semantics is still on the table. Another note, you don't really need to wait for the uniform function call syntax for this to work. The moveFront function template in std.range could check for the presence of moveFront in the range and call it when available. This means you have to write moveFront(r) everywhere instead of r.moveFront(), which might be an annoyance but at least it works.

Turns out it's a lot more annoying in practice than I had imagined. I think I need to wait for uniform function call syntax. Andrei
Oct 07 2010
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday 06 October 2010 23:09:04 Andrei Alexandrescu wrote:
 I agree with all of the above. After all has been said and done, it
 looks like uniform function call syntax is a pivotal feature for
 simplifying ranges. Most ranges can simply define the basic operations,
 and std.range takes care of defining boilerplate defaults for a host of
 cases. For example, you just call r.moveFront() and that becomes
 moveFront(r) which is defined by std.range.
 
 Whenever the range is defined in a way that makes it impossible to
 generate e.g. moveFront() appropriately, the user would have to. Would
 this be acceptable?

It certainly sounds good to me. The only downside is that anyone writing algorithms that use ranges still has to worry about using, moveFront(), but for the most part, not using it just means that their algorithm is less efficient in the same way that it wolud be if moveFront() didn't exist, so you don't really lose anything. I do think that we need to avoid complicating ranges any further, however. - Jonathan M Davis
Oct 06 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 06 Oct 2010 19:09:15 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 10/6/10 12:50 CDT, Steven Schveighoffer wrote:
 I agree all of this except for save() (not a secret to anyone who's read
 my posts).

What was your alternative design that obviates save()?

First, let me discuss why I don't like save. Let's classify ranges into two types. Primitive ranges are ones that provide a range interface to data that otherwise does not have a range interface. A good example is a container range. Compound ranges are ranges that combine or alter range functionality from a wrapped range or set of ranges. A good example is Zip or Chain. Let's also define the trivial save implementation as a single line: return this; We can assume that compound ranges don't contain a trivial solution, but only because the range(s) they contain could implement a non-trivial solution. So we can qualify those trivial, even though they don't have the canonical trival solution. Most ranges that are not compound implement the trivial solution. This includes all dcollections ranges and *ALL* non-compound ranges in phobos. Then there are ranges which cannot implement save whatsoever, even if they wanted to. A range that provides the range interface for a network stream would be a good example. That leaves us with ranges which cannot implement the trival solution, but can implement save. A class-based range would be a good example. Currently, there are zero examples of this type of range. That's right, zero (grep 'save()' std/*.d). I contend that we have no place for such ranges in D. So my question is, what is the point of save? The whole point is for this last class of ranges, so they can implement a way to copy the iteration position in a way that isn't doable via simple assignment. But there *AREN'T ANY* of these ranges in existence. Why do we have a feature that is littered all over phobos and any range that wants to be more than a basic imput range when the implementation is return this; ? Now, this isn't quite fair -- save is important not only for what it provides but for what it excludes. It correctly excludes the ranges that cannot implement it. And these types of ranges do exist. So if we get rid of save, we have to compensate for this as well. -------- So now you know why I don't like save. I'll move on to my alternative. My solution is this. The class of ranges that cannot implement the trivial save will define an enum refIterator to be true. And that's it. It shouldn't hurt currently since no ranges implement a non-trivial save. If we come across a range where it makes sense to implement save, we can figure that out later. But our solution works easily in phobos: size_t bringToFront(Range1, Range2)(Range1 front, Range2 back) if (isForwardRange!Range1 && isForwardRange!Range2) Look, there's already a template constraint for isForwardRange. Just define isForwardRange to disqualify ranges which define the refIterator enum, and we have to change no code (except get rid of all the save litter).
 I'll also note, another unsolved issue with this is iteration --
 iterating a sealed range via foreach is not going to use moveFront.

Well it shouldn't because the user doesn't want to destroy the stuff is iterating.

Yeah, but I also don't think by default he wants to make an arbitrary-cost copy. Imagine a situation like this: foreach(i; arr) { if(i > 5) return i; } If arr is an array of BigInt, you are copying all of them out of the array, just to read them. All I'm saying is moveFront simply solves the swapping problem, there are other problems with arbitrary cost copy construction that are not solved by moveFront.
 Andrei wrote:
 4. It would be a definite departure from C++, where all value copies
 are considered of arbitrary cost. This would provide a convenient
 straw-man for naysayers (e.g. "Hey, D calls the copy constructor even
 more often than C++! No thanks, I'll stick with C++0x which solves it
 all with rvalue references").

Wait, shouldn't auto ref solve the rvalue references thing? I mean if you are copying an rvalue, there is no reason to keep the original around, so no need to call the copy constructor, right?

auto ref has its uses, but designing an interface with front() but not moveFront() will make it impossible to move the front out of the range.

But your point was about how C++ has rvalue references, how does rvalue references solve the moveFront problem? FWIW, I thought C++ supports passing rvalue references only by const, no? -Steve
Oct 07 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 25 Oct 2010 19:06:13 -0400, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 On 10/7/10 8:24 CDT, Steven Schveighoffer wrote:
 First, let me discuss why I don't like save.

 Let's classify ranges into two types. Primitive ranges are ones that
 provide a range interface to data that otherwise does not have a range
 interface. A good example is a container range.

 Compound ranges are ranges that combine or alter range functionality
 from a wrapped range or set of ranges. A good example is Zip or Chain.


 We can assume that compound ranges don't contain a trivial solution, but
 only because the range(s) they contain could implement a non-trivial
 solution. So we can qualify those trivial, even though they don't have
 the canonical trival solution.


 I knew my retort for a while now, but neglected to write it down.

 David Simcha added dynamic interfaces for std.range, in which save()  
 does work and is actually instrumental. That suggests save() is useful.  
 Now, dynamic interfaces are not just an obscure corner case, so I think  
 they're here to stay and provide a compelling use case.

You mean std.range.InputRange(R) and friends? The interfaces themselves are not proof they are useful. I see zero uses in phobos of InputRange, I didn't look for the others. Can you show places they are used? The InputRangeObject and OutputRangeObject are compound ranges as I outlined above. That is, they wrap another range, and as I said before, the fact they implement a non-trivial save is not significant. In fact, I don't see them useful anywhere, you are creating an object around a non-object in order to satisfy an interface that nobody uses. Again, these have zero uses in phobos. This just furthers the case that save is an eager solution for an unproven need. No offense to the work done, David. Actually, there is a bug there, InputRangeObject.save is implemented like this: static if(isForwardRange!R) { property typeof(this) save() { return new typeof(this)(_range); } } Should be: return new typeof(this)(_range.save); I'm willing to bet that 90% of code outside phobos does not use save, they just use simple assignment, knowing that it is the same as save (or not knowing they have to use save). And that reminds me of another point I failed to mention: implementing save does not guarantee it will be used -- assignment is not disallowed, and works the same as save in all currently implemented ranges. It is very simple and likely to copy a range using = and not see any bugs, until some person eventually decides object-based ranges would be a good idea and uses them, and then decides to use your function. That is, the compiler makes the range-implementer implement save, but does not make the range-user use save. So on top of being useless, it's bug-prone. The good news is that bugs are extremely unlikely to pop up since nobody (yet) implements the non-trivial save. -Steve
Oct 27 2010
prev sibling next sibling parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 06/10/2010 17:34, Andrei Alexandrescu wrote:
 or similar. However, a sealed range does not offer references, so trying
 e.g.

 swap(r1.front, r2.front);

 will not work. This is a problem.

Why doesn't a sealed range offer references? Is it to prevent modifying the elements being iterated? (I searched google and TDPL but couldn't find any info on sealed ranges) -- Bruno Medeiros - Software Engineer
Oct 29 2010
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 29 Oct 2010 09:06:24 -0400, Bruno Medeiros  
<brunodomedeiros+spam com.gmail> wrote:

 On 06/10/2010 17:34, Andrei Alexandrescu wrote:
 or similar. However, a sealed range does not offer references, so trying
 e.g.

 swap(r1.front, r2.front);

 will not work. This is a problem.

Why doesn't a sealed range offer references? Is it to prevent modifying the elements being iterated? (I searched google and TDPL but couldn't find any info on sealed ranges)

Because a sealed range then has complete power over memory allocation of its elements (really, I think sealed ranges is a misnomer, it's really ranges on sealed containers). Example: dcollections has a default allocator that allocates nodes in the collection as chunks (large numbers of nodes that fit into a page) to avoid over-using the GC. It also uses free-lists to further avoid GC allocations. If a range on such a container returned a reference, then someone could keep that reference indefinitely. When the container removed that element and re-used it for another, the reference now would refer to the same element. Although the range essentially is still a pointer, you can use mechanisms to prevent abuse, such as using a mutation counter on the container/range to track changes (dcollections doesn't do this, but it is possible). These kinds of things are impossible if the range offers references. Once a reference gets out, you have lost control over the memory. -Steve
Nov 02 2010
parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 02/11/2010 15:16, Steven Schveighoffer wrote:
 On Fri, 29 Oct 2010 09:06:24 -0400, Bruno Medeiros
 <brunodomedeiros+spam com.gmail> wrote:

 On 06/10/2010 17:34, Andrei Alexandrescu wrote:
 or similar. However, a sealed range does not offer references, so trying
 e.g.

 swap(r1.front, r2.front);

 will not work. This is a problem.

Why doesn't a sealed range offer references? Is it to prevent modifying the elements being iterated? (I searched google and TDPL but couldn't find any info on sealed ranges)

Because a sealed range then has complete power over memory allocation of its elements (really, I think sealed ranges is a misnomer, it's really ranges on sealed containers).

Ah, my mistake. I thought sealed ranges were simply ranges that did not allow modifying the underlying container (a "logical const" range), and similarly for sealed containers (= an unmodifiable container). I see why escaping references is not allowed then. -- Bruno Medeiros - Software Engineer
Nov 09 2010
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, October 29, 2010 11:05:02 Andrei Alexandrescu wrote:
 As for eager-coping, it's a _value type_, of _course_ it's going to be
 copied eagerly. ints, floats, bools, etc. _all_ copy eagerly. How can
 you expect anything else? It's quite explicit that that's how value
 types work. So, if you write a struct where copying is expensive, you're
 asking for trouble and should know it. The only real issue with it that
 I see is the non-obvious places where copying take place. But if you
 tried to make it so that structs didn't copy eagerly, then you're making
 it so that they're not value types anymore which will increas overheard
 and goes against what structs are meant for.

In fact reference counting allows you to define value types with cheap copy construction, so the above is incorrect.

I think I misunderstood what you meant and was thinking reference type rather than ref-counted.
 
 C++ doesn't stop you or help you from being stupid about expensive copy-
 construction. Why should D? Sure, it sucks when your program is slow, but
 we have classes which are reference types and don't have this problem.
 So, if you want a reference type, there it is. I don't see how anyone
 could expect expensive copying not to result in poor performance for a
 _value type_.

That means ranges need to define moveFront, moveBack, moveAt, and countless othed contortions in library code and client code alike. I think the view you're conveying is overly simplistic.

I'm arguing that you do _not_ define those and that you let user code fend for itself. If a programmer is foolish enough to write code with overly expensive postblit constructors, then they're just shooting themselves in the foot. - Jonathan M Davis
Oct 29 2010
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, October 29, 2010 10:51:17 Andrei Alexandrescu wrote:
 On 10/29/10 12:21 CDT, Jonathan M Davis wrote:
 Honestly, what I expect to happen if constant-copy cost is expected is
 that code won't be written that way and the code that expects a
 constant-copy cost is going to be slow. Really, I'd say that in the
 general case, either arbitrary copy construction is going to be
 expected, and there will be code which assumes constant-copy cost and so
 is slow, or constant-copy cost construction is going to be expected and
 any postplits which don't have a constant-copy cost will are going to be
 inefficient in various places.
 
 I really don't think that you have any hope of enforcing either scheme.

That is correct. I don't want to enforce as much as choosing one stance and sticking with it throughout the standard library. The STL is consistently making the following stated assumptions: 1. Functors and iterators are cheap to copy 2. General objects (including containers) are arbitrarily expensive to copy Once STL chose a way, everyone using it could figure out a way to use it gainfully, or to debug performance problems if there were any.
 Programmers just won't think about it in the general case. So, unless the
 "mandatory refcounting for value type" (and I have no clue how you'd do
 that) is actually enforced, you can't enforce either, and you _will_
 have inefficiencies. It's just a question of where and what kind.

Walter and I have been seriously discussing introducing refcounted which would automate part of the process. The problem is you can't automate all or even most of it. Either you must assume client code uses const consistently, or that client code inserts manual calls whenever they plan to change the object.
 If anything, I'm inclined to say that we assume that the postblit is O(1)
 and let the programmer worry about any inefficiencies. We can point out
 that anything worse that O(1) will be a performance problem, but it
 seems to me that any attempt to either accomodate arbitrary cost
 postblit constructors or to try and use any kind of scheme which forces
 programmers to write postblits in a certain way is too complicated and
 doomed to failure. And even if it works, it will be highly annoying to
 deal with.

It sure is annoying, but it does work. Don, can you estimate how difficult it would be to convert BigInt to a refcounted implementation?

Okay, then essentially what you're suggesting is that Phobos either 1. Attempts to avoid copying and has constructs such as moveFront() to avoid it. 2. Or makes all of its structs which have arbitrary-cost postblits into reference types which are refcounted and leave user code to fend for itself. If that's the case, then neither really strikes me as much of a departure from C++. In the first case, we are - for better or worse - facilitating code which has expensive postblit constructors. In the second, we're letting them fend for themselves (like C++ does), but are within Phobos using ref-counting for structs with expensive postblits and potentially suggesting that that's what D programmers in general should use when they have structs which are expensive to copy. Unless I'm misunderstanding something here, neither solution is really a departure from C++. It's just that in one case we make what a C++ programmer would likely have done more efficient whereas in the other we're suggesting an idiom which a C++ programmer would likely not have used, but if the C++ programmer programs using an arbitrary-cost postblit like he would have done with an arbitrary-cost copy constructor in C++, the situation is essentially the same as in C++ - except perhaps if it's a expensive to copy a range, which really shouldn't be the case usually. And if both situations aren't all that much of a departure from C++, then it's really a question of which is best for D. The first solution has the advantage that user code doesn't have to care, but it complicates Phobos and the code of anyone looking to write a container type, a range-type, or a range-based algorithm (the worst of those likely being the algorithm since that's the one of the 3 that people are most likely to be doing). So, it hides the issues but imperfectly. In the second solution, anyone writing structs which are expensive to copy needs to care (like they probably should anyway), but Phobos is simplified. In the general case, I'd say that complicating Phobos in favor of simplifying user code would be the correct decision, since it affects the fewest programmers, but ranges are far-reaching enough that it's probably better to go with the second solution (or just outright ignore the issue like I suggested, though if there are any structs in Phobos with arbitrary-cost postblit constructors where this could be an issue, we probably should deal with them - be it by ref- counting or whatever makes the most sense). - Jonathan M Davis
Oct 29 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/6/10 11:34 AM, Andrei Alexandrescu wrote:
 I think ranges are a great thing, having simplicity as one of their
 advantages. In the beginning they were indeed rather simple:

 struct MyRange {
 bool empty();
 ref Widget front();
 void popFront();
 }

 with the appropriate refinements for bidirectional and random.

 Then there was the need to distinguish one-pass, "input" ranges (e.g.
 files) from many-pass, "forward" ranges. So the "save" function got born
 for forward ranges and above:

 struct MyRange {
 bool empty();
 ref Widget front();
 void popFront();
 MyRange save();
 }

 Then  property came about which required extra adornments:

 struct MyRange {
  property bool empty();
  property ref Widget front();
 void popFront();
  property MyRange save();
 }

 Then some ranges were unable to return ref, but they could receive
 assignment. I call these /sealed ranges/ because they "seal" the address
 of their elements making it inaccessible:

 struct MyRange {
  property bool empty();
  property Widget front();
  property void front(Widget);
 void popFront();
  property MyRange save();
 }

 This made bidirectional and random-access range interfaces quite big
 because now we're talking about back() (two versions), opIndex(),
 opIndexAssign() and opIndexOpAssign().

 Until now I think we're solid on design decisions. The discussion starts
 here.

 And then there was this nagging thing which is well-understood in the
 C++ world. A sealed range returns by value and accepts by value. But in
 C++, an object's copy constructor is arbitrarily expensive. The library
 must be designed in accordance to that and carefully navigate around
 that issue.

 For example, swap is a fundamental primitive used in many algorithms. It
 should swap objects at constant cost. Indeed, for references, swap is
 easy to implement:

 void swap(T)(ref T lhs, ref T rhs)
 {
 assert(!pointsTo(lhs, rhs) && !pointsTo(rhs, lhs)
 && !pointsTo(lhs, lhs) && !pointsTo(rhs, rhs));
 T tmp = move(lhs);
 lhs = move(rhs);
 rhs = move(tmp);
 }

 or similar. However, a sealed range does not offer references, so trying
 e.g.

 swap(r1.front, r2.front);

 will not work. This is a problem.

 To solve that problem, I introduced moveFront(), moveBack(), and
 moveAt(size_t), all of which destructively read the front, back, or an
 indexed element respectively off the range. Then you can swap r1.front
 with r2.front at constant cost like this:

 T tmp = r1.moveFront();
 r1.front = r2.moveFront();
 r2.front = move(tmp);

 All of this works and is rock-solid, but it does load the range
 interface considerably. To a newcomer coming without the background
 above, a full-fledged range definition may look quite loaded.

 One simplification is to simply decree that Phobos (and D in general)
 shuns objects with eager copy. Any this(this) could be considered
 constant cost. That would have two consequences:

 1. It would simplify all ranges and many algorithms because there's no
 more need for moveXxx and swapping can be done the old-fashioned way
 (with assignments in inline code):

 T tmp = r1.front;
 r1.front = r2.front;
 r2.front = tmp;

 In general many things become considerably easier if you can simply
 assume that copying objects around won't be part of your complexity
 requirements or the practical costs of your algorithm.

 2. It would force certain types (such as BigInt) that allocate resources
 and have value semantics to resort to reference counting.

 3. It would give more legitimacy to sealed objects that return data by
 value (as opposed to reference). I believe sealed objects are very
 important for safety.

 4. It would be a definite departure from C++, where all value copies are
 considered of arbitrary cost. This would provide a convenient straw-man
 for naysayers (e.g. "Hey, D calls the copy constructor even more often
 than C++! No thanks, I'll stick with C++0x which solves it all with
 rvalue references").

 5. It would make things look and feel familiar to people coming from any
 other languages than C++, who seldom need to define a costly this(this)
 at all.

 Please discuss.


 Andrei

Thanks to all for a very illuminating discussion. The dialog revealed that the topic of cheap copy construction is oddly linked with the topic of garbage collection: garbage collection calls destructors concurrently with other code, which means reference count code must be interlocked, which makes it not-so-cheap. This is only a manifestation of a deeper problem: destructors can execute arbitrary code, and, if ran in an arbitrary thread, can cause all sorts of deadlocks and essentially render "shared" inoperant. Walter and I discussed two possibilities this morning: 1. Never call class destructors (or never call them if more than one thread is running) 2. Each destructor should be called in the thread that created the object. The second solution is quite attractive. It can be implemented by associating each thread with a worklist - a list of objects to destroy. The GC can run in any thread but it will not call destructors. Instead, it adds objects to be destroyed to the threads that created them (this means we need to add a field to each object with the thread ID of the creating thread). When client code calls new, the allocation routine will first inspect the worklist and call the destructors if needed. I think this can be made to work and has good properties, although a fair amount of details need to be figured out. Please share any thoughts you may have. Andrei
Nov 04 2010
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s
 Thanks to all for a very illuminating discussion. The dialog revealed
 that the topic of cheap copy construction is oddly linked with the topic
 of garbage collection: garbage collection calls destructors concurrently
 with other code, which means reference count code must be interlocked,
 which makes it not-so-cheap.
 This is only a manifestation of a deeper problem: destructors can
 execute arbitrary code, and, if ran in an arbitrary thread, can cause
 all sorts of deadlocks and essentially render "shared" inoperant.
 Walter and I discussed two possibilities this morning:
 1. Never call class destructors (or never call them if more than one
 thread is running)
 2. Each destructor should be called in the thread that created the object.
 The second solution is quite attractive. It can be implemented by
 associating each thread with a worklist - a list of objects to destroy.
 The GC can run in any thread but it will not call destructors. Instead,
 it adds objects to be destroyed to the threads that created them (this
 means we need to add a field to each object with the thread ID of the
 creating thread). When client code calls new, the allocation routine
 will first inspect the worklist and call the destructors if needed.
 I think this can be made to work and has good properties, although a
 fair amount of details need to be figured out. Please share any thoughts
 you may have.
 Andrei

Interesting, though I feel like the baseline cruft for Object is getting a little big. I think we could store the thread ID and the monitor object in the same memory. If a class is being synchronized on, then it's shared rather than being owned by a single thread. Therefore, at an offset of 1, a class either has a pointer to a monitor (if it's shared, and therefore can be destroyed in any thread) or a thread ID (if it's unshared). To tell them apart, just see if the classinfo pointer points to a Monitor or a Thread.
Nov 04 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/4/10 2:06 PM, dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s
 2. Each destructor should be called in the thread that created the object.
 The second solution is quite attractive. It can be implemented by
 associating each thread with a worklist - a list of objects to destroy.
 The GC can run in any thread but it will not call destructors. Instead,
 it adds objects to be destroyed to the threads that created them (this
 means we need to add a field to each object with the thread ID of the
 creating thread). When client code calls new, the allocation routine
 will first inspect the worklist and call the destructors if needed.
 I think this can be made to work and has good properties, although a
 fair amount of details need to be figured out. Please share any thoughts
 you may have.
 Andrei

Interesting, though I feel like the baseline cruft for Object is getting a little big. I think we could store the thread ID and the monitor object in the same memory. If a class is being synchronized on, then it's shared rather than being owned by a single thread. Therefore, at an offset of 1, a class either has a pointer to a monitor (if it's shared, and therefore can be destroyed in any thread) or a thread ID (if it's unshared). To tell them apart, just see if the classinfo pointer points to a Monitor or a Thread.

Yah, my thoughts exactly. My feeling is that this is one of those areas where it's difficult to find a solution that's at the same time simple and good. Sean, any thoughts? Andrei
Nov 04 2010
parent Sean Kelly <sean invisibleduck.org> writes:
Andrei Alexandrescu Wrote:

 On 11/4/10 2:06 PM, dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s
 2. Each destructor should be called in the thread that created the object.
 The second solution is quite attractive. It can be implemented by
 associating each thread with a worklist - a list of objects to destroy.
 The GC can run in any thread but it will not call destructors. Instead,
 it adds objects to be destroyed to the threads that created them (this
 means we need to add a field to each object with the thread ID of the
 creating thread). When client code calls new, the allocation routine
 will first inspect the worklist and call the destructors if needed.
 I think this can be made to work and has good properties, although a
 fair amount of details need to be figured out. Please share any thoughts
 you may have.
 Andrei

Interesting, though I feel like the baseline cruft for Object is getting a little big. I think we could store the thread ID and the monitor object in the same memory. If a class is being synchronized on, then it's shared rather than being owned by a single thread. Therefore, at an offset of 1, a class either has a pointer to a monitor (if it's shared, and therefore can be destroyed in any thread) or a thread ID (if it's unshared). To tell them apart, just see if the classinfo pointer points to a Monitor or a Thread.

Yah, my thoughts exactly. My feeling is that this is one of those areas where it's difficult to find a solution that's at the same time simple and good. Sean, any thoughts?

Right now monitors are structs, typically allocated with malloc. I think they could be transformed to classes in conjunction with some porting of old runtime C code to D, but that's more than a few line fix. It also doesn't address the issue of dynamically allocated structs, but perhaps those will be ruled as not-to-be-finalized-by-the-GC? An alternative would be for the GC to store this info in a lookup table of some sort, though I do like the dual use of the monitor slot in objects.
Nov 04 2010
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 04 Nov 2010 14:38:59 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Thanks to all for a very illuminating discussion. The dialog revealed  
 that the topic of cheap copy construction is oddly linked with the topic  
 of garbage collection: garbage collection calls destructors concurrently  
 with other code, which means reference count code must be interlocked,  
 which makes it not-so-cheap.

 This is only a manifestation of a deeper problem: destructors can  
 execute arbitrary code, and, if ran in an arbitrary thread, can cause  
 all sorts of deadlocks and essentially render "shared" inoperant.

 Walter and I discussed two possibilities this morning:

 1. Never call class destructors (or never call them if more than one  
 thread is running)

 2. Each destructor should be called in the thread that created the  
 object.

 The second solution is quite attractive. It can be implemented by  
 associating each thread with a worklist - a list of objects to destroy.  
 The GC can run in any thread but it will not call destructors. Instead,  
 it adds objects to be destroyed to the threads that created them (this  
 means we need to add a field to each object with the thread ID of the  
 creating thread). When client code calls new, the allocation routine  
 will first inspect the worklist and call the destructors if needed.

 I think this can be made to work and has good properties, although a  
 fair amount of details need to be figured out. Please share any thoughts  
 you may have.

What if a thread no longer exists? Would there be a way to mark such dtors that need this feature? I don't know, because it seems to me like a lot of band-aiding for something that is infrequently used (I've gotten along fine in D without reference counting for as long as I've used it). You also need to provide a pointer in the object for a linked list for the work-list. I'm thinking I would prefer to have the destructor and finalizer split, so the 'destructor' can tell whether it's being called synchronously or from the GC. Then the dtor itself can handle the whole worklist implementation. I guess the only extra thing you would need in addition to that is a hook so memory allocation from a given thread can be hooked to process the worklist. Or maybe the worklist becomes a standard feature of the thread class, and you manually add the task from the destructor instead of the GC doing it for you. -Steve
Nov 04 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/4/10 2:45 PM, Steven Schveighoffer wrote:
 On Thu, 04 Nov 2010 14:38:59 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 I think this can be made to work and has good properties, although a
 fair amount of details need to be figured out. Please share any
 thoughts you may have.

What if a thread no longer exists? Would there be a way to mark such dtors that need this feature?

Probably a thread that dies (by crashing) will never call its destructors.
 I don't know, because it seems to me like
 a lot of band-aiding for something that is infrequently used (I've
 gotten along fine in D without reference counting for as long as I've
 used it).

If you use File, you use reference counting. We do need to define a policy for copying expensive objects.
 You also need to provide a pointer in the object for a linked list for
 the work-list.

I think it's enough if the thread has the list, but I may as well be wrong.
 I'm thinking I would prefer to have the destructor and finalizer split,
 so the 'destructor' can tell whether it's being called synchronously or
 from the GC. Then the dtor itself can handle the whole worklist
 implementation.

 I guess the only extra thing you would need in addition to that is a
 hook so memory allocation from a given thread can be hooked to process
 the worklist. Or maybe the worklist becomes a standard feature of the
 thread class, and you manually add the task from the destructor instead
 of the GC doing it for you.

These are all interesting options. At the end of the day, what we want to guarantee is no races in non-shared objects. Andrei
Nov 04 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/4/10 3:09 PM, Steven Schveighoffer wrote:
 On Thu, 04 Nov 2010 15:55:07 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 11/4/10 2:45 PM, Steven Schveighoffer wrote:
 On Thu, 04 Nov 2010 14:38:59 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 I think this can be made to work and has good properties, although a
 fair amount of details need to be figured out. Please share any
 thoughts you may have.

What if a thread no longer exists? Would there be a way to mark such dtors that need this feature?

Probably a thread that dies (by crashing) will never call its destructors.

It is quite possible that a thread exits while there are active, non-GC'd objects owned by it in the heap. There's no way to determine this when the thread exits without running a GC collection, which is unacceptable.

Oh, I see. Indeed, thread cleanup code should also go down the worklist and call destructors. By definition, no unshared objects should be visible by other threads. [snip]
 I just was saying that I prefer solutions where we don't bend the entire
 language to cater to reference counting, if we can help it. I'd rather
 bend the reference counting implementation to fit the language.
 Something in the middle might be good too.

Absolutely. Andrei
Nov 04 2010
next sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-11-04 16:22:27 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Oh, I see. Indeed, thread cleanup code should also go down the worklist 
 and call destructors. By definition, no unshared objects should be 
 visible by other threads.

Just a silly question... which thread will own immutable objects? And how does the GC knows we've cast an object to immutable? Or what happens if we just have one immutable member in the middle of a thread-local struct? Immutable implicitly means shared, so you can leak references to immutable members at will. The same problem arise if you have a shared member as a member your struct or class. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 04 2010
prev sibling parent Sean Kelly <sean invisibleduck.org> writes:
Steven Schveighoffer Wrote:
 
 So the object isn't in the worklist when the thread exits, and it doesn't  
 get marked for destruction until some other thread runs a GC cycle.

Right.
Nov 04 2010
prev sibling next sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-11-04 16:09:26 -0400, "Steven Schveighoffer" 
<schveiguy yahoo.com> said:

 I just was saying that I prefer solutions where we don't bend the 
 entire  language to cater to reference counting, if we can help it.  
 I'd rather  bend the reference counting implementation to fit the 
 language.  Something  in the middle might be good too.

I just want to drop a note at this. I know it's not exactly the same issue, but... If you've read my plans for D/Objective-C, you'll have seen there's automatic reference counting for Objective-C objects on the list. There's obviously no plan to implement that for other things than Objective-C objects in that document, but once it's implemented for one type it'll trivial to allow it for other types. I've thought throughly about the issue before choosing that direction. There are two reasons to implement it in the compiler: easy of use and efficiency of the generated code. Cocoa traditionally has used explicit calls to retain/release (with some help from autorelease). This is tedious and error-prone and looks out of place in D. The D way would be to create a ref-counting struct that wraps Objective-C objects and calls retain/release for you. But the problem with that is you'd have to use this struct for each and every Objective-C object reference and never forget to use it. We could make the compiler forbid not wrapping Objective-C objects in this struct, but at this point, why not make the compiler wrap the reference in that struct transparently for you? And if it comes to this, it's probably easier to just skip that struct-wrapping scheme entirely and just issue calls to the relevant function upon assignment and destruction of Objective-C object reference. And this will probably open the door to optimization opportunities. Changing an atomic reference count is expensive, but by noting that any paired call to retain and release on the same object has no effect you can avoid a lot of these calls. How many can be avoided? Well, in theory, if the reference count when the object enters the function is the same as the reference count when the object leaves the function's code, then you don't have to touch the reference counter at all. Any function that doesn't leak an object received as an argument doesn't have to touch the reference count. That's a lot of functions! So in the end, I think that if reference-counting is going to be used, even just moderately, it'd better be in the compiler for efficiency reasons. That's only an advantage if you use atomic reference counting, but it's the kind of thing that prevent ridiculous "optimizations" like making function parameters 'ref' just to avoid touching the reference counter... see: <http://stackoverflow.com/questions/2502394/the-cost-of-passing-by-shared-ptr> -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 04 2010
prev sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
Steven Schveighoffer Wrote:
 
 It is quite possible that a thread exits while there are active, non-GC'd  
 objects owned by it in the heap.  There's no way to determine this when  
 the thread exits without running a GC collection, which is unacceptable.

Right. But it doesn't matter who finalizes this data, since it won't be competing with the only other thread that actually referenced it.
Nov 04 2010
parent Sean Kelly <sean invisibleduck.org> writes:
Sean Kelly Wrote:

 Steven Schveighoffer Wrote:
 
 It is quite possible that a thread exits while there are active, non-GC'd  
 objects owned by it in the heap.  There's no way to determine this when  
 the thread exits without running a GC collection, which is unacceptable.

Right. But it doesn't matter who finalizes this data, since it won't be competing with the only other thread that actually referenced it.

What I meant was that it won't be competing with the owner of any unshared data referenced by this object, since that thread has terminated.
Nov 04 2010
prev sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
Steven Schveighoffer Wrote:
 
 What if a thread no longer exists?  Would there be a way to mark such  
 dtors that need this feature?  I don't know, because it seems to me like a  
 lot of band-aiding for something that is infrequently used (I've gotten  
 along fine in D without reference counting for as long as I've used it).

If a thread no longer exists then it doesn't matter who calls the finalizers so long as they aren't finalized concurrently by more than one thread. To simplify things, the thread could notify the GC on termination so it could eagerly do whatever maintenance it wants to. The only time this dtor wouldn't be called is for threads that terminate abnormally via pthread_exit or Windows TerminateThread, but because such threads have been left in an invalid state (uncalled dtors or other scoped code may have left mutexes locked, etc), finalizing any known data risks a deadlock or worse. However, since these thread termination routines aren't supported in D (or C++ for that matter, as far as I know), not finalizing the data would simply be another effect of the resulting undefined behavior.
 You also need to provide a pointer in the object for a linked list for the  
 work-list.

The work-list could be a list of delegates. Then the footprint of objects doesn't need to be altered.
 I'm thinking I would prefer to have the destructor and finalizer split, so  
 the 'destructor' can tell whether it's being called synchronously or from  
 the GC.  Then the dtor itself can handle the whole worklist implementation.

This is actually very easy, as druntime already differentiates between the two (the 'det' flag in rt_finalize). I haven't thought enough about where the worklist code should live though. All the GC knows about finalization is that it calls rt_finalize() on blocks with the FINALIZE bit set, but at the same time it's a perfect choke-point for iterating on the work-list when allocations are performed. There are three situations that have th be dealt with: 1. The GC finalizes a shared object. 2. The GC finalizes an unshared object, but the GC was invoked by the object's owner thread. 3. The GC finalizes an unshared object, but the GC was invoked by another thread. The GC would probably be the best place to track which thread owned what object so it would probably be the best place to store the work-list as well. This is somewhat similar to how HOARD handles freed memory, so there is certainly a precedent for an allocator taking care of this sort of thing.
 I guess the only extra thing you would need in addition to that is a hook  
 so memory allocation from a given thread can be hooked to process the  
 worklist.  Or maybe the worklist becomes a standard feature of the thread  
 class, and you manually add the task from the destructor instead of the GC  
 doing it for you.

Hm... this would delay the finalization of shared objects as well, but perhaps that doesn't matter since they can be safely finalized by any thread. I'm still inclined to say that the GC should take care of the grunt work though, unless someone wants to argue that owner thread ID of every object should be stored in a new hidden field in the object itself.
Nov 04 2010
parent Sean Kelly <sean invisibleduck.org> writes:
Steven Schveighoffer Wrote:

 On Thu, 04 Nov 2010 18:35:51 -0400, Sean Kelly <sean invisibleduck.org>  
 wrote:
 
 Steven Schveighoffer Wrote:
 What if a thread no longer exists?  Would there be a way to mark such
 dtors that need this feature?  I don't know, because it seems to me  
 like a
 lot of band-aiding for something that is infrequently used (I've gotten
 along fine in D without reference counting for as long as I've used it).

If a thread no longer exists then it doesn't matter who calls the finalizers so long as they aren't finalized concurrently by more than one thread. To simplify things, the thread could notify the GC on termination so it could eagerly do whatever maintenance it wants to.

Right, but in Andrei's idea (from what I can tell), the GC does not call destructors, it just adds to thread-specific worklists. But I think it's a non-issue. Just stick it on the worklist of the current thread, since the dead thread doesn't care anymore. I'm assuming the current thread's worklist is cleaned at the end of the GC cycle.

The beginning, before a collection is triggered. It would be like a mini-collection.
 The GC would probably be the best place to track which thread owned what  
 object so it would probably be the best place to store the work-list as  
 well.  This is somewhat similar to how HOARD handles freed memory, so  
 there is certainly a precedent for an allocator taking care of this sort  
 of thing.

It could be in the Thread object, or in a thread's TLS. I feel the Thread object is much easier to get at from another thread than another thread's TLS, or is there an easy way to do this? Hm... I just had a thought -- what is going to be the performance of continuously looking up the right worklist to add to based on the thread ID?

That's one reason I thought the work list should be in the GC. It could store the per-block field in a parallel block of memory to each page, and this would be an index into an array. When a thread terminates the work list could be processed and released for re-use. No opaque function calls for lookup or anything like that.
 I'm still inclined to say that the GC should take care of the grunt work  
 though, unless someone wants to argue that owner thread ID of every  
 object should be stored in a new hidden field in the object itself.

I think that is what Andrei is suggesting. Unless we want to change the language to be able to flag reference counted types so the GC can see that.

I'm just not sure it's worth changing the ABI for this. Particularly when the "thread ID" is implementation-defined.
 Can we encapsulate all this function into a member?  So for instance you  
 just add a RefCount type in your struct, and it contains the code to deal  
 with this problem?

Probably.
Nov 04 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 04 Nov 2010 15:55:07 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 11/4/10 2:45 PM, Steven Schveighoffer wrote:
 On Thu, 04 Nov 2010 14:38:59 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 I think this can be made to work and has good properties, although a
 fair amount of details need to be figured out. Please share any
 thoughts you may have.

What if a thread no longer exists? Would there be a way to mark such dtors that need this feature?

Probably a thread that dies (by crashing) will never call its destructors.

It is quite possible that a thread exits while there are active, non-GC'd objects owned by it in the heap. There's no way to determine this when the thread exits without running a GC collection, which is unacceptable. simple example of a thread function that exits properly but leaves data in the heap: void mythread() { auto c = new ClassObject(); }
 I don't know, because it seems to me like
 a lot of band-aiding for something that is infrequently used (I've
 gotten along fine in D without reference counting for as long as I've
 used it).

If you use File, you use reference counting. We do need to define a policy for copying expensive objects.

hehe, most of my heavy d work has been developing dcollections or from my previous job (where I used Tango). So I haven't yet been using File :) For the times I have used File, the reference counting aspect of it has not been important (i.e. short utilities that exit quickly).
 You also need to provide a pointer in the object for a linked list for
 the work-list.

I think it's enough if the thread has the list, but I may as well be wrong.

No, I mean if you have a spot for a pointer in every object, then the object memory itself can act as the linked list element that is in the worklist. Otherwise, you have to allocate extra memory to be put into the list and which points at the objects to be destroyed (which I think would be overkill). The Thread object would still contain the list head pointer.
 I'm thinking I would prefer to have the destructor and finalizer split,
 so the 'destructor' can tell whether it's being called synchronously or
 from the GC. Then the dtor itself can handle the whole worklist
 implementation.

 I guess the only extra thing you would need in addition to that is a
 hook so memory allocation from a given thread can be hooked to process
 the worklist. Or maybe the worklist becomes a standard feature of the
 thread class, and you manually add the task from the destructor instead
 of the GC doing it for you.

These are all interesting options. At the end of the day, what we want to guarantee is no races in non-shared objects.

Understood. I think we should explore all options, to see what makes the most sense. I did like the description of the way Cocoa does it from Michel Fortin on the mailing list. It might also be worth looking into. I just was saying that I prefer solutions where we don't bend the entire language to cater to reference counting, if we can help it. I'd rather bend the reference counting implementation to fit the language. Something in the middle might be good too. -Steve
Nov 04 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 04 Nov 2010 16:22:27 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 11/4/10 3:09 PM, Steven Schveighoffer wrote:
 On Thu, 04 Nov 2010 15:55:07 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 11/4/10 2:45 PM, Steven Schveighoffer wrote:
 On Thu, 04 Nov 2010 14:38:59 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 I think this can be made to work and has good properties, although a
 fair amount of details need to be figured out. Please share any
 thoughts you may have.

What if a thread no longer exists? Would there be a way to mark such dtors that need this feature?

Probably a thread that dies (by crashing) will never call its destructors.

It is quite possible that a thread exits while there are active, non-GC'd objects owned by it in the heap. There's no way to determine this when the thread exits without running a GC collection, which is unacceptable.

Oh, I see. Indeed, thread cleanup code should also go down the worklist and call destructors. By definition, no unshared objects should be visible by other threads.

Well, yes, but not exactly :) You see, an object could not have been targeted for destruction by the GC until *after* the thread that allocated it has exited. I suspect that the right answer here is for the GC to just destroy it, but if it's not equipped to run those dtors, it might be a sticky thing to implement. So the object isn't in the worklist when the thread exits, and it doesn't get marked for destruction until some other thread runs a GC cycle. -Steve
Nov 04 2010
prev sibling next sibling parent Sean Kelly <sean invisibleduck.org> writes:
Andrei Alexandrescu Wrote:

 On 10/6/10 11:34 AM, Andrei Alexandrescu wrote:

 Until now I think we're solid on design decisions. The discussion starts
 here.

 And then there was this nagging thing which is well-understood in the
 C++ world. A sealed range returns by value and accepts by value. But in
 C++, an object's copy constructor is arbitrarily expensive. The library
 must be designed in accordance to that and carefully navigate around
 that issue.

 For example, swap is a fundamental primitive used in many algorithms. It
 should swap objects at constant cost. Indeed, for references, swap is
 easy to implement:

 void swap(T)(ref T lhs, ref T rhs)
 {
 assert(!pointsTo(lhs, rhs) && !pointsTo(rhs, lhs)
 && !pointsTo(lhs, lhs) && !pointsTo(rhs, rhs));
 T tmp = move(lhs);
 lhs = move(rhs);
 rhs = move(tmp);
 }

 or similar. However, a sealed range does not offer references, so trying
 e.g.

 swap(r1.front, r2.front);

 will not work. This is a problem.


I probably missed it, but I imagine there was discussion of sealed ranges returning a wrapper struct for front() that's implicitly convertible to the underlying type? The a specialization of swap could do what's needed.
 Thanks to all for a very illuminating discussion. The dialog revealed 
 that the topic of cheap copy construction is oddly linked with the topic 
 of garbage collection: garbage collection calls destructors concurrently 
 with other code, which means reference count code must be interlocked, 
 which makes it not-so-cheap.
 
 This is only a manifestation of a deeper problem: destructors can 
 execute arbitrary code, and, if ran in an arbitrary thread, can cause 
 all sorts of deadlocks and essentially render "shared" inoperant.
 
 Walter and I discussed two possibilities this morning:
 
 1. Never call class destructors (or never call them if more than one 
 thread is running)
 
 2. Each destructor should be called in the thread that created the object.
 
 The second solution is quite attractive. It can be implemented by 
 associating each thread with a worklist - a list of objects to destroy. 
 The GC can run in any thread but it will not call destructors. Instead, 
 it adds objects to be destroyed to the threads that created them (this 
 means we need to add a field to each object with the thread ID of the 
 creating thread). When client code calls new, the allocation routine 
 will first inspect the worklist and call the destructors if needed.

This should only be necessary for non-shared instances. Which, admittedly, are the majority of allocated objects. So either an optional ID per memory block or instead per-thread pools to allocate from. The ID approach would be easier to implement with the current GC though. The same would be required for dynamically allocated structs once the GC bug has been fixed where they aren't finalized right now (or a proclamation were made that dynamic structs won't be finalized by the GC).
Nov 04 2010
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 04 Nov 2010 18:35:51 -0400, Sean Kelly <sean invisibleduck.org>  
wrote:

 Steven Schveighoffer Wrote:
 What if a thread no longer exists?  Would there be a way to mark such
 dtors that need this feature?  I don't know, because it seems to me  
 like a
 lot of band-aiding for something that is infrequently used (I've gotten
 along fine in D without reference counting for as long as I've used it).

If a thread no longer exists then it doesn't matter who calls the finalizers so long as they aren't finalized concurrently by more than one thread. To simplify things, the thread could notify the GC on termination so it could eagerly do whatever maintenance it wants to.

Right, but in Andrei's idea (from what I can tell), the GC does not call destructors, it just adds to thread-specific worklists. But I think it's a non-issue. Just stick it on the worklist of the current thread, since the dead thread doesn't care anymore. I'm assuming the current thread's worklist is cleaned at the end of the GC cycle.
 You also need to provide a pointer in the object for a linked list for  
 the
 work-list.

The work-list could be a list of delegates. Then the footprint of objects doesn't need to be altered.

Yes, but then you need to allocate memory to hold the delegates. I don't know if this is a efficient use of memory. Why not just bump up the size of an object by one pointer?
 I'm thinking I would prefer to have the destructor and finalizer split,  
 so
 the 'destructor' can tell whether it's being called synchronously or  
 from
 the GC.  Then the dtor itself can handle the whole worklist  
 implementation.

This is actually very easy, as druntime already differentiates between the two (the 'det' flag in rt_finalize). I haven't thought enough about where the worklist code should live though. All the GC knows about finalization is that it calls rt_finalize() on blocks with the FINALIZE bit set, but at the same time it's a perfect choke-point for iterating on the work-list when allocations are performed. There are three situations that have th be dealt with: 1. The GC finalizes a shared object. 2. The GC finalizes an unshared object, but the GC was invoked by the object's owner thread. 3. The GC finalizes an unshared object, but the GC was invoked by another thread.

4. A thread exits with a non-empty worklist. I think this might reduce to case 2, but it's not a case where you're already taking the global GC lock (vs. cleaning up when you call new).
 The GC would probably be the best place to track which thread owned what  
 object so it would probably be the best place to store the work-list as  
 well.  This is somewhat similar to how HOARD handles freed memory, so  
 there is certainly a precedent for an allocator taking care of this sort  
 of thing.

It could be in the Thread object, or in a thread's TLS. I feel the Thread object is much easier to get at from another thread than another thread's TLS, or is there an easy way to do this? Hm... I just had a thought -- what is going to be the performance of continuously looking up the right worklist to add to based on the thread ID?
 I guess the only extra thing you would need in addition to that is a  
 hook
 so memory allocation from a given thread can be hooked to process the
 worklist.  Or maybe the worklist becomes a standard feature of the  
 thread
 class, and you manually add the task from the destructor instead of the  
 GC
 doing it for you.

Hm... this would delay the finalization of shared objects as well, but perhaps that doesn't matter since they can be safely finalized by any thread.

There might be no point to use this weird scheme for shared objects -- you need to use locking to do reference counting anyways, so you might as well just do that.
 I'm still inclined to say that the GC should take care of the grunt work  
 though, unless someone wants to argue that owner thread ID of every  
 object should be stored in a new hidden field in the object itself.

I think that is what Andrei is suggesting. Unless we want to change the language to be able to flag reference counted types so the GC can see that. Can we encapsulate all this function into a member? So for instance you just add a RefCount type in your struct, and it contains the code to deal with this problem? -Steve
Nov 04 2010