www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - moveFront() and friends: Request for comment

reply dsimcha <dsimcha yahoo.com> writes:
Andrei and I were talking on the Phobos list and deep in a newsgroup thread
about whether Phobos should make a serious effort to efficiently support
structs with arbitrarily complex, expensive postblits.  Such support includes
the moveFront(), moveBack() and moveAt() range primitives, which are designed
to allow a struct with arbitrarily expensive copying to be efficiently moved
(rather than copied) from the range to another variable.

My general feeling is ranges are becoming too complex, that they already work
well enough for the vast majority of cases, and that any more complexity would
make them nearly impossible to write correctly, especially in the case of
higher order ranges.  I think the amount of time it took to get std.range and
the range parts of std.algorithm into a reasonably bug-free state supports
this assessment.  In general, for reasons I won't detail to avoid driving this
post off topic, I think D has been complicating common cases to fix corner
cases too much lately.

With regard to supporting postblits specifically, to me there are two points
to having value semantics instead of reference semantics:

1.  For small primitive types value semantics are more efficient to implement.

2.  Value semantics avoid uncontrolled aliasing and make programs easier to
reason about.

However, I think number 2 is a red herring when it comes to structs,
especially if you care about performance or handling out of memory exceptions.
 The net effect of hiding arbitrarily complex, possibly throwing logic behind
an innocent looking assignment statement, parameter passing, etc. is to make
programs harder to reason about than if everything just had reference
semantics and explicit cloning.  Therefore, I consider arbitrarily expensive,
non-O(1) postblits to be a terrible programming practice that Phobos should
not go out of the way to handle efficiently, and a throwback to non-GC'd
languages where everything must have a clear owner who is responsible for
freeing it.  I realize that there are a few odd cases like BigInt that would
be very strange with reference semantics.  However, in these cases copy on
write semantics work pretty well.

I'd like any comments anyone might have on to what extent arbitrarily
expensive postblits should be considered in the design of Phobos.
Aug 26 2010
parent reply Pillsy <pillsbury gmail.com> writes:
dsimcha Wrote:
[...]
 I'd like any comments anyone might have on to what extent 
 arbitrarily expensive postblits should be considered in the 
 design of Phobos.

I agree with you: expensive post-blits just don't seem sufficiently necessary in D to warp the design of the standard library around them, and have a distinctly anti-patternish feel to them. You have too many other options, like straight reference semantics, copy-on- write, and (with immutability and GC) safely shared structure. The last can be an incredibly useful technique for reducing the cost of copying, because it allows you to treat an immutable reference type exactly like a value type. Cheers, Pillsy
Aug 27 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 8/27/10 7:44 PDT, Pillsy wrote:
 dsimcha Wrote:
 [...]
 I'd like any comments anyone might have on to what extent
 arbitrarily expensive postblits should be considered in the
 design of Phobos.

I agree with you: expensive post-blits just don't seem sufficiently necessary in D to warp the design of the standard library around them, and have a distinctly anti-patternish feel to them. You have too many other options, like straight reference semantics, copy-on- write, and (with immutability and GC) safely shared structure. The last can be an incredibly useful technique for reducing the cost of copying, because it allows you to treat an immutable reference type exactly like a value type. Cheers, Pillsy

Clearly immutable sharing helps, and clearly reference counting and COW are valid techniques. However, the situation is not as cut and dried. The problem with RC/COW is that they reduce exception safety essentially _everywhere_ else but the copy constructor. Once you have a type using RC/COW, any mutation of that object, even one that ostensibly doesn't cause resource allocation, might have arbitrary cost or fail. This was the experience with std::string in C++ - its creators did everything they could to enable reference counting, and the outcome was quite unpleasant. Add to this the implementation annoyance of checking for aliasing in _every_ single method of the type. RefCounted in phobos can help with that, but not without a cost. So I'm not sure it's as simple a decision as it might sound. Andrei
Aug 27 2010
next sibling parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 On 8/27/10 7:44 PDT, Pillsy wrote:
 dsimcha Wrote:
 [...]
 I'd like any comments anyone might have on to what extent
 arbitrarily expensive postblits should be considered in the
 design of Phobos.

I agree with you: expensive post-blits just don't seem sufficiently necessary in D to warp the design of the standard library around them, and have a distinctly anti-patternish feel to them. You have too many other options, like straight reference semantics, copy-on- write, and (with immutability and GC) safely shared structure. The last can be an incredibly useful technique for reducing the cost of copying, because it allows you to treat an immutable reference type exactly like a value type. Cheers, Pillsy

are valid techniques. However, the situation is not as cut and dried. The problem with RC/COW is that they reduce exception safety essentially _everywhere_ else but the copy constructor. Once you have a type using RC/COW, any mutation of that object, even one that ostensibly doesn't cause resource allocation, might have arbitrary cost or fail. This was the experience with std::string in C++ - its creators did everything they could to enable reference counting, and the outcome was quite unpleasant. Add to this the implementation annoyance of checking for aliasing in _every_ single method of the type. RefCounted in phobos can help with that, but not without a cost. So I'm not sure it's as simple a decision as it might sound. Andrei

But with RC/COW at least you know that if something becomes arbitrarily expensive it's for a good reason, i.e. because doing anything else would break the abstraction you're trying to create **right now**. With eager copying copies get made very often when a human reader reasoning only locally about the code could easily prove they're not needed. Furthermore, I think my point still generally holds about reference vs. value semantics: When you have to bend over backwards to get value semantics, they're no longer easier to reason about.
Aug 27 2010
prev sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-08-27 14:04:50 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Add to this the implementation annoyance of checking for aliasing in 
 _every_ single method of the type. RefCounted in phobos can help with 
 that, but not without a cost.

Also, to get reference counting right when the struct can be stored on the garbage-collected heap you must use atomic operations to manipulate the reference count. Atomic operations add some more overhead on multi-core and multi-processor systems. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Aug 27 2010