www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - what to do with postblit on the heap?

reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
I have submitted a fix for bug 5272,  
http://d.puremagic.com/issues/show_bug.cgi?id=5272 "Postblit not called on  
copying due to array append"

However, I am starting to realize that one of the major reasons for  
postblit is to match it with an equivalent dtor.

This works well when the struct is on the stack -- the posblit for  
instance increments a reference counter, then the dtor decrements the ref  
counter.

But when the data is on the heap, the destructor is *not* called.  So what  
happens to any ref-counted data that is on the heap?  It's never  
decremented.  Currently though, it might still work, because postblit  
isn't called when the data is on the heap!  So no increment, no decrement.

I think this is an artificial "success".  However, if the pull request I  
initiated is accepted, then postblit *will* be called on heap allocation,  
for instance if you append data.  This will further highlight the fact  
that the destructor is not being called.

So is it worth adding calls to postblit, knowing that the complement  
destructor is not going to be called?  I can see in some cases where it  
would be expected, and I can see other cases where it will be difficult to  
deal with.  IMO, the difficult cases are already broken anyways, but it  
just seems like they are not.

The other part of this puzzle that is missing is array assignment, for  
example a[] = b[] does not call postblits.  I cannot fix this because  
_d_arraycopy does not give me the typeinfo.

Anyone else have any thoughts?  I'm mixed as to whether this patch should  
be accepted without more comprehensive GC/compiler reform.  I feel its a  
step in the right direction, but that it will upset the balance in a few  
places (particularly ref-counting).

-Steve
Jun 20 2011
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Steven Schveighoffer:

 The other part of this puzzle that is missing is array assignment, for  
 example a[] = b[] does not call postblits.  I cannot fix this because  
 _d_arraycopy does not give me the typeinfo.

This seems fixable. Is it possible to rewrite _d_arraycopy?
 Anyone else have any thoughts?

I think the current situation is not acceptable. This is a problem quite worse than _d_arraycopy because here some information is missing. Isn't this is the same problem with struct destructors? A solution is to add this information at runtime, a type tag to structs that have a postblit and/or destructor. But then structs aren't PODs any more. There are other places to store this information, like in some kind of associative array. Another solution is to forbid what the compiler can't guarantee. If a struct is going to be used only where its type is known, then it's allowed to have postblit and destructor. Is it possible to enforce this? I think it is. Here an annotation is useful to better manage this contract between programmer and compiler. Bye, bearophile
Jun 20 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 20 Jun 2011 11:03:27 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Steven Schveighoffer:

 The other part of this puzzle that is missing is array assignment, for
 example a[] = b[] does not call postblits.  I cannot fix this because
 _d_arraycopy does not give me the typeinfo.

This seems fixable. Is it possible to rewrite _d_arraycopy?

The compiler is the one passing the parameters to _d_arraycopy, so even if I change _d_arraycopy to accept a TypeInfo, the compiler needs to be fixed to send the TypeInfo. I think this is really a no-brainer, because currently what is passed is the element size, which is contained within the TypeInfo. I will be filing a bug on that. But currently, I can't fix it.
 Anyone else have any thoughts?

I think the current situation is not acceptable. This is a problem quite worse than _d_arraycopy because here some information is missing. Isn't this is the same problem with struct destructors?

This is an easy fix -- the typeinfo contains information of whether or not and how to run the postblit. The larger problem is the GC not calling the destructor. But my immediate question is -- is it better to half-fix the problem by committing my changes, or leave the issue alone?
 A solution is to add this information at runtime, a type tag to structs  
 that have a postblit and/or destructor. But then structs aren't PODs any  
 more. There are other places to store this information, like in some  
 kind of associative array.

Any solution that fixes the GC problem will have to store the typeinfo somehow associated with the block. I think we may have more traction for this problem with a precise GC. I don't think the right route is to store type info inside the struct itself. This added overhead is not necessary for when the struct is stored on the stack.
 Another solution is to forbid what the compiler can't guarantee. If a  
 struct is going to be used only where its type is known, then it's  
 allowed to have postblit and destructor. Is it possible to enforce this?  
 I think it is. Here an  annotation is useful to better manage this  
 contract between programmer and compiler.

This is a possibility, making a struct only usable if it's inside another such struct or inside a class, or on the stack. -Steve
Jun 20 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On 2011-06-20 11:56, Jose Armando Garcia wrote:
 On Mon, Jun 20, 2011 at 12:03 PM, bearophile <bearophileHUGS lycos.com> 

 Steven Schveighoffer:
 A solution is to add this information at runtime, a type tag to structs
 that have a postblit and/or destructor. But then structs aren't PODs any
 more. There are other places to store this information, like in some
 kind of associative array.

What are PODs?

Plain Old Datatype. It's a user-defined data type with member variables but no functions. It just holds data. - Jonathan M Davis
Jun 20 2011
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2011-06-20 10:34:14 -0400, "Steven Schveighoffer" 
<schveiguy yahoo.com> said:

 I have submitted a fix for bug 5272,  
 http://d.puremagic.com/issues/show_bug.cgi?id=5272 "Postblit not called 
 on  copying due to array append"
 
 However, I am starting to realize that one of the major reasons for  
 postblit is to match it with an equivalent dtor.
 
 This works well when the struct is on the stack -- the posblit for  
 instance increments a reference counter, then the dtor decrements the 
 ref  counter.
 
 But when the data is on the heap, the destructor is *not* called.  So 
 what  happens to any ref-counted data that is on the heap?  It's never  
 decremented.  Currently though, it might still work, because postblit  
 isn't called when the data is on the heap!  So no increment, no 
 decrement.
 
 I think this is an artificial "success".  However, if the pull request 
 I  initiated is accepted, then postblit *will* be called on heap 
 allocation,  for instance if you append data.  This will further 
 highlight the fact  that the destructor is not being called.
 
 So is it worth adding calls to postblit, knowing that the complement  
 destructor is not going to be called?  I can see in some cases where it 
  would be expected, and I can see other cases where it will be 
 difficult to  deal with.  IMO, the difficult cases are already broken 
 anyways, but it  just seems like they are not.
 
 The other part of this puzzle that is missing is array assignment, for  
 example a[] = b[] does not call postblits.  I cannot fix this because  
 _d_arraycopy does not give me the typeinfo.
 
 Anyone else have any thoughts?  I'm mixed as to whether this patch 
 should  be accepted without more comprehensive GC/compiler reform.  I 
 feel its a  step in the right direction, but that it will upset the 
 balance in a few  places (particularly ref-counting).

My feeling is that array appending and array assignment should be considered a compiler issue first and foremost. The compiler needs to be fixed, and once that's done the runtime will need to be updated anyway to match the changes in the compiler. Your proposed fix for array assignment is a good start for when the compiler will provide the necessary info to the runtime, but applying it at this time will just fix some cases by breaking a few others: net improvement zero. As for the issue that destructors aren't called for arrays on the heap, it's a serious problem. But it's also a separate problem that concerns purely the runtime, as far as I am aware of. Is there someone working on it? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jun 20 2011
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2011-06-20 18:12:11 -0400, "Steven Schveighoffer" 
<schveiguy yahoo.com> said:

 On Mon, 20 Jun 2011 16:45:44 -0400, Michel Fortin  
 <michel.fortin michelf.com> wrote:
 
 My feeling is that array appending and array assignment should be  
 considered a compiler issue first and foremost. The compiler needs to 
 be  fixed, and once that's done the runtime will need to be updated 
 anyway  to match the changes in the compiler. Your proposed fix for 
 array  assignment is a good start for when the compiler will provide 
 the  necessary info to the runtime, but applying it at this time will 
 just  fix some cases by breaking a few others: net improvement zero.

BTW, I now feel that your request to make a distinction between move and copy is not required. The compiler currently calls the destructor of temporaries, so it should also call postblit. I don't think it can make the distinction between array appending and simply calling some other function.

Well, if a ~= S(); does result in a temporary which get copied and then destroyed, why have move semantics at all? Move semantics are not just an optimization, they actually change the semantics. If you have a struct with a disabled postblit, should it still be appendable?
 If the issue of array assignment is fixed, do you think it's worth 
 putting  the change in, and then filing a bug against the GC?  I still 
 think the  current cases that "work" are fundamentally broken anyways.

That depends. I'm not too sure currently whether the S destructor is called for this code: a ~= S(); If the compiler currently calls the destructor on the temporary S struct, then your patch is actually a fix because it balances constructors and destructors correctly for the appending part (the bug is then that compiler should use move semantics but is using copy instead). If it doesn't call the destructor then your patch does introduce a bug for this case. All in all, I don't think it's important enough to justify we waste hours debating in what order we should fix those bugs. Do what you think is right. If it becomes a problem or it introduces a bug here or there, we'll adjust, at worse that means a revert of your commit.
 As for the issue that destructors aren't called for arrays on the heap, 
  it's a serious problem. But it's also a separate problem that concerns 
  purely the runtime, as far as I am aware of. Is there someone working 
 on  it?

I think we need precise scanning to get a complete solution. Another option is to increase the information the array runtime stores in the memory block (currently it only stores the "used" length) and then hook the GC to call the dtors. This might be a quick fix that doesn't require precise scanning, but it also fixes the most common case of allocating a single struct or an array of structs on the heap.

The GC calling the destructor doesn't require precise scanning. Although it's true that both problems require adding type information to memory blocks, beyond that requirement they're both independent. It'd be really nice if struct destructors were called correctly. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jun 20 2011
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2011-06-21 07:34:24 -0400, "Steven Schveighoffer" 
<schveiguy yahoo.com> said:

 On Mon, 20 Jun 2011 21:59:49 -0400, Michel Fortin  
 <michel.fortin michelf.com> wrote:
 
 Well, if
 
 	a ~= S();
 
 does result in a temporary which get copied and then destroyed, why 
 have  move semantics at all? Move semantics are not just an 
 optimization, they  actually change the semantics. If you have a struct 
 with a  disabled  postblit, should it still be appendable?

Good question. I don't even know how the runtime could avoid calling postblit, there is no flag saying the postblit is disabled in the typeinfo (that I know of). But think about it this way, if you have a function foo: foo(S)(ref S s, S[] arr) { arr[0] = s; } Isn't this copy semantics? This is exactly how the D runtime gets the data. The only difference is, the runtime function is allowed to accept a temporary as a reference (not possible in a normal function).

... and in the special case where the reference is a rvalue, then it should have move semantics. See below.
 Now, you could force move semantics, if you know the argument is an  
 rvalue, but I don't know enough about what postblit is used for in 
 order  to say it's fine to use move semantics to move the struct into 
 the heap.
 
 The reason I say move semantics are an optimization is because:
 
 {
    S tmp;
    arr ~= tmp;
 }
 
 is essentially equivalent to:
 
 arr ~= S();
 
 But the former is copy semantics, the latter can be considered move.  
 It  seems like a smart compiler during optimization could rewrite the 
 former  as the latter, unless the semantics truly are different.  Which 
 is why I'm  trying to figure out how postblit can be used ;)

Actually, this should be the equivalent: import std.algorithm; S tmp; arr ~= move(tmp); While there is no doubt that 'moving' a struct can often be used as an optimization without changing the semantics, if you want the disabled attribute to be useful on the postblit constructor then the language needs to define when its semantics require 'moving' data and whey then require 'copying' data, it can't let that only to the choice of the optimizer. Things might be clearer if we had a move operator, but instead we have a 'move' function. There is only one case where I think we can assume to have move semantics: when a temporary (a rvalue) is assigned to somewhere. That's also all that's needed for the 'move' function to work. And that is broken currently when it comes to array appending. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jun 21 2011
next sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2011-06-21 08:38:05 -0400, "Steven Schveighoffer" 
<schveiguy yahoo.com> said:

 On Tue, 21 Jun 2011 08:25:40 -0400, Michel Fortin  
 <michel.fortin michelf.com> wrote:
 
 While there is no doubt that 'moving' a struct can often be used as an  
 optimization without changing the semantics, if you want the  disabled  
 attribute to be useful on the postblit constructor then the language  
 needs to define when its semantics require 'moving' data and whey then  
 require 'copying' data, it can't let that only to the choice of the  
 optimizer.

Another issue with appending a disabled-postblit struct, what happens when you have to reallocate a block to get more space? This cannot possibly be a move, because the compiler has no idea at the time of appending whether anything else has a reference to the original data. So should it just be a runtime error?

That's indeed a problem.
 I'm starting to think that  disabled postblit structs *shouldn't* be 
 able  to be appended.

That would make sense. It should be a compile-time error. It would also turn appending using move to an optimization, because all the types you can append will be guarantied to be copyable. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jun 21 2011
prev sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2011-06-21 09:24:29 -0400, so <so so.so> said:

 On Tue, 21 Jun 2011 15:25:40 +0300, Michel Fortin  
 <michel.fortin michelf.com> wrote:
 
 Actually, this should be the equivalent:
 
 	import std.algorithm;
 
 	S tmp;
 	arr ~= move(tmp);
 
 While there is no doubt that 'moving' a struct can often be used as an  
 optimization without changing the semantics, if you want the  disabled  
 attribute to be useful on the postblit constructor then the language  
 needs to define when its semantics require 'moving' data and whey then  
 require 'copying' data, it can't let that only to the choice of the  
 optimizer.
 
 Things might be clearer if we had a move operator, but instead we have 
 a  'move' function. There is only one case where I think we can assume 
 to  have move semantics: when a temporary (a rvalue) is assigned to  
 somewhere. That's also all that's needed for the 'move' function to  
 work. And that is broken currently when it comes to array appending.

It should be something else because move(tmp) in std.algorithm takes by reference and returns by value by actually moving it, because of the value semantics in D, that the ability to differentiate value from reference it doesn't need any other syntax because this is much better. I think it is pretty neat, yet i still have some trouble understanding its effect here. S tmp; arr ~= move(tmp); // would make an unnecessary copy. Move should do some kind of a magic there and treat its argument like a value, and return it.

Actually, no copy is needed. Move takes the argument by ref so it can obliterates it. Obliteration consists of replacing its bytes with those in S.init. That way if you have a smart pointer, it gets returned without having to update the reference count (since the source's content has been destroyed). It was effectively be moved, not copied. Note 1: Currently 'move' obliterates the source only if the type has a destructor or a postblit. I think it should always do it, but without inlining that might be a performance bottleneck. Note 2: Making move efficient in the case of appending might require a total rework of how the compiler interacts with the runtime. And I don't think you can optimize away all blitting unless the move function was treated specially by the compiler (or became a special operator). -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jun 21 2011
next sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2011-06-21 12:13:32 -0400, so <so so.so> said:

 On Tue, 21 Jun 2011 18:18:26 +0300, Michel Fortin  
 <michel.fortin michelf.com> wrote:
 
 Actually, no copy is needed. Move takes the argument by ref so it can  
 obliterates it. Obliteration consists of replacing its bytes with those 
  in S.init. That way if you have a smart pointer, it gets returned  
 without having to update the reference count (since the source's 
 content  has been destroyed). It was effectively be moved, not copied.

T move(ref T a) { T b; move(a, b); return b; } T a; whatever = move(a); If T is a struct, i don't see how a copy is not needed looking at the current state of move.

Actually, that depends on how you look at this. The essence of a move operation is that you just copy the bits and then obliterate the old ones. So yes, there's indeed a copy to do, but there's no need to call a copy constructor or a destructor because no new instance has been created, it has just been moved. If you don't call the copy constructor (postblit) then it's a move operation, not a copy operation, even though there's still a bitwise copy inside the move operation. In the return statement above, 'b' gets copied to 'whatever', then disappears along with the stack frame belonging to the function. So it becomes a move operation. (And it's even more direct than that with the named-value optimization.) -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jun 21 2011
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
(resending)

On 6/21/11 11:13 AM, so wrote:
 On Tue, 21 Jun 2011 18:18:26 +0300, Michel Fortin
 <michel.fortin michelf.com> wrote:

 Actually, no copy is needed. Move takes the argument by ref so it can
 obliterates it. Obliteration consists of replacing its bytes with
 those in S.init. That way if you have a smart pointer, it gets
 returned without having to update the reference count (since the
 source's content has been destroyed). It was effectively be moved, not
 copied.

T move(ref T a) { T b; move(a, b); return b; } T a; whatever = move(a); If T is a struct, i don't see how a copy is not needed looking at the current state of move.

The rule that move and TDPL rely on but is not fully implemented is that returning a nonstatic local value never does a postblit nor a destructor - it just copies the bits. Andrei
Jun 21 2011
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 6/21/11 4:24 PM, Sean Kelly wrote:
 On Jun 21, 2011, at 11:26 AM, Andrei Alexandrescu wrote:
 The rule that move and TDPL rely on but is not fully implemented is that
returning a nonstatic local value never does a postblit nor a destructor - it
just copies the bits.

So it's effectively illegal to have a struct containing a pointer that references itself, correct?

Illegal. All D structs must be transparently relocatable without breaking their invariant. Andrei
Jun 21 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 20 Jun 2011 16:45:44 -0400, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2011-06-20 10:34:14 -0400, "Steven Schveighoffer"  
 <schveiguy yahoo.com> said:

 I have submitted a fix for bug 5272,   
 http://d.puremagic.com/issues/show_bug.cgi?id=5272 "Postblit not called  
 on  copying due to array append"
  However, I am starting to realize that one of the major reasons for   
 postblit is to match it with an equivalent dtor.
  This works well when the struct is on the stack -- the posblit for   
 instance increments a reference counter, then the dtor decrements the  
 ref  counter.
  But when the data is on the heap, the destructor is *not* called.  So  
 what  happens to any ref-counted data that is on the heap?  It's never   
 decremented.  Currently though, it might still work, because postblit   
 isn't called when the data is on the heap!  So no increment, no  
 decrement.
  I think this is an artificial "success".  However, if the pull request  
 I  initiated is accepted, then postblit *will* be called on heap  
 allocation,  for instance if you append data.  This will further  
 highlight the fact  that the destructor is not being called.
  So is it worth adding calls to postblit, knowing that the complement   
 destructor is not going to be called?  I can see in some cases where it  
  would be expected, and I can see other cases where it will be  
 difficult to  deal with.  IMO, the difficult cases are already broken  
 anyways, but it  just seems like they are not.
  The other part of this puzzle that is missing is array assignment,  
 for  example a[] = b[] does not call postblits.  I cannot fix this  
 because  _d_arraycopy does not give me the typeinfo.
  Anyone else have any thoughts?  I'm mixed as to whether this patch  
 should  be accepted without more comprehensive GC/compiler reform.  I  
 feel its a  step in the right direction, but that it will upset the  
 balance in a few  places (particularly ref-counting).

My feeling is that array appending and array assignment should be considered a compiler issue first and foremost. The compiler needs to be fixed, and once that's done the runtime will need to be updated anyway to match the changes in the compiler. Your proposed fix for array assignment is a good start for when the compiler will provide the necessary info to the runtime, but applying it at this time will just fix some cases by breaking a few others: net improvement zero.

BTW, I now feel that your request to make a distinction between move and copy is not required. The compiler currently calls the destructor of temporaries, so it should also call postblit. I don't think it can make the distinction between array appending and simply calling some other function. If the issue of array assignment is fixed, do you think it's worth putting the change in, and then filing a bug against the GC? I still think the current cases that "work" are fundamentally broken anyways. For instance, in a ref-counted struct, if you appended it to an array, then removed all the stack-based references, the ref count goes to zero, even though the array still has a reference (I think someone filed a bug against std.stdio.File for this).
 As for the issue that destructors aren't called for arrays on the heap,  
 it's a serious problem. But it's also a separate problem that concerns  
 purely the runtime, as far as I am aware of. Is there someone working on  
 it?

I think we need precise scanning to get a complete solution. Another option is to increase the information the array runtime stores in the memory block (currently it only stores the "used" length) and then hook the GC to call the dtors. This might be a quick fix that doesn't require precise scanning, but it also fixes the most common case of allocating a single struct or an array of structs on the heap. -Steve
Jun 20 2011
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On 2011-06-20 15:12, Steven Schveighoffer wrote:
 On Mon, 20 Jun 2011 16:45:44 -0400, Michel Fortin
 
 <michel.fortin michelf.com> wrote:
 On 2011-06-20 10:34:14 -0400, "Steven Schveighoffer"
 
 <schveiguy yahoo.com> said:
 I have submitted a fix for bug 5272,
 http://d.puremagic.com/issues/show_bug.cgi?id=5272 "Postblit not called
 on copying due to array append"
 
 However, I am starting to realize that one of the major reasons for
 
 postblit is to match it with an equivalent dtor.
 
 This works well when the struct is on the stack -- the posblit for
 
 instance increments a reference counter, then the dtor decrements the
 ref counter.
 
 But when the data is on the heap, the destructor is *not* called. So
 
 what happens to any ref-counted data that is on the heap? It's never
 decremented. Currently though, it might still work, because postblit
 isn't called when the data is on the heap! So no increment, no
 decrement.
 
 I think this is an artificial "success". However, if the pull request
 
 I initiated is accepted, then postblit *will* be called on heap
 allocation, for instance if you append data. This will further
 highlight the fact that the destructor is not being called.
 
 So is it worth adding calls to postblit, knowing that the complement
 
 destructor is not going to be called? I can see in some cases where it
 
 would be expected, and I can see other cases where it will be
 
 difficult to deal with. IMO, the difficult cases are already broken
 anyways, but it just seems like they are not.
 
 The other part of this puzzle that is missing is array assignment,
 
 for example a[] = b[] does not call postblits. I cannot fix this
 because _d_arraycopy does not give me the typeinfo.
 
 Anyone else have any thoughts? I'm mixed as to whether this patch
 
 should be accepted without more comprehensive GC/compiler reform. I
 feel its a step in the right direction, but that it will upset the
 balance in a few places (particularly ref-counting).

My feeling is that array appending and array assignment should be considered a compiler issue first and foremost. The compiler needs to be fixed, and once that's done the runtime will need to be updated anyway to match the changes in the compiler. Your proposed fix for array assignment is a good start for when the compiler will provide the necessary info to the runtime, but applying it at this time will just fix some cases by breaking a few others: net improvement zero.

BTW, I now feel that your request to make a distinction between move and copy is not required. The compiler currently calls the destructor of temporaries, so it should also call postblit. I don't think it can make the distinction between array appending and simply calling some other function.

If an object is moved, neither the postblit nor the destructor should be called. The object is moved, not copied and destroyed. I believe that TDPL is very specific on that. - Jonathan M Davis
Jun 20 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 20 Jun 2011 18:43:30 -0400, Jonathan M Davis <jmdavisProg gmx.com>
wrote:

 On 2011-06-20 15:12, Steven Schveighoffer wrote:
 BTW, I now feel that your request to make a distinction between move and
 copy is not required. The compiler currently calls the destructor of
 temporaries, so it should also call postblit. I don't think it can make
 the distinction between array appending and simply calling some other
 function.

If an object is moved, neither the postblit nor the destructor should be called. The object is moved, not copied and destroyed. I believe that TDPL is very specific on that.

Well, I think in this case it is being copied. It's put on the stack, and then copied to the heap inside the runtime function. The runtime could be passed a flag indicating the append is really a move, but I'm not sure it's a good choice. To me, not calling the postblit and dtor on a moved struct is an optimization, no? And you can't re-implement these semantics for a normal function. The one case I can think of is when an rvalue is allowed to be passed by reference (which is exactly what's happening here). Is there anything a postblit is allowed to do that would break a struct if you disabled the postblit in this case? I'm pretty sure internal pointers are not supported, especially if move semantics do not call the postblit. -Steve
Jun 20 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On 2011-06-20 16:07, Steven Schveighoffer wrote:
 On Mon, 20 Jun 2011 18:43:30 -0400, Jonathan M Davis <jmdavisProg gmx.com>
 
 wrote:
 On 2011-06-20 15:12, Steven Schveighoffer wrote:
 BTW, I now feel that your request to make a distinction between move and
 copy is not required. The compiler currently calls the destructor of
 temporaries, so it should also call postblit. I don't think it can make
 the distinction between array appending and simply calling some other
 function.

If an object is moved, neither the postblit nor the destructor should be called. The object is moved, not copied and destroyed. I believe that TDPL is very specific on that.

Well, I think in this case it is being copied. It's put on the stack, and then copied to the heap inside the runtime function. The runtime could be passed a flag indicating the append is really a move, but I'm not sure it's a good choice. To me, not calling the postblit and dtor on a moved struct is an optimization, no? And you can't re-implement these semantics for a normal function. The one case I can think of is when an rvalue is allowed to be passed by reference (which is exactly what's happening here).

Well, going from the stack to the heap probably is a copy. But moves shouldn't be calling the postblit or the destructor, and you seemed to be saying that they should. The main place that a move would occur that I can think would be when returning a value from a function, which is very different. And I don't think that avoiding the postblit is necessarily just an optimization. If the postblit really is skipped, then it's probably possible to return an object which cannot legally be copied (presumably due to some combination of reference or pointer member variables and const or immutable), though that wouldn't exactly be a typical situation, even if it actually is possible. It _is_ primarily an optimization to move rather than copy and destroy, but I'm not sure that it's _just_ an optimization.
 Is there anything a postblit is allowed to do that would break a struct if
 you disabled the postblit in this case?  I'm pretty sure internal pointers
 are not supported, especially if move semantics do not call the postblit.

If the struct had a pointer to a local member variable which the postblit would have deep-copied, then sure, not calling the postblit would screw with the struct. But that would screw with a struct which was returned from a function as well, and that's the prime place for the move semantics. That sort of struct is just plain badly designed, so I don't think that it's really something to worry about. I can't think of any other cases where it would be a problem though. Structs don't usually care where they live (aside from the issue of structs being designed to live on the stack and then not getting their destructor called because they're on the heap). - Jonathan M Davis
Jun 20 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On 2011-06-20 18:59, Michel Fortin wrote:
 On 2011-06-20 18:12:11 -0400, "Steven Schveighoffer"
 
 <schveiguy yahoo.com> said:
 On Mon, 20 Jun 2011 16:45:44 -0400, Michel Fortin
 
 <michel.fortin michelf.com> wrote:
 My feeling is that array appending and array assignment should be
 considered a compiler issue first and foremost. The compiler needs to
 be  fixed, and once that's done the runtime will need to be updated
 anyway  to match the changes in the compiler. Your proposed fix for
 array  assignment is a good start for when the compiler will provide
 the  necessary info to the runtime, but applying it at this time will
 just  fix some cases by breaking a few others: net improvement zero.

BTW, I now feel that your request to make a distinction between move and copy is not required. The compiler currently calls the destructor of temporaries, so it should also call postblit. I don't think it can make the distinction between array appending and simply calling some other function.

Well, if a ~= S(); does result in a temporary which get copied and then destroyed, why have move semantics at all? Move semantics are not just an optimization, they actually change the semantics. If you have a struct with a disabled postblit, should it still be appendable?

I would expect that to have move semantics. There's no need to create and destroy a temporary. It's completely wasteful. A copy should only be happening when a copy _needs_ to happen. It doesn't need to happen here. Now, depending on what ~= did internally (assuming that it were an overloaded operator), then a copy may end up occurring inside of the function, but that shouldn't happen for the built-in ~= operator, and a well-written overloaded ~= should avoid the need to copy as well. - Jonathan M Davis
Jun 20 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 20 Jun 2011 21:59:49 -0400, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2011-06-20 18:12:11 -0400, "Steven Schveighoffer"  
 <schveiguy yahoo.com> said:

 On Mon, 20 Jun 2011 16:45:44 -0400, Michel Fortin   
 <michel.fortin michelf.com> wrote:

 My feeling is that array appending and array assignment should be   
 considered a compiler issue first and foremost. The compiler needs to  
 be  fixed, and once that's done the runtime will need to be updated  
 anyway  to match the changes in the compiler. Your proposed fix for  
 array  assignment is a good start for when the compiler will provide  
 the  necessary info to the runtime, but applying it at this time will  
 just  fix some cases by breaking a few others: net improvement zero.

and copy is not required. The compiler currently calls the destructor of temporaries, so it should also call postblit. I don't think it can make the distinction between array appending and simply calling some other function.

Well, if a ~= S(); does result in a temporary which get copied and then destroyed, why have move semantics at all? Move semantics are not just an optimization, they actually change the semantics. If you have a struct with a disabled postblit, should it still be appendable?

Good question. I don't even know how the runtime could avoid calling postblit, there is no flag saying the postblit is disabled in the typeinfo (that I know of). But think about it this way, if you have a function foo: foo(S)(ref S s, S[] arr) { arr[0] = s; } Isn't this copy semantics? This is exactly how the D runtime gets the data. The only difference is, the runtime function is allowed to accept a temporary as a reference (not possible in a normal function). Now, you could force move semantics, if you know the argument is an rvalue, but I don't know enough about what postblit is used for in order to say it's fine to use move semantics to move the struct into the heap. The reason I say move semantics are an optimization is because: { S tmp; arr ~= tmp; } is essentially equivalent to: arr ~= S(); But the former is copy semantics, the latter can be considered move. It seems like a smart compiler during optimization could rewrite the former as the latter, unless the semantics truly are different. Which is why I'm trying to figure out how postblit can be used ;)
 If the issue of array assignment is fixed, do you think it's worth  
 putting  the change in, and then filing a bug against the GC?  I still  
 think the  current cases that "work" are fundamentally broken anyways.

That depends. I'm not too sure currently whether the S destructor is called for this code: a ~= S();

It is, I tested it. I ran this code: struct Test { this(this) { writeln("copy done"); } void opAssign(Test rhs) { writeln("assignment done"); } ~this() { writeln("destructor called"); } } void main() { Test[] tests = new Test[1]; { // Test test; // tests ~= test; tests ~= Test(); } writeln("done"); } and saw "destructor called" in the output, no matter which option was commented out.
 All in all, I don't think it's important enough to justify we waste  
 hours debating in what order we should fix those bugs. Do what you think  
 is right. If it becomes a problem or it introduces a bug here or there,  
 we'll adjust, at worse that means a revert of your commit.

OK, then I'll push the change. I already filed a bug against _d_arraycopy.
 As for the issue that destructors aren't called for arrays on the  
 heap,  it's a serious problem. But it's also a separate problem that  
 concerns  purely the runtime, as far as I am aware of. Is there  
 someone working on  it?

option is to increase the information the array runtime stores in the memory block (currently it only stores the "used" length) and then hook the GC to call the dtors. This might be a quick fix that doesn't require precise scanning, but it also fixes the most common case of allocating a single struct or an array of structs on the heap.

The GC calling the destructor doesn't require precise scanning. Although it's true that both problems require adding type information to memory blocks, beyond that requirement they're both independent. It'd be really nice if struct destructors were called correctly.

Yes, the more I think about it, the more this solution looks attractive. All that is required is to flag the block as having a finalizer, store the TypeInfo pointer somewhere, and the GC should call it. I'll put in a bugzilla enhancement so it's not forgotten. -Steve
Jun 21 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 21 Jun 2011 08:25:40 -0400, Michel Fortin  
<michel.fortin michelf.com> wrote:

 While there is no doubt that 'moving' a struct can often be used as an  
 optimization without changing the semantics, if you want the  disabled  
 attribute to be useful on the postblit constructor then the language  
 needs to define when its semantics require 'moving' data and whey then  
 require 'copying' data, it can't let that only to the choice of the  
 optimizer.

Another issue with appending a disabled-postblit struct, what happens when you have to reallocate a block to get more space? This cannot possibly be a move, because the compiler has no idea at the time of appending whether anything else has a reference to the original data. So should it just be a runtime error? I'm starting to think that disabled postblit structs *shouldn't* be able to be appended. -Steve
Jun 21 2011
prev sibling next sibling parent so <so so.so> writes:
On Tue, 21 Jun 2011 04:59:49 +0300, Michel Fortin  
<michel.fortin michelf.com> wrote:

 Well, if

 	a ~= S();

 does result in a temporary which get copied and then destroyed, why have  
 move semantics at all? Move semantics are not just an optimization, they  
 actually change the semantics.

There was a similar discussion on struct constructors which ended up something like this, that it is an optimization. I fully agree it is not, move exists just the reasons like this.
Jun 21 2011
prev sibling next sibling parent so <so so.so> writes:
On Tue, 21 Jun 2011 15:25:40 +0300, Michel Fortin  
<michel.fortin michelf.com> wrote:

 Actually, this should be the equivalent:

 	import std.algorithm;

 	S tmp;
 	arr ~= move(tmp);

 While there is no doubt that 'moving' a struct can often be used as an  
 optimization without changing the semantics, if you want the  disabled  
 attribute to be useful on the postblit constructor then the language  
 needs to define when its semantics require 'moving' data and whey then  
 require 'copying' data, it can't let that only to the choice of the  
 optimizer.

 Things might be clearer if we had a move operator, but instead we have a  
 'move' function. There is only one case where I think we can assume to  
 have move semantics: when a temporary (a rvalue) is assigned to  
 somewhere. That's also all that's needed for the 'move' function to  
 work. And that is broken currently when it comes to array appending.

It should be something else because move(tmp) in std.algorithm takes by reference and returns by value by actually moving it, because of the value semantics in D, that the ability to differentiate value from reference it doesn't need any other syntax because this is much better. I think it is pretty neat, yet i still have some trouble understanding its effect here. S tmp; arr ~= move(tmp); // would make an unnecessary copy. Move should do some kind of a magic there and treat its argument like a value, and return it. Something like: move(ref T a) return cast(T)a; Maybe it makes no sense at all but i tried!
Jun 21 2011
prev sibling next sibling parent so <so so.so> writes:
On Tue, 21 Jun 2011 18:18:26 +0300, Michel Fortin  
<michel.fortin michelf.com> wrote:

 Actually, no copy is needed. Move takes the argument by ref so it can  
 obliterates it. Obliteration consists of replacing its bytes with those  
 in S.init. That way if you have a smart pointer, it gets returned  
 without having to update the reference count (since the source's content  
 has been destroyed). It was effectively be moved, not copied.

T move(ref T a) { T b; move(a, b); return b; } T a; whatever = move(a); If T is a struct, i don't see how a copy is not needed looking at the current state of move.
Jun 21 2011
prev sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Jun 21, 2011, at 11:26 AM, Andrei Alexandrescu wrote:
=20
 The rule that move and TDPL rely on but is not fully implemented is =

destructor - it just copies the bits. So it's effectively illegal to have a struct containing a pointer that = references itself, correct?=
Jun 21 2011