D - Copy on Write

Scott Egan (8/8) Apr 16 2004 I've read a bit about the requirement for write on copy semantics being

Ilya Minkov (23/32) Apr 16 2004 It should be a standard strategy for libraries. In your code, you may do...

Scott Egan (20/52) Apr 17 2004 I just draged this from the MSDN Library - jsut for info. Obviously MS

Walter (4/10) Apr 17 2004 It's a perfectly valid approach, except that it's slow and therefore not

Walter (11/19) Apr 16 2004 In order to make copy-on-write work automatically, one has to detect whe...

Walter (8/23) Apr 17 2004 a

"Scott Egan" <scotte tpg.com.aux> writes:

I've read a bit about the requirement for write on copy semantics being
necessary.

Understandably since a slice will point to the same array data.

I gatther that its got to be done by the programmer.  I this such a good
thing?

Could the compiler/runtime not determin when this is necessary and
automatically make a copy?

Am I missing something?

Apr 16 2004

Ilya Minkov <minkov cs.tum.edu> writes:

Scott Egan schrieb:
 I've read a bit about the requirement for write on copy semantics being
 necessary.
 
 Understandably since a slice will point to the same array data.

It should be a standard strategy for libraries. In your code, you may do 
as you please. So are libraries for code which is only used internally 
and not called by the user.

 I gatther that its got to be done by the programmer.  I this such a good
 thing?

Yes, so it is. Walter decided it would be right as it is, because, say, 
when doing a functional style processing, you would need to avoid 
copying redundantly at every call. On the other hand, one could say, let 
"in" for arrays mean constant semantics or copy semantics as you suggest 
below, and assume "inout" if one wants the result to promote upwards - 
because without inout array resizes and such don't work anyway.

 Could the compiler/runtime not determin when this is necessary and
 automatically make a copy?

Perhaps - if it detects that an array is being written to. But then 
again, in such cases there are different decision strategies. One could 
be, if function contains code where array is lvalue, always copy at the 
beginning of the function. Another could be, copy at the first actual 
assignment - which could never happen and thus be a win - and maintain a 
boolean which has to be checked at every assignment - which is then a 
loss. As you see, they perform differently well depending on the task - 
but only the programmer could decide that and implement the optimal 
strategy, it cannot be done automatically.

 Am I missing something?

I don't think you are any longer. :>

-eye

PS. I think this is one of the fundamental questions - would it be worth 
adding that to a Wiki?

Apr 16 2004

"Scott Egan" <scotte tpg.com.aux> writes:

I just draged this from the MSDN Library - jsut for info.  Obviously MS
throught about this too.

Strings are immutable
One of the hard things to get used to in the .NET Framework is that String
objects are immutable, meaning once they're created their values cannot be
changed. (However, you can reassign the string reference to refer to another
string, freeing up the first string for garbage collection if no other
references to it exist.)

The methods of String that appear to manipulate the string do not change the
current string; instead, they create a new string and return it. Even
changing, inserting, or deleting a single character causes a new string to
be created and the old one to be thrown away.

Note that the process of repeatedly creating and throwing away strings can
be slow. But making strings immutable has a number of advantages, in that
ownership, aliasing, and threading issues are all much simpler with
immutable objects. For instance, strings are always safe for multithreaded
programming, since there's nothing a thread can do that would mess up
another thread by modifying a string, since strings cannot be modified.




"Ilya Minkov" <minkov cs.tum.edu> wrote in message
news:c5p8kg$15eu$1 digitaldaemon.com...
 Scott Egan schrieb:
 I've read a bit about the requirement for write on copy semantics being
 necessary.

 Understandably since a slice will point to the same array data.

 It should be a standard strategy for libraries. In your code, you may do
 as you please. So are libraries for code which is only used internally
 and not called by the user.

 I gatther that its got to be done by the programmer.  I this such a good
 thing?

 Yes, so it is. Walter decided it would be right as it is, because, say,
 when doing a functional style processing, you would need to avoid
 copying redundantly at every call. On the other hand, one could say, let
 "in" for arrays mean constant semantics or copy semantics as you suggest
 below, and assume "inout" if one wants the result to promote upwards -
 because without inout array resizes and such don't work anyway.

 Could the compiler/runtime not determin when this is necessary and
 automatically make a copy?

 Perhaps - if it detects that an array is being written to. But then
 again, in such cases there are different decision strategies. One could
 be, if function contains code where array is lvalue, always copy at the
 beginning of the function. Another could be, copy at the first actual
 assignment - which could never happen and thus be a win - and maintain a
 boolean which has to be checked at every assignment - which is then a
 loss. As you see, they perform differently well depending on the task -
 but only the programmer could decide that and implement the optimal
 strategy, it cannot be done automatically.

 Am I missing something?

 I don't think you are any longer. :>

 -eye

 PS. I think this is one of the fundamental questions - would it be worth
 adding that to a Wiki?

Apr 17 2004

"Walter" <walter digitalmars.com> writes:

"Scott Egan" <scotte tpg.com.aux> wrote in message
news:c5r9ke$13o2$1 digitaldaemon.com...
 Note that the process of repeatedly creating and throwing away strings can
 be slow. But making strings immutable has a number of advantages, in that
 ownership, aliasing, and threading issues are all much simpler with
 immutable objects. For instance, strings are always safe for multithreaded
 programming, since there's nothing a thread can do that would mess up
 another thread by modifying a string, since strings cannot be modified.

It's a perfectly valid approach, except that it's slow and therefore not
suitable for high performance programming applications.

Apr 17 2004

"Walter" <walter digitalmars.com> writes:

"Scott Egan" <scotte tpg.com.aux> wrote in message
news:c5ong8$ajv$1 digitaldaemon.com...
 I've read a bit about the requirement for write on copy semantics being
 necessary.

 Understandably since a slice will point to the same array data.

 I gatther that its got to be done by the programmer.  I this such a good
 thing?

 Could the compiler/runtime not determin when this is necessary and
 automatically make a copy?

 Am I missing something?

In order to make copy-on-write work automatically, one has to detect when a
write is done to an array. Without hardware support for this (hardware
support does exist for such, but only at the 4k page level), the compiler
will have to either insert 'write barrier' checks on every write to an
array, or would have to take the conservative approach and copy the array
for every write to it. The latter is how javascript and (I heard) how delphi
works. But it makes multiple sequential writes to an array incredibly slow
and memory hoggish, rendering it completely unacceptable for a performance
oriented language like D.

Apr 16 2004

"Walter" <walter digitalmars.com> writes:

"J Anderson" <REMOVEanderson badmama.com.au> wrote in message
news:c5qq8b$fb3$1 digitaldaemon.com...
 Walter wrote:

In order to make copy-on-write work automatically, one has to detect when


a
write is done to an array. Without hardware support for this (hardware
support does exist for such, but only at the 4k page level), the compiler
will have to either insert 'write barrier' checks on every write to an
array, or would have to take the conservative approach and copy the array
for every write to it. The latter is how javascript and (I heard) how


delphi
works. But it makes multiple sequential writes to an array incredibly


slow
and memory hoggish, rendering it completely unacceptable for a


performance
oriented language like D.

 Couldn't something like this be enabled only for debug mode, and parhaps
 specifically specified by the programmer?

It could, but I think it's better to just lay out the way it works and stick
with it.

Apr 17 2004

D Programming

C/C++ Programming

Other

D - Copy on Write