digitalmars.D - xxxInPlace or xxxCopy?

Andrei Alexandrescu (23/23) Jan 19 2011 I'm consolidating some routines from std.string into std.array. They are...

bearophile (6/11) Jan 19 2011 Strings are meant to be immutable, and the functional style is simpler t...

so (4/4) Jan 19 2011 Strange, we are again on the opposite sides...

bearophile (4/8) Jan 19 2011 In the meantime the world is going more functional... :-)

so (2/3) Jan 20 2011 I love how they solve this problem, but if you go on that path while

Simen kjaeraas (4/8) Jan 19 2011 Nope. (s1 == "Mary has a li'l lamb.") && (s == "Mary has a lil lamb.").

Jesse Phillips (4/6) Jan 19 2011 Do what sort does. On another thought what about:
Jonathan M Davis (21/47) Jan 19 2011 --vote;

Andrei Alexandrescu (5/35) Jan 19 2011 Problem is, even though the example uses strings, the functions apply to...

Jonathan M Davis (12/54) Jan 19 2011 True. But I would expect a string to be by far the most used type of arr...

so (3/9) Jan 19 2011 Isn't simplicity and understandability favors the in-place style on thes...

bearophile (6/9) Jan 19 2011 You have to think of the normal sort as a performance hack, something th...

so (3/8) Jan 20 2011 I didn't know that, this solution is what i meant.

bearophile (4/7) Jan 20 2011 Python was designed lot of time ago by Guido that I think didn't know mu...

Jonathan M Davis (48/59) Jan 19 2011 No. I'd argue that it's clearer to see stuff like

Andrei Alexandrescu (3/33) Jan 19 2011 This is a good argument, thanks Jonathan.

Andrej Mitrovic (7/11) Jan 19 2011 [1, 2, 3] < sorted returned a new list
Andrej Mitrovic (18/29) Jan 19 2011 What I meant by the first sentence is that due to the interpreter

bearophile (10/27) Jan 20 2011 Such bugs are common enough. GNU C has the warn_unused_result attribute ...

Trass3r (3/3) Jan 20 2011 If such an annotation was introduced, it should be the other way around.

Jonathan M Davis (13/16) Jan 20 2011 Actually, there are plenty of cases where you throw away the return valu...

foobar (4/23) Jan 20 2011 You brought up an interesting idea:

Steven Schveighoffer (4/38) Jan 20 2011 Pure functions no longer have that requirement. You can pass mutable

Don (4/47) Jan 20 2011 If you don't use the return value of a strongly pure, nothrow function,

bearophile (4/7) Jan 20 2011 I have added this at the end of the enhancement request 5464 (but the er...

Andrej Mitrovic (8/13) Jan 20 2011 Yeah. There are functions that can return a value that also have

spir (17/42) Jan 20 2011 attribute (that is like your @nodiscard if you use -Werror to turn

bearophile (4/7) Jan 20 2011 http://d.puremagic.com/issues/show_bug.cgi?id=5464

Trass3r (5/10) Jan 20 2011 second one does it in place.
so (11/22) Jan 20 2011 I don't understand how the first two are clear and the last two are not ...

bearophile (4/8) Jan 20 2011 In Python I am used to immutable strings, so string methods like replace...
Jonathan M Davis (26/53) Jan 20 2011 replace is clearer in the first case, because you're getting the return ...

so (10/16) Jan 21 2011 I am really trying hard to understand this, but your reasons for first i...

bearophile (5/7) Jan 21 2011 You will find D1 string functions are much more from here than from Boos...
spir (12/29) Jan 21 2011 Without any additional information, I would necessirily assume replace
Jonathan M Davis (48/67) Jan 21 2011 The issue is when you don't look at the documentation or trying to avoid...
Jonathan M Davis (29/59) Jan 21 2011 The fact that a function performs an action has nothing do to with wheth...
spir (22/33) Jan 21 2011 I don't follow you here. You use in your reasoning the particularity of

Andrei Alexandrescu (4/14) Jan 19 2011 We also have toupperInPlace and tolowerInPlace as precedents pointing

bearophile (4/6) Jan 19 2011 Important general rule: if converting string functions into generic func...
Jerry Quinn (4/22) Jan 19 2011 The big difference is operating on immutable arrays vs mutable ones. Fo...

Jonathan M Davis (5/39) Jan 19 2011 I'd say that yes, it's too ugly to contemplate. The reason is simple: th...

Justin Johansson (7/12) Jan 20 2011 Though your question has already prompted a number of answers, are you
spir (41/63) Jan 20 2011 I have thought at these issues (there are several playing together) in
Akakima (7/7) Jan 20 2011 Is it ok to use:
foobar (10/40) Jan 20 2011 Like bearophile and others, I too would prefer the default behavior to b...
Jonathan M Davis (19/55) Jan 21 2011 Sure, you can always come up with more exotic stuff that the return valu...
spir (16/19) Jan 21 2011 Same for me. I don't find having this version as the normal case odd at

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

I'm consolidating some routines from std.string into std.array. They are 
specialized for operating on arrays, and include the likes of insert, 
remove, replace.

One question is whether operations should be performed in place or on a 
copy. For example:

string s = "Mary has a lil lamb.";
// Implicit copy
auto s1 = replace(s, "lil", "li'l");
assert(s == "Mary has a lil lamb.");
// Explicit in-place
replaceInPlace(s, "lil", "li'l");
assert(s == "Mary has a li'l lamb.");

So that would make copying the default behavior. Alternatively, we could 
make in-place the default behavior and ask for the Copy suffix:

string s = "Mary has a lil lamb.";
// Explicit copy
auto s1 = replaceCopy(s, "lil", "li'l");
assert(s == "Mary has a lil lamb.");
// Implicit in-place
replace(s, "lil", "li'l");
assert(s == "Mary has a li'l lamb.");


Thoughts?

Andrei

Jan 19 2011

bearophile <bearophileHUGS lycos.com> writes:

Andrei:

 One question is whether operations should be performed in place or on a 
 copy. For example:

Strings are meant to be immutable, and the functional style is simpler to
understand and safer to use, so I firmly suggest the default (with shorter
names) functions to create a new string/array, and the versions that work in
place with a longer name.

In some languages the versions that work in-place have a bang (!) suffix, like
replace and replace!. I guess a name like "replaceBang" is too much cryptic.


 auto s1 = replace(s, "lil", "li'l");
 assert(s == "Mary has a lil lamb.");

You probably meant:
 assert(s1 == "Mary has a lil lamb.");

Bye,
bearophile

Jan 19 2011

so <so so.do> writes:

Strange, we are again on the opposite sides...
Second one looks much better to me.
I think, most of the time we need inplace, and it deserves the better  
syntax.

Jan 19 2011

bearophile <bearophileHUGS lycos.com> writes:

so:

 Strange, we are again on the opposite sides...
 Second one looks much better to me.
 I think, most of the time we need inplace, and it deserves the better  
 syntax.

In the meantime the world is going more functional... :-)

Bye,
bearophile

Jan 19 2011

so <so so.do> writes:

 In the meantime the world is going more functional... :-)

I love how they solve this problem, but if you go on that path while  
ignoring the reality there wouldn't be much of a reason using D, no? :)

Jan 20 2011

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

bearophile <bearophileHUGS lycos.com> wrote:

 auto s1 = replace(s, "lil", "li'l");
 assert(s == "Mary has a lil lamb.");

 You probably meant:
 assert(s1 == "Mary has a lil lamb.");


Nope. (s1 == "Mary has a li'l lamb.") && (s == "Mary has a lil lamb.").


-- 
Simen

Jan 19 2011

Jesse Phillips <jessekphillips+D gmail.com> writes:

Andrei Alexandrescu Wrote:

 So that would make copying the default behavior. Alternatively, we could 
 make in-place the default behavior and ask for the Copy suffix:

Do what sort does. On another thought what about:

auto s = replace(s1[], "lil", "li'l");

isn't the empty [] the specification for saving a range in its current form?
Just seems like this would be how we'd want to do things.

Jan 19 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday, January 19, 2011 15:33:16 Andrei Alexandrescu wrote:
 I'm consolidating some routines from std.string into std.array. They are
 specialized for operating on arrays, and include the likes of insert,
 remove, replace.
 
 One question is whether operations should be performed in place or on a
 copy. For example:
 
 string s = "Mary has a lil lamb.";
 // Implicit copy
 auto s1 = replace(s, "lil", "li'l");
 assert(s == "Mary has a lil lamb.");
 // Explicit in-place
 replaceInPlace(s, "lil", "li'l");
 assert(s == "Mary has a li'l lamb.");

++vote;

 So that would make copying the default behavior. Alternatively, we could
 make in-place the default behavior and ask for the Copy suffix:
 
 string s = "Mary has a lil lamb.";
 // Explicit copy
 auto s1 = replaceCopy(s, "lil", "li'l");
 assert(s == "Mary has a lil lamb.");
 // Implicit in-place
 replace(s, "lil", "li'l");
 assert(s == "Mary has a li'l lamb.");

--vote;

 
 Thoughts?

Haven't we been using the approach that string operations generally make copies 
(in many cases slices) and marking functions that do it in place with InPlace? 
That's certainly the approach that I'd prefer. And considering that strings 
(which would be the most common use of arrays, I would think) have immutable 
elements and generally _can't_ do anything in place, that would imply that 
copying/slicing would be the default rather than doing operations in place.

Also, if you're looking to minimize code breakage, you're going to have to go 
with using copy by default and in place for functions marked for it, because
the 
existing versions of functions like replace have been making copies. So, 
switching to in place by default would break more code.

Being forced to use functions with copy in the name would make dealing with 
strings more annoying, since they would _have_ to be using the copy versions, 
and it would be the versions with copy in the name which would be used the
most, 
which seems really backwards.

So, I really think that copying should be the default and in place functions 
should be marked with InPlace. It's more consistent with current behavior and 
would generally result in less typing.

- Jonathan M Davis

Jan 19 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/19/11 6:53 PM, Jonathan M Davis wrote:
 On Wednesday, January 19, 2011 15:33:16 Andrei Alexandrescu wrote:
 I'm consolidating some routines from std.string into std.array. They are
 specialized for operating on arrays, and include the likes of insert,
 remove, replace.

 One question is whether operations should be performed in place or on a
 copy. For example:

 string s = "Mary has a lil lamb.";
 // Implicit copy
 auto s1 = replace(s, "lil", "li'l");
 assert(s == "Mary has a lil lamb.");
 // Explicit in-place
 replaceInPlace(s, "lil", "li'l");
 assert(s == "Mary has a li'l lamb.");

 ++vote;

 So that would make copying the default behavior. Alternatively, we could
 make in-place the default behavior and ask for the Copy suffix:

 string s = "Mary has a lil lamb.";
 // Explicit copy
 auto s1 = replaceCopy(s, "lil", "li'l");
 assert(s == "Mary has a lil lamb.");
 // Implicit in-place
 replace(s, "lil", "li'l");
 assert(s == "Mary has a li'l lamb.");

 --vote;

So I guess vote stays unchanged :o).

 Thoughts?

 Haven't we been using the approach that string operations generally make copies
 (in many cases slices) and marking functions that do it in place with InPlace?

Problem is, even though the example uses strings, the functions apply to 
all arrays.

Andrei

Jan 19 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday, January 19, 2011 17:10:07 Andrei Alexandrescu wrote:
 On 1/19/11 6:53 PM, Jonathan M Davis wrote:
 On Wednesday, January 19, 2011 15:33:16 Andrei Alexandrescu wrote:
 I'm consolidating some routines from std.string into std.array. They are
 specialized for operating on arrays, and include the likes of insert,
 remove, replace.
 
 One question is whether operations should be performed in place or on a
 copy. For example:
 
 string s = "Mary has a lil lamb.";
 // Implicit copy
 auto s1 = replace(s, "lil", "li'l");
 assert(s == "Mary has a lil lamb.");
 // Explicit in-place
 replaceInPlace(s, "lil", "li'l");
 assert(s == "Mary has a li'l lamb.");

 
 ++vote;
 
 So that would make copying the default behavior. Alternatively, we could
 make in-place the default behavior and ask for the Copy suffix:
 
 string s = "Mary has a lil lamb.";
 // Explicit copy
 auto s1 = replaceCopy(s, "lil", "li'l");
 assert(s == "Mary has a lil lamb.");
 // Implicit in-place
 replace(s, "lil", "li'l");
 assert(s == "Mary has a li'l lamb.");

 
 --vote;

 
 So I guess vote stays unchanged :o).
 
 Thoughts?

 
 Haven't we been using the approach that string operations generally make
 copies (in many cases slices) and marking functions that do it in place
 with InPlace?

 
 Problem is, even though the example uses strings, the functions apply to
 all arrays.

True. But I would expect a string to be by far the most used type of array. So, 
unless you want to specialize the functions so that they work one way for 
strings and another way for other arrays (which sounds like a really bad idea), 
it would make the most sense to pick the way that's most likely to be used as 
the default. And since strings are the most likely case, choosing what works 
best for strings seems like the best idea IMHO.

And honestly, from the standpoint of code simplicity and understandability, 
there's a lot to be said for making copies being the default rather than 
mutation. You can then use the InPlace versions if you need the boost in 
efficiency.

- Jonathan M Davis

Jan 19 2011

so <so so.do> writes:

 And honestly, from the standpoint of code simplicity and  
 understandability,
 there's a lot to be said for making copies being the default rather than
 mutation. You can then use the InPlace versions if you need the boost in
 efficiency.

 - Jonathan M Davis

Isn't simplicity and understandability favors the in-place style on these  
type of algorithms?
As Jesse Phillips said, it is same as sort.

Jan 19 2011

bearophile <bearophileHUGS lycos.com> writes:

so:

 Isn't simplicity and understandability favors the in-place style on these  
 type of algorithms?

Nope, functional-style code is what you are looking for :-)


 As Jesse Phillips said, it is same as sort.

You have to think of the normal sort as a performance hack, something that is
good because copying data wastes a lot of time, if the array is large or if you
have to sort an many small arrays. Normally in Python you prefer sorted(), that
returns a sorted copy, unless performance is important. I'd like something like
sorted() in D too.

In a program there is code that's performance-critical, and other code that's
not changing the total runtime much. Often the second kind of code is a good
part of the whole program. In this part you want very short, readable, safer
code, even functional-style :-)

Bye,
bearophile

Jan 19 2011

so <so so.do> writes:

 You have to think of the normal sort as a performance hack, something  
 that is good because copying data wastes a lot of time, if the array is  
 large or if you have to sort an many small arrays. Normally in Python  
 you prefer sorted(), that returns a sorted copy, unless performance is  
 important. I'd like something like sorted() in D too.

I didn't know that, this solution is what i meant.
So, they didn't blindly enforce functional language rules to a  
non-functional language.

Jan 20 2011

bearophile <bearophileHUGS lycos.com> writes:

so:

 I didn't know that, this solution is what i meant.
 So, they didn't blindly enforce functional language rules to a  
 non-functional language.

Python was designed lot of time ago by Guido that I think didn't know much
about functional programming. So they have first added an in-place sort() and
later they have added a more functional sorted(). D2 is more functional than
Python2, and I think the behaviour of sorted() is better to be the default one
in D2 :-)

Bye,
bearophile

Jan 20 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday 19 January 2011 18:36:55 so wrote:
 And honestly, from the standpoint of code simplicity and
 understandability,
 there's a lot to be said for making copies being the default rather than
 mutation. You can then use the InPlace versions if you need the boost in
 efficiency.
 
 - Jonathan M Davis

 
 Isn't simplicity and understandability favors the in-place style on these
 type of algorithms?
 As Jesse Phillips said, it is same as sort.

No. I'd argue that it's clearer to see stuff like

auto newStr = replace(str, "hello", "world");
auto sorted = sort(newStr);

than to see stuff like

replace(str, "hello", "world");
sort(newStr);

If you have

replace(str, "hello", "world");

you don't know whether it's changed the value in place or if you're throwing 
away a return value. However, if you have

auto newStr = replace(str, "hello", "world");
replaceInPlace(newStr, "world", "hello");

it's quite clear that the first one returns a value and the the second one does 
it in place. Whereas if you have

auto newStr = replaceCopy(str, "hello", "world");
replace(newStr, "world", "hello");

the first one is clear, but the second one is only clear because seeing the
first 
one makes it obvious that the second one must be doing something different. And 
even then, I'd argue that the name replaceCopy is more ambiguous than 
replaceInPlace. I think that it's far more likely that a function xCopy is
going 
to have possible alternate meanings that xInPlace would, since not only is copy 
both a verb and a noun, but it can be used in a lot more situations, whereas 
InPlace is pretty limited and thus clear. Not to mention, if a function says 
copy, that implies that it might actually be _copying_ rather than slicing, 
which many xCopy functions would actually be doing rather than actually
copying. 
So, using Copy in the name is actual ambiguous _regardless_ of what the first 
part of the function name is.

In functional languages, it's _required_ that a function return the changed 
value instead of changing the one passed in. You're far less likely to 
accidentally mutate stuff if you program that way, even if you're not dealing 
with immutable or const values. I think that code is much cleaner if you
program 
in a functional style. The problem is, of course, that you can't constantly be 
copying everything all the time, because there's a definite performance hit for 
doing that. So, you have functions which make changes in place when you need to 
do that.

So, I'd argue that it's generally better to program using a functional style if 
you can and then use mutation if necessary for performance. x and xInPlace 
support that while xCopy and x do not.

However, I think that the biggest argument in favor of using x and xInPlace is 
that strings are by far the most used type of array, and they _need_ to use the 
version which makes a copy or slices the array. So, if the x / xInPlace naming 
scheme would result in x being used more than xInPlace, whereas xCopy / x would 
result in xCopy being used the most. And I really think that the shorter
version 
should be the one which is going to be used the most. Not to mention, that's
the 
way that the string functions have been done thus far, so sticking to x / 
xInPlace will break less code.

- Jonathan M Davis

Jan 19 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/19/11 9:11 PM, Jonathan M Davis wrote:
 On Wednesday 19 January 2011 18:36:55 so wrote:
 And honestly, from the standpoint of code simplicity and
 understandability,
 there's a lot to be said for making copies being the default rather than
 mutation. You can then use the InPlace versions if you need the boost in
 efficiency.

 - Jonathan M Davis

 Isn't simplicity and understandability favors the in-place style on these
 type of algorithms?
 As Jesse Phillips said, it is same as sort.

 No. I'd argue that it's clearer to see stuff like

 auto newStr = replace(str, "hello", "world");
 auto sorted = sort(newStr);

 than to see stuff like

 replace(str, "hello", "world");
 sort(newStr);

 If you have

 replace(str, "hello", "world");

 you don't know whether it's changed the value in place or if you're throwing
 away a return value. However, if you have

 auto newStr = replace(str, "hello", "world");
 replaceInPlace(newStr, "world", "hello");

 it's quite clear that the first one returns a value and the the second one does
 it in place. Whereas if you have

 auto newStr = replaceCopy(str, "hello", "world");
 replace(newStr, "world", "hello");

 the first one is clear, but the second one is only clear because seeing the
first
 one makes it obvious that the second one must be doing something different.

This is a good argument, thanks Jonathan.

Andrei

Jan 19 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

One common mistake newbies make in Python is calling the sorted method
and expecting it to sort in place:

 x = [3, 2, 1]
 sorted(x)



[1, 2, 3]    < sorted returned a new list
 x



[3, 2, 1]    < x stayed the same



There are a few functions in the Python lib that have "InPlace" added
to their names to avoid confusion, so it's not a new convention and it
seems like a good way to go.

Jan 19 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 1/20/11, Andrej Mitrovic <andrej.mitrovich gmail.com> wrote:
 One common mistake newbies make in Python is calling the sorted method
 and expecting it to sort in place:

 x = [3, 2, 1]
 sorted(x)



 [1, 2, 3]    < sorted returned a new list
 x



 [3, 2, 1]    < x stayed the same



 There are a few functions in the Python lib that have "InPlace" added
 to their names to avoid confusion, so it's not a new convention and it
 seems like a good way to go.

What I meant by the first sentence is that due to the interpreter
outputing the sorted list, a newbie might think that x was sorted, so
he uses it in his own code until he notices the bug.

I think what might help out in D is if we had a way to mark some
functions so the compiler guarantees that their return values *are
not* to be discarded. For example, this code will compile:

import std.stdio;
import std.string;
void main()
{
    string s = "Mary has a lil lamb.";
    replace(s, "lil", "li'l");  // returns a copy, but discards it
}

If the replace function is marked with some kind of  nodiscard
annotation, then his would be a compile error since it doesn't make
sense to construct a new string, return it, and discard it.

But maybe that's going overboard. How often do these kinds of bugs creep in?

Jan 19 2011

bearophile <bearophileHUGS lycos.com> writes:

Andrej Mitrovic:

 I think what might help out in D is if we had a way to mark some
 functions so the compiler guarantees that their return values *are
 not* to be discarded. For example, this code will compile:
 
 import std.stdio;
 import std.string;
 void main()
 {
     string s = "Mary has a lil lamb.";
     replace(s, "lil", "li'l");  // returns a copy, but discards it
 }
 
 If the replace function is marked with some kind of  nodiscard
 annotation, then his would be a compile error since it doesn't make
 sense to construct a new string, return it, and discard it.
 
 But maybe that's going overboard. How often do these kinds of bugs creep in?

Such bugs are common enough. GNU C has the warn_unused_result attribute (that
is like your  nodiscard if you use -Werror to turn warnings into errors):
http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html

Some C lints require a void cast where you don't want to use a function result:
cast(void)replace(s, "lil", "li'l");

In a language the default is different and where you don't want to use a
function result you have to add an annotation:
unused replace(s, "lil", "li'l");

Something like  nodiscard is more useful in C than D because in C there are no
true built-in exceptions, so error return values are common, and ignoring them
is a mistake. In some cases like replace() or the C realloc() ignoring a result
is always a programmer error. So something like  nodiscard is useful in D too.

Bye,
bearophile

Jan 20 2011

Trass3r <un known.com> writes:

If such an annotation was introduced, it should be the other way around.
But imo discarding a return value should always result in a warning,
the function returns something for a reason.

Jan 20 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday 20 January 2011 03:51:48 Trass3r wrote:
 If such an annotation was introduced, it should be the other way around.
 But imo discarding a return value should always result in a warning,
 the function returns something for a reason.

Actually, there are plenty of cases where you throw away the return value. A 
number of overloaded operators are prime examples - such as opAssign. 
std.algorithm.sort both sorts in place _and_ returns a sorted range (so that 
other algorithms can then know that the range is sorted). It's really quite
easy 
to get legitimate cases where throwing away the return value makes perfect 
sense. Now, if you're dealing with a strongly pure function which throws away 
its return value, then yes, that's definitely bug, since the only effect of the 
function is its return value. Frequently however, that's not the case.

Yes, you can have bugs because you didn't actually use the return value of a 
function, but it's that necessarily uncommon to have function calls which 
legitimately throw away their return value.

- Jonathan M Davis

Jan 20 2011

foobar <foo bar.com> writes:

Jonathan M Davis Wrote:

 On Thursday 20 January 2011 03:51:48 Trass3r wrote:
 If such an annotation was introduced, it should be the other way around.
 But imo discarding a return value should always result in a warning,
 the function returns something for a reason.

 
 Actually, there are plenty of cases where you throw away the return value. A 
 number of overloaded operators are prime examples - such as opAssign. 
 std.algorithm.sort both sorts in place _and_ returns a sorted range (so that 
 other algorithms can then know that the range is sorted). It's really quite
easy 
 to get legitimate cases where throwing away the return value makes perfect 
 sense. Now, if you're dealing with a strongly pure function which throws away 
 its return value, then yes, that's definitely bug, since the only effect of
the 
 function is its return value. Frequently however, that's not the case.
 
 Yes, you can have bugs because you didn't actually use the return value of a 
 function, but it's that necessarily uncommon to have function calls which 
 legitimately throw away their return value.
 
 - Jonathan M Davis

You brought up an interesting idea:
D already supports purity and as you said it doesn't make sense to discard
return values of such functions. 
Therefore, it makes sense that for pure functions, this would result in a
compile time error.

Jan 20 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 20 Jan 2011 10:36:00 -0500, foobar <foo bar.com> wrote:

 Jonathan M Davis Wrote:

 On Thursday 20 January 2011 03:51:48 Trass3r wrote:
 If such an annotation was introduced, it should be the other way  

 around.
 But imo discarding a return value should always result in a warning,
 the function returns something for a reason.

 Actually, there are plenty of cases where you throw away the return  
 value. A
 number of overloaded operators are prime examples - such as opAssign.
 std.algorithm.sort both sorts in place _and_ returns a sorted range (so  
 that
 other algorithms can then know that the range is sorted). It's really  
 quite easy
 to get legitimate cases where throwing away the return value makes  
 perfect
 sense. Now, if you're dealing with a strongly pure function which  
 throws away
 its return value, then yes, that's definitely bug, since the only  
 effect of the
 function is its return value. Frequently however, that's not the case.

 Yes, you can have bugs because you didn't actually use the return value  
 of a
 function, but it's that necessarily uncommon to have function calls  
 which
 legitimately throw away their return value.

 - Jonathan M Davis

 You brought up an interesting idea:
 D already supports purity and as you said it doesn't make sense to  
 discard return values of such functions.
 Therefore, it makes sense that for pure functions, this would result in  
 a compile time error.

Pure functions no longer have that requirement.  You can pass mutable  
references to pure functions, which makes them weak-pure.

-Steve

Jan 20 2011

Don <nospam nospam.com> writes:

Steven Schveighoffer wrote:
 On Thu, 20 Jan 2011 10:36:00 -0500, foobar <foo bar.com> wrote:
 
 Jonathan M Davis Wrote:

 On Thursday 20 January 2011 03:51:48 Trass3r wrote:
 If such an annotation was introduced, it should be the other way 

 around.
 But imo discarding a return value should always result in a warning,
 the function returns something for a reason.

 Actually, there are plenty of cases where you throw away the return 
 value. A
 number of overloaded operators are prime examples - such as opAssign.
 std.algorithm.sort both sorts in place _and_ returns a sorted range 
 (so that
 other algorithms can then know that the range is sorted). It's really 
 quite easy
 to get legitimate cases where throwing away the return value makes 
 perfect
 sense. Now, if you're dealing with a strongly pure function which 
 throws away
 its return value, then yes, that's definitely bug, since the only 
 effect of the
 function is its return value. Frequently however, that's not the case.

 Yes, you can have bugs because you didn't actually use the return 
 value of a
 function, but it's that necessarily uncommon to have function calls 
 which
 legitimately throw away their return value.

 - Jonathan M Davis

 You brought up an interesting idea:
 D already supports purity and as you said it doesn't make sense to 
 discard return values of such functions.
 Therefore, it makes sense that for pure functions, this would result 
 in a compile time error.

 
 Pure functions no longer have that requirement.  You can pass mutable 
 references to pure functions, which makes them weak-pure.
 
 -Steve

If you don't use the return value of a strongly pure, nothrow function, 
you could be given a 'expression has no effect' error.
Currently the function call is silently dropped.

Jan 20 2011

bearophile <bearophileHUGS lycos.com> writes:

Don:

 If you don't use the return value of a strongly pure, nothrow function, 
 you could be given a 'expression has no effect' error.
 Currently the function call is silently dropped.

I have added this at the end of the enhancement request 5464 (but the error
message is different).

Bye,
bearophile

Jan 20 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 1/20/11, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 On Thursday 20 January 2011 03:51:48 Trass3r wrote:
 If such an annotation was introduced, it should be the other way around.
 But imo discarding a return value should always result in a warning,
 the function returns something for a reason.

 Actually, there are plenty of cases where you throw away the return value.

Yeah. There are functions that can return a value that also have
side-effects. An example might be a class method that modifies it's
private fields and might return the number of fields that were
affected. While you might not need the return value in most cases, you
do want the side-effects to happen. That's why forcing an error on
functions that return values which aren't used would not be a good
idea, and where the annotation idea comes from.

Jan 20 2011

spir <denis.spir gmail.com> writes:

On 01/20/2011 11:31 AM, bearophile wrote:
 Andrej Mitrovic:

 I think what might help out in D is if we had a way to mark some
 functions so the compiler guarantees that their return values *are
 not* to be discarded. For example, this code will compile:

 import std.stdio;
 import std.string;
 void main()
 {
      string s = "Mary has a lil lamb.";
      replace(s, "lil", "li'l");  // returns a copy, but discards it
 }

 If the replace function is marked with some kind of  nodiscard
 annotation, then his would be a compile error since it doesn't make
 sense to construct a new string, return it, and discard it.

 But maybe that's going overboard. How often do these kinds of bugs 


creep in?
 Such bugs are common enough. GNU C has the warn_unused_result 

attribute (that is like your  nodiscard if you use -Werror to turn 
warnings into errors):
 http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html

 Some C lints require a void cast where you don't want to use a 

function result:
 cast(void)replace(s, "lil", "li'l");

 In a language the default is different and where you don't want to 

use a function result you have to add an annotation:
 unused replace(s, "lil", "li'l");

 Something like  nodiscard is more useful in C than D because in C 

there are no true built-in exceptions, so error return values are 
common, and ignoring them is a mistake. In some cases like replace() or 
the C realloc() ignoring a result is always a programmer error. So 
something like  nodiscard is useful in D too.

But I thought D had such a feature already. Probably I'm confusing, but 
I think I've had compiler warning in such cases, procisely (ingoring a 
func result).

denis
_________________
vita es estrany
spir.wikidot.com

Jan 20 2011

bearophile <bearophileHUGS lycos.com> writes:

Andrej Mitrovic:

 If the replace function is marked with some kind of  nodiscard
 annotation, then his would be a compile error since it doesn't make
 sense to construct a new string, return it, and discard it.

http://d.puremagic.com/issues/show_bug.cgi?id=5464

Bye,
bearophile

Jan 20 2011

Trass3r <un known.com> writes:

 If you have replace(str, "hello", "world");
 you don't know whether it's changed the value in place or if you're

throwing away a return value. However, if you have
 auto newStr = replace(str, "hello", "world");
 replaceInPlace(newStr, "world", "hello");
 it's quite clear that the first one returns a value and the the

second one does it in place.

Very true. Imho function names would also be more understandable this
way cause xInPlace is unambiguous while xCopy might lead to confusion
(at least I could imagine a stranger misinterpreting replaceCopy etc.)

Jan 20 2011

so <so so.do> writes:

 auto newStr = replace(str, "hello", "world");
 replaceInPlace(newStr, "world", "hello");

 it's quite clear that the first one returns a value and the the second  
 one does
 it in place. Whereas if you have

 auto newStr = replaceCopy(str, "hello", "world");
 replace(newStr, "world", "hello");

 the first one is clear, but the second one is only clear because seeing  
 the first
 one makes it obvious that the second one must be doing something  
 different.

I don't understand how the first two are clear and the last two are not so.
Where both have the name "replace" for different things, and replace to me  
means "replace in place".
With this in hand, how is the first "replace" is quite clear?

I am sure this is the case for many people. Problem is the naming here.
If you have named it something like "replaced" and return a copy, it would  
be obvious and clear.
Here, aren't you just dictating functional language rules to a  
multi-paradigm language, implicitly?
In a fully functional language "replace(something)" might mean "replace  
and give me a copy", but it is not what we have.

Jan 20 2011

bearophile <bearophileHUGS lycos.com> writes:

so:

 I don't understand how the first two are clear and the last two are not so.
 Where both have the name "replace" for different things, and replace to me  
 means "replace in place".
 With this in hand, how is the first "replace" is quite clear?

In Python I am used to immutable strings, so string methods like replace return
a modified copy. D1 string functions are similar. I'd like D2 to be like Python
here, but in practice an in-place replace procedure and a strongly-pure replace
function that returns a modified copy are about equally clear :-) Yet, if you
perform many in-place operations on strings you may get confused (it happened
to me), such confusion is less common with functional-style string functions.

Bye,
bearophile

Jan 20 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday, January 20, 2011 05:48:12 so wrote:
 auto newStr = replace(str, "hello", "world");
 replaceInPlace(newStr, "world", "hello");
 
 it's quite clear that the first one returns a value and the the second
 one does
 it in place. Whereas if you have
 
 auto newStr = replaceCopy(str, "hello", "world");
 replace(newStr, "world", "hello");
 
 the first one is clear, but the second one is only clear because seeing
 the first
 one makes it obvious that the second one must be doing something
 different.

 
 I don't understand how the first two are clear and the last two are not so.
 Where both have the name "replace" for different things, and replace to me
 means "replace in place".
 With this in hand, how is the first "replace" is quite clear?
 
 I am sure this is the case for many people. Problem is the naming here.
 If you have named it something like "replaced" and return a copy, it would
 be obvious and clear.
 Here, aren't you just dictating functional language rules to a
 multi-paradigm language, implicitly?
 In a fully functional language "replace(something)" might mean "replace
 and give me a copy", but it is not what we have.

replace is clearer in the first case, because you're getting the return value.
If 
you don't get the return value, then it's not immediately clear whether it's 
replacing "world" with "hello" in the return value or whether the function is 
void and "world" is being replaced in the original string (though they fact
that 
we're dealing with strings here means that it _can't_ alter the original string 
- it's more of a question when dealing with arrays with mutable elements).

Also, replaced would just be downright confusing to me, since it's not a verb. 
I'd expect it to be some sort of boolean test for whether something had been 
replaced, though that doesn't make a whole lot of sense in the context. I
expect 
functions to be verbs unless checking state. Now, as I understdand it, python 
uses past participles such as replaced and sorted, but having never programmed 
in python, I'm not particularly familiar with that naming scheme and it wouild 
really throw me off at first.

Regardless, I don't see anything wrong with naming functions in a manner that 
implies that a functional style is the default - particularly when we're
talking 
about arrays, and they pretty much _have_ to be used in a functional style, 
because their elements are immutable.

Andrei is essentially asking us whether the default behavior of an array 
function should typically be to return the changed value or to change it in 
place, with the longer name going to the function which has the other behavior. 
And since strings _have_ to be copied/sliced, and strings are generally going
to 
be the most common type of array used, then it would make sense to make the 
default behavior be copying/slicing, making the functions which alter arrays in 
place have InPlace in their name.

- Jonathan M Davis

Jan 20 2011

so <so so.do> writes:

 replace is clearer in the first case, because you're getting the return  
 value.
 ...

I am really trying hard to understand this, but your reasons for first is  
clearer then the second makes no sense to me i am sorry.
I still think second is clearer, but whatever, as long as i can see the  
interface or the doc, i am fine.

string replace(string, ...);
void replace(ref string, ...);

 Regardless, I don't see anything wrong with naming functions in a manner  
 that
 implies that a functional style is the default

I am not against enforcing such a rule, i am against doing it implicitly  
and work with assumptions.
Just check boost/string/replace, they have in place replaces default too.  
You might not like boost (some don't) but it is the closest example to D.

Jan 21 2011

bearophile <bearophileHUGS lycos.com> writes:

so:

 Just check boost/string/replace, they have in place replaces default too.  
 You might not like boost (some don't) but it is the closest example to D.

You will find D1 string functions are much more from here than from Boost:
http://docs.python.org/release/2.5.2/lib/string-methods.html

Bye,
bearophile

Jan 21 2011

spir <denis.spir gmail.com> writes:

On 01/21/2011 07:47 PM, so wrote:
 replace is clearer in the first case, because you're getting the
 return value.
 ...

 I am really trying hard to understand this, but your reasons for first
 is clearer then the second makes no sense to me i am sorry.
 I still think second is clearer, but whatever, as long as i can see the
 interface or the doc, i am fine.

 string replace(string, ...);
 void replace(ref string, ...);

 Regardless, I don't see anything wrong with naming functions in a
 manner that
 implies that a functional style is the default

 I am not against enforcing such a rule, i am against doing it implicitly
 and work with assumptions.
 Just check boost/string/replace, they have in place replaces default
 too. You might not like boost (some don't) but it is the closest example
 to D.

Without any additional information, I would necessirily assume replace 
performs an /action/ because it's an action verb: meaning it changes the 
argument. Like 'so', I cannot understand the converse reasoning.
I you want people to guess that a true function returns a result, just 
name it according to its result: replacedString, ot just replaced. 
Nobody, I guess, would ever think that a routine called replacedString 
acts in-place.

Denis
_________________
vita es estrany
spir.wikidot.com

Jan 21 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Friday, January 21, 2011 10:47:01 so wrote:
 replace is clearer in the first case, because you're getting the return
 value.
 ...

 
 I am really trying hard to understand this, but your reasons for first is
 clearer then the second makes no sense to me i am sorry.
 I still think second is clearer, but whatever, as long as i can see the
 interface or the doc, i am fine.
 
 string replace(string, ...);
 void replace(ref string, ...);

The issue is when you don't look at the documentation or trying to avoid having 
to look at the documentation. If you see

auto result = replace(str, "hello", "goodbye");

it's quite clear that a copy is taking place. And if a copy/slice is taking 
place, then that is what you would normally see when replace is used. However, 
if replace alters the array in place, then

replace(str, "hello", "goodbye");

would be what you would normally see. And without looking at the documentation, 
it's not clear whether that is doing it in-place or if you're throwing away the 
return value. However, in the case where replace does a copy/slice, it _is_ 
clear, because the return value is saved.

So, if copying/slicing is the default, then you won't _need_ to read the 
documentation to know whether a copy/slice is happening or whether it's 
happening in-place, because the code itself will make it obvious (unless you 
screwed up and forgot to assign the return value to a variable or pass it to a 
function). But in the case where in-place is the default, it is _not_ obvious
by 
reading the code. Sure, once you read the documentation, you'll know. But you 
have to read the documentation. So, copying/slicing by default is immediately 
obvious whereas in-place is not.

 Regardless, I don't see anything wrong with naming functions in a manner
 that
 implies that a functional style is the default

 
 I am not against enforcing such a rule, i am against doing it implicitly
 and work with assumptions.
 Just check boost/string/replace, they have in place replaces default too.
 You might not like boost (some don't) but it is the closest example to D.

If you want consistency among your function, then you have to pick either 
copying or in place as the default. That doesn't necessarily mean that _all_ 
functions must _always_ be named that way (e.g. the current behavior of sort is 
an interesting example since it does _both_). However, if you're going for 
consistency, then you have to pick one or the other. Unless you want to 
explicitly put Copy and InPlace in all of the array functions and not have any 
without it, you're going to _have_ to deal with the fact that a function
without 
Copy or InPlace in its name is still going to have to do one or the other 
(unless you're talking about a function which is just querying something about 
an array rather than manipulating it - like cmp).

So, when you have a function like replace, you have to choose whether it's
going 
to do it in place or copy/slice the array. A different version of the function 
with a different name (such as replaceCopy or replaceInPlace) then deals with
the 
other case. Phobos has already been going for the default of copying/slicing 
rather than doing it in-place. Given that strings _have_ to be copied or sliced 
and that strings are the most common type of array, making copying/slicing the 
default makes good sense.

It's fine if Boost wants to pick in-place as the default. That's their choice. 
They're also dealing with a different programming language with different pros
and 
cons. Personally, I prefer that copying/slicing be the default if it's
efficient 
enough to do so, since that promotes a functional style of programming, which
is 
going to tend to be more straightforward and less error-prone, but if in-place 
mutation was going to be the normal use case (like is probably the case with 
Boost), then it's probably better to make in-place the norm, because that's the 
way that's going to be used most. However, since that's _not_ the way that's 
likely to be used most in D (due to strings having immutable elements), I
really 
don't think that in-place as the default makes the most sense for D.

- Jonathan M Davis

Jan 21 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Friday, January 21, 2011 12:15:42 spir wrote:
 On 01/21/2011 07:47 PM, so wrote:
 replace is clearer in the first case, because you're getting the
 return value.
 ...

 
 I am really trying hard to understand this, but your reasons for first
 is clearer then the second makes no sense to me i am sorry.
 I still think second is clearer, but whatever, as long as i can see the
 interface or the doc, i am fine.
 
 string replace(string, ...);
 void replace(ref string, ...);
 
 Regardless, I don't see anything wrong with naming functions in a
 manner that
 implies that a functional style is the default

 
 I am not against enforcing such a rule, i am against doing it implicitly
 and work with assumptions.
 Just check boost/string/replace, they have in place replaces default
 too. You might not like boost (some don't) but it is the closest example
 to D.

 
 Without any additional information, I would necessirily assume replace
 performs an /action/ because it's an action verb: meaning it changes the
 argument. Like 'so', I cannot understand the converse reasoning.
 I you want people to guess that a true function returns a result, just
 name it according to its result: replacedString, ot just replaced.
 Nobody, I guess, would ever think that a routine called replacedString
 acts in-place.

The fact that a function performs an action has nothing do to with whether it 
alters its arguments or just returns a value. It could be either.

Functional languages _must_ return a result and _can't_ alter their arguments. 
Many, many functions are rewritten that way in pretty much _all_ languages. In 
fact, I'd argue that the _normal_ case is that you pass arguments to a
function, 
and it returns a result without altering the arguments. It's only when you get 
into reference types that that changes.

And, of course, arrays are reference types (abeit somewhat special ones). But 
since non-reference type arguments never get altered, and many functions with 
reference type arguments don't alter there arguments (in fact, I'd argue that 
_most_ functions don't alter their arguments - regardless of whether they're 
reference types or not), _not_ altering the arguments would be what you would 
typically expect of a function unless the name made it obvious that that wasn't 
the case, or what the function did made it obvious, or if you read the 
documention and then _knew_ what it did.

And honestly, I find the whole python thing of using the past partiple for 
indicating that the result is returned rather than done in place is just weird. 
I'd expect a function like sorted to give me a boolean result telling me
whether 
a range is sorted, _not_ that it would return a sorted version of the range
that 
you gave it. I expect function names to be verbs, not past participles. Now, as 
bizarre as that convention may be, it could make functions clearer if you know 
about the convention and it is followed. However, as someone who has never
dealt 
with code written that way, reading code that was written that way would be 
rather confusing at first.

In any case, I'd argue that having a function _not_ alter its aruments is the 
typical default case of functions in general, so assuming that a function 
altered in place just because you passed it an array seems odd to me.

- Jonathan M Davis

Jan 21 2011

spir <denis.spir gmail.com> writes:

On 01/21/2011 09:21 PM, Jonathan M Davis wrote:
 The issue is when you don't look at the documentation or trying to avoid having
 to look at the documentation. If you see

 auto result = replace(str, "hello", "goodbye");

 it's quite clear that a copy is taking place. And if a copy/slice is taking
 place, then that is what you would normally see when replace is used. However,
 if replace alters the array in place, then

 replace(str, "hello", "goodbye");

 would be what you would normally see. And without looking at the documentation,
 it's not clear whether that is doing it in-place or if you're throwing away the
 return value. However, in the case where replace does a copy/slice, it_is_
 clear, because the return value is saved.

I don't follow you here. You use in your reasoning the particularity of 
C-like funcs which can be both proper functions and action routines. 
Indeed, as you say, one can throw away a result after calling a routine 
which is mainly a function, but for a side-effect; right. But the same 
applies conversely: one can well call a routine which is mainly an 
action (in this case, that operates in-place) and returns whatever 
outcome flag, so that:
	auto result = replace(str, "hello", "goodbye");
actually operates in-place. Which is consistent with its name, an action 
verb suggesting an action. Replace could eg return the number of 
replacements performed (actually useful, what do you think?) Without 
more information, and guessing from the name, that is precisely what I 
would think (and try to imagine what meta-info replace returns).

Do not misinterpret: I actually support the choice of making return/copy 
the default (where both would make sense), because it's safer. But since 
we are changing many names, why not avoid misleading ones, precisely for 
the default case?


Denis
_________________
vita es estrany
spir.wikidot.com

Jan 21 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/19/11 8:36 PM, so wrote:
 And honestly, from the standpoint of code simplicity and
 understandability,
 there's a lot to be said for making copies being the default rather than
 mutation. You can then use the InPlace versions if you need the boost in
 efficiency.

 - Jonathan M Davis

 Isn't simplicity and understandability favors the in-place style on
 these type of algorithms?
 As Jesse Phillips said, it is same as sort.

We also have toupperInPlace and tolowerInPlace as precedents pointing 
the other way.

Andrei

Jan 19 2011

bearophile <bearophileHUGS lycos.com> writes:

Andrei:

 Problem is, even though the example uses strings, the functions apply to 
 all arrays.

Important general rule: if converting string functions into generic functions
makes them worse string functions, then don't move them to the algorithm
module, or create special string functions for the string module.

Bye,
bearophile

Jan 19 2011

Jerry Quinn <jlquinn optonline.net> writes:

Andrei Alexandrescu Wrote:

 On 1/19/11 6:53 PM, Jonathan M Davis wrote:
 On Wednesday, January 19, 2011 15:33:16 Andrei Alexandrescu wrote:
 I'm consolidating some routines from std.string into std.array. They are
 specialized for operating on arrays, and include the likes of insert,
 remove, replace.

 One question is whether operations should be performed in place or on a
 copy. For example:


 
 So I guess vote stays unchanged :o).
 
 Thoughts?

 Haven't we been using the approach that string operations generally make copies
 (in many cases slices) and marking functions that do it in place with InPlace?

 
 Problem is, even though the example uses strings, the functions apply to 
 all arrays.

The big difference is operating on immutable arrays vs mutable ones.  For
immutable arrays, you  have to do copies.  But mutable ones allow in-place
editing.  If I'm working with mutable arrays of ints, I don't want to have to
type InPlace after every function and I *really* don't want the array to be
copied or efficiency will go down the tubes.

Nor do I want to add Copy to every string operation.  This might be an argument
to leave the string functions where they are.  To a certain extent, strings are
special, even though they really aren't.

Is it too ugly to contemplate algorithms doing in-place operations on mutable
arrays and return a copy instead for immutable ones?

Jan 19 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday 19 January 2011 18:23:14 Jerry Quinn wrote:
 Andrei Alexandrescu Wrote:
 On 1/19/11 6:53 PM, Jonathan M Davis wrote:
 On Wednesday, January 19, 2011 15:33:16 Andrei Alexandrescu wrote:
 I'm consolidating some routines from std.string into std.array. They
 are specialized for operating on arrays, and include the likes of
 insert, remove, replace.
 
 One question is whether operations should be performed in place or on
 a


 
 copy. For example:


 So I guess vote stays unchanged :o).
 
 Thoughts?

 
 Haven't we been using the approach that string operations generally
 make copies (in many cases slices) and marking functions that do it in
 place with InPlace?

 
 Problem is, even though the example uses strings, the functions apply to
 all arrays.

 
 The big difference is operating on immutable arrays vs mutable ones.  For
 immutable arrays, you  have to do copies.  But mutable ones allow in-place
 editing.  If I'm working with mutable arrays of ints, I don't want to have
 to type InPlace after every function and I *really* don't want the array
 to be copied or efficiency will go down the tubes.
 
 Nor do I want to add Copy to every string operation.  This might be an
 argument to leave the string functions where they are.  To a certain
 extent, strings are special, even though they really aren't.
 
 Is it too ugly to contemplate algorithms doing in-place operations on
 mutable arrays and return a copy instead for immutable ones?

I'd say that yes, it's too ugly to contemplate. The reason is simple: the 
behavior of the function then changes drastically depending on whether the
array 
you give it is immutable or not.

- Jonathan M Davis

Jan 19 2011

Justin Johansson <jj nospam.com> writes:

On 20/01/11 10:33, Andrei Alexandrescu wrote:
 I'm consolidating some routines from std.string into std.array. They are
 specialized for operating on arrays, and include the likes of insert,
 remove, replace.

 One question is whether operations should be performed in place or on a
 copy. For example:

Though your question has already prompted a number of answers, are you 
sure that your question *saliently* poses the problem to be answered?

In short, work on stating the problem as succinctly as you can, rather 
than asking for answers that shoot from the hip.

Cheers,
Justin Johansson

Jan 20 2011

spir <denis.spir gmail.com> writes:

On 01/20/2011 12:33 AM, Andrei Alexandrescu wrote:
 I'm consolidating some routines from std.string into std.array. They are
 specialized for operating on arrays, and include the likes of insert,
 remove, replace.

 One question is whether operations should be performed in place or on a
 copy. For example:

 string s = "Mary has a lil lamb.";
 // Implicit copy
 auto s1 = replace(s, "lil", "li'l");
 assert(s == "Mary has a lil lamb.");
 // Explicit in-place
 replaceInPlace(s, "lil", "li'l");
 assert(s == "Mary has a li'l lamb.");

 So that would make copying the default behavior. Alternatively, we could
 make in-place the default behavior and ask for the Copy suffix:

 string s = "Mary has a lil lamb.";
 // Explicit copy
 auto s1 = replaceCopy(s, "lil", "li'l");
 assert(s == "Mary has a lil lamb.");
 // Implicit in-place
 replace(s, "lil", "li'l");
 assert(s == "Mary has a li'l lamb.");


 Thoughts?

I have thought at these issues (there are several playing together) in 
other languages.

The first problem is indeed that both operations may often be useful. If 
you define it to operate in-place, then when the user instead wants a 
new element, they need copy first:
	col2 = col1;
	col2.sort();
If instead you define it to create a new element, then conversely when 
the user wants it to operate in-place, they need to reassign:
	col1 = col.sorted;

The second point is how to hint the user to the actual semantics, and 
avoid possibly naughty bugs. It's mainly a question of naming. I have 
decided to follow once and for all the below guideline:
* In-place modification is an action, thus it's name is an action verb, 
like "sort" (indeed, english is very often ambiguous; in such cases, 
verbal sense take precedence, else add some more word).
* Creating a new is a function in the pure, math, sense of the word (not 
the C sense); name after what it creates. Usually, a simple adjective 
does the job, else add a noun: "sorted", "sortedTable", "sortedList".
* Never mix both action & function in the same routine (except for 
signaling error in language without any exception system).

It is often worth having both operations; difference of naming makes 
this easy to manage. When having both is overkill, I decided to return a 
new element for methods operating globally, and modify in-place for 
methods operating at the level of element(s). The reason is the first 
ones are usually costly, so it's worth using the safer functional scheme 
(and copying sometimes allows faster algo). While creating a whole new 
collection after any minimal change on element(s) is obviously not very 
efficient.

These questions, as taken implicitely in this thread, mostly concern 
collections. Now, the case of string chosen as initial example is as 
always very particular. I'm not fan for this reason of the politics of 
using the same methods as for (other) arrays, except in cases where it's 
obvious. D strings are even more particular by having immutable 
elements. Well...
My 2 cents.

Denis
_________________
vita es estrany
spir.wikidot.com

Jan 20 2011

"Akakima" <akakima33 gmail.com> writes:

Is it ok to use:

In place:

trim( string )
replace( string, from, to )

or Copy:

trim( string, outstring )
replace( string, from, to, outstring )

Jan 20 2011

foobar <foo bar.com> writes:

Andrei Alexandrescu Wrote:

 I'm consolidating some routines from std.string into std.array. They are 
 specialized for operating on arrays, and include the likes of insert, 
 remove, replace.
 
 One question is whether operations should be performed in place or on a 
 copy. For example:
 
 string s = "Mary has a lil lamb.";
 // Implicit copy
 auto s1 = replace(s, "lil", "li'l");
 assert(s == "Mary has a lil lamb.");
 // Explicit in-place
 replaceInPlace(s, "lil", "li'l");
 assert(s == "Mary has a li'l lamb.");
 
 So that would make copying the default behavior. Alternatively, we could 
 make in-place the default behavior and ask for the Copy suffix:
 
 string s = "Mary has a lil lamb.";
 // Explicit copy
 auto s1 = replaceCopy(s, "lil", "li'l");
 assert(s == "Mary has a lil lamb.");
 // Implicit in-place
 replace(s, "lil", "li'l");
 assert(s == "Mary has a li'l lamb.");
 
 
 Thoughts?
 
 Andrei

Like bearophile and others, I too would prefer the default behavior to be the
functional option and return a copy by default. As already mentioned this
agrees with the immutable d string types.

Regarding the naming scheme we have several options:
1. overload based on immutability. The type system will do the right thing for
you but this may be confusing to read, especially if one uses auto frequently.
2. Use past tense a-la python (sort vs. sorted). This reads more naturally for
native English speakers but has the same issues as English itself (all those
language exceptions such as split). 
3. use "artificial" scheme such as Ruby's bang (sort vs. sort!). This is my
preferred option. Benefits are consistency and is easier for for non native
English speakers.
Unfortunately, D doesn't allow '!' in function names. "__InPlace" is clear but
also verbose. Perhaps we could use some other, more terse, notion? 
something like:


My two cents...

Jan 20 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Friday, January 21, 2011 12:48:57 spir wrote:
 On 01/21/2011 09:21 PM, Jonathan M Davis wrote:
 The issue is when you don't look at the documentation or trying to avoid
 having to look at the documentation. If you see
 
 auto result = replace(str, "hello", "goodbye");
 
 it's quite clear that a copy is taking place. And if a copy/slice is
 taking place, then that is what you would normally see when replace is
 used. However, if replace alters the array in place, then
 
 replace(str, "hello", "goodbye");
 
 would be what you would normally see. And without looking at the
 documentation, it's not clear whether that is doing it in-place or if
 you're throwing away the return value. However, in the case where
 replace does a copy/slice, it_is_ clear, because the return value is
 saved.

 
 I don't follow you here. You use in your reasoning the particularity of
 C-like funcs which can be both proper functions and action routines.
 Indeed, as you say, one can throw away a result after calling a routine
 which is mainly a function, but for a side-effect; right. But the same
 applies conversely: one can well call a routine which is mainly an
 action (in this case, that operates in-place) and returns whatever
 outcome flag, so that:
 	auto result = replace(str, "hello", "goodbye");
 actually operates in-place. Which is consistent with its name, an action
 verb suggesting an action. Replace could eg return the number of
 replacements performed (actually useful, what do you think?) Without
 more information, and guessing from the name, that is precisely what I
 would think (and try to imagine what meta-info replace returns).
 
 Do not misinterpret: I actually support the choice of making return/copy
 the default (where both would make sense), because it's safer. But since
 we are changing many names, why not avoid misleading ones, precisely for
 the default case?

Sure, you can always come up with more exotic stuff that the return value could 
do, but I would expect that your average programmer would think that

auto result = replace(str, "hello", "goodbye");

made a copy of the string with "hello" having been replaced with "goodbye" in 
the return value rather than in-place. Stuff like returning the number of 
replacements made is less typical, and I wouldn't expect that to be what a 
programmer would initially expect the function to do. Obviously, you're going
to 
have to look at the documentation to be sure regardless of what the function 
actually does, but in this case, the obvious answer would be the correct one.

I really don't find having functions returning results without altering their 
arguments as the normal case to be odd at all, let alone misleading, since 
that's what most functions actually do. True, it becomes more ambiguous once 
you're dealing with reference types like arrays, and ultimately, you have to 
read the documentation to be sure, but the most typical case is for a function 
to take in a set of arguments and return a result without altering those 
arguments. I see no reason to change that just because you're dealing with a 
reference type. I find replace to be perfectly clear as it is.

- Jonathan M Davis

Jan 21 2011

spir <denis.spir gmail.com> writes:

On 01/21/2011 10:03 PM, Jonathan M Davis wrote:
 I really don't find having functions returning results without altering their
 arguments as the normal case to be odd at all, let alone misleading, since
 that's what most functions actually do.

Same for me. I don't find having this version as the normal case odd at 
all, neither.
I just find using action-verbs to denote that misleading; eg sort(array) 
so-to-say "naturally" means "_sort_ this array", not "gimme a new 
_sorted_ array". Many programmers name function which (main) purpose is 
to construct a new element according to said element (not only me, I 
copied this practice from others). They are right on this, it's highly 
informative and never misleading (except for issues inherent to english).
Then, using action-verbs for action-functions is, in constrast, also 
sensible:
	writeReport(reportData);

Denis
_________________
vita es estrany
spir.wikidot.com

Jan 21 2011

D Programming

C/C++ Programming

Other

digitalmars.D - xxxInPlace or xxxCopy?