digitalmars.D - in-parameter

spir (9/9) Nov 07 2010 Hello,

Daniel Murphy (10/10) Nov 08 2010 "spir" wrote in message

spir (14/28) Nov 08 2010 e=20
Jonathan M Davis (29/57) Nov 08 2010 Dynamic arrays are reference types. A Dynamic array holds the pointer to...

Pillsy (7/12) Nov 08 2010 This behavior, IMO, is a real misfeature. The length property of an arra...

Jonathan M Davis (16/46) Nov 08 2010 Concatenation for arrays is huge, and given that strings are arrays, it'...

Pillsy (12/24) Nov 08 2010 Well, there's catenating arrays and producing a new (and freshly allocat...

Steven Schveighoffer (4/6) Nov 08 2010 This is a misconception, a string is not immutable, the data it points t...

Pillsy (14/20) Nov 08 2010 So, wait, if I have a program like this:

Pillsy (12/26) Nov 08 2010
Steven Schveighoffer (41/63) Nov 08 2010 The latter. appendSailor does nothing of significance since it throws aw...

Jonathan M Davis (33/66) Nov 08 2010 It's perfectly legal. You just can't alter any of the elements in the ar...

spir <denis.spir gmail.com> writes:

Hello,

I'd like to know, aside user-side semantics, whether the compiler uses the =
"in" qualifier for efficiency (pass arrays & structs by ref under the hood?=
). Well, seems obvious, but there may be some hidden constraint I'm unable =
to realise.


Denis
-- -- -- -- -- -- --
vit esse estrany =E2=98=A3

spir.wikidot.com

Nov 07 2010

"Daniel Murphy" <yebblies nospamgmail.com> writes:

"spir" <denis.spir gmail.com> wrote in message 
news:mailman.157.1289146124.21107.digitalmars-d puremagic.com...
I'd like to know, aside user-side semantics, whether the compiler uses the 
"in" qualifier for efficiency (pass arrays & structs by ref under the 
hood?). Well, seems obvious, but there may be some hidden constraint I'm 
unable to realise.

The spec states: "The in storage class is equivalent to const scope."

So, no, the compiler never implicitly uses ref to pass in parameters.

It could be possible with a rule like "pass by const ref if param.sizeof > x 
bytes", but I think this would require an abi change.

Nov 08 2010

spir <denis.spir gmail.com> writes:

On Mon, 8 Nov 2010 18:13:54 +1000
"Daniel Murphy" <yebblies nospamgmail.com> wrote:

 "spir" <denis.spir gmail.com> wrote in message=20
 news:mailman.157.1289146124.21107.digitalmars-d puremagic.com...
 I'd like to know, aside user-side semantics, whether the compiler uses th=

e=20
 "in" qualifier for efficiency (pass arrays & structs by ref under the=20
 hood?). Well, seems obvious, but there may be some hidden constraint I'm=

=20
 unable to realise.
=20
 The spec states: "The in storage class is equivalent to const scope."
=20
 So, no, the compiler never implicitly uses ref to pass in parameters.
=20
 It could be possible with a rule like "pass by const ref if param.sizeof =
 x=20
 bytes", but I think this would require an abi change.=20

=20
Then, if I pass a huge array/string or struct as "in", it is copied, right?=
 Is the only way to avoid copy then to pass as ref?
I take the opportunity to ask about dynamic arrays. There is an internally =
pointed element-array, but is the array's interface (the kind of struct hol=
ding ptr & length) itself implicitely referenced?

Denis
-- -- -- -- -- -- --
vit esse estrany =E2=98=A3

spir.wikidot.com

Nov 08 2010

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday 08 November 2010 00:03:55 spir wrote:
 On Mon, 8 Nov 2010 18:13:54 +1000
=20
 "Daniel Murphy" <yebblies nospamgmail.com> wrote:
 "spir" <denis.spir gmail.com> wrote in message
 news:mailman.157.1289146124.21107.digitalmars-d puremagic.com...
 I'd like to know, aside user-side semantics, whether the compiler uses
 the "in" qualifier for efficiency (pass arrays & structs by ref under
 the hood?). Well, seems obvious, but there may be some hidden constraint
 I'm unable to realise.
=20
 The spec states: "The in storage class is equivalent to const scope."
=20
 So, no, the compiler never implicitly uses ref to pass in parameters.
=20
 It could be possible with a rule like "pass by const ref if param.sizeof
 x bytes", but I think this would require an abi change.


=20
 Then, if I pass a huge array/string or struct as "in", it is copied, righ=

t?
 Is the only way to avoid copy then to pass as ref? I take the opportunity
 to ask about dynamic arrays. There is an internally pointed element-array,
 but is the array's interface (the kind of struct holding ptr & length)
 itself implicitely referenced?
=20
 Denis
 -- -- -- -- -- -- --
 vit esse estrany =E2=98=A3
=20
 spir.wikidot.com

Dynamic arrays are reference types. A Dynamic array holds the pointer to it=
s=20
internal C-style array and the length of that array. It's essentially a str=
uct.=20
That struct - being a struct - is copied by value. But since internally, it=
=20
holds a pointer, the new dynamic array struct has a pointer to the same dat=
a.=20
So, if you alter the elements of that array, it alters the elements of the =
array=20
that was passed in. However, if you alter the arrays size, causing it to ha=
ve to=20
re-allocate memory, then that array is going to be pointing to a different =
block=20
of memory, and it will no longer affect the original array. Dynamic arrays =
are=20
shallow-copied when passed to functions, not deep-copied, so you have to us=
e dup=20
or idup if you want to get a full copy which will no longer affect the orig=
inal=20
(or the parameter has to be const or in, so that the function cannot alter =
the=20
array at all).

Now, static arrays _do_ get copied when passed as arguments to functions be=
cause=20
they're value types, but dynamic arrays are reference types, so they don't.

=2D Jonathan M Davis

Nov 08 2010

Pillsy <pillsbury gmail.com> writes:

Jonathan M Davis Wrote:
[...]
 So, if you alter the elements of that array, it alters the elements of 
 the array that was passed in. However, if you alter the arrays size, 
 causing it to have to re-allocate memory, then that array is going to 
 be pointing to a different block  of memory, and it will no longer 
 affect the original array.

This behavior, IMO, is a real misfeature. The length property of an array
shouldn't be directly mutable, and you shouldn't be able to append onto the end
of a dynamic array, because it can cause some surprising behavior and adds a
lot of cruft to the interface in the form of, well, most of std.array. The
ability to use an an array as a replacement for an ArrayList or std::vector
clashes badly with the ability to use arrays as lightweight slices.

Since lightweight slices are such a win[1], and people coming from any of

to use as a list, I think separating the two concepts and moving the flexible
array into a library would be a notable improvement to the language. Sure, it
may surprise Pythonistas, but they'll have a lot to learn anyway, given how
much lower level and more static D is as a language. 

[1] From a "marketing" perspective, they're also great way to show off how
using a GCed language can actually improve performance and memory use.

Cheers,
Pillsy

Nov 08 2010

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday, November 08, 2010 08:43:20 Pillsy wrote:
 Jonathan M Davis Wrote:
 [...]
 
 So, if you alter the elements of that array, it alters the elements of
 the array that was passed in. However, if you alter the arrays size,
 causing it to have to re-allocate memory, then that array is going to
 be pointing to a different block  of memory, and it will no longer
 affect the original array.

 
 This behavior, IMO, is a real misfeature. The length property of an array
 shouldn't be directly mutable, and you shouldn't be able to append onto
 the end of a dynamic array, because it can cause some surprising behavior
 and adds a lot of cruft to the interface in the form of, well, most of
 std.array. The ability to use an an array as a replacement for an
 ArrayList or std::vector clashes badly with the ability to use arrays as
 lightweight slices.
 
 Since lightweight slices are such a win[1], and people coming from any of

 array to use as a list, I think separating the two concepts and moving the
 flexible array into a library would be a notable improvement to the
 language. Sure, it may surprise Pythonistas, but they'll have a lot to
 learn anyway, given how much lower level and more static D is as a
 language.
 
 [1] From a "marketing" perspective, they're also great way to show off how
 using a GCed language can actually improve performance and memory use.
 
 Cheers,
 Pillsy

Concatenation for arrays is huge, and given that strings are arrays, it's that 
much more important. I agree that messing with the size of array can be 
confusing, but in practice, I've never seen it be a problem. If you want to 
ensure that an array can't be altered, then either you pass it as const or dup 
it. If it has const or immutable elements (such as string does), then you don't 
even have to worry about that. As long as you realize that resizing an array 
_can_ cause re-allocation, then you don't resize an array that you don't want
to 
re-allocate. And if you want to _force_ re-allocation, then you simply use dup 
or idup.

I really don't think that this is generally an issue, and if you want to avoid 
it, all you have to do is just not resize arrays, so I really don't think that 
this is much of a problem in reality. And I'd have to have arrays hamstringed
by 
disallowing concatenation and the like. I just don't see any real benefit in 
trying to do so. What we have works quite well.

- Jonathan M Davis

Nov 08 2010

Pillsy <pillsbury gmail.com> writes:

Jonathan M Davis wrote:

 On Monday, November 08, 2010 08:43:20 Pillsy wrote:

 The length property of an array shouldn't be directly mutable,
 and you shouldn't be able to append onto the end of a dynamic
 array, because it can cause some surprising behavior and adds a 
 lot of cruft to the interface in the form of, well, most of
 std.array. The ability to use an an array as a replacement for an
 ArrayList or std::vector clashes badly with the ability to use 
 arrays as lightweight slices.


[...]
 Concatenation for arrays is huge, and given that strings are arrays, 
 it's that much more important.

Well, there's catenating arrays and producing a new (and freshly allocated)
array, and there's catenating arrays *in place* to reduce 
the allocation overhead. That's not useless by any means, but I think
it can be handled by container classes or the moral equivalent of something
like StringBuilder. Since D has operator overloading, you can 
even continue to use the same pleasant syntax.
 
Besides, isn't catenating or appending in place impossible with D's (immutable)
strings anyway?
[...]
 I just don't see any real benefit in trying to do so. What we have 
 works quite well.

The way mutable arrays may or may not share structure with each other in ways
that are hard to predict gives me the screaming willies, and I think container
and "builder" classes are entirely sufficient for covering the other use cases.
In any event, I'm pretty certain that even if there were wide-spread support
for this change, it would have to wait for the largely-hypothetical D3.

Cheers,
Pillsy

Nov 08 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 08 Nov 2010 13:46:52 -0500, Pillsy <pillsbury gmail.com> wrote:

 Besides, isn't catenating or appending in place impossible with D's  
 (immutable) strings anyway?

This is a misconception, a string is not immutable, the data it points to  
is immutable.  You can append to a string just like a mutable array.

-Steve

Nov 08 2010

Pillsy <pillsbury gmail.com> writes:

Steven Schveighoffer Wrote:

 On Mon, 08 Nov 2010 13:46:52 -0500, Pillsy <pillsbury gmail.com> 
 wrote:

 Besides, isn't catenating or appending in place impossible with D's  
 (immutable) strings anyway?


 This is a misconception, a string is not immutable, the data it points 
 to is immutable.  You can append to a string just like a mutable array.

So, wait, if I have a program like this:

void appendSailor (string s) {
   s ~= "Sailor";
}

void main () {
   auto s = "Hello World!";

   appendSailor(s[0 .. 6]);

   writefln(s);
}

I should expect to get "Hello Sailor" as output? Or is it just that a new array
of characters will be allocated and that will be appended into, so
`appendSailor()` becomes a slightly expensive no-op?

The former behavior would be really horrible, while the latter behavior doesn't
seem to provide an overwhelming advantage over not allowing append-in-place for
arrays.

Cheers,
Pillsy

Nov 08 2010

Pillsy <pillsbury gmail.com> writes:

Pillsy Wrote:
[...]
 So, wait, if I have a program like this:

 
 void appendSailor (string s) {
    s ~= "Sailor";
 }

 
 void main () {
    auto s = "Hello World!";
 
    appendSailor(s[0 .. 6]);
 
    writefln(s);
 }

 
 I should expect to get "Hello Sailor" as output? Or is it just that a 
 new array of characters will be allocated and that will be appended 
 into, so `appendSailor()` becomes a slightly expensive no-op?

No, wait, I'm a moron.

Having

s = "foo"
s ~= "bar"

mean that `s` now holds "foobar" is obviously pretty useful. But as useful as
it is, I assume it doesn't mean manipulating the length freely, which is what
concerns me. Since strings are immutable arrays, the question of what structure
is being shared is mostly academic, and when it's not academic, it's a
performance issue.

Cheers,
Pillsy

Nov 08 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 08 Nov 2010 14:19:47 -0500, Pillsy <pillsbury gmail.com> wrote:

 Steven Schveighoffer Wrote:

 On Mon, 08 Nov 2010 13:46:52 -0500, Pillsy <pillsbury gmail.com>
 wrote:

 Besides, isn't catenating or appending in place impossible with D's
 (immutable) strings anyway?


 This is a misconception, a string is not immutable, the data it points
 to is immutable.  You can append to a string just like a mutable array.

 So, wait, if I have a program like this:

 void appendSailor (string s) {
    s ~= "Sailor";
 }

 void main () {
    auto s = "Hello World!";

    appendSailor(s[0 .. 6]);

    writefln(s);
 }

 I should expect to get "Hello Sailor" as output? Or is it just that a  
 new array of characters will be allocated and that will be appended  
 into, so `appendSailor()` becomes a slightly expensive no-op?

The latter. appendSailor does nothing of significance since it throws away  
its result.

 The former behavior would be really horrible, while the latter behavior  
 doesn't seem to provide an overwhelming advantage over not allowing  
 append-in-place for arrays.

It provides huge value, because given a string, I don't have to care where  
it came from or who built it, I know I can just append to it and the  
runtime takes care of the details.  When it can be optimized, it will be.   
Less things to care about = less things to prove.  By disallowing  
appending, you are wasting huge amounts of possible optimizations (didn't  
follow this thread fully, so I'm not sure exactly what you proposed).

For instance, if you had:

string appendSailor (string s) {
    s ~= "Sailor";
    return s;
}

void main()
{
    auto s = "Hello World!";
    auto s2 = "Hello ";
    auto s3 = s2.idup;
    auto s4 = s.idup;

    s = appendSailor(s[0..6]);
    s2 = appendSailor(s2);
    s3 = appendSailor(s3);
    s4 = appendSailor(s4[0..6]);
}

all four cases are valid, the only one that "extends into existing memory"  
is s3, but it's the only one where you *could* extend into existing  
memory.  s and s2 are in ROM, so you can't extend there, extending  
s4[0..6] would overwrite immutable data, so that is illegal.  So  
basically, appending optimizes where it possibly can optimize, and  
everywhere else it does the right thing to complete the operation.

So without knowing how appendSailor is called, I can judge it just by  
looking at appendSailor, and reason that it will always return a  
consistent result, with no memory violations.  In addition, I need not  
specify any extra requirements for appendSailor like "only call this with  
newly allocated memory!".  It just works, the same way, every time.

This of course is only true for immutable or const.  For mutable data,  
there is the possibility that a function may create a confusing situation,  
but it's very rare for this to happen, since most code is either concerned  
with appending data or modifying data, but not both.

-Steve

Nov 08 2010

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday, November 08, 2010 10:46:52 Pillsy wrote:
 Jonathan M Davis wrote:
 On Monday, November 08, 2010 08:43:20 Pillsy wrote:
 The length property of an array shouldn't be directly mutable,
 and you shouldn't be able to append onto the end of a dynamic
 array, because it can cause some surprising behavior and adds a
 lot of cruft to the interface in the form of, well, most of
 std.array. The ability to use an an array as a replacement for an
 ArrayList or std::vector clashes badly with the ability to use
 arrays as lightweight slices.


 
 [...]
 
 Concatenation for arrays is huge, and given that strings are arrays,
 it's that much more important.

 
 Well, there's catenating arrays and producing a new (and freshly allocated)
 array, and there's catenating arrays *in place* to reduce the allocation
 overhead. That's not useless by any means, but I think it can be handled
 by container classes or the moral equivalent of something like
 StringBuilder. Since D has operator overloading, you can even continue to
 use the same pleasant syntax.
 
 Besides, isn't catenating or appending in place impossible with D's
 (immutable) strings anyway? [...]

It's perfectly legal. You just can't alter any of the elements in the array. 
Appending or concatenating to an arry alters the array, not its elements. And
in 
a string, it's its elements which are immutable, not the array itself.

 I just don't see any real benefit in trying to do so. What we have
 works quite well.

 
 The way mutable arrays may or may not share structure with each other in
 ways that are hard to predict gives me the screaming willies, and I think
 container and "builder" classes are entirely sufficient for covering the
 other use cases. In any event, I'm pretty certain that even if there were
 wide-spread support for this change, it would have to wait for the
 largely-hypothetical D3.

It's certainly true that if you want to be altering mutable arrays and you want 
guarantees that two arrays don't refer to the same memory, you're going to have 
to take extra steps - such as using dup. However, a container class such as 
Array isn't necessarily going to help with that. The main change there would be 
that a range over the container and the container would no longer be the same 
type, but that can be detrimental as well depending on what you're doing.

A container solution would result in extraneous dups in many situations and 
could be detrimental to performance. The current situation does not disallow a 
container solution, but it doesn't force it either, so you're free to use 
containers such as Array, if you'd prefer.

Really, I think that using dynamic arrays safely comes down to four rules:

1. Don't resize an array if you want a guarantee that any references to it or a 
portion of it continue to point to the same memory.

2. dup or idup an array if you want a gurantee that it points to the same 
memory.

3. If you don't care whether two arrays point to the same memory or not, then 
feel free to resize them. They may or may not end up pointing to the same 
memory, but since you don't care if they do, it doesn't matter.

4. If you want guarantees that the elements of an array aren't altered but
don't 
want to any reallocations to take place if they don't need to, then use arrays 
with const or immutable elements. That way, the arrays themselves can be messed 
with as much as you like, but the elements won't be changed, and so it doesn't 
matter one whit whether two arrays point to the same memory.

True, the situation is not as straightforward as it would be if resizing arrays 
wasn't legal, but it would sure be limiting to disallow the resizing of arrays 
(I think that it's a great feature and a definite improvement over C). It
really 
isn't all that hard to avoid problems with it, and of course, there's nothing 
stopping anyone from using library solutions instead if they'd prefer.

- Jonathan M Davis

Nov 08 2010

D Programming

C/C++ Programming

Other

digitalmars.D - in-parameter