www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - in-parameter

reply spir <denis.spir gmail.com> writes:
Hello,

I'd like to know, aside user-side semantics, whether the compiler uses the =
"in" qualifier for efficiency (pass arrays & structs by ref under the hood?=
). Well, seems obvious, but there may be some hidden constraint I'm unable =
to realise.


Denis
-- -- -- -- -- -- --
vit esse estrany =E2=98=A3

spir.wikidot.com
Nov 07 2010
parent reply "Daniel Murphy" <yebblies nospamgmail.com> writes:
"spir" <denis.spir gmail.com> wrote in message 
news:mailman.157.1289146124.21107.digitalmars-d puremagic.com...
I'd like to know, aside user-side semantics, whether the compiler uses the 
"in" qualifier for efficiency (pass arrays & structs by ref under the 
hood?). Well, seems obvious, but there may be some hidden constraint I'm 
unable to realise.

The spec states: "The in storage class is equivalent to const scope."

So, no, the compiler never implicitly uses ref to pass in parameters.

It could be possible with a rule like "pass by const ref if param.sizeof > x 
bytes", but I think this would require an abi change. 
Nov 08 2010
next sibling parent spir <denis.spir gmail.com> writes:
On Mon, 8 Nov 2010 18:13:54 +1000
"Daniel Murphy" <yebblies nospamgmail.com> wrote:

 "spir" <denis.spir gmail.com> wrote in message=20
 news:mailman.157.1289146124.21107.digitalmars-d puremagic.com...
 I'd like to know, aside user-side semantics, whether the compiler uses th=
e=20
 "in" qualifier for efficiency (pass arrays & structs by ref under the=20
 hood?). Well, seems obvious, but there may be some hidden constraint I'm=
=20
 unable to realise.
=20
 The spec states: "The in storage class is equivalent to const scope."
=20
 So, no, the compiler never implicitly uses ref to pass in parameters.
=20
 It could be possible with a rule like "pass by const ref if param.sizeof =
 x=20
 bytes", but I think this would require an abi change.=20
=20 Then, if I pass a huge array/string or struct as "in", it is copied, right?= Is the only way to avoid copy then to pass as ref? I take the opportunity to ask about dynamic arrays. There is an internally = pointed element-array, but is the array's interface (the kind of struct hol= ding ptr & length) itself implicitely referenced? Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Nov 08 2010
prev sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday 08 November 2010 00:03:55 spir wrote:
 On Mon, 8 Nov 2010 18:13:54 +1000
=20
 "Daniel Murphy" <yebblies nospamgmail.com> wrote:
 "spir" <denis.spir gmail.com> wrote in message
 news:mailman.157.1289146124.21107.digitalmars-d puremagic.com...
 I'd like to know, aside user-side semantics, whether the compiler uses
 the "in" qualifier for efficiency (pass arrays & structs by ref under
 the hood?). Well, seems obvious, but there may be some hidden constraint
 I'm unable to realise.
=20
 The spec states: "The in storage class is equivalent to const scope."
=20
 So, no, the compiler never implicitly uses ref to pass in parameters.
=20
 It could be possible with a rule like "pass by const ref if param.sizeof
 x bytes", but I think this would require an abi change.
=20 Then, if I pass a huge array/string or struct as "in", it is copied, righ=
t?
 Is the only way to avoid copy then to pass as ref? I take the opportunity
 to ask about dynamic arrays. There is an internally pointed element-array,
 but is the array's interface (the kind of struct holding ptr & length)
 itself implicitely referenced?
=20
 Denis
 -- -- -- -- -- -- --
 vit esse estrany =E2=98=A3
=20
 spir.wikidot.com
Dynamic arrays are reference types. A Dynamic array holds the pointer to it= s=20 internal C-style array and the length of that array. It's essentially a str= uct.=20 That struct - being a struct - is copied by value. But since internally, it= =20 holds a pointer, the new dynamic array struct has a pointer to the same dat= a.=20 So, if you alter the elements of that array, it alters the elements of the = array=20 that was passed in. However, if you alter the arrays size, causing it to ha= ve to=20 re-allocate memory, then that array is going to be pointing to a different = block=20 of memory, and it will no longer affect the original array. Dynamic arrays = are=20 shallow-copied when passed to functions, not deep-copied, so you have to us= e dup=20 or idup if you want to get a full copy which will no longer affect the orig= inal=20 (or the parameter has to be const or in, so that the function cannot alter = the=20 array at all). Now, static arrays _do_ get copied when passed as arguments to functions be= cause=20 they're value types, but dynamic arrays are reference types, so they don't. =2D Jonathan M Davis
Nov 08 2010
parent reply Pillsy <pillsbury gmail.com> writes:
Jonathan M Davis Wrote:
[...]
 So, if you alter the elements of that array, it alters the elements of 
 the array that was passed in. However, if you alter the arrays size, 
 causing it to have to re-allocate memory, then that array is going to 
 be pointing to a different block  of memory, and it will no longer 
 affect the original array.
This behavior, IMO, is a real misfeature. The length property of an array shouldn't be directly mutable, and you shouldn't be able to append onto the end of a dynamic array, because it can cause some surprising behavior and adds a lot of cruft to the interface in the form of, well, most of std.array. The ability to use an an array as a replacement for an ArrayList or std::vector clashes badly with the ability to use arrays as lightweight slices. Since lightweight slices are such a win[1], and people coming from any of to use as a list, I think separating the two concepts and moving the flexible array into a library would be a notable improvement to the language. Sure, it may surprise Pythonistas, but they'll have a lot to learn anyway, given how much lower level and more static D is as a language. [1] From a "marketing" perspective, they're also great way to show off how using a GCed language can actually improve performance and memory use. Cheers, Pillsy
Nov 08 2010
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, November 08, 2010 08:43:20 Pillsy wrote:
 Jonathan M Davis Wrote:
 [...]
 
 So, if you alter the elements of that array, it alters the elements of
 the array that was passed in. However, if you alter the arrays size,
 causing it to have to re-allocate memory, then that array is going to
 be pointing to a different block  of memory, and it will no longer
 affect the original array.
This behavior, IMO, is a real misfeature. The length property of an array shouldn't be directly mutable, and you shouldn't be able to append onto the end of a dynamic array, because it can cause some surprising behavior and adds a lot of cruft to the interface in the form of, well, most of std.array. The ability to use an an array as a replacement for an ArrayList or std::vector clashes badly with the ability to use arrays as lightweight slices. Since lightweight slices are such a win[1], and people coming from any of array to use as a list, I think separating the two concepts and moving the flexible array into a library would be a notable improvement to the language. Sure, it may surprise Pythonistas, but they'll have a lot to learn anyway, given how much lower level and more static D is as a language. [1] From a "marketing" perspective, they're also great way to show off how using a GCed language can actually improve performance and memory use. Cheers, Pillsy
Concatenation for arrays is huge, and given that strings are arrays, it's that much more important. I agree that messing with the size of array can be confusing, but in practice, I've never seen it be a problem. If you want to ensure that an array can't be altered, then either you pass it as const or dup it. If it has const or immutable elements (such as string does), then you don't even have to worry about that. As long as you realize that resizing an array _can_ cause re-allocation, then you don't resize an array that you don't want to re-allocate. And if you want to _force_ re-allocation, then you simply use dup or idup. I really don't think that this is generally an issue, and if you want to avoid it, all you have to do is just not resize arrays, so I really don't think that this is much of a problem in reality. And I'd have to have arrays hamstringed by disallowing concatenation and the like. I just don't see any real benefit in trying to do so. What we have works quite well. - Jonathan M Davis
Nov 08 2010
parent reply Pillsy <pillsbury gmail.com> writes:
Jonathan M Davis wrote:

 On Monday, November 08, 2010 08:43:20 Pillsy wrote:
 The length property of an array shouldn't be directly mutable,
 and you shouldn't be able to append onto the end of a dynamic
 array, because it can cause some surprising behavior and adds a 
 lot of cruft to the interface in the form of, well, most of
 std.array. The ability to use an an array as a replacement for an
 ArrayList or std::vector clashes badly with the ability to use 
 arrays as lightweight slices.
[...]
 Concatenation for arrays is huge, and given that strings are arrays, 
 it's that much more important.
Well, there's catenating arrays and producing a new (and freshly allocated) array, and there's catenating arrays *in place* to reduce the allocation overhead. That's not useless by any means, but I think it can be handled by container classes or the moral equivalent of something like StringBuilder. Since D has operator overloading, you can even continue to use the same pleasant syntax. Besides, isn't catenating or appending in place impossible with D's (immutable) strings anyway? [...]
 I just don't see any real benefit in trying to do so. What we have 
 works quite well.
The way mutable arrays may or may not share structure with each other in ways that are hard to predict gives me the screaming willies, and I think container and "builder" classes are entirely sufficient for covering the other use cases. In any event, I'm pretty certain that even if there were wide-spread support for this change, it would have to wait for the largely-hypothetical D3. Cheers, Pillsy
Nov 08 2010
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 08 Nov 2010 13:46:52 -0500, Pillsy <pillsbury gmail.com> wrote:

 Besides, isn't catenating or appending in place impossible with D's  
 (immutable) strings anyway?
This is a misconception, a string is not immutable, the data it points to is immutable. You can append to a string just like a mutable array. -Steve
Nov 08 2010
parent reply Pillsy <pillsbury gmail.com> writes:
Steven Schveighoffer Wrote:

 On Mon, 08 Nov 2010 13:46:52 -0500, Pillsy <pillsbury gmail.com> 
 wrote:
 Besides, isn't catenating or appending in place impossible with D's  
 (immutable) strings anyway?
 This is a misconception, a string is not immutable, the data it points 
 to is immutable.  You can append to a string just like a mutable array.
So, wait, if I have a program like this: void appendSailor (string s) { s ~= "Sailor"; } void main () { auto s = "Hello World!"; appendSailor(s[0 .. 6]); writefln(s); } I should expect to get "Hello Sailor" as output? Or is it just that a new array of characters will be allocated and that will be appended into, so `appendSailor()` becomes a slightly expensive no-op? The former behavior would be really horrible, while the latter behavior doesn't seem to provide an overwhelming advantage over not allowing append-in-place for arrays. Cheers, Pillsy
Nov 08 2010
next sibling parent Pillsy <pillsbury gmail.com> writes:
Pillsy Wrote:
[...]
 So, wait, if I have a program like this:
 void appendSailor (string s) {
    s ~= "Sailor";
 }
 void main () {
    auto s = "Hello World!";
 
    appendSailor(s[0 .. 6]);
 
    writefln(s);
 }
 I should expect to get "Hello Sailor" as output? Or is it just that a 
 new array of characters will be allocated and that will be appended 
 into, so `appendSailor()` becomes a slightly expensive no-op?
No, wait, I'm a moron. Having s = "foo" s ~= "bar" mean that `s` now holds "foobar" is obviously pretty useful. But as useful as it is, I assume it doesn't mean manipulating the length freely, which is what concerns me. Since strings are immutable arrays, the question of what structure is being shared is mostly academic, and when it's not academic, it's a performance issue. Cheers, Pillsy
Nov 08 2010
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 08 Nov 2010 14:19:47 -0500, Pillsy <pillsbury gmail.com> wrote:

 Steven Schveighoffer Wrote:

 On Mon, 08 Nov 2010 13:46:52 -0500, Pillsy <pillsbury gmail.com>
 wrote:
 Besides, isn't catenating or appending in place impossible with D's
 (immutable) strings anyway?
 This is a misconception, a string is not immutable, the data it points
 to is immutable.  You can append to a string just like a mutable array.
So, wait, if I have a program like this: void appendSailor (string s) { s ~= "Sailor"; } void main () { auto s = "Hello World!"; appendSailor(s[0 .. 6]); writefln(s); } I should expect to get "Hello Sailor" as output? Or is it just that a new array of characters will be allocated and that will be appended into, so `appendSailor()` becomes a slightly expensive no-op?
The latter. appendSailor does nothing of significance since it throws away its result.
 The former behavior would be really horrible, while the latter behavior  
 doesn't seem to provide an overwhelming advantage over not allowing  
 append-in-place for arrays.
It provides huge value, because given a string, I don't have to care where it came from or who built it, I know I can just append to it and the runtime takes care of the details. When it can be optimized, it will be. Less things to care about = less things to prove. By disallowing appending, you are wasting huge amounts of possible optimizations (didn't follow this thread fully, so I'm not sure exactly what you proposed). For instance, if you had: string appendSailor (string s) { s ~= "Sailor"; return s; } void main() { auto s = "Hello World!"; auto s2 = "Hello "; auto s3 = s2.idup; auto s4 = s.idup; s = appendSailor(s[0..6]); s2 = appendSailor(s2); s3 = appendSailor(s3); s4 = appendSailor(s4[0..6]); } all four cases are valid, the only one that "extends into existing memory" is s3, but it's the only one where you *could* extend into existing memory. s and s2 are in ROM, so you can't extend there, extending s4[0..6] would overwrite immutable data, so that is illegal. So basically, appending optimizes where it possibly can optimize, and everywhere else it does the right thing to complete the operation. So without knowing how appendSailor is called, I can judge it just by looking at appendSailor, and reason that it will always return a consistent result, with no memory violations. In addition, I need not specify any extra requirements for appendSailor like "only call this with newly allocated memory!". It just works, the same way, every time. This of course is only true for immutable or const. For mutable data, there is the possibility that a function may create a confusing situation, but it's very rare for this to happen, since most code is either concerned with appending data or modifying data, but not both. -Steve
Nov 08 2010
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, November 08, 2010 10:46:52 Pillsy wrote:
 Jonathan M Davis wrote:
 On Monday, November 08, 2010 08:43:20 Pillsy wrote:
 The length property of an array shouldn't be directly mutable,
 and you shouldn't be able to append onto the end of a dynamic
 array, because it can cause some surprising behavior and adds a
 lot of cruft to the interface in the form of, well, most of
 std.array. The ability to use an an array as a replacement for an
 ArrayList or std::vector clashes badly with the ability to use
 arrays as lightweight slices.
[...]
 Concatenation for arrays is huge, and given that strings are arrays,
 it's that much more important.
Well, there's catenating arrays and producing a new (and freshly allocated) array, and there's catenating arrays *in place* to reduce the allocation overhead. That's not useless by any means, but I think it can be handled by container classes or the moral equivalent of something like StringBuilder. Since D has operator overloading, you can even continue to use the same pleasant syntax. Besides, isn't catenating or appending in place impossible with D's (immutable) strings anyway? [...]
It's perfectly legal. You just can't alter any of the elements in the array. Appending or concatenating to an arry alters the array, not its elements. And in a string, it's its elements which are immutable, not the array itself.
 I just don't see any real benefit in trying to do so. What we have
 works quite well.
The way mutable arrays may or may not share structure with each other in ways that are hard to predict gives me the screaming willies, and I think container and "builder" classes are entirely sufficient for covering the other use cases. In any event, I'm pretty certain that even if there were wide-spread support for this change, it would have to wait for the largely-hypothetical D3.
It's certainly true that if you want to be altering mutable arrays and you want guarantees that two arrays don't refer to the same memory, you're going to have to take extra steps - such as using dup. However, a container class such as Array isn't necessarily going to help with that. The main change there would be that a range over the container and the container would no longer be the same type, but that can be detrimental as well depending on what you're doing. A container solution would result in extraneous dups in many situations and could be detrimental to performance. The current situation does not disallow a container solution, but it doesn't force it either, so you're free to use containers such as Array, if you'd prefer. Really, I think that using dynamic arrays safely comes down to four rules: 1. Don't resize an array if you want a guarantee that any references to it or a portion of it continue to point to the same memory. 2. dup or idup an array if you want a gurantee that it points to the same memory. 3. If you don't care whether two arrays point to the same memory or not, then feel free to resize them. They may or may not end up pointing to the same memory, but since you don't care if they do, it doesn't matter. 4. If you want guarantees that the elements of an array aren't altered but don't want to any reallocations to take place if they don't need to, then use arrays with const or immutable elements. That way, the arrays themselves can be messed with as much as you like, but the elements won't be changed, and so it doesn't matter one whit whether two arrays point to the same memory. True, the situation is not as straightforward as it would be if resizing arrays wasn't legal, but it would sure be limiting to disallow the resizing of arrays (I think that it's a great feature and a definite improvement over C). It really isn't all that hard to avoid problems with it, and of course, there's nothing stopping anyone from using library solutions instead if they'd prefer. - Jonathan M Davis
Nov 08 2010