www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - "Best" way of passing in a big struct to a function?

reply "Val Markovic" <val markovic.io> writes:
TL;DR: what should I use if I need C++'s const& for a param?

Long version: I have a big struct provided by a library and I'm 
trying to pass instances of it to a function that will never need 
to modify the passed in value. Naturally I want to pass it 
efficiently, without incurring a copy. I know that I can use 
"const ref" in D, but is this the preferred way of doing it? How 
about "in ref"? Or something else?

Related background: I'm a D newbie, I've read TDPL and I loved it 
and I'm now working on a Markdown processor as a D learning 
exercise. I've written hundreds of thousands of C++ LOC and this 
is the perspective from which I look at D (and I love what I see).
Oct 09 2012
next sibling parent reply "Val Markovic" <val markovic.io> writes:
Oh, and a related question: what is the best way to pass in an 
associative array like CustomStruct[string]? I can't say I'm too 
clear on how AA's are managed/implemented. Do they have value 
semantics or reference semantics? What about lists?
Oct 09 2012
next sibling parent reply "Val Markovic" <val markovic.io> writes:
On Wednesday, 10 October 2012 at 04:55:48 UTC, Val Markovic wrote:
 Oh, and a related question: what is the best way to pass in an 
 associative array like CustomStruct[string]? I can't say I'm 
 too clear on how AA's are managed/implemented. Do they have 
 value semantics or reference semantics? What about lists?
Ok, feel free to disregard this question; I just checked TDPL (should have done that first) and it clearly says that AA's follow reference semantics. Dynamic arrays passed to functions actually pass in a light-weight object referring to the same underlying data (I'm guessing that a dynamic array is internally "nothing more" than a struct holding a pointer and a length, right?).
Oct 09 2012
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, October 10, 2012 06:39:51 Val Markovic wrote:
 On Wednesday, 10 October 2012 at 04:55:48 UTC, Val Markovic wrote:
 Oh, and a related question: what is the best way to pass in an
 associative array like CustomStruct[string]? I can't say I'm
 too clear on how AA's are managed/implemented. Do they have
 value semantics or reference semantics? What about lists?
Ok, feel free to disregard this question; I just checked TDPL (should have done that first) and it clearly says that AA's follow reference semantics. Dynamic arrays passed to functions actually pass in a light-weight object referring to the same underlying data (I'm guessing that a dynamic array is internally "nothing more" than a struct holding a pointer and a length, right?).
A dynamic array is effectively DynamicArray(T) { T* ptr; size_t length; } So, they're sort of reference types, sort of not. Passing a dynamic array by value will slice it, allowing the elements to still be mutated (because they point to the same memory) if they're mutable, but if you alter the array itself, it won't alter the original, and if you alter it enough, it could end up copying the array so that it's not a slice anymore (e.g. appending could require reallocating the array to make room for the new elements, thereby changing the ptr value from what it was originally). You should read this article: http://dlang.org/d-array-article.html Associative arrays on the other hand are entirely reference types. The one thing that you need to watch out for is that if you pass one to a function, and it's null, then when you add elements to it, it will create a new AA for the local variable in that function but not affect the one passed in (which is still null). But if it's non-null when it's passed in, then anything done to it in the function that it was passed to will affect the original (since they're one and the same). - Jonathan M Davis
Oct 09 2012
prev sibling parent reply "thedeemon" <dlang thedeemon.com> writes:
On Wednesday, 10 October 2012 at 04:55:48 UTC, Val Markovic wrote:
 Oh, and a related question: what is the best way to pass in an 
 associative array like CustomStruct[string]? I can't say I'm 
 too clear on how AA's are managed/implemented. Do they have 
 value semantics or reference semantics?
Good question, I'd like to get some clarification on it too. Because it doesn't behave like, for example, class which surely has reference semantics. When I've got a class class C { int m; } and pass an object of this class to a function, void mutate_C(C c) { c.m = 5; } it follows reference semantics and its contents gets changed. However if I pass an assoc. array to a function which changes its contents void mutate_AA(string[int] aa) { foreach(i; 0..10) aa[i*10] = "hi"; } Then this code string[int] aa; mutate_AA(aa); writeln(aa); outputs "[]" - changes are not applied. It's only after I change parameter to "ref string[int] aa" its value get changed successfully.
Oct 09 2012
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, October 10, 2012 08:59:54 thedeemon wrote:
 On Wednesday, 10 October 2012 at 04:55:48 UTC, Val Markovic wrote:
 Oh, and a related question: what is the best way to pass in an
 associative array like CustomStruct[string]? I can't say I'm
 too clear on how AA's are managed/implemented. Do they have
 value semantics or reference semantics?
Good question, I'd like to get some clarification on it too. Because it doesn't behave like, for example, class which surely has reference semantics. When I've got a class class C { int m; } and pass an object of this class to a function, void mutate_C(C c) { c.m = 5; } it follows reference semantics and its contents gets changed. However if I pass an assoc. array to a function which changes its contents void mutate_AA(string[int] aa) { foreach(i; 0..10) aa[i*10] = "hi"; } Then this code string[int] aa; mutate_AA(aa); writeln(aa); outputs "[]" - changes are not applied. It's only after I change parameter to "ref string[int] aa" its value get changed successfully.
The exact same thing would happen with a class. The problem is that the aa that you pass in is null, so if you assign anything to it within the function or otherwise mutate it, it doesn't affect the original. Making it ref fixes the problem, because then anything which affects the AA variable inside of the called function is operating on a reference to the original AA variable rather than just operating on what the original AA variable pointed to. Making sure that the aa has been properly initialized before passing it to a function (which would mean giving it at least one value) would make the ref completely unnecessary. - Jonathan M Davis
Oct 10 2012
parent reply "thedeemon" <dlang thedeemon.com> writes:
On Wednesday, 10 October 2012 at 07:28:55 UTC, Jonathan M Davis 
wrote:
 Making sure that the aa has been properly initialized before 
 passing it to a function (which would mean giving it at least 
 one value) would make the ref completely unnecessary.

 - Jonathan M Davis
Ah, thanks a lot! This behavior of a fresh AA being null and then silently converted to a non-null when being filled confused me.
Oct 10 2012
parent reply Don Clugston <dac nospam.com> writes:
On 10/10/12 09:12, thedeemon wrote:
 On Wednesday, 10 October 2012 at 07:28:55 UTC, Jonathan M Davis wrote:
 Making sure that the aa has been properly initialized before passing
 it to a function (which would mean giving it at least one value) would
 make the ref completely unnecessary.

 - Jonathan M Davis
Ah, thanks a lot! This behavior of a fresh AA being null and then silently converted to a non-null when being filled confused me.
Yes, it's confusing and annoying. This is something in the language that we keep talking about fixing, but to date it hasn't happened.
Oct 10 2012
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Oct 10, 2012 at 10:40:24AM +0200, Don Clugston wrote:
 On 10/10/12 09:12, thedeemon wrote:
On Wednesday, 10 October 2012 at 07:28:55 UTC, Jonathan M Davis wrote:
Making sure that the aa has been properly initialized before passing
it to a function (which would mean giving it at least one value)
would make the ref completely unnecessary.

- Jonathan M Davis
Ah, thanks a lot! This behavior of a fresh AA being null and then silently converted to a non-null when being filled confused me.
Yes, it's confusing and annoying. This is something in the language that we keep talking about fixing, but to date it hasn't happened.
How would it be fixed, though? T -- Let X be the set not defined by this sentence...
Oct 10 2012
prev sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, October 10, 2012 06:27:52 Val Markovic wrote:
 TL;DR: what should I use if I need C++'s const& for a param?
 
 Long version: I have a big struct provided by a library and I'm
 trying to pass instances of it to a function that will never need
 to modify the passed in value. Naturally I want to pass it
 efficiently, without incurring a copy. I know that I can use
 "const ref" in D, but is this the preferred way of doing it? How
 about "in ref"? Or something else?
 
 Related background: I'm a D newbie, I've read TDPL and I loved it
 and I'm now working on a Markdown processor as a D learning
 exercise. I've written hundreds of thousands of C++ LOC and this
 is the perspective from which I look at D (and I love what I see).
Unlike in C++, const ref in D requires that the argument be an lvalue just like with ref, so if you define const ref, it won't work with rvalues. You'd need to create an overload which wasn't ref to do that. e.g. auto foo(const S s) { return foo(s); } auto foo(const ref S s) { ... } And if you do that, make sure that the constness of the functions matches, or you'll get infinite recursion, because the constness gets matched before the refness of the type when the overload is selected when calling the function. If your function is templated, then you can use auto ref auto foo(S)(auto ref S s) { ... } or auto foo(S)(auto ref const S s) { ... } If you want the function templated (to be able to use auto ref) but not its arguments, then do auto foo()(auto ref const S s) { ... } auto ref does essentially the same as the first example, but the compiler generates the overloads for you. But again, it only works with templated functions. This whole topic is a bit of a thorny one in that D's design is trying to avoid some of the problems that allowing const T& to take rvalues in C++ causes, but it makes a situation like what you're trying to do annoying to handle. And auto ref doesn't really fix it (even though that's whole the reason that it was added), because it only works with templated functions. There have been some discussions on how to adjust how ref works in order to fix the problem without introducing the problems that C++ has with it, but nothing has actually be decided on yet, let alone implemented. - Jonathan M Davis
Oct 09 2012
parent reply "Val Markovic" <val markovic.io> writes:
 This whole topic is a bit of a thorny one in that D's design is 
 trying to
 avoid some of the problems that allowing const T& to take 
 rvalues in C++
 causes, but it makes a situation like what you're trying to do 
 annoying to
 handle. And auto ref doesn't really fix it (even though that's 
 whole the reason
 that it was added), because it only works with templated 
 functions. There have
 been some discussions on how to adjust how ref works in order 
 to fix the
 problem without introducing the problems that C++ has with it, 
 but nothing has
 actually be decided on yet, let alone implemented.
So if I don't need to support accepting rvalues, is there an argument for "in ref" over "const ref"? "in ref" looks superior: it's more descriptive and from what the docs say, it gives even more guarantees about the behavior of the function.
Oct 09 2012
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, October 10, 2012 06:51:50 Val Markovic wrote:
 So if I don't need to support accepting rvalues, is there an
 argument for "in ref" over "const ref"? "in ref" looks superior:
 it's more descriptive and from what the docs say, it gives even
 more guarantees about the behavior of the function.
In general, I'd advise aganist using in. in is an alias for const scope. scope is supposed to make it so that no references to that parameter can escape the function. This makes it utterly pointless for value types (as structs typically are). To make matters worse, it's not even properly implemented for anything beyond delegates. So, while something like int[] foo(scope int[] arr) { return arr; } is supposed to be illegal, the compiler currently allows it. So, if you use scope (or in) very much, you're going to get all kinds of compilation errors once scope has been fixed. If you want const, use const, but aside from delegates, I wouldn't use scope at this point, so I wouldn't use in either. And for many cases, even if scope worked correctly, using in instead of const would be pointless, because the scope portion wouldn't be applicable and would be ignored. Personally, I wish that in didn't exist at all. It just causes future bugs at this point and adds no value (since you can use const scope if that's what you want). But D1 had it (albeit with slightly different semantics), so it's still around in D2 for transitional purposes if nothing else. - Jonathan M Davis
Oct 09 2012