www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Confusion over what types have value vs reference semantics

reply Neurone <dtvictorious gmail.com> writes:
Hi,

Are there universal rules that I can apply to determine what 
types have reference or value semantics? I know that for the 
basic primitive C types (int, bool, etc) has value semantics.

In particular, I'm still trying to understand  stack vs 
GC-managed arrays, and slices.

Finally, I have an associative array, and I want to pass it into 
a function. The function is only reading data. Would putting ref 
on in function parameter pass it by reference?
Sep 11 2016
next sibling parent rikki cattermole <rikki cattermole.co.nz> writes:
On 12/09/2016 3:15 AM, Neurone wrote:
 Hi,

 Are there universal rules that I can apply to determine what types have
 reference or value semantics? I know that for the basic primitive C
 types (int, bool, etc) has value semantics.

 In particular, I'm still trying to understand  stack vs GC-managed
 arrays, and slices.

 Finally, I have an associative array, and I want to pass it into a
 function. The function is only reading data. Would putting ref on in
 function parameter pass it by reference?
Ok two questions here: 1) What constitutes value vs reference passing 2) Allocation location So basically you have two "locations" where something can be stored, on the stack or the heap. Now I put it in quotes because they are both in RAM somewhere so it doesn't matter too much. The only difference is only a subset of the stack is readily allocatable at any single function call. So in regarding to what gets passed by value, well that is simple. If it is a class, you're passing a few pointers as your reference. All primal types including pointers are passed as values. If you use ref, you turn a type into a pointer auto magically (a location in the stack most likely). Just so you're aware, arrays are slices in D (excluding static arrays). They are simply a pointer + length. So I suppose you can think of references and slices as a container of sorts for other values which get passed in.
Sep 11 2016
prev sibling parent reply Mike Parker <aldacron gmail.com> writes:
On Sunday, 11 September 2016 at 15:15:09 UTC, Neurone wrote:
 Hi,

 Are there universal rules that I can apply to determine what 
 types have reference or value semantics? I know that for the 
 basic primitive C types (int, bool, etc) has value semantics.
Primitive types (int, bool, etc) and structs are passed by value unless a function parameter is annotated with 'ref'. Classes are reference types, so given instance foo of class Foo, foo itself is a reference. For arrays, it's easiest to think of one as a struct with two fields: length and ptr. The ptr field points to the memory where the array data is stored. When you pass an array to a function, the length & ptr are passed by value (it's a "slice"). That means that modifying the length of the array by adding or removing elements or attempting to change where ptr is pointing will only modify the local copy. You can modify the array elements in the original array (manipulate their fields, overwrite them, and such), but you can't modify the structure (length or pointer) of the original array unless the parameter is annotated with ref. Although AAs don't have a length or ptr, they work similarly: you can modify the contents without ref, but can only modify the structure with.
 In particular, I'm still trying to understand  stack vs 
 GC-managed arrays, and slices.
Steven's article on slices should help [1]. It also helps if you just think of all dynamic arrays as slices. int[] foo; // slice with length & ptr, no memory allocated for elements int[3] bar; // static array with length & ptr, three ints allocated on the star The memory for foo needs to be allocated somewhere. It might be the GC-managed heap, it might be malloc, it might even be stack memory. Doesn't matter. You cannot modify the length of bar, but you can slice it: auto barSlice = bar[]; And here, no memory is allocated. barSlice.ptr is the same as bar.ptr and barSlice.length is the same as bar.length. However, if you append a new element: barSlice ~= 10; The GC will allocate memory for a new array and barSlice will no longer point to bar. It will now have four elements.
 Finally, I have an associative array, and I want to pass it 
 into a function. The function is only reading data. Would 
 putting ref on in function parameter pass it by reference?
You can easily test this: ``` void addElem(int[string] aa, string key, int val) { aa[key] = val; } void main() { int[string] map; map.addElem("key", 10); import std.stdio : writeln; writeln("The aa was modified: ", ("key" in map) != null); } ``` AAs behave like arrays. The meta data (like the array length and ptr) is passed by value, the data by reference. The above should print false. Add ref to the function parameter, it will print true. However, if a key already exists in the aa, then you can modify it without ref as it isn't changing the structure of the aa. The following will print 10 whether the aa parameter is ref or not. ``` void addElem(int[string] aa, string key, int val) { aa[key] = val; } void main() { int[string] map; map["key"] = 5; map.addElem("key", 10); import std.stdio : writeln; writeln("The value of key is ", map["key"]); } ``` [1] https://dlang.org/d-array-article.html
Sep 11 2016
parent reply Mike Parker <aldacron gmail.com> writes:
On Sunday, 11 September 2016 at 16:10:04 UTC, Mike Parker wrote:

 And here, no memory is allocated. barSlice.ptr is the same as 
 bar.ptr and barSlice.length is the same as bar.length. However, 
 if you append a new element:

 barSlice ~= 10;

 The GC will allocate memory for a new array and barSlice will 
 no longer point to bar. It will now have four elements.
I should clarify that this holds true for all slices, not just slices of static arrays. The key point is that appending to a slice will only allocate if the the .capacity property of the slice is 0. Slices of static arrays will always have a capacity of 0. Slices of slices might not, i.e. there may be room in the memory block for more elements.
Sep 11 2016
parent reply Neurone <dtvictorious gmail.com> writes:
On Sunday, 11 September 2016 at 16:14:59 UTC, Mike Parker wrote:
 On Sunday, 11 September 2016 at 16:10:04 UTC, Mike Parker wrote:

 And here, no memory is allocated. barSlice.ptr is the same as 
 bar.ptr and barSlice.length is the same as bar.length. 
 However, if you append a new element:

 barSlice ~= 10;

 The GC will allocate memory for a new array and barSlice will 
 no longer point to bar. It will now have four elements.
I should clarify that this holds true for all slices, not just slices of static arrays. The key point is that appending to a slice will only allocate if the the .capacity property of the slice is 0. Slices of static arrays will always have a capacity of 0. Slices of slices might not, i.e. there may be room in the memory block for more elements.
Thanks for the detailed answer. I still don't get the advantage of passing slices into functions by value allowing modification to elements of the original array. Is there an way to specify that a true independent copy of an array should be passed into the function? E.g, in c++ func(Vector<int> v) causes a copy of the argument to be passed in.
Sep 13 2016
parent Jonathan M Davis via Digitalmars-d-learn writes:
On Tuesday, September 13, 2016 15:27:07 Neurone via Digitalmars-d-learn wrote:
 On Sunday, 11 September 2016 at 16:14:59 UTC, Mike Parker wrote:
 On Sunday, 11 September 2016 at 16:10:04 UTC, Mike Parker wrote:
 And here, no memory is allocated. barSlice.ptr is the same as
 bar.ptr and barSlice.length is the same as bar.length.
 However, if you append a new element:

 barSlice ~= 10;

 The GC will allocate memory for a new array and barSlice will
 no longer point to bar. It will now have four elements.
I should clarify that this holds true for all slices, not just slices of static arrays. The key point is that appending to a slice will only allocate if the the .capacity property of the slice is 0. Slices of static arrays will always have a capacity of 0. Slices of slices might not, i.e. there may be room in the memory block for more elements.
Thanks for the detailed answer. I still don't get the advantage of passing slices into functions by value allowing modification to elements of the original array. Is there an way to specify that a true independent copy of an array should be passed into the function? E.g, in c++ func(Vector<int> v) causes a copy of the argument to be passed in.
Slices are a huge performance boost (e.g. the fact that we have slicing like this for strings makes parsing code _way_ more efficient by default than would ever be the case for something like std::string). If you're worried about a function mutating the elements of an array that it's given, then you can always mark them with const. e.g. auto foo(const(int)[] arr) {...} But there is no way to force a naked dynamic array to do a deeper copy when passed. If you're worried about it, you can explicitly call dup to create a copy of the array rather than slice it - e.g. foo(arr.dup) - but the function itself can't enforce that behavior. The closest that it could do would be to explicitly dup the parameter itself - though if you were going to do that, you'd want to make it clear in the documentation, since that's not a typical thing to do, and if someone wanted to ensure that an array that they were passing to the function didn't get mutated, they'd dup it themselves, which would result in two dups if your function did the dup. If you want a dynamic array to be duped every time it's passed to a function or otherwise copied, you'd need to create a wrapper struct with a postblit constructor that called dup. That would generally make for unnecessarily inefficient code though. The few containers in std.container are all reference types for the same reason - containers which copy by default make it way too easy to accidentally copy them and are arguably a bad default (though obviously, there are cases where that would be the best behavior). So, the typical thing to do with dynamic arrays is to use const or immutable elements when you want to ensure that they don't get mutated when passing them around and duping or iduping a dynamic array when you want to ensure that you have a copy of the array rather than a slice. - Jonathan M Davis
Sep 13 2016