www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Efficient way to pass struct as parameter

reply Tim Hsu <tim37021 gmail.com> writes:
I am creating Vector3 structure. I use struct to avoid GC. 
However, struct will be copied when passed as parameter to 
function


struct Ray {
     Vector3f origin;
     Vector3f dir;

      nogc  system
     this(Vector3f *origin, Vector3f *dir) {
         this.origin = *origin;
         this.dir = *dir;
     }
}

How can I pass struct more efficiently?
Jan 02
next sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Tuesday, January 02, 2018 18:21:13 Tim Hsu via Digitalmars-d-learn wrote:
 I am creating Vector3 structure. I use struct to avoid GC.
 However, struct will be copied when passed as parameter to
 function


 struct Ray {
      Vector3f origin;
      Vector3f dir;

       nogc  system
      this(Vector3f *origin, Vector3f *dir) {
          this.origin = *origin;
          this.dir = *dir;
      }
 }

 How can I pass struct more efficiently?
When passing a struct to a funtion, if the argument is an rvalue, it will be moved rather than copied, but if it's an lvalue, it will be copied. If the parameter is marked with ref, then the lvalue will be passed by reference and not copied, but rvalues will not be accepted (and unlike with C++, tacking on const doesn't affect that). Alternatively, if the function is templated (and you can add empty parens to templatize a function if you want to), then an auto ref parameter will result in different template instantiations depending on whether the argument is an lvalue or rvalue. If it's an lvalue, then the template will be instantiated with that parameter as ref, so the argument will be passed by ref and no copy will be made, whereas if it's an rvalue, then the parameter will end up without having ref, so the argument will be moved. If the function isn't templated and can't be templated (e.g. if its a member function of a class and you want it to be virtual), then you'd need to overload the function with overloads that have ref and don't have ref in order to get the same effect (though the non-ref overload can simply forward to the ref overload). That does get a bit tedious though if you have several parameters. If you want to guarantee that no copy will ever be made, then you will have to either use ref or a pointer, which could get annoying with rvalues (since you'd have to assign them to a variable) and could actually result in more copies, because it would restrict the compiler's abilities to use moves instead of copies. In general, the best way is likely going to be to use auto ref where possible and overload functions where not. Occasionally, there is talk of adding something similar to C++'s const& to D, but Andrei does not want to add rvalue references to the language, and D's const is restrictive enough that requiring const to avoid the copy would arguably be overly restrictive. It may be that someone will eventually propose a feature with semantics that Andrei will accept that acts similarly to const&, but it has yet to happen. auto ref works for a lot of cases though, and D's ability to do moves without a move constructor definitely reduces the number of unnecessary copies. See also: https://stackoverflow.com/questions/35120474/does-d-have-a-move-constructor - Jonathan M Davis
Jan 02
parent reply Igor Shirkalin <mathsoft inbox.ru> writes:
On Tuesday, 2 January 2018 at 18:45:48 UTC, Jonathan M Davis 
wrote:
 [...]
Smart optimizer should think for you without any "auto" private words if function is inlined. I mean LDC compiler first of all.
Jan 02
parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Tuesday, January 02, 2018 19:27:50 Igor Shirkalin via Digitalmars-d-learn 
wrote:
 On Tuesday, 2 January 2018 at 18:45:48 UTC, Jonathan M Davis

 wrote:
 [...]
Smart optimizer should think for you without any "auto" private words if function is inlined. I mean LDC compiler first of all.
A smart optimizer may very well optimize out a number of copies. The fact that D requires that structs be moveable opens up all kinds of optimization opportunities - even more so when stuff gets inlined. However, if you want to guarantee that unnecessary copies aren't happening, you have to ensure that ref gets used with lvalues and does not get used with rvalues, and that tends to mean either using auto ref or overloading functions on ref. - Jonathan M Davis
Jan 02
prev sibling next sibling parent Seb <seb wilzba.ch> writes:
On Tuesday, 2 January 2018 at 18:21:13 UTC, Tim Hsu wrote:
 I am creating Vector3 structure. I use struct to avoid GC. 
 However, struct will be copied when passed as parameter to 
 function


 struct Ray {
     Vector3f origin;
     Vector3f dir;

      nogc  system
     this(Vector3f *origin, Vector3f *dir) {
         this.origin = *origin;
         this.dir = *dir;
     }
 }

 How can I pass struct more efficiently?
If you want the compiler to ensure that a struct doesn't get copied, you can disable its postblit: disable this(this); Now, there are a couple of goodies in std.typecons like RefCounted or Unique that allow you to pass struct around without needing to worry about memory allocation: https://dlang.org/phobos/std_typecons.html#RefCounted https://dlang.org/phobos/std_typecons.html#Unique Example: https://run.dlang.io/is/3rbqpn Of course, you can always roll your own allocator: https://run.dlang.io/is/uNmn0d
Jan 02
prev sibling parent reply Johan Engelen <j j.nl> writes:
On Tuesday, 2 January 2018 at 18:21:13 UTC, Tim Hsu wrote:
 I am creating Vector3 structure. I use struct to avoid GC. 
 However, struct will be copied when passed as parameter to 
 function


 struct Ray {
     Vector3f origin;
     Vector3f dir;

      nogc  system
     this(Vector3f *origin, Vector3f *dir) {
         this.origin = *origin;
         this.dir = *dir;
     }
 }

 How can I pass struct more efficiently?
Pass the Vector3f by value. There is not one best solution here: it depends on what you are doing with the struct, and how large the struct is. It depends on whether the function will be inlined. It depends on the CPU. And probably 10 other things. Vector3f is a small struct (I'm guessing it's 3 floats?), pass it by value and it will be passed in registers. This "copy" costs nothing on x86, the CPU will have to load the floats from memory and store them in a register anyway, before it can write it to the target Vector3f, regardless of how you pass the Vector3f. You can play with some code here: https://godbolt.org/g/w56jmA Passing by pointer (ref is the same) has large downsides and is certainly not always fastest. For small structs and if copying is not semantically wrong, just pass by value. More important: measure what bottlenecks your program has and optimize there. - Johan
Jan 02
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Tuesday, 2 January 2018 at 22:17:14 UTC, Johan Engelen wrote:
 Pass the Vector3f by value.
This is very frequently the correct answer to these questions! Never assume ref is faster if speed matters - it may not be.
Jan 02
next sibling parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Tuesday, January 02, 2018 22:49:20 Adam D. Ruppe via Digitalmars-d-learn 
wrote:
 On Tuesday, 2 January 2018 at 22:17:14 UTC, Johan Engelen wrote:
 Pass the Vector3f by value.
This is very frequently the correct answer to these questions! Never assume ref is faster if speed matters - it may not be.
It also makes for much cleaner code if you pretty much always pass by value and then only start dealing with ref or auto ref when you know that you need it - especially if you're going to need to manually overload the function on refness. But for better or worse, a lot of this sort of thing ultimately depends on what the optimizer does to a particular piece of code, and that's far from easy to predict given everything that an optimizer can do these days - especially if you're using ldc rather than dmd. - Jonathan M Davis
Jan 02
prev sibling parent reply Tim Hsu <tim37021 gmail.com> writes:
On Tuesday, 2 January 2018 at 22:49:20 UTC, Adam D. Ruppe wrote:
 On Tuesday, 2 January 2018 at 22:17:14 UTC, Johan Engelen wrote:
 Pass the Vector3f by value.
This is very frequently the correct answer to these questions! Never assume ref is faster if speed matters - it may not be.
However speed really matters for me. I am writing a path tracing program. Ray will be constructed million of times during computation. And will be passed to functions to test intersection billion of times. After Reading comments here, it seems ray will be passed by value to the intersection testing function. I am not sure if ray is small enough to be passed by value. It needs some experiment.
Jan 02
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jan 03, 2018 at 07:02:28AM +0000, Tim Hsu via Digitalmars-d-learn wrote:
 On Tuesday, 2 January 2018 at 22:49:20 UTC, Adam D. Ruppe wrote:
 On Tuesday, 2 January 2018 at 22:17:14 UTC, Johan Engelen wrote:
 Pass the Vector3f by value.
This is very frequently the correct answer to these questions! Never assume ref is faster if speed matters - it may not be.
However speed really matters for me.
That's why you need to use a profiler to find out where the hotspots are. It may not be where you think it is.
 I am writing a path tracing program.  Ray will be constructed million
 of times during computation.  And will be passed to functions to test
 intersection billion of times.  After Reading comments here, it seems
 ray will be passed by value to the intersection testing function. I am
 not sure if ray is small enough to be passed by value. It needs some
 experiment.
With modern CPUs with advanced caching, it may not always be obvious whether passing by value or passing by reference is better. Always use a profiler to be sure. T -- If blunt statements had a point, they wouldn't be blunt...
Jan 03
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2018-01-03 08:02, Tim Hsu wrote:
  It needs some experiment.
This is the correct answer. Never assume anything about performance before having tested it. -- /Jacob Carlborg
Jan 03
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Jan 02, 2018 at 10:17:14PM +0000, Johan Engelen via Digitalmars-d-learn
wrote:
[...]
 Passing by pointer (ref is the same) has large downsides and is
 certainly not always fastest. For small structs and if copying is not
 semantically wrong, just pass by value.
+1.
 More important: measure what bottlenecks your program has and optimize
 there.
[...] It cannot be said often enough: premature optimization is the root of all evils. It makes your code less readable, less maintainable, more bug-prone, and makes you spend far too much time and energy fiddling with details that ultimately may not even matter, and worst of all, it may not even be a performance win in the end, e.g., if you end up with CPU cache misses / excessive RAM roundtrips because of too much indirection, where you could have passed the entire struct in registers. When it comes to optimization, there are 3 rules: profile, profile, profile. I used to heavily hand-"optimize" my code a lot (I come from a strong C/C++ background -- premature optimization seems to be a common malady among us in that crowd). Then I started using a profiler, and I suddenly had that sinking realization that all those countless hours of tweaking my code to be "optimal" were wasted, because the *real* bottleneck was somewhere else completely. From many such experiences, I've learned that (1) the real bottleneck is rarely where you predict it to be, and (2) most real bottlenecks can be fixed with very simple changes (sometimes even a 1-line change) with very big speed gains, whereas (3) fixing supposed "inefficiencies" that aren't founded on real evidence (i.e., using a profiler) usually cost many hours of time, add tons of complexity, and rarely give you more than 1-2% speedups (and sometimes can actually make your code perform *worse*: your code can become so complicated the compiler's optimizer is unable to generate optimal code for it). T -- MSDOS = MicroSoft's Denial Of Service
Jan 02
parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Tuesday, 2 January 2018 at 23:27:22 UTC, H. S. Teoh wrote:
 When it comes to optimization, there are 3 rules: profile, 
 profile, profile.  I used to heavily hand-"optimize" my code a 
 lot (I come from a strong C/C++ background -- premature 
 optimization seems to be a common malady among us in that 
 crowd).
That's why I always tell that C++ is premature optimization oriented programming, aka as POOP.
Jan 03
parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 01/03/2018 10:40 AM, Patrick Schluter wrote:
 On Tuesday, 2 January 2018 at 23:27:22 UTC, H. S. Teoh wrote:
 When it comes to optimization, there are 3 rules: profile, profile,
 profile.  I used to heavily hand-"optimize" my code a lot (I come from
 a strong C/C++ background -- premature optimization seems to be a
 common malady among us in that crowd).
That's why I always tell that C++ is premature optimization oriented programming, aka as POOP.
In my earlier C++ days I've embarrassed myself by insisting that strings should be passed by reference for performance reasons. (No, I had not profiled.) Then I learned more and always returned vectors (and maps) by value from producer functions: vector<int> makeInts(some param) { // ... } That's how it should be! :) I used the same function when interviewing candidates (apologies to all; I don't remember good things about my interviewing other people; I hope I will never interview people like that anymore). They would invariably write a function something like this: void makeInts(vector<int> & result, some param) { // ... } And that's wrong because there are the big questions of what do you require or do with the reference parameter 'result'? Would you clear it first? If not, shouldn't the function be named appendInts? If you cleared it upfront, would you still be happy if an exception was thrown inside the function, etc. That's why I like producer functions that return values: vector<int> makeInts(some param) { // ... } And if they can be 'pure', D allows them to be used to initialize immutable variables as well. Pretty cool! :) Ali
Jan 03
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Wed, 3 Jan 2018 10:57:13 -0800
schrieb Ali =C3=87ehreli <acehreli yahoo.com>:

 On 01/03/2018 10:40 AM, Patrick Schluter wrote:
  > On Tuesday, 2 January 2018 at 23:27:22 UTC, H. S. Teoh wrote: =20
  >>
  >> When it comes to optimization, there are 3 rules: profile, profile,
  >> profile.  I used to heavily hand-"optimize" my code a lot (I come from
  >> a strong C/C++ background -- premature optimization seems to be a
  >> common malady among us in that crowd). =20
  >
  > That's why I always tell that C++ is premature optimization oriented
  > programming, aka as POOP. =20
=20
 [=E2=80=A6]

 That's why I like producer functions that return values:
=20
 vector<int> makeInts(some param) {
      // ...
 }
=20
 And if they can be 'pure', D allows them to be used to initialize=20
 immutable variables as well. Pretty cool! :)
=20
 Ali
May I add, this is also optimal performance-wise. The result variable will be allocated on the caller stack and the callee writes directly to it. So even POOPs like me, do it. --=20 Marco
Jan 28