www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Some performance questions

reply Lars Kyllingstad <public kyllingen.NOSPAMnet> writes:
I have some functions for which I want to find the nicest possible 
combination of performance and usability. I have two suggestions as to 
how they should be defined.

"Classic" style:

     real myFunction(real arg, int someParam, inout real aReturnValue)
     {
         declare temporary variables;
         do some calculations;
         store a return value in aReturnValue;
         return main return value;
     }

The user-friendly way, where the function is encapsulated in a class:

     class MyFunctionWorkspace
     {
         declare private temporary variables;

         real anotherReturnValue;

         this (int someParam)
         { ... }

         real myFunction(real arg)
         {
             do some calculations;
             store a return value in aReturnValue;
             return main return value;
         }
     }

I'm sure a lot of people will disagree with me on this, but let me first 
say why I think the last case is more user-friendly. For one thing, the 
same class can be used over and over again with the same parameter(s). 
Also, the user only has to retrieve aReturnValue if it is needed. If 
there are many such "additional" inout parameters which are seldom 
needed, it gets tedious to declare variables for them every time the 
function is called. I could overload the function, but this also has 
drawbacks if there are several inout parameters with the same type.

My questions are:

- If I do like in the second example above, and reuse temporary 
variables instead of allocating them every time the function is called, 
could this way also give the best performance? (Yes, I know this is bad 
form...)

...or, if not...

- If I (again in the second example) move the temporary variables inside 
the function, so they are allocated on the stack instead of the heap 
(?), will this improve or reduce performance?

I could write both types of code and test them against each other, but I 
am planning to use the same style for several different functions in 
several modules, and want to find the solution which is generally the 
best one.

-Lars
Feb 02 2009
next sibling parent Lars Kyllingstad <public kyllingen.NOSPAMnet> writes:
Lars Kyllingstad wrote:
         real anotherReturnValue;
Correction: real aReturnValue;
Feb 02 2009
prev sibling next sibling parent reply Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Mon, Feb 2, 2009 at 8:31 AM, Lars Kyllingstad
<public kyllingen.nospamnet> wrote:
 I have some functions for which I want to find the nicest possible
 combination of performance and usability. I have two suggestions as to how
 they should be defined.

 "Classic" style:

    real myFunction(real arg, int someParam, inout real aReturnValue)
    {
        declare temporary variables;
        do some calculations;
        store a return value in aReturnValue;
        return main return value;
    }

 The user-friendly way, where the function is encapsulated in a class:

    class MyFunctionWorkspace
    {
        declare private temporary variables;

        real anotherReturnValue;

        this (int someParam)
        { ... }

        real myFunction(real arg)
        {
            do some calculations;
            store a return value in aReturnValue;
            return main return value;
        }
    }

 I'm sure a lot of people will disagree with me on this, but let me first say
 why I think the last case is more user-friendly. For one thing, the same
 class can be used over and over again with the same parameter(s). Also, the
 user only has to retrieve aReturnValue if it is needed. If there are many
 such "additional" inout parameters which are seldom needed, it gets tedious
 to declare variables for them every time the function is called. I could
 overload the function, but this also has drawbacks if there are several
 inout parameters with the same type.

 My questions are:

 - If I do like in the second example above, and reuse temporary variables
 instead of allocating them every time the function is called, could this way
 also give the best performance? (Yes, I know this is bad form...)

 ...or, if not...

 - If I (again in the second example) move the temporary variables inside the
 function, so they are allocated on the stack instead of the heap (?), will
 this improve or reduce performance?

 I could write both types of code and test them against each other, but I am
 planning to use the same style for several different functions in several
 modules, and want to find the solution which is generally the best one.
Any gains you get from skipping the initial calculations will be swiftly cut down by the cost of heap allocation and cache misses, if you allocate this object several times. A much better way to get the usability of the latter with the better performance of the former is to use a struct instead of a class. I highly doubt you'll be needing to inherit these "operation objects" anyway. The struct will be allocated on the stack, and you still get all the usability.
Feb 02 2009
next sibling parent Lars Kyllingstad <public kyllingen.NOSPAMnet> writes:
Jarrett Billingsley wrote:
 On Mon, Feb 2, 2009 at 8:31 AM, Lars Kyllingstad
 <public kyllingen.nospamnet> wrote:
 I have some functions for which I want to find the nicest possible
 combination of performance and usability. I have two suggestions as to how
 they should be defined.

 "Classic" style:

    real myFunction(real arg, int someParam, inout real aReturnValue)
    {
        declare temporary variables;
        do some calculations;
        store a return value in aReturnValue;
        return main return value;
    }

 The user-friendly way, where the function is encapsulated in a class:

    class MyFunctionWorkspace
    {
        declare private temporary variables;

        real anotherReturnValue;

        this (int someParam)
        { ... }

        real myFunction(real arg)
        {
            do some calculations;
            store a return value in aReturnValue;
            return main return value;
        }
    }

 I'm sure a lot of people will disagree with me on this, but let me first say
 why I think the last case is more user-friendly. For one thing, the same
 class can be used over and over again with the same parameter(s). Also, the
 user only has to retrieve aReturnValue if it is needed. If there are many
 such "additional" inout parameters which are seldom needed, it gets tedious
 to declare variables for them every time the function is called. I could
 overload the function, but this also has drawbacks if there are several
 inout parameters with the same type.

 My questions are:

 - If I do like in the second example above, and reuse temporary variables
 instead of allocating them every time the function is called, could this way
 also give the best performance? (Yes, I know this is bad form...)

 ...or, if not...

 - If I (again in the second example) move the temporary variables inside the
 function, so they are allocated on the stack instead of the heap (?), will
 this improve or reduce performance?

 I could write both types of code and test them against each other, but I am
 planning to use the same style for several different functions in several
 modules, and want to find the solution which is generally the best one.
Any gains you get from skipping the initial calculations will be swiftly cut down by the cost of heap allocation and cache misses, if you allocate this object several times.
OK. But if the object is allocated once (or seldomly, at least), and I allocate any working variables on the stack, then the second case may not be half bad?
 A much better way to get the usability of the latter with the better
 performance of the former is to use a struct instead of a class.  I
 highly doubt you'll be needing to inherit these "operation objects"
 anyway.  The struct will be allocated on the stack, and you still get
 all the usability.
Thanks, I hadn't even thought of that! :) This could certainly be a solution. There are two problems, however: 1) In D1, structs don't have constructors, which could again make the initial parameter setting a tedious task. But this is not a big problem, as I could just define a static opCall for each struct as a kind of constructor. 2) Bigger problem: I was kinda hoping that all the functions could implement a common interface, so I can use them in generic algorithms. This could possibly be done with structs using templates, but plain old interfaces would be a cleaner solution. -Lars
Feb 02 2009
prev sibling parent reply grauzone <none example.net> writes:
Jarrett Billingsley wrote:
 On Mon, Feb 2, 2009 at 8:31 AM, Lars Kyllingstad
 <public kyllingen.nospamnet> wrote:
 I have some functions for which I want to find the nicest possible
 combination of performance and usability. I have two suggestions as to how
 they should be defined.

 "Classic" style:

    real myFunction(real arg, int someParam, inout real aReturnValue)
    {
        declare temporary variables;
        do some calculations;
        store a return value in aReturnValue;
        return main return value;
    }

 The user-friendly way, where the function is encapsulated in a class:

    class MyFunctionWorkspace
    {
        declare private temporary variables;

        real anotherReturnValue;

        this (int someParam)
        { ... }

        real myFunction(real arg)
        {
            do some calculations;
            store a return value in aReturnValue;
            return main return value;
        }
    }

 I'm sure a lot of people will disagree with me on this, but let me first say
 why I think the last case is more user-friendly. For one thing, the same
 class can be used over and over again with the same parameter(s). Also, the
 user only has to retrieve aReturnValue if it is needed. If there are many
 such "additional" inout parameters which are seldom needed, it gets tedious
 to declare variables for them every time the function is called. I could
 overload the function, but this also has drawbacks if there are several
 inout parameters with the same type.

 My questions are:

 - If I do like in the second example above, and reuse temporary variables
 instead of allocating them every time the function is called, could this way
 also give the best performance? (Yes, I know this is bad form...)

 ...or, if not...

 - If I (again in the second example) move the temporary variables inside the
 function, so they are allocated on the stack instead of the heap (?), will
 this improve or reduce performance?

 I could write both types of code and test them against each other, but I am
 planning to use the same style for several different functions in several
 modules, and want to find the solution which is generally the best one.
Any gains you get from skipping the initial calculations will be swiftly cut down by the cost of heap allocation and cache misses, if you allocate this object several times. A much better way to get the usability of the latter with the better performance of the former is to use a struct instead of a class. I highly doubt you'll be needing to inherit these "operation objects" anyway. The struct will be allocated on the stack, and you still get all the usability.
Why not use scope to allocate the class on the stack? For everything else, I agree with Donald Knuth (if he really said that...)
Feb 02 2009
parent reply Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Mon, Feb 2, 2009 at 1:27 PM, grauzone <none example.net> wrote:
 Why not use scope to allocate the class on the stack?
 For everything else, I agree with Donald Knuth (if he really said that...)
That's fine too, and would fit in with his needs to implement interfaces. But again, if he's worried about caching some parameters but not worried about the overhead of virtual calls.. something's off.
Feb 02 2009
next sibling parent reply Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:
Jarrett Billingsley wrote:
 On Mon, Feb 2, 2009 at 1:27 PM, grauzone <none example.net> wrote:
 Why not use scope to allocate the class on the stack?
 For everything else, I agree with Donald Knuth (if he really said that...)
That's fine too, and would fit in with his needs to implement interfaces. But again, if he's worried about caching some parameters but not worried about the overhead of virtual calls.. something's off.
Or he's caching some very big/complex parameters in the code he's actually writing... maybe. That said: do we have any assurance that, were the functor class tagged as 'final', the call would cease to be virtual? If so, then the only extra cost on the call is that of the hidden "this" sitting in ESI. I still don't care for the memory allocation involved, personally, but if these are long-lived functors that may not be a major problem. (Ie, if he calls foo(?,X) a million times, the cost of allocating one object is amortized into nearly nothing.) -- Chris Nicholson-Sauls
Feb 02 2009
next sibling parent Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Mon, Feb 2, 2009 at 3:11 PM, Chris Nicholson-Sauls
<ibisbasenji gmail.com> wrote:

 do we have any assurance that, were the functor
 class tagged as 'final', the call would cease to be virtual?
http://d.puremagic.com/issues/show_bug.cgi?id=1909
Feb 02 2009
prev sibling parent reply Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Mon, Feb 2, 2009 at 3:11 PM, Chris Nicholson-Sauls
<ibisbasenji gmail.com> wrote:
 Or he's caching some very big/complex parameters in the code he's actually
 writing... maybe. That said: do we have any assurance that, were the functor
 class tagged as 'final', the call would cease to be virtual?  If so, then
 the only extra cost on the call is that of the hidden "this" sitting in ESI.
  I still don't care for the memory allocation involved, personally, but if
 these are long-lived functors that may not be a major problem.  (Ie, if he
 calls foo(?,X) a million times, the cost of allocating one object is
 amortized into nearly nothing.)
Oh, I suppose I should also point out that if you made these functors' methods final, they wouldn't be able to implement interfaces, since interface implementations must be virtual. So, at that point, you're using a final scope class - might as well use a struct anyway.
Feb 02 2009
parent reply grauzone <none example.net> writes:
Jarrett Billingsley wrote:
 On Mon, Feb 2, 2009 at 3:11 PM, Chris Nicholson-Sauls
 <ibisbasenji gmail.com> wrote:
 Or he's caching some very big/complex parameters in the code he's actually
 writing... maybe. That said: do we have any assurance that, were the functor
 class tagged as 'final', the call would cease to be virtual?  If so, then
 the only extra cost on the call is that of the hidden "this" sitting in ESI.
  I still don't care for the memory allocation involved, personally, but if
 these are long-lived functors that may not be a major problem.  (Ie, if he
 calls foo(?,X) a million times, the cost of allocating one object is
 amortized into nearly nothing.)
Oh, I suppose I should also point out that if you made these functors' methods final, they wouldn't be able to implement interfaces, since interface implementations must be virtual. So, at that point, you're using a final scope class - might as well use a struct anyway.
As far as I know, interface methods can still be final methods in a class. final methods are only disallowed to be overridden further. But it's perfectly fine to mark a method final, that overrides a method from a super class. final so to say only works in one direction. Then the compiler can optimize calls, if they are statically known to be final. If not, it still has to do a vtable lookup on a method call, even if the actually called method is final. So it can still make sense to use a class instead of a struct.
Feb 02 2009
parent reply Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Mon, Feb 2, 2009 at 3:37 PM, grauzone <none example.net> wrote:

 As far as I know, interface methods can still be final methods in a class.
 final methods are only disallowed to be overridden further. But it's
 perfectly fine to mark a method final, that overrides a method from a super
 class. final so to say only works in one direction.
Sure, the method will be final, but it will still be virtual. The way interfaces work is by basically giving you a slice of the vtable.
 Then the compiler can optimize calls, if they are statically known to be
 final. If not, it still has to do a vtable lookup on a method call, even if
 the actually called method is final.
The compiler can't optimize calls on interface references away. The function that's using the interface reference only knows as much as the interface tells it. If some class implements the interface and marks its implementation of the interface as final, it doesn't matter, since the method is not marked final in the interface (and can't be!). Okay, so *if* the compiler inlined the call to the function that took the interface reference, *and* it was smart enough to recognize that that interface reference did not escape, *and* it was smart enough to realize that the interface really pointed to a class, *and* it knew that the implementation of the method was final, it could inline it. But that seems like an incredibly smart compiler and an incredibly rare situation. I also don't believe in relying on optimizations that are not enforced, as it makes for nonportable code.
Feb 02 2009
parent reply grauzone <none example.net> writes:
I agree. Of course using an interface to call a method always requires a 
virtual method call. It's even slower than a virtual method call, 
because it needs to convert the interface reference into an object 
reference.

But he still could call the method in question directly. Implementing an 
interface can be useful to enforce a contract. You can't do that with 
structs.

Code compiled in debug mode (or was it not-release mode) also calls the 
code to check the invariant, even if you didn't define one. I guess this 
can make calling struct methods much faster than object methods.
Feb 02 2009
parent Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Mon, Feb 2, 2009 at 4:55 PM, grauzone <none example.net> wrote:
 I agree. Of course using an interface to call a method always requires a
 virtual method call. It's even slower than a virtual method call, because it
 needs to convert the interface reference into an object reference.

 But he still could call the method in question directly. Implementing an
 interface can be useful to enforce a contract. You can't do that with
 structs.
What's the point of implementing an interface unless you plan on passing instances of that class to something that expects an interface reference? ;)
 Code compiled in debug mode (or was it not-release mode) also calls the code
 to check the invariant, even if you didn't define one. I guess this can make
 calling struct methods much faster than object methods.
Invariants (as well as in/out contracts and assertions) are turned off in release mode. FWIW, struct methods also do an "assert(this !is null);" in debug mode, so they're sort of doing an invariant check. But struct methods are never virtual, so yes, they will in general be faster.
Feb 02 2009
prev sibling parent reply Lars Kyllingstad <public kyllingen.NOSPAMnet> writes:
Jarrett Billingsley wrote:
 On Mon, Feb 2, 2009 at 1:27 PM, grauzone <none example.net> wrote:
 Why not use scope to allocate the class on the stack?
 For everything else, I agree with Donald Knuth (if he really said that...)
That's fine too, and would fit in with his needs to implement interfaces. But again, if he's worried about caching some parameters but not worried about the overhead of virtual calls.. something's off.
You're assuming too much programming knowledge and carelessness on my part. I merely wanted to know if the second solution would be significantly slower than the first one. Caching of the parameters would be a bonus, as would caching of additional output and the ability to use interfaces. -Lars
Feb 02 2009
parent bearophile <bearophileHUGS lycos.com> writes:
Lars Kyllingstad:
 I merely wanted to know if the second solution would be significantly slower
than the first one.<
No amount of theory can replace actual timings of your code snippets :-) (It's often true the other way too, practice doesn't replace theory. But here there isn't too much theory, so lot of practice suffices if you don't know the theory). Bye, bearophile
Feb 02 2009
prev sibling parent reply Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:
Lars Kyllingstad wrote:
 I have some functions for which I want to find the nicest possible 
 combination of performance and usability. I have two suggestions as to 
 how they should be defined.
 
 "Classic" style:
 
     real myFunction(real arg, int someParam, inout real aReturnValue)
     {
         declare temporary variables;
         do some calculations;
         store a return value in aReturnValue;
         return main return value;
     }
 
 The user-friendly way, where the function is encapsulated in a class:
 
     class MyFunctionWorkspace
     {
         declare private temporary variables;
 
         real anotherReturnValue;
 
         this (int someParam)
         { ... }
 
         real myFunction(real arg)
         {
             do some calculations;
             store a return value in aReturnValue;
             return main return value;
         }
     }
 
 I'm sure a lot of people will disagree with me on this, but let me first 
 say why I think the last case is more user-friendly. For one thing, the 
 same class can be used over and over again with the same parameter(s). 
 Also, the user only has to retrieve aReturnValue if it is needed. If 
 there are many such "additional" inout parameters which are seldom 
 needed, it gets tedious to declare variables for them every time the 
 function is called. I could overload the function, but this also has 
 drawbacks if there are several inout parameters with the same type.
 
 My questions are:
 
 - If I do like in the second example above, and reuse temporary 
 variables instead of allocating them every time the function is called, 
 could this way also give the best performance? (Yes, I know this is bad 
 form...)
 
 ...or, if not...
 
 - If I (again in the second example) move the temporary variables inside 
 the function, so they are allocated on the stack instead of the heap 
 (?), will this improve or reduce performance?
 
 I could write both types of code and test them against each other, but I 
 am planning to use the same style for several different functions in 
 several modules, and want to find the solution which is generally the 
 best one.
 
 -Lars
If I understand right that your main concern is with parameters that are used over and over and over again -- which I can empathize with -- you could also look into function currying. Assuming you are using Phobos, the module you want to look at is std.bind, usage of which is pretty straightforward. Given a function: real pow (real base, real exp); You could emulate a square() function via std.bind like so: square = bind(&pow, _0, 2.0); square(42.0); // same as: pow(42.0, 2.0) If you are using Tango, I'm honestly not sure off the top of my head what the relevant module is, but you could always install Tangobos and use std.bind just fine. All that being said, I have no experience with currying functions with inout parameters. If my understanding of how std.bind works its magic is right, it should be fine. I believe it wraps the call up in a structure, which means the actual parameter will be from a field of said structure... which, actually, means it could also store state. That in itself could be an interesting capability. -- Chris Nicholson-Sauls
Feb 02 2009
parent reply Lars Kyllingstad <public kyllingen.NOSPAMnet> writes:
Chris Nicholson-Sauls wrote:
 Lars Kyllingstad wrote:
 I have some functions for which I want to find the nicest possible 
 combination of performance and usability. I have two suggestions as to 
 how they should be defined.

 "Classic" style:

     real myFunction(real arg, int someParam, inout real aReturnValue)
     {
         declare temporary variables;
         do some calculations;
         store a return value in aReturnValue;
         return main return value;
     }

 The user-friendly way, where the function is encapsulated in a class:

     class MyFunctionWorkspace
     {
         declare private temporary variables;

         real anotherReturnValue;

         this (int someParam)
         { ... }

         real myFunction(real arg)
         {
             do some calculations;
             store a return value in aReturnValue;
             return main return value;
         }
     }

 I'm sure a lot of people will disagree with me on this, but let me 
 first say why I think the last case is more user-friendly. For one 
 thing, the same class can be used over and over again with the same 
 parameter(s). Also, the user only has to retrieve aReturnValue if it 
 is needed. If there are many such "additional" inout parameters which 
 are seldom needed, it gets tedious to declare variables for them every 
 time the function is called. I could overload the function, but this 
 also has drawbacks if there are several inout parameters with the same 
 type.

 My questions are:

 - If I do like in the second example above, and reuse temporary 
 variables instead of allocating them every time the function is 
 called, could this way also give the best performance? (Yes, I know 
 this is bad form...)

 ...or, if not...

 - If I (again in the second example) move the temporary variables 
 inside the function, so they are allocated on the stack instead of the 
 heap (?), will this improve or reduce performance?

 I could write both types of code and test them against each other, but 
 I am planning to use the same style for several different functions in 
 several modules, and want to find the solution which is generally the 
 best one.

 -Lars
If I understand right that your main concern is with parameters that are used over and over and over again -- which I can empathize with -- you could also look into function currying. Assuming you are using Phobos, the module you want to look at is std.bind, usage of which is pretty straightforward. Given a function: real pow (real base, real exp); You could emulate a square() function via std.bind like so: square = bind(&pow, _0, 2.0); square(42.0); // same as: pow(42.0, 2.0) If you are using Tango, I'm honestly not sure off the top of my head what the relevant module is, but you could always install Tangobos and use std.bind just fine. All that being said, I have no experience with currying functions with inout parameters. If my understanding of how std.bind works its magic is right, it should be fine. I believe it wraps the call up in a structure, which means the actual parameter will be from a field of said structure... which, actually, means it could also store state. That in itself could be an interesting capability. -- Chris Nicholson-Sauls
Most of the time I use Tango, but in this particular case I don't want my code to depend on either library. Also I'm not sure whether the std.bind functionality is even present in Tango. I could always write my.own.bind, though. Your solution is nice from a usability perspective, in that it reuses function arguments -- possibly even inout ones. From a performance perspective, however, it carries with it the overhead of an extra function call, which I'm not sure I want. -Lars
Feb 02 2009
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Lars Kyllingstad wrote:
 [snip]
 From a performance
 perspective, however, it carries with it the overhead of an extra
 function call, which I'm not sure I want.
 
 -Lars
You're worried about a second function call which could potentially be inlined, yet you're seemingly not worried about the overhead of virtual calls or heap allocations... Allow me to quote Donald Knuth:
 We should forget about small efficiencies, say about 97% of the time:
 premature optimization is the root of all evil.
Unless you're doing something where you *know* you're going to need every last cycle, just go with whichever design works best. Your response to Jarrett implies that you've already got a design in mind, and are just fishing for a magic "make it go faster button." Believe me, if Walter had invented such a thing, he wouldn't be wasting his time putting up with us; he'd be too busy smoking $100 bills from the comfort of his SPACE FORTRESS. :D In any case, I'm willing to bet that if there *are* inefficiencies you're not going to know exactly where until you've written the code, anyway. :P If classes work, and make for an elegant design, go for it. -- Daniel
Feb 02 2009
parent reply Lars Kyllingstad <public kyllingen.NOSPAMnet> writes:
Daniel Keep wrote:
 
 Lars Kyllingstad wrote:
 [snip]
 From a performance
 perspective, however, it carries with it the overhead of an extra
 function call, which I'm not sure I want.

 -Lars
You're worried about a second function call which could potentially be inlined, yet you're seemingly not worried about the overhead of virtual calls or heap allocations...
But that's the problem, you see. I don't know how expensive these operations are, hence my initial question(s). (This was also why I posted my question in D.learn.) For instance, I didn't know (not sure I still do) what the cost is of frequent allocation/deallocation/access of stack memory vs. infrequent allocation/deallocation and frequent access of heap memory. From the replies I've got, it seems heap variables make for significantly slower code. Nor was I sure, as you pointed out, how expensive a virtual function call is vs. an extra non-virtual function call. I'm a physicist, not a computer scientist. :)
 Allow me to quote Donald Knuth:
 
 We should forget about small efficiencies, say about 97% of the time:
 premature optimization is the root of all evil.
Unless you're doing something where you *know* you're going to need every last cycle, just go with whichever design works best. Your response to Jarrett implies that you've already got a design in mind, and are just fishing for a magic "make it go faster button."
I want that button, yes. :) But seriously, I am doing numerical computations, so performance is absolutely an issue. The main thing I wanted to know was, can I have both performance and usability, or do I have to choose? With Jarretts suggestion I can, to some degree, have both.
 Believe me, if Walter had invented such a thing, he wouldn't be wasting
 his time putting up with us; he'd be too busy smoking $100 bills from
 the comfort of his SPACE FORTRESS.  :D
What are you implying, that he wouldn't make it open-source? :)
 In any case, I'm willing to bet that if there *are* inefficiencies
 you're not going to know exactly where until you've written the code,
 anyway.  :P
 
 If classes work, and make for an elegant design, go for it.
 
   -- Daniel
Feb 02 2009
parent reply Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:
Lars Kyllingstad wrote:
 Daniel Keep wrote:
 Lars Kyllingstad wrote:
 [snip]
 From a performance
 perspective, however, it carries with it the overhead of an extra
 function call, which I'm not sure I want.

 -Lars
You're worried about a second function call which could potentially be inlined, yet you're seemingly not worried about the overhead of virtual calls or heap allocations...
But that's the problem, you see. I don't know how expensive these operations are, hence my initial question(s). (This was also why I posted my question in D.learn.) For instance, I didn't know (not sure I still do) what the cost is of frequent allocation/deallocation/access of stack memory vs. infrequent allocation/deallocation and frequent access of heap memory. From the replies I've got, it seems heap variables make for significantly slower code.
Allocating stack memory is very cheap, because essentially the only thing that has to be done is to offset a stack pointer. Some stack variables are even optimized away if only used as temporaries (that is, their value is retained in a register until it isn't needed) and for short durations. Allocating heap memory, on the other hand, is expensive for two reasons. The first, is that the heap may have to grow, which means negotiating more memory from the operating system, which means switching the CPU back and forth between modes, sometimes several iterations. Of course, this doesn't happen on every allocation, or even very often if you're careful. The second reason, is that before every allocation the garbage collector will perform a collection run. This can actually be disabled (at least in theory) if you plan on doing several allocations in a short period of time, and thereafter re-enabled. For the latter case, see Phobos 'std.gc' or Tango 'tango.core.Memory'. Once you have memory allocated, the cost of access is generally about the same, except that the stack is more likely to be cached by the CPU. (Since it is inevitably accessed often.)
 Nor was I sure, as you pointed out, how expensive a virtual function 
 call is vs. an extra non-virtual function call.
It adds an additional step. You start with an index into the object's vtable (a list of pointers) rather than the function's actual address. Its essentially the same as the difference between assigning to an 'int**' versus an 'int*'.
 I'm a physicist, not a computer scientist. :)
 
Which is a good thing, since D could use more experience from non-programmers who need to program. That's a demographic that occasionally (but never completely!) gets forgotten. I'm not exactly a thirty-years guru, myself. -- Chris Nicholson-Sauls
Feb 03 2009
next sibling parent Lars Kyllingstad <public kyllingen.NOSPAMnet> writes:
Chris Nicholson-Sauls wrote:
 Lars Kyllingstad wrote:
 Daniel Keep wrote:
 Lars Kyllingstad wrote:
 [snip]
 From a performance
 perspective, however, it carries with it the overhead of an extra
 function call, which I'm not sure I want.

 -Lars
You're worried about a second function call which could potentially be inlined, yet you're seemingly not worried about the overhead of virtual calls or heap allocations...
But that's the problem, you see. I don't know how expensive these operations are, hence my initial question(s). (This was also why I posted my question in D.learn.) For instance, I didn't know (not sure I still do) what the cost is of frequent allocation/deallocation/access of stack memory vs. infrequent allocation/deallocation and frequent access of heap memory. From the replies I've got, it seems heap variables make for significantly slower code.
Allocating stack memory is very cheap, because essentially the only thing that has to be done is to offset a stack pointer. Some stack variables are even optimized away if only used as temporaries (that is, their value is retained in a register until it isn't needed) and for short durations. Allocating heap memory, on the other hand, is expensive for two reasons. The first, is that the heap may have to grow, which means negotiating more memory from the operating system, which means switching the CPU back and forth between modes, sometimes several iterations. Of course, this doesn't happen on every allocation, or even very often if you're careful. The second reason, is that before every allocation the garbage collector will perform a collection run. This can actually be disabled (at least in theory) if you plan on doing several allocations in a short period of time, and thereafter re-enabled. For the latter case, see Phobos 'std.gc' or Tango 'tango.core.Memory'. Once you have memory allocated, the cost of access is generally about the same, except that the stack is more likely to be cached by the CPU. (Since it is inevitably accessed often.)
 Nor was I sure, as you pointed out, how expensive a virtual function 
 call is vs. an extra non-virtual function call.
It adds an additional step. You start with an index into the object's vtable (a list of pointers) rather than the function's actual address. Its essentially the same as the difference between assigning to an 'int**' versus an 'int*'.
 I'm a physicist, not a computer scientist. :)
Which is a good thing, since D could use more experience from non-programmers who need to program. That's a demographic that occasionally (but never completely!) gets forgotten. I'm not exactly a thirty-years guru, myself. -- Chris Nicholson-Sauls
Thank you for a very informative reply. :) -Lars
Feb 03 2009
prev sibling parent reply Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Tue, Feb 3, 2009 at 3:44 PM, Chris Nicholson-Sauls
<ibisbasenji gmail.com> wrote:
 The
 second reason, is that before every allocation the garbage collector will
 perform a collection run.  This can actually be disabled (at least in
 theory) if you plan on doing several allocations in a short period of time,
 and thereafter re-enabled.
It should be "before every allocation the garbage collector *may* perform a collection run." If it collected on every allocation it would make your program's execution speed next to useless ;)
Feb 03 2009
parent Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:
Jarrett Billingsley wrote:
 On Tue, Feb 3, 2009 at 3:44 PM, Chris Nicholson-Sauls
 <ibisbasenji gmail.com> wrote:
 The
 second reason, is that before every allocation the garbage collector will
 perform a collection run.  This can actually be disabled (at least in
 theory) if you plan on doing several allocations in a short period of time,
 and thereafter re-enabled.
It should be "before every allocation the garbage collector *may* perform a collection run." If it collected on every allocation it would make your program's execution speed next to useless ;)
Well okay, yes, it *may*. I was in a hurry and trying to be general. ;) Chances are, though, that if you are doing so many allocations in a short period as to be worried about it, that it probably will. If I remember right, the current GC runs a collection just before requesting more heap, so its actually related to the first issue. (I may well remember wrong, its been a very long time since I dove into the GC code.) -- Chris Nicholson-Sauls
Feb 03 2009