www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - [Optimization] Speculatively not calling invariant on class objects

reply "Iain Buclaw" <ibuclaw gdcproject.org> writes:
This post got me thinking: 
http://forum.dlang.org/post/mpo71n$22ma$1 digitalmars.com

We know at compile time for a given object whether or not there 
are any invariants, lack of any polymorphism, along with 
disallowing invariants in interfaces means that for the given:

   class NoInvariants { }
   NoInvariants obj;
   assert(obj);

It's only a case of checking each base class for any invariant 
functions, and if none are found, then we can make an (almost) 
reasonable assumption that calling _d_invariant will result in 
nothing but wasted cycles.

However, these can't be omitted completely at compile-time given 
that we can't guarantee if there are any up-cast classes that 
have an invariant.

But we should be able to speculatively test at runtime whether or 
not a call to _d_invariant may be required by doing a simple 
pointer test on the classinfo.

So, given a scenario where we *know* that in a given method 
'func', the this class object NoInvariants provably has no 
invariants anywhere in it's vtable, we can turn calls to 
_d_invariant into.

   void func(NoInvariants this)
   {
     if (typeid(this) == typeid(NoInvariants))
     {
       /* Nothing */
     }
     else
     {
       _d_invariant(this);
     }
   }

A similar tactic is done in C++ when it comes to speculative 
de-virtualization. [1]

Giving this a try on some very contrived benchmarks:

   void test()
   {
       NoInv obj = new NoInv();
       obj.func();
   }
   auto bench = benchmark!(test)(10_000_000);
   writeln("Total time: ", to!Duration(bench[0]));


I found that the patched codegen actually managed to consistently 
squeeze out an extra 2% or more in runtime performance over just 
turning off invariants, and in tests where the check was made to 
fail, was pretty much a penalty-less in comparison to always 
calling _d_invariant.

always_inv(-O2 w/o patch):
- Total time: 592 ms, 430 μs, and 6 hnsecs

always_inv(final, -O2 w/o patch):
- Total time: 572 ms, 495 μs, and 1 hnsec

no_inv(-O2 -fno-invariants):
- Total time: 526 ms, 696 μs, and 3 hnsecs

no_inv(final, -O2 -fno-invariants):
- Total time: 514 ms, 477 μs, and 3 hnsecs

spec_inv(-O2 w/ patch):
- Total time: 513 ms, 90 μs, and 6 hnsecs

spec_inv(final, -O2 w/ patch)
- Total time: 503 ms, 343 μs, and 9 hnsecs

This surprised me, I would have thought that both no_inv and 
spec_inv would be the same, but then again maybe I'm just no good 
at writing tests (very likely).

I'm raising a PR [2], granted that no one can see a hole in my 
thought process, I'd be looking to get it merged in and let 
people try it out to see if they get a similar improvement 
general applications for in non-release builds.


Regards
Iain


[1]: 
http://hubicka.blogspot.de/2014/02/devirtualization-in-c-part-4-analyzing.html
[2]: https://github.com/D-Programming-GDC/GDC/pull/132
Aug 12 2015
next sibling parent reply "Kagamin" <spam here.lot> writes:
Remove allocation?
Aug 12 2015
parent reply "Iain Buclaw" <ibuclaw gdcproject.org> writes:
On Wednesday, 12 August 2015 at 12:48:53 UTC, Kagamin wrote:
 Remove allocation?
My test isn't good enough for that. With scoped classes, the backend can actually devirtualize, inline and DCE pretty much everything except the one side effect I added. This is because it knows how the vtable is initialized, unlike with GC'd classes with _d_newclass. There is another optimization opportunity here... Regards Iain
Aug 12 2015
parent "Kagamin" <spam here.lot> writes:
static __gshared NoInv obj;

   void test()
   {
       obj.func();
   }

   obj = new NoInv();
   auto bench = benchmark!(test)(10_000_000);
   writeln("Total time: ", to!Duration(bench[0]));
Aug 12 2015
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 8/12/15 8:22 AM, Iain Buclaw wrote:
 This post got me thinking:
 http://forum.dlang.org/post/mpo71n$22ma$1 digitalmars.com

 We know at compile time for a given object whether or not there are any
 invariants, lack of any polymorphism, along with disallowing invariants
 in interfaces means that for the given:

    class NoInvariants { }
    NoInvariants obj;
    assert(obj);

 It's only a case of checking each base class for any invariant
 functions, and if none are found, then we can make an (almost)
 reasonable assumption that calling _d_invariant will result in nothing
 but wasted cycles.

 However, these can't be omitted completely at compile-time given that we
 can't guarantee if there are any up-cast classes that have an invariant.

 But we should be able to speculatively test at runtime whether or not a
 call to _d_invariant may be required by doing a simple pointer test on
 the classinfo.
My thought was that you could just set the default invariant pointer to null. Then when you load the invariant function to call, if it's null, don't call it. You could probably get rid of calls to _d_invariant by just calling the invariant directly, no? -Steve
Aug 13 2015
parent reply Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 13 August 2015 at 19:12, Steven Schveighoffer via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 On 8/12/15 8:22 AM, Iain Buclaw wrote:

 This post got me thinking:
 http://forum.dlang.org/post/mpo71n$22ma$1 digitalmars.com

 We know at compile time for a given object whether or not there are any
 invariants, lack of any polymorphism, along with disallowing invariants
 in interfaces means that for the given:

    class NoInvariants { }
    NoInvariants obj;
    assert(obj);

 It's only a case of checking each base class for any invariant
 functions, and if none are found, then we can make an (almost)
 reasonable assumption that calling _d_invariant will result in nothing
 but wasted cycles.

 However, these can't be omitted completely at compile-time given that we
 can't guarantee if there are any up-cast classes that have an invariant.

 But we should be able to speculatively test at runtime whether or not a
 call to _d_invariant may be required by doing a simple pointer test on
 the classinfo.
My thought was that you could just set the default invariant pointer to null. Then when you load the invariant function to call, if it's null, don't call it.
That is what's done at compile time with structs.
 You could probably get rid of calls to _d_invariant by just calling the
 invariant directly, no?

 -Steve
Not with classes, because you need to walk over all interfaces in the vtable, which more likely than not is unknown at compile-time. Regards Iain.
Aug 13 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 8/13/15 1:39 PM, Iain Buclaw via Digitalmars-d wrote:
 On 13 August 2015 at 19:12, Steven Schveighoffer via Digitalmars-d
     You could probably get rid of calls to _d_invariant by just calling
     the invariant directly, no?



 Not with classes, because you need to walk over all interfaces in the
 vtable, which more likely than not is unknown at compile-time.
I guess my understanding of the vtable population isn't very good. I thought there was one invariant entry, period. I don't understand why you'd have multiple invariants in an object that you have to cycle through, why wouldn't the fully derived object know how to call them (from one entry point)? Surely, it knows the interfaces it uses. I thought invariant was like ctor/dtor, the most derived automatically calls the base version. -Steve
Aug 13 2015
parent reply Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 13 August 2015 at 20:03, Steven Schveighoffer via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 On 8/13/15 1:39 PM, Iain Buclaw via Digitalmars-d wrote:

 On 13 August 2015 at 19:12, Steven Schveighoffer via Digitalmars-d
You could probably get rid of calls to _d_invariant by just calling
     the invariant directly, no?



 Not with classes, because you need to walk over all interfaces in the
 vtable, which more likely than not is unknown at compile-time.
I guess my understanding of the vtable population isn't very good. I thought there was one invariant entry, period. I don't understand why you'd have multiple invariants in an object that you have to cycle through, why wouldn't the fully derived object know how to call them (from one entry point)? Surely, it knows the interfaces it uses.
class A { invariant { } } class B : A { } class C : B { invariant { } } B b = new C(); // We can only discover that 'b' is a C object at runtime.
 I thought invariant was like ctor/dtor, the most derived automatically
 calls the base version.
Nope, it only calls it's own invariants. Calling all derived invariants is what _d_invariant is for.
Aug 13 2015
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 8/13/15 2:25 PM, Iain Buclaw via Digitalmars-d wrote:

 class A { invariant { } }
 class B : A { }
 class C : B { invariant { } }

 B b = new C();  // We can only discover that 'b' is a C object at runtime.

     I thought invariant was like ctor/dtor, the most derived
     automatically calls the base version.


 Nope, it only calls it's own invariants.  Calling all derived invariants
 is what _d_invariant is for.
I envisioned C.invariant would inject a call to A.invariant, and that invariant would occupy a vtable slot. -Steve
Aug 13 2015