digitalmars.D - emplace, scope, enforce [Was: Re: Manual...]

bearophile (180/183) Jul 20 2010 Thank you, I have used this, and later I have done few tests too.

Rory McGuire (3/214) Jul 21 2010 Takes 18m27.720s in PHP :)

Rory Mcguire (29/314) Jul 21 2010 added

bearophile (6/8) Jul 21 2010 (I suggest you to round away milliseconds, they are never significant in...

Rory Mcguire (7/18) Jul 21 2010 On ubuntu 10.04 64 I'm using `time` to get the timing.

Rory Mcguire (3/26) Jul 21 2010 Perhaps the slow times are because I'm reporting the real timings not th...

bearophile (27/29) Jul 21 2010 This was Andrei's code before dsource went down (ddoc and unittest remov...

Dmitry Olshansky (21/211) Jul 21 2010 Well, I'm using this for structs, very straightforward:

bearophile (6/15) Jul 21 2010 If the class is allocated on the stack it's much better if the destructo...

Dmitry Olshansky (56/74) Jul 21 2010 Uh, yes I guess I should have read your post to the end :). Stack

bearophile (9/13) Jul 21 2010 If you really want a class to be used as scope only you can do this, see...

Dmitry Olshansky (40/58) Jul 21 2010 Going further with library implementation as opposed to language

Andrei Alexandrescu (13/80) Jul 21 2010 Nice work. To avoid name clashes with alias this, you may want to use a
Andrei Alexandrescu (5/70) Jul 21 2010 s/Test/T/ I suppose.
Rory Mcguire (11/24) Jul 21 2010 With your code I `time` reports the below timings on my machine:

Andrei Alexandrescu (56/80) Jul 21 2010 I compiled and ran the tests myself with -O -release -inline and got

Rory Mcguire (2/95) Jul 21 2010 Thanks Andrei!!!

Andrei Alexandrescu (4/6) Jul 21 2010 Don't mention it. Thanks for not over-quoting in the future :o).

Rory Mcguire (19/21) Jul 21 2010 My timings with your code, I included the one without -inline because I
Dmitry Olshansky (43/69) Jul 21 2010 Thanks for kind feedback (and showing some optimization tricks). Also

Andrei Alexandrescu (10/17) Jul 21 2010 [snip]

Philippe Sigaud (11/18) Jul 22 2010 Hmm, so the struct Scoped is implicitly parametrized by T and Args... Co...

Andrei Alexandrescu (5/35) Jul 22 2010 Great point. I simplified the implementation:

Dmitry Olshansky (7/24) Jul 22 2010 A cool idiom indeed!

Andrei Alexandrescu (5/32) Jul 22 2010 It doesn't, but just because it uses some heavy-handed tricks such as

bearophile <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 emplace(), defined in std.conv, is relatively new. I haven't yet added
 emplace() for class objects, and this is as good an opportunity as any:
 http://www.dsource.org/projects/phobos/changeset/1752

Thank you, I have used this, and later I have done few tests too. 

The "scope" for class instantiations can be deprecated once there is an
acceptable alternative. You can't deprecate features before you have found a
good enough alternative.

---------------------

A first problem is the syntax, to allocate an object on the stack you need
something like:

// is testbuf correctly aligned?
ubyte[__traits(classInstanceSize, Test)] testbuf = void;
Test t = emplace!(Test)(cast(void[])testbuf, arg1, arg2);


That is too much worse looking, hairy and error prone than:
scope Test t = new Test(arg1, arg2);


I have tried to build a helper to improve the situation, like something that
looks:
Test t = StackAlloc!(Test, arg1, arg2);

But failing that, my second try was this, not good enough:
mixin(stackAlloc!(Test, Test)("t", "arg1, arg2"));

---------------------

A second problem is that this program compiles with no errors:

import std.conv: emplace;

final class Test {
    int x, y;
    this(int xx, int yy) {
        this.x = xx;
        this.y = yy;
    }
}

Test foo(int x, int y) {
    ubyte[__traits(classInstanceSize, Test)] testbuf = void;
    Test t = emplace!(Test)(cast(void[])testbuf, x, y);
    return t;
}

void main() {
    foo(1, 2);
}



While the following one gives:
test.d(13): Error: escaping reference to scope local t


import std.conv: emplace;

final class Test {
    int x, y;
    this(int xx, int yy) {
        this.x = xx;
        this.y = yy;
    }
}

Test foo(int x, int y) {
    scope t = new Test(x, y);
    return t;
}

void main() {
    foo(1, 2);
}


So the compiler is aware that the scoped object can't escape, while using
emplace things become more bug-prone. "scope" can cause other bugs, time ago I
have filed a bug report about one problem, but it avoids the most common bug.
(I am not sure the emplace solves that problem with scope, I think it shares
the same problem, plus adds new ones).

---------------------

A third problem is that the ctor doesn't get called:


import std.conv: emplace;
import std.c.stdio: puts;

final class Test {
    this() {
    }
    ~this() { puts("killed"); }
}

void main() {
    ubyte[__traits(classInstanceSize, Test)] testbuf = void;
    Test t = emplace!(Test)(cast(void[])testbuf);
}


That prints nothing. Using scope it gets called (even if it's not present!).

---------------------

This is not a problem of emplace(), it's a problem of the dmd optimizer.
I have done few tests for the performance too. I have used this basic
pseudocode:

while (i < Max)
{
   create testObject(i, i, i, i, i, i)
   testObject.doSomething(i, i, i, i, i, i)
   testObject.doSomething(i, i, i, i, i, i)
   testObject.doSomething(i, i, i, i, i, i)
   testObject.doSomething(i, i, i, i, i, i)
   destroy testObject
   i++
}


Coming from here:
http://www.drdobbs.com/java/184401976
And its old timings:
http://www.ddj.com/java/184401976?pgno=9


The Java version of the code is simple:

final class Obj {
    int i1, i2, i3, i4, i5, i6;

    Obj(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
        this.i1 = ii1;
        this.i2 = ii2;
        this.i3 = ii3;
        this.i4 = ii4;
        this.i5 = ii5;
        this.i6 = ii6;
    }

    void doSomething(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
    }
}

class Test {
    public static void main(String args[]) {
        final int N = 100_000_000;
        int i = 0;
        while (i < N) {
            Obj testObject = new Obj(i, i, i, i, i, i);
            testObject.doSomething(i, i, i, i, i, i);
            testObject.doSomething(i, i, i, i, i, i);
            testObject.doSomething(i, i, i, i, i, i);
            testObject.doSomething(i, i, i, i, i, i);
            // testObject = null; // makes no difference
            i++;
        }
    }
}



This is a D version that uses emplace() (if you don't use emplace here the
performance of the D code is very bad compared to the Java one):


import std.conv: emplace;

final class Test { // 32 bytes each instance
    int i1, i2, i3, i4, i5, i6;
    this(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
        this.i1 = ii1;
        this.i2 = ii2;
        this.i3 = ii3;
        this.i4 = ii4;
        this.i5 = ii5;
        this.i6 = ii6;
    }
    void doSomething(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
    }
}

void main() {
    enum int N = 100_000_000;

    int i;
    while (i < N) {
        ubyte[__traits(classInstanceSize, Test)] buf = void;
        Test testObject = emplace!(Test)(cast(void[])buf, i, i, i, i, i, i);
        // Test testObject = new Test(i, i, i, i, i, i);
        // scope Test testObject = new Test(i, i, i, i, i, i);        
        testObject.doSomething(i, i, i, i, i, i);
        testObject.doSomething(i, i, i, i, i, i);
        testObject.doSomething(i, i, i, i, i, i);
        testObject.doSomething(i, i, i, i, i, i);
        testObject = null;
        i++;
    }
}


The Java code (server) runs in about 0.25 seconds here.
The D code (that doesn't do heap allocations at all) run in about 3.60 seconds.

With a bit of experiments I have seen that emplace() doesn't get inlined, and
the cause is it contains enforce(). enforce contains a throw, and it seems dmd
doesn't inline functions that can throw, you can test it with a little test
program like this:


import std.c.stdlib: atoi;
void foo(int b) {
    if (b)
        throw new Throwable(null);
}
void main() {
    int b = atoi("0");
    foo(b);
}


So if you comment out the two enforce() inside emplace() dmd inlines emplace()
and the running time becomes about 2.30 seconds, less than ten times slower
than Java.

If emplace() doesn't contain calls to enforce() then the loop in main() becomes
(dmd 2.047, optmized build):


L1A:		push	dword ptr 02Ch[ESP]
		mov	EDX,_D10test6_good4Test7__ClassZ[0Ch]
		mov	EAX,_D10test6_good4Test7__ClassZ[08h]
		push	EDX
		push	ESI
		call	near ptr _memcpy
		mov	ECX,03Ch[ESP]
		mov	8[ECX],EBX
		mov	0Ch[ECX],EBX
		mov	010h[ECX],EBX
		mov	014h[ECX],EBX
		mov	018h[ECX],EBX
		mov	01Ch[ECX],EBX
		inc	EBX
		add	ESP,0Ch
		cmp	EBX,05F5E100h
		jb	L1A


(The memcpy is done by emplace to initialize the object before calling its
ctor. You must perform the initialization because it needs the pointer to the
virtual table and monitor. The monitor here was null. I think a future LDC2 can
optimize away more stuff in that loop, so it's not so bad).



scope Test testObject = new Test(i, i, i, i, i, i);
It runs in about 6 seconds (also because the ctor is called even if's missing).


seconds, about 110 times slower than Java.

Bye,
bearophile

Jul 20 2010

"Rory McGuire" <rmcguire neonova.co.za> writes:

On Wed, 21 Jul 2010 03:58:33 +0200, bearophile <bearophileHUGS lycos.com>  
wrote:

 Andrei Alexandrescu:

 emplace(), defined in std.conv, is relatively new. I haven't yet added
 emplace() for class objects, and this is as good an opportunity as any:
 http://www.dsource.org/projects/phobos/changeset/1752

 Thank you, I have used this, and later I have done few tests too.

 The "scope" for class instantiations can be deprecated once there is an  
 acceptable alternative. You can't deprecate features before you have  
 found a good enough alternative.

 ---------------------

 A first problem is the syntax, to allocate an object on the stack you  
 need something like:

 // is testbuf correctly aligned?
 ubyte[__traits(classInstanceSize, Test)] testbuf = void;
 Test t = emplace!(Test)(cast(void[])testbuf, arg1, arg2);


 That is too much worse looking, hairy and error prone than:
 scope Test t = new Test(arg1, arg2);


 I have tried to build a helper to improve the situation, like something  
 that looks:
 Test t = StackAlloc!(Test, arg1, arg2);

 But failing that, my second try was this, not good enough:
 mixin(stackAlloc!(Test, Test)("t", "arg1, arg2"));

 ---------------------

 A second problem is that this program compiles with no errors:

 import std.conv: emplace;

 final class Test {
     int x, y;
     this(int xx, int yy) {
         this.x = xx;
         this.y = yy;
     }
 }

 Test foo(int x, int y) {
     ubyte[__traits(classInstanceSize, Test)] testbuf = void;
     Test t = emplace!(Test)(cast(void[])testbuf, x, y);
     return t;
 }

 void main() {
     foo(1, 2);
 }



 While the following one gives:
 test.d(13): Error: escaping reference to scope local t


 import std.conv: emplace;

 final class Test {
     int x, y;
     this(int xx, int yy) {
         this.x = xx;
         this.y = yy;
     }
 }

 Test foo(int x, int y) {
     scope t = new Test(x, y);
     return t;
 }

 void main() {
     foo(1, 2);
 }


 So the compiler is aware that the scoped object can't escape, while  
 using emplace things become more bug-prone. "scope" can cause other  
 bugs, time ago I have filed a bug report about one problem, but it  
 avoids the most common bug. (I am not sure the emplace solves that  
 problem with scope, I think it shares the same problem, plus adds new  
 ones).

 ---------------------

 A third problem is that the ctor doesn't get called:


 import std.conv: emplace;
 import std.c.stdio: puts;

 final class Test {
     this() {
     }
     ~this() { puts("killed"); }
 }

 void main() {
     ubyte[__traits(classInstanceSize, Test)] testbuf = void;
     Test t = emplace!(Test)(cast(void[])testbuf);
 }


 That prints nothing. Using scope it gets called (even if it's not  
 present!).

 ---------------------

 This is not a problem of emplace(), it's a problem of the dmd optimizer.
 I have done few tests for the performance too. I have used this basic  
 pseudocode:

 while (i < Max)
 {
    create testObject(i, i, i, i, i, i)
    testObject.doSomething(i, i, i, i, i, i)
    testObject.doSomething(i, i, i, i, i, i)
    testObject.doSomething(i, i, i, i, i, i)
    testObject.doSomething(i, i, i, i, i, i)
    destroy testObject
    i++
 }


 Coming from here:
 http://www.drdobbs.com/java/184401976
 And its old timings:
 http://www.ddj.com/java/184401976?pgno=9


 The Java version of the code is simple:

 final class Obj {
     int i1, i2, i3, i4, i5, i6;

     Obj(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
         this.i1 = ii1;
         this.i2 = ii2;
         this.i3 = ii3;
         this.i4 = ii4;
         this.i5 = ii5;
         this.i6 = ii6;
     }

     void doSomething(int ii1, int ii2, int ii3, int ii4, int ii5, int  
 ii6) {
     }
 }

 class Test {
     public static void main(String args[]) {
         final int N = 100_000_000;
         int i = 0;
         while (i < N) {
             Obj testObject = new Obj(i, i, i, i, i, i);
             testObject.doSomething(i, i, i, i, i, i);
             testObject.doSomething(i, i, i, i, i, i);
             testObject.doSomething(i, i, i, i, i, i);
             testObject.doSomething(i, i, i, i, i, i);
             // testObject = null; // makes no difference
             i++;
         }
     }
 }



 This is a D version that uses emplace() (if you don't use emplace here  
 the performance of the D code is very bad compared to the Java one):


 import std.conv: emplace;

 final class Test { // 32 bytes each instance
     int i1, i2, i3, i4, i5, i6;
     this(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
         this.i1 = ii1;
         this.i2 = ii2;
         this.i3 = ii3;
         this.i4 = ii4;
         this.i5 = ii5;
         this.i6 = ii6;
     }
     void doSomething(int ii1, int ii2, int ii3, int ii4, int ii5, int  
 ii6) {
     }
 }

 void main() {
     enum int N = 100_000_000;

     int i;
     while (i < N) {
         ubyte[__traits(classInstanceSize, Test)] buf = void;
         Test testObject = emplace!(Test)(cast(void[])buf, i, i, i, i, i,  
 i);
         // Test testObject = new Test(i, i, i, i, i, i);
         // scope Test testObject = new Test(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         testObject = null;
         i++;
     }
 }


 The Java code (server) runs in about 0.25 seconds here.
 The D code (that doesn't do heap allocations at all) run in about 3.60  
 seconds.

 With a bit of experiments I have seen that emplace() doesn't get  
 inlined, and the cause is it contains enforce(). enforce contains a  
 throw, and it seems dmd doesn't inline functions that can throw, you can  
 test it with a little test program like this:


 import std.c.stdlib: atoi;
 void foo(int b) {
     if (b)
         throw new Throwable(null);
 }
 void main() {
     int b = atoi("0");
     foo(b);
 }


 So if you comment out the two enforce() inside emplace() dmd inlines  
 emplace() and the running time becomes about 2.30 seconds, less than ten  
 times slower than Java.

 If emplace() doesn't contain calls to enforce() then the loop in main()  
 becomes (dmd 2.047, optmized build):


 L1A:		push	dword ptr 02Ch[ESP]
 		mov	EDX,_D10test6_good4Test7__ClassZ[0Ch]
 		mov	EAX,_D10test6_good4Test7__ClassZ[08h]
 		push	EDX
 		push	ESI
 		call	near ptr _memcpy
 		mov	ECX,03Ch[ESP]
 		mov	8[ECX],EBX
 		mov	0Ch[ECX],EBX
 		mov	010h[ECX],EBX
 		mov	014h[ECX],EBX
 		mov	018h[ECX],EBX
 		mov	01Ch[ECX],EBX
 		inc	EBX
 		add	ESP,0Ch
 		cmp	EBX,05F5E100h
 		jb	L1A


 (The memcpy is done by emplace to initialize the object before calling  
 its ctor. You must perform the initialization because it needs the  
 pointer to the virtual table and monitor. The monitor here was null. I  
 think a future LDC2 can optimize away more stuff in that loop, so it's  
 not so bad).



 scope Test testObject = new Test(i, i, i, i, i, i);
 It runs in about 6 seconds (also because the ctor is called even if's  
 missing).


 27.2 seconds, about 110 times slower than Java.

 Bye,
 bearophile

Takes 18m27.720s in PHP :)

Jul 21 2010

Rory Mcguire <rjmcguire gm_no_ail.com> writes:

Rory McGuire wrote:

 On Wed, 21 Jul 2010 03:58:33 +0200, bearophile 

<bearophileHUGS lycos.com>
 wrote:
 
 Andrei Alexandrescu:

 emplace(), defined in std.conv, is relatively new. I haven't yet 



added
 emplace() for class objects, and this is as good an opportunity as 



any:
 http://www.dsource.org/projects/phobos/changeset/1752

 Thank you, I have used this, and later I have done few tests too.

 The "scope" for class instantiations can be deprecated once there 


is an
 acceptable alternative. You can't deprecate features before you 


have
 found a good enough alternative.

 ---------------------

 A first problem is the syntax, to allocate an object on the stack 


you
 need something like:

 // is testbuf correctly aligned?
 ubyte[__traits(classInstanceSize, Test)] testbuf = void;
 Test t = emplace!(Test)(cast(void[])testbuf, arg1, arg2);


 That is too much worse looking, hairy and error prone than:
 scope Test t = new Test(arg1, arg2);


 I have tried to build a helper to improve the situation, like 


something
 that looks:
 Test t = StackAlloc!(Test, arg1, arg2);

 But failing that, my second try was this, not good enough:
 mixin(stackAlloc!(Test, Test)("t", "arg1, arg2"));

 ---------------------

 A second problem is that this program compiles with no errors:

 import std.conv: emplace;

 final class Test {
     int x, y;
     this(int xx, int yy) {
         this.x = xx;
         this.y = yy;
     }
 }

 Test foo(int x, int y) {
     ubyte[__traits(classInstanceSize, Test)] testbuf = void;
     Test t = emplace!(Test)(cast(void[])testbuf, x, y);
     return t;
 }

 void main() {
     foo(1, 2);
 }



 While the following one gives:
 test.d(13): Error: escaping reference to scope local t


 import std.conv: emplace;

 final class Test {
     int x, y;
     this(int xx, int yy) {
         this.x = xx;
         this.y = yy;
     }
 }

 Test foo(int x, int y) {
     scope t = new Test(x, y);
     return t;
 }

 void main() {
     foo(1, 2);
 }


 So the compiler is aware that the scoped object can't escape, while
 using emplace things become more bug-prone. "scope" can cause other
 bugs, time ago I have filed a bug report about one problem, but it
 avoids the most common bug. (I am not sure the emplace solves that
 problem with scope, I think it shares the same problem, plus adds 


new
 ones).

 ---------------------

 A third problem is that the ctor doesn't get called:


 import std.conv: emplace;
 import std.c.stdio: puts;

 final class Test {
     this() {
     }
     ~this() { puts("killed"); }
 }

 void main() {
     ubyte[__traits(classInstanceSize, Test)] testbuf = void;
     Test t = emplace!(Test)(cast(void[])testbuf);
 }


 That prints nothing. Using scope it gets called (even if it's not
 present!).

 ---------------------

 This is not a problem of emplace(), it's a problem of the dmd 


optimizer.
 I have done few tests for the performance too. I have used this 


basic
 pseudocode:

 while (i < Max)
 {
    create testObject(i, i, i, i, i, i)
    testObject.doSomething(i, i, i, i, i, i)
    testObject.doSomething(i, i, i, i, i, i)
    testObject.doSomething(i, i, i, i, i, i)
    testObject.doSomething(i, i, i, i, i, i)
    destroy testObject
    i++
 }


 Coming from here:
 http://www.drdobbs.com/java/184401976
 And its old timings:
 http://www.ddj.com/java/184401976?pgno=9


 The Java version of the code is simple:

 final class Obj {
     int i1, i2, i3, i4, i5, i6;

     Obj(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
         this.i1 = ii1;
         this.i2 = ii2;
         this.i3 = ii3;
         this.i4 = ii4;
         this.i5 = ii5;
         this.i6 = ii6;
     }

     void doSomething(int ii1, int ii2, int ii3, int ii4, int ii5, 


int
 ii6) {
     }
 }

 class Test {
     public static void main(String args[]) {
         final int N = 100_000_000;
         int i = 0;
         while (i < N) {
             Obj testObject = new Obj(i, i, i, i, i, i);
             testObject.doSomething(i, i, i, i, i, i);
             testObject.doSomething(i, i, i, i, i, i);
             testObject.doSomething(i, i, i, i, i, i);
             testObject.doSomething(i, i, i, i, i, i);
             // testObject = null; // makes no difference
             i++;
         }
     }
 }



 This is a D version that uses emplace() (if you don't use emplace 


here
 the performance of the D code is very bad compared to the Java 


one):

 import std.conv: emplace;

 final class Test { // 32 bytes each instance
     int i1, i2, i3, i4, i5, i6;
     this(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
         this.i1 = ii1;
         this.i2 = ii2;
         this.i3 = ii3;
         this.i4 = ii4;
         this.i5 = ii5;
         this.i6 = ii6;
     }
     void doSomething(int ii1, int ii2, int ii3, int ii4, int ii5, 


int
 ii6) {
     }
 }

 void main() {
     enum int N = 100_000_000;

     int i;
     while (i < N) {
         ubyte[__traits(classInstanceSize, Test)] buf = void;
         Test testObject = emplace!(Test)(cast(void[])buf, i, i, i, 


i, i,
 i);
         // Test testObject = new Test(i, i, i, i, i, i);
         // scope Test testObject = new Test(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         testObject = null;
         i++;
     }
 }


 The Java code (server) runs in about 0.25 seconds here.
 The D code (that doesn't do heap allocations at all) run in about 


3.60
 seconds.

 With a bit of experiments I have seen that emplace() doesn't get
 inlined, and the cause is it contains enforce(). enforce contains a
 throw, and it seems dmd doesn't inline functions that can throw, 


you can
 test it with a little test program like this:


 import std.c.stdlib: atoi;
 void foo(int b) {
     if (b)
         throw new Throwable(null);
 }
 void main() {
     int b = atoi("0");
     foo(b);
 }


 So if you comment out the two enforce() inside emplace() dmd 


inlines
 emplace() and the running time becomes about 2.30 seconds, less 


than ten
 times slower than Java.

 If emplace() doesn't contain calls to enforce() then the loop in 


main()
 becomes (dmd 2.047, optmized build):


 L1A:		push	dword ptr 02Ch[ESP]
 mov	EDX,_D10test6_good4Test7__ClassZ[0Ch]
 mov	EAX,_D10test6_good4Test7__ClassZ[08h]
 push	EDX
 push	ESI
 call	near ptr _memcpy
 mov	ECX,03Ch[ESP]
 mov	8[ECX],EBX
 mov	0Ch[ECX],EBX
 mov	010h[ECX],EBX
 mov	014h[ECX],EBX
 mov	018h[ECX],EBX
 mov	01Ch[ECX],EBX
 inc	EBX
 add	ESP,0Ch
 cmp	EBX,05F5E100h
 jb	L1A


 (The memcpy is done by emplace to initialize the object before 


calling
 its ctor. You must perform the initialization because it needs the
 pointer to the virtual table and monitor. The monitor here was 


null. I
 think a future LDC2 can optimize away more stuff in that loop, so 


it's
 not so bad).



 scope Test testObject = new Test(i, i, i, i, i, i);
 It runs in about 6 seconds (also because the ctor is called even 


if's
 missing).




about
 27.2 seconds, about 110 times slower than Java.

 Bye,
 bearophile

 
 Takes 18m27.720s in PHP :)

Takes 5m26.776s in Python.
Takes 0m1.008s in Java.

can't test D version I don't have emplace and dsource is ignoring me.

Jul 21 2010

bearophile <bearophileHUGS lycos.com> writes:

Rory Mcguire:
 Takes 5m26.776s in Python.
 Takes 0m1.008s in Java.

(I suggest you to round away milliseconds, they are never significant in such
benchmarks.)
Python 2.7 uses its GC a bit better, so it can be a bit faster.
Your Java code has run four times slower than my slow PC, that's a lot. In Java
have you used the -server switch?

Bye,
bearophile

Jul 21 2010

Rory Mcguire <rjmcguire gm_no_ail.com> writes:

bearophile wrote:

 Rory Mcguire:
 Takes 5m26.776s in Python.
 Takes 0m1.008s in Java.

 
 (I suggest you to round away milliseconds, they are never significant in
 such benchmarks.) Python 2.7 uses its GC a bit better, so it can be a bit
 faster. Your Java code has run four times slower than my slow PC, that's a
 lot. In Java have you used the -server switch?
 
 Bye,
 bearophile

On ubuntu 10.04 64 I'm using `time` to get the timing.
I wan't using -server, with it I get 0m1.047s.

D version gets 0m8.162s using a 32bit chroot environment.

Processor is a core i7   (1.6Ghz * 8). 6GB ram.

Interesting thing about the python one is it used 3GB of ram most of the 
time.

Jul 21 2010

Rory Mcguire <rjmcguire gm_no_ail.com> writes:

Rory Mcguire wrote:

 bearophile wrote:
 
 Rory Mcguire:
 Takes 5m26.776s in Python.
 Takes 0m1.008s in Java.

 
 (I suggest you to round away milliseconds, they are never significant in
 such benchmarks.) Python 2.7 uses its GC a bit better, so it can be a bit
 faster. Your Java code has run four times slower than my slow PC, that's
 a lot. In Java have you used the -server switch?
 
 Bye,
 bearophile

 
 On ubuntu 10.04 64 I'm using `time` to get the timing.
 I wan't using -server, with it I get 0m1.047s.
 
 D version gets 0m8.162s using a 32bit chroot environment.
 
 Processor is a core i7   (1.6Ghz * 8). 6GB ram.
 
 Interesting thing about the python one is it used 3GB of ram most of the
 time.

Perhaps the slow times are because I'm reporting the real timings not the 
user/sys time.

Jul 21 2010

bearophile <bearophileHUGS lycos.com> writes:

Rory McGuire:
 Takes 18m27.720s in PHP :)

You have lot of patience :-)


 can't test D version I don't have emplace and dsource is ignoring me.

This was Andrei's code before dsource went down (ddoc and unittest removed):

T emplace(T, Args...)(void[] chunk, Args args) if (is(T == class)) {
    enforce(chunk.length >= __traits(classInstanceSize, T));
    auto a = cast(size_t) chunk.ptr;
    enforce(a % real.alignof == 0);
    auto result = cast(typeof(return)) chunk.ptr;

    // Initialize the object in its pre-ctor state
    (cast(byte[]) chunk)[] = typeid(T).init[];

    // Call the ctor if any
    static if (is(typeof(result.__ctor(args))))
    {
        // T defines a genuine constructor accepting args
        // Go the classic route: write .init first, then call ctor
        result.__ctor(args);
    }
    else
    {
        static assert(args.length == 0 && !is(typeof(&T.__ctor)),
                "Don't know how to initialize an object of type "
                ~ T.stringof ~ " with arguments " ~ Args.stringof);
    }
    return result;
}

Bye,
bearophile

Jul 21 2010

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 21.07.2010 5:58, bearophile wrote:
 Andrei Alexandrescu:

    
 emplace(), defined in std.conv, is relatively new. I haven't yet added
 emplace() for class objects, and this is as good an opportunity as any:
 http://www.dsource.org/projects/phobos/changeset/1752
      

 Thank you, I have used this, and later I have done few tests too.

 The "scope" for class instantiations can be deprecated once there is an
acceptable alternative. You can't deprecate features before you have found a
good enough alternative.

 ---------------------

 A first problem is the syntax, to allocate an object on the stack you need
something like:

 // is testbuf correctly aligned?
 ubyte[__traits(classInstanceSize, Test)] testbuf = void;
 Test t = emplace!(Test)(cast(void[])testbuf, arg1, arg2);


    
 That is too much worse looking, hairy and error prone than:
 scope Test t = new Test(arg1, arg2);


 I have tried to build a helper to improve the situation, like something that
looks:
 Test t = StackAlloc!(Test, arg1, arg2);
    

Well, I'm using this for structs, very straightforward:

T* create(T, Args...)(Args args)
if ( !is(T == class) ){
     return emplace!T(malloc(T.sizeof)[0..T.sizeof], args);
}

void destroy(T)(T* ptr) if ( !is(T == class) ){
     assert(ptr);
     clear(ptr);
     free(ptr);
}
//then
auto a =  create!T(params);

I guess one could easily patch it for classes.
 But failing that, my second try was this, not good enough:
 mixin(stackAlloc!(Test, Test)("t", "arg1, arg2"));

 ---------------------

 A second problem is that this program compiles with no errors:

 import std.conv: emplace;

 final class Test {
      int x, y;
      this(int xx, int yy) {
          this.x = xx;
          this.y = yy;
      }
 }

 Test foo(int x, int y) {
      ubyte[__traits(classInstanceSize, Test)] testbuf = void;
      Test t = emplace!(Test)(cast(void[])testbuf, x, y);
      return t;
 }

 void main() {
      foo(1, 2);
 }
    

This is just a pitfall of any stack allocation, and emplace is, in fact, 
about custom allocation, not scoped variables.
 While the following one gives:
 test.d(13): Error: escaping reference to scope local t


 import std.conv: emplace;

 final class Test {
      int x, y;
      this(int xx, int yy) {
          this.x = xx;
          this.y = yy;
      }
 }

 Test foo(int x, int y) {
      scope t = new Test(x, y);
      return t;
 }

 void main() {
      foo(1, 2);
 }


 So the compiler is aware that the scoped object can't escape, while using
emplace things become more bug-prone. "scope" can cause other bugs, time ago I
have filed a bug report about one problem, but it avoids the most common bug.
(I am not sure the emplace solves that problem with scope, I think it shares
the same problem, plus adds new ones).

 ---------------------

 A third problem is that the ctor doesn't get called:


 import std.conv: emplace;
 import std.c.stdio: puts;

 final class Test {
      this() {
      }
      ~this() { puts("killed"); }
 }

 void main() {
      ubyte[__traits(classInstanceSize, Test)] testbuf = void;
      Test t = emplace!(Test)(cast(void[])testbuf);
 }
    

This is dtor not get called, and it's because emplace is a library 
replacement for placement new( no pun).
Sure enough with manual memory management you need to call clear(t) at exit.

 That prints nothing. Using scope it gets called (even if it's not present!).

 ---------------------

 This is not a problem of emplace(), it's a problem of the dmd optimizer.
 I have done few tests for the performance too. I have used this basic
pseudocode:

 while (i<  Max)
 {
     create testObject(i, i, i, i, i, i)
     testObject.doSomething(i, i, i, i, i, i)
     testObject.doSomething(i, i, i, i, i, i)
     testObject.doSomething(i, i, i, i, i, i)
     testObject.doSomething(i, i, i, i, i, i)
     destroy testObject
     i++
 }


 Coming from here:
 http://www.drdobbs.com/java/184401976
 And its old timings:
 http://www.ddj.com/java/184401976?pgno=9


 The Java version of the code is simple:

 final class Obj {
      int i1, i2, i3, i4, i5, i6;

      Obj(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
          this.i1 = ii1;
          this.i2 = ii2;
          this.i3 = ii3;
          this.i4 = ii4;
          this.i5 = ii5;
          this.i6 = ii6;
      }

      void doSomething(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
      }
 }

 class Test {
      public static void main(String args[]) {
          final int N = 100_000_000;
          int i = 0;
          while (i<  N) {
              Obj testObject = new Obj(i, i, i, i, i, i);
              testObject.doSomething(i, i, i, i, i, i);
              testObject.doSomething(i, i, i, i, i, i);
              testObject.doSomething(i, i, i, i, i, i);
              testObject.doSomething(i, i, i, i, i, i);
              // testObject = null; // makes no difference
              i++;
          }
      }
 }



 This is a D version that uses emplace() (if you don't use emplace here the
performance of the D code is very bad compared to the Java one):


 import std.conv: emplace;

 final class Test { // 32 bytes each instance
      int i1, i2, i3, i4, i5, i6;
      this(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
          this.i1 = ii1;
          this.i2 = ii2;
          this.i3 = ii3;
          this.i4 = ii4;
          this.i5 = ii5;
          this.i6 = ii6;
      }
      void doSomething(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
      }
 }

 void main() {
      enum int N = 100_000_000;

      int i;
      while (i<  N) {
          ubyte[__traits(classInstanceSize, Test)] buf = void;
          Test testObject = emplace!(Test)(cast(void[])buf, i, i, i, i, i, i);
          // Test testObject = new Test(i, i, i, i, i, i);
          // scope Test testObject = new Test(i, i, i, i, i, i);
          testObject.doSomething(i, i, i, i, i, i);
          testObject.doSomething(i, i, i, i, i, i);
          testObject.doSomething(i, i, i, i, i, i);
          testObject.doSomething(i, i, i, i, i, i);
          testObject = null;
          i++;
      }
 }


 The Java code (server) runs in about 0.25 seconds here.
 The D code (that doesn't do heap allocations at all) run in about 3.60 seconds.

 With a bit of experiments I have seen that emplace() doesn't get inlined, and
the cause is it contains enforce(). enforce contains a throw, and it seems dmd
doesn't inline functions that can throw, you can test it with a little test
program like this:


 import std.c.stdlib: atoi;
 void foo(int b) {
      if (b)
          throw new Throwable(null);
 }
 void main() {
      int b = atoi("0");
      foo(b);
 }


 So if you comment out the two enforce() inside emplace() dmd inlines emplace()
and the running time becomes about 2.30 seconds, less than ten times slower
than Java.

 If emplace() doesn't contain calls to enforce() then the loop in main()
becomes (dmd 2.047, optmized build):


 L1A:		push	dword ptr 02Ch[ESP]
 		mov	EDX,_D10test6_good4Test7__ClassZ[0Ch]
 		mov	EAX,_D10test6_good4Test7__ClassZ[08h]
 		push	EDX
 		push	ESI
 		call	near ptr _memcpy
 		mov	ECX,03Ch[ESP]
 		mov	8[ECX],EBX
 		mov	0Ch[ECX],EBX
 		mov	010h[ECX],EBX
 		mov	014h[ECX],EBX
 		mov	018h[ECX],EBX
 		mov	01Ch[ECX],EBX
 		inc	EBX
 		add	ESP,0Ch
 		cmp	EBX,05F5E100h
 		jb	L1A


 (The memcpy is done by emplace to initialize the object before calling its
ctor. You must perform the initialization because it needs the pointer to the
virtual table and monitor. The monitor here was null. I think a future LDC2 can
optimize away more stuff in that loop, so it's not so bad).



 scope Test testObject = new Test(i, i, i, i, i, i);
 It runs in about 6 seconds (also because the ctor is called even if's missing).


seconds, about 110 times slower than Java.

 Bye,
 bearophile
    


-- 
Dmitry Olshansky

Jul 21 2010

bearophile <bearophileHUGS lycos.com> writes:

Dmitry Olshansky:
 Well, I'm using this for structs, very straightforward:
 
 T* create(T, Args...)(Args args)
 if ( !is(T == class) ){
      return emplace!T(malloc(T.sizeof)[0..T.sizeof], args);
 }

That's not good enough, you are allocating on the (C) heap. If you use that in
the D benchmark I have shown you probably can get bad timing results.


 This is dtor not get called, and it's because emplace is a library 
 replacement for placement new( no pun).
 Sure enough with manual memory management you need to call clear(t) at exit.

If the class is allocated on the stack it's much better if the destructor is
called when the class gets out of scope. Otherwise it's like C programming.
(I suggest to edit your post, to remove useless parts of the original post.)

Bye,
bearophile

Jul 21 2010

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 21.07.2010 14:20, bearophile wrote:
 Dmitry Olshansky:
    
 Well, I'm using this for structs, very straightforward:

 T* create(T, Args...)(Args args)
 if ( !is(T == class) ){
       return emplace!T(malloc(T.sizeof)[0..T.sizeof], args);
 }
      

 That's not good enough, you are allocating on the (C) heap. If you use that in
the D benchmark I have shown you probably can get bad timing results.

    

Uh, yes I guess I should have read your post to the end :). Stack 
allocation is risky business to say least. Some kind of memory pool 
should be handy.
 This is dtor not get called, and it's because emplace is a library
 replacement for placement new( no pun).
 Sure enough with manual memory management you need to call clear(t) at exit.
      

 If the class is allocated on the stack it's much better if the destructor is
called when the class gets out of scope. Otherwise it's like C programming.
 (I suggest to edit your post, to remove useless parts of the original post.)

    

To that end one should prefer vanilla structs with destructor, not final 
classes and scope. That's, of course, losing the inheritance and such.
The problem is designing such classes and then documenting: "you should 
always use it as 'scope' ", is awkward.
Moreover, the function which you pass the stack allocated class instance 
are unaware of that clever trick.

Slight modification of your benchmark:

import std.conv: emplace;
import std.contracts;

import std.stdio;
final class Test { // 32 bytes each instance
     int i1, i2, i3, i4, i5, i6;
     this(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
         this.i1 = ii1;
         this.i2 = ii2;
         this.i3 = ii3;
         this.i4 = ii4;
         this.i5 = ii5;
         this.i6 = ii6;
     }
     void doSomething(int ii1, int ii2, int ii3, int ii4, int ii5, int 
ii6) {
     }
}

Test hidden;

void fun(Test t){
     hidden = t;
}

void bench(){
     enum int N = 10_000_000;
     int i;

     while (i < N) {
         scope Test testObject = new Test(i, i, i, i, i, i);
         fun(testObject);
         testObject.doSomething(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         i++;
     }

}

void main() {
     int a,b,c;//
     bench();
//what's the hidden now?
     writefln("%d %d", hidden.i1,hidden.i2);
     writefln("%d %d", hidden.i1,hidden.i2);
}

The second writefln prints garbage. I guess it's because of pointer to 
the long gone stackframe, which is ovewritten by the first writeln.

-- 
Dmitry Olshansky

Jul 21 2010

bearophile <bearophileHUGS lycos.com> writes:

Dmitry Olshansky:
 The problem is designing such classes and then documenting: "you should 
 always use it as 'scope' ", is awkward.

If you really want a class to be used as scope only you can do this, see the
error message:

scope class Foo {}
void main() {
  Foo f = new Foo;
}



 The second writefln prints garbage. I guess it's because of pointer to 
 the long gone stackframe, which is ovewritten by the first writeln.

Yes scope has this and other problems (and I think two of them can be fixed),
but I don't think emplace() is a big improvement.

Bye,
bearophile

Jul 21 2010

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 21.07.2010 16:26, bearophile wrote:
 Dmitry Olshansky:
    
 The problem is designing such classes and then documenting: "you should
 always use it as 'scope' ", is awkward.
      

 If you really want a class to be used as scope only you can do this, see the
error message:

 scope class Foo {}
 void main() {
    Foo f = new Foo;
 }


    
 The second writefln prints garbage. I guess it's because of pointer to
 the long gone stackframe, which is ovewritten by the first writeln.
      

 Yes scope has this and other problems (and I think two of them can be fixed),
but I don't think emplace() is a big improvement.

 Bye,
 bearophile
    

Going further with library implementation as opposed to language 
feature, I made a (somewhat) successful try at implementing scoped classes:

struct Scoped(T){
     ubyte[__traits(classInstanceSize, Test)] _payload;
     T getPayload(){
         return cast(T)(_payload.ptr);
     }
     alias getPayload this;
     static Scoped opCall(Args...)(Args args) if ( 
is(typeof(T.init.__ctor(args))) ){// TODO: should also provide decent 
error message
         Scoped!T s;
         emplace!T(cast(void[])s._payload,args);
         return s;
     }
     ~this(){
         clear(getPayload);
     }
}

now replace the orignal while loop with this:
while (i < N) {
         auto testObject = Scoped!Test(i, i, i, i, i, i);
         //assuming we have aforementioned evil function func(Test t), 
that keeps global reference to t.
         //fun(testObject); //uncoment to get an compile error - type 
mismatch
         testObject.doSomething(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         i++;
     }

and all works just the same as with deprecated scope storage class.
Even better it disallows passing the variable to functions expecting 
vanilla Test, it's limiting but for a good reason.
There are still issues that should be solved (name clash for one, plus 
the ability to define default construct Scoped!T) but overall it's OK to me.

-- 
Dmitry Olshansky

Jul 21 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Dmitry Olshansky wrote:
 On 21.07.2010 16:26, bearophile wrote:
 Dmitry Olshansky:
   
 The problem is designing such classes and then documenting: "you should
 always use it as 'scope' ", is awkward.
      

 If you really want a class to be used as scope only you can do this, 
 see the error message:

 scope class Foo {}
 void main() {
    Foo f = new Foo;
 }


   
 The second writefln prints garbage. I guess it's because of pointer to
 the long gone stackframe, which is ovewritten by the first writeln.
      

 Yes scope has this and other problems (and I think two of them can be 
 fixed), but I don't think emplace() is a big improvement.

 Bye,
 bearophile
    

 Going further with library implementation as opposed to language 
 feature, I made a (somewhat) successful try at implementing scoped classes:
 
 struct Scoped(T){
     ubyte[__traits(classInstanceSize, Test)] _payload;
     T getPayload(){
         return cast(T)(_payload.ptr);
     }
     alias getPayload this;
     static Scoped opCall(Args...)(Args args) if ( 
 is(typeof(T.init.__ctor(args))) ){// TODO: should also provide decent 
 error message
         Scoped!T s;
         emplace!T(cast(void[])s._payload,args);
         return s;
     }
     ~this(){
         clear(getPayload);
     }
 }
 
 now replace the orignal while loop with this:
 while (i < N) {
         auto testObject = Scoped!Test(i, i, i, i, i, i);
         //assuming we have aforementioned evil function func(Test t), 
 that keeps global reference to t.
         //fun(testObject); //uncoment to get an compile error - type 
 mismatch
         testObject.doSomething(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         i++;
     }
 
 and all works just the same as with deprecated scope storage class.
 Even better it disallows passing the variable to functions expecting 
 vanilla Test, it's limiting but for a good reason.
 There are still issues that should be solved (name clash for one, plus 
 the ability to define default construct Scoped!T) but overall it's OK to 
 me.
 

Nice work. To avoid name clashes with alias this, you may want to use a
trick invented by Shin Fujishiro:

struct ExWhyZee {
    template ExWhyZee() {
       // implementation goes here
    }
    alias ExWhyZee!().whatever this;
}

This way you never have a confusion between the symbols defined by the
wrapped type and your own type. I use a variant of the same trick in
RefCounted.


Andrei

Jul 21 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Dmitry Olshansky wrote:
 On 21.07.2010 16:26, bearophile wrote:
 Dmitry Olshansky:
   
 The problem is designing such classes and then documenting: "you should
 always use it as 'scope' ", is awkward.
      

 If you really want a class to be used as scope only you can do this, 
 see the error message:

 scope class Foo {}
 void main() {
    Foo f = new Foo;
 }


   
 The second writefln prints garbage. I guess it's because of pointer to
 the long gone stackframe, which is ovewritten by the first writeln.
      

 Yes scope has this and other problems (and I think two of them can be 
 fixed), but I don't think emplace() is a big improvement.

 Bye,
 bearophile
    

 Going further with library implementation as opposed to language 
 feature, I made a (somewhat) successful try at implementing scoped classes:

I salute this approach.

 struct Scoped(T){
     ubyte[__traits(classInstanceSize, Test)] _payload;

s/Test/T/ I suppose.

     T getPayload(){
         return cast(T)(_payload.ptr);
     }
     alias getPayload this;
     static Scoped opCall(Args...)(Args args) if ( 
 is(typeof(T.init.__ctor(args))) ){// TODO: should also provide decent 
 error message
         Scoped!T s;
         emplace!T(cast(void[])s._payload,args);
         return s;
     }
     ~this(){
         clear(getPayload);
     }
 }
 
 now replace the orignal while loop with this:
 while (i < N) {
         auto testObject = Scoped!Test(i, i, i, i, i, i);
         //assuming we have aforementioned evil function func(Test t), 
 that keeps global reference to t.
         //fun(testObject); //uncoment to get an compile error - type 
 mismatch
         testObject.doSomething(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         i++;
     }
 
 and all works just the same as with deprecated scope storage class.
 Even better it disallows passing the variable to functions expecting 
 vanilla Test, it's limiting but for a good reason.
 There are still issues that should be solved (name clash for one, plus 
 the ability to define default construct Scoped!T) but overall it's OK to 
 me.

I agree Scope has a rightful place in the standard library.


Andrei

Jul 21 2010

Rory Mcguire <rjmcguire gm_no_ail.com> writes:

Dmitry Olshansky wrote:

 now replace the orignal while loop with this:
 while (i < N) {
 auto testObject = Scoped!Test(i, i, i, i, i, i);
 //assuming we have aforementioned evil function func(Test t),
 that keeps global reference to t.
 //fun(testObject); //uncoment to get an compile error - type
 mismatch
 testObject.doSomething(i, i, i, i, i, i);
 testObject.doSomething(i, i, i, i, i, i);
 testObject.doSomething(i, i, i, i, i, i);
 testObject.doSomething(i, i, i, i, i, i);
 i++;
 }

With your code I `time` reports the below timings on my machine:
real	0m19.658s
user	0m19.590s
sys	0m0.010s

compared to:
real	0m9.122s
user	0m9.090s
sys	0m0.000s

with bearofiles original version. With -O -release its about 4 seconds 
faster for each.

Jul 21 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 07/21/2010 01:59 PM, Rory Mcguire wrote:
 Dmitry Olshansky wrote:

 now replace the orignal while loop with this:
 while (i<  N) {
 auto testObject = Scoped!Test(i, i, i, i, i, i);
 //assuming we have aforementioned evil function func(Test t),
 that keeps global reference to t.
 //fun(testObject); //uncoment to get an compile error - type
 mismatch
 testObject.doSomething(i, i, i, i, i, i);
 testObject.doSomething(i, i, i, i, i, i);
 testObject.doSomething(i, i, i, i, i, i);
 testObject.doSomething(i, i, i, i, i, i);
 i++;
 }

 With your code I `time` reports the below timings on my machine:
 real	0m19.658s
 user	0m19.590s
 sys	0m0.010s

 compared to:
 real	0m9.122s
 user	0m9.090s
 sys	0m0.000s

 with bearofiles original version. With -O -release its about 4 seconds
 faster for each.

I compiled and ran the tests myself with -O -release -inline and got 
1.95s for Dmitry's implementation and 1.55s for bearophile's.

I optimized Dmitry's implementation in two ways: I replaced the call to 
clear() with a straight call to the destructor and added = void in two 
places to avoid double initialization. I got 1.11s, which significantly 
undercuts the implementation using scope.

Here's the code I used for testing:

struct Scoped(T) {
     ubyte[__traits(classInstanceSize, Test)] _payload = void;
     T getPayload(){
         return cast(T)(_payload.ptr);
     }
     alias getPayload this;

     static Scoped opCall(Args...)(Args args)
     if (is(typeof(T.init.__ctor(args)))) {
         // TODO: should also provide decent error message
         Scoped!T s = void;
         emplace!T(cast(void[])s._payload,args);
         return s;
     }
     ~this() {
         static if (is(typeof(getPayload.__dtor()))) {
             getPayload.__dtor();
         }
     }
}

final class Test { // 32 bytes each instance
     int i1, i2, i3, i4, i5, i6;
     this(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
         this.i1 = ii1;
         this.i2 = ii2;
         this.i3 = ii3;
         this.i4 = ii4;
         this.i5 = ii5;
         this.i6 = ii6;
     }
     void doSomething(int ii1, int ii2, int ii3, int ii4, int ii5, int 
ii6) {
     }
}

void main(string[] args)
{
     enum int N = 10_000_000;
     int i;
     while (i < N) {
         auto testObject = Scoped!Test(i, i, i, i, i, i);
         //scope testObject = new Test(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         testObject.doSomething(i, i, i, i, i, i);
         i++;
     }
}


Andrei

Jul 21 2010

Rory Mcguire <rjmcguire gm_no_ail.com> writes:

Andrei Alexandrescu wrote:

 On 07/21/2010 01:59 PM, Rory Mcguire wrote:
 Dmitry Olshansky wrote:

 now replace the orignal while loop with this:
 while (i<  N) {
 auto testObject = Scoped!Test(i, i, i, i, i, i);
 //assuming we have aforementioned evil function func(Test t),
 that keeps global reference to t.
 //fun(testObject); //uncoment to get an compile error - type
 mismatch
 testObject.doSomething(i, i, i, i, i, i);
 testObject.doSomething(i, i, i, i, i, i);
 testObject.doSomething(i, i, i, i, i, i);
 testObject.doSomething(i, i, i, i, i, i);
 i++;
 }

 With your code I `time` reports the below timings on my machine:
 real	0m19.658s
 user	0m19.590s
 sys	0m0.010s

 compared to:
 real	0m9.122s
 user	0m9.090s
 sys	0m0.000s

 with bearofiles original version. With -O -release its about 4 seconds
 faster for each.

 
 I compiled and ran the tests myself with -O -release -inline and got
 1.95s for Dmitry's implementation and 1.55s for bearophile's.
 
 I optimized Dmitry's implementation in two ways: I replaced the call to
 clear() with a straight call to the destructor and added = void in two
 places to avoid double initialization. I got 1.11s, which significantly
 undercuts the implementation using scope.
 
 Here's the code I used for testing:
 
 struct Scoped(T) {
      ubyte[__traits(classInstanceSize, Test)] _payload = void;
      T getPayload(){
          return cast(T)(_payload.ptr);
      }
      alias getPayload this;
 
      static Scoped opCall(Args...)(Args args)
      if (is(typeof(T.init.__ctor(args)))) {
          // TODO: should also provide decent error message
          Scoped!T s = void;
          emplace!T(cast(void[])s._payload,args);
          return s;
      }
      ~this() {
          static if (is(typeof(getPayload.__dtor()))) {
              getPayload.__dtor();
          }
      }
 }
 
 final class Test { // 32 bytes each instance
      int i1, i2, i3, i4, i5, i6;
      this(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
          this.i1 = ii1;
          this.i2 = ii2;
          this.i3 = ii3;
          this.i4 = ii4;
          this.i5 = ii5;
          this.i6 = ii6;
      }
      void doSomething(int ii1, int ii2, int ii3, int ii4, int ii5, int
 ii6) {
      }
 }
 
 void main(string[] args)
 {
      enum int N = 10_000_000;
      int i;
      while (i < N) {
          auto testObject = Scoped!Test(i, i, i, i, i, i);
          //scope testObject = new Test(i, i, i, i, i, i);
          testObject.doSomething(i, i, i, i, i, i);
          testObject.doSomething(i, i, i, i, i, i);
          testObject.doSomething(i, i, i, i, i, i);
          testObject.doSomething(i, i, i, i, i, i);
          i++;
      }
 }
 
 
 Andrei


Thanks Andrei!!!

Jul 21 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Rory Mcguire wrote:
 Andrei Alexandrescu wrote:

[snip]

 Thanks Andrei!!!

Don't mention it. Thanks for not over-quoting in the future :o).

Andrei

Jul 21 2010

Rory Mcguire <rjmcguire gm_no_ail.com> writes:

Andrei Alexandrescu wrote:

 Here's the code I used for testing:
 


My timings with your code, I included the one without -inline because I 
havn't been using -inline in my other tests:

$ dmd -O -release -inline program1_2.d
$ time ./program1_2 

real	0m0.526s
user	0m0.520s
sys	0m0.000s
$ dmd -O -release program1_2.d
$ time ./program1_2 

real	0m0.820s
user	0m0.810s
sys	0m0.000s

and bearofiles with -inline:
$ time ./program1

real	0m2.267s
user	0m2.260s
sys	0m0.000s



Nice improvement...so is this in phobos or druntime yet? :D

Jul 21 2010

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 21.07.2010 23:16, Andrei Alexandrescu wrote:
 I compiled and ran the tests myself with -O -release -inline and got 
 1.95s for Dmitry's implementation and 1.55s for bearophile's.

 I optimized Dmitry's implementation in two ways: I replaced the call 
 to clear() with a straight call to the destructor and added = void in 
 two places to avoid double initialization. I got 1.11s, which 
 significantly undercuts the implementation using scope.

 Here's the code I used for testing:

 struct Scoped(T) {
     ubyte[__traits(classInstanceSize, Test)] _payload = void;
     T getPayload(){
         return cast(T)(_payload.ptr);
     }
     alias getPayload this;

     static Scoped opCall(Args...)(Args args)
     if (is(typeof(T.init.__ctor(args)))) {
         // TODO: should also provide decent error message
         Scoped!T s = void;
         emplace!T(cast(void[])s._payload,args);
         return s;
     }
     ~this() {
         static if (is(typeof(getPayload.__dtor()))) {
             getPayload.__dtor();
         }
     }
 }

Thanks for kind feedback (and showing some optimization tricks).  Also 
this implementation still has issues with it: it calls dtor twice. Not a 
good trait for RAII technique ! :)
Since it's now considered useful I feel myself obliged to enhance and 
correct it. Sadly enough I still haven't managed to apply an inner 
template trick.
Here's the end result along with a simple unittest:

struct Scoped(T){
     ubyte[__traits(classInstanceSize, T)] _scopedPayload = void;
     T getScopedPayload(){
         return cast(T)(_scopedPayload.ptr);
     }
     alias getScopedPayload this;
     this(Args...)(Args args){
         static if (!is(typeof(getScopedPayload.__ctor(args)))) {
             static assert(false,"Scoped: wrong arguments passed to ctor");
         }else {
             emplace!T(cast(void[])_scopedPayload,args);
         }
      }
     ~this() {
         static if (is(typeof(getScopedPayload.__dtor()))) {
             getScopedPayload.__dtor();
         }
     }
}
//also track down destruction/construction
class A{
     this(int a){ writeln("A with ",a); }
     this(real r){ writeln("A with ",r); }
     ~this(){ writeln("A destroyed");  }
}

unittest{
     {
         auto a = Scoped!A(42);
     }
     {
         auto b = Scoped!A(5.5);
     }
}

-- 
Dmitry Olshansky

Jul 21 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 07/21/2010 03:35 PM, Dmitry Olshansky wrote:
 Thanks for kind feedback (and showing some optimization tricks). Also
 this implementation still has issues with it: it calls dtor twice. Not a
 good trait for RAII technique ! :)
 Since it's now considered useful I feel myself obliged to enhance and
 correct it. Sadly enough I still haven't managed to apply an inner
 template trick.
 Here's the end result along with a simple unittest:

[snip]

The double destructor hits a bug in the compiler implementation. Anyway, 
here's the committed code:

http://www.dsource.org/projects/phobos/changeset/1774

It uses a new idiom that is enabled by auto returns - defines a struct 
inside the function and returns it. That's a veritable existential type! 
(http://stackoverflow.com/questions/292274/what-is-an-existential-type) 
I expect more of that idiom in the upcoming commits.


Andrei

Jul 21 2010

Philippe Sigaud <philippe.sigaud gmail.com> writes:

On Thu, Jul 22, 2010 at 00:27, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 The double destructor hits a bug in the compiler implementation. Anyway,
 here's the committed code:

 http://www.dsource.org/projects/phobos/changeset/1774

 It uses a new idiom that is enabled by auto returns - defines a struct
 inside the function and returns it. That's a veritable existential type! (
 http://stackoverflow.com/questions/292274/what-is-an-existential-type) I
 expect more of that idiom in the upcoming commits.


Hmm, so the struct Scoped is implicitly parametrized by T and Args... Cool.

Why do you put a second layer of (Args...) in Scoped constructor? Why not
just

this(Args args) if (etc) {...}

And, in your case, if you used Args inside Scoped (which you don't do),
would that be the ctor's Args which'd be used?


Philippe

PS: too bad that, if bug 2581 is not squashed, your scoped won't show in the
docs :(

Jul 22 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Philippe Sigaud wrote:
 On Thu, Jul 22, 2010 at 00:27, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>> 
 wrote:
 
     The double destructor hits a bug in the compiler implementation.
     Anyway, here's the committed code:
 
     http://www.dsource.org/projects/phobos/changeset/1774
 
     It uses a new idiom that is enabled by auto returns - defines a
     struct inside the function and returns it. That's a veritable
     existential type!
     (http://stackoverflow.com/questions/292274/what-is-an-existential-type)
     I expect more of that idiom in the upcoming commits..
 
 
 Hmm, so the struct Scoped is implicitly parametrized by T and Args.... Cool.
 
 Why do you put a second layer of (Args...) in Scoped constructor? Why 
 not just
 
 this(Args args) if (etc) {...}
 
 And, in your case, if you used Args inside Scoped (which you don't do), 
 would that be the ctor's Args which'd be used?
 
 
 Philippe

Great point. I simplified the implementation:

http://www.dsource.org/projects/phobos/changeset/1776

 PS: too bad that, if bug 2581 is not squashed, your scoped won't show in 
 the docs :(

I'll fix that later with a version(ddoc).


Andrei

Jul 22 2010

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 22.07.2010 2:27, Andrei Alexandrescu wrote:
 On 07/21/2010 03:35 PM, Dmitry Olshansky wrote:
 Thanks for kind feedback (and showing some optimization tricks). Also
 this implementation still has issues with it: it calls dtor twice. Not a
 good trait for RAII technique ! :)
 Since it's now considered useful I feel myself obliged to enhance and
 correct it. Sadly enough I still haven't managed to apply an inner
 template trick.
 Here's the end result along with a simple unittest:

 [snip]

 The double destructor hits a bug in the compiler implementation. 
 Anyway, here's the committed code:

 http://www.dsource.org/projects/phobos/changeset/1774

 It uses a new idiom that is enabled by auto returns - defines a struct 
 inside the function and returns it. That's a veritable existential 
 type! 
 (http://stackoverflow.com/questions/292274/what-is-an-existential-type) I 
 expect more of that idiom in the upcoming commits.

A cool idiom indeed!

I guess I should clarify the double destruction problem: it was caused 
by static opCall, the current implementation in Phobos does not suffers 
from it.

-- 
Dmitry Olshansky

Jul 22 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Dmitry Olshansky wrote:
 On 22.07.2010 2:27, Andrei Alexandrescu wrote:
 On 07/21/2010 03:35 PM, Dmitry Olshansky wrote:
 Thanks for kind feedback (and showing some optimization tricks). Also
 this implementation still has issues with it: it calls dtor twice. Not a
 good trait for RAII technique ! :)
 Since it's now considered useful I feel myself obliged to enhance and
 correct it. Sadly enough I still haven't managed to apply an inner
 template trick.
 Here's the end result along with a simple unittest:

 [snip]

 The double destructor hits a bug in the compiler implementation. 
 Anyway, here's the committed code:

 http://www.dsource.org/projects/phobos/changeset/1774

 It uses a new idiom that is enabled by auto returns - defines a struct 
 inside the function and returns it. That's a veritable existential 
 type! 
 (http://stackoverflow.com/questions/292274/what-is-an-existential-type) 
 I expect more of that idiom in the upcoming commits.

 A cool idiom indeed!
 
 I guess I should clarify the double destruction problem: it was caused 
 by static opCall, the current implementation in Phobos does not suffers 
 from it.

It doesn't, but just because it uses some heavy-handed tricks such as 
casting from an untyped buffer to Scoped. I found two distinct bugs by 
working on Scoped, which I'll submit soon.

Andrei

Jul 22 2010

D Programming

C/C++ Programming

Other

digitalmars.D - emplace, scope, enforce [Was: Re: Manual...]