www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 14912] New: Move initialisation of GC'd struct and class data

https://issues.dlang.org/show_bug.cgi?id=14912

          Issue ID: 14912
           Summary: Move initialisation of GC'd struct and class data from
                    the callee to the caller
           Product: D
           Version: D2
          Hardware: All
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P1
         Component: dmd
          Assignee: nobody puremagic.com
          Reporter: ibuclaw gdcproject.org

Currently, druntime will initialise all GC'd data in the caller.

Examples:

_d_newclass():
  p[0 .. ci.init.length] = ci.init[];

_d_newitemT():
  memset(p, 0, _ti.tsize);

_d_newitemiT():
  memcpy(p, init.ptr, init.length);


In each example, results in a system call.  And because the implementation is
always hidden away, the optimizer (or an optimizing backend) cannot assume
anything about the contents of the pointer returned in these calls.

For instance, in very simple case:

  class A
  {
    int foo () { return 42; }
  }

  int test()
  {
    A a = new A(), b = a;
    return b.foo();
  }

If the contents of 'a' set by the caller in the compiler, we would have the
following codegen (pseudo-code):

  int test()
  {
    struct A *a;
    struct A *b;

    a = new A();
    *a = A.init;
    b = a;

    return b.__vptr.foo(b);
  }

From that, an optimizer can break down and inline the default initializer
without the need for memset/memcpy: // ... a = new A(); a.__vptr = &typeid(A).vtbl a.__monitor = null; // ... Perform constant propagation to replace all occurrences of b with a: // ... return *(a.__vptr + 40)(a); // ... Global value numbering to resolve the lookup in the vtable, and de-virtualize the call: // ... return A.foo(a); // ... After some dead code removal, the inliner now sees the direct call and is ready to inline A.foo: int test() { struct A *a = new A(); a.__vptr = typeid(A).vtbl.ptr a.__monitor = null; return 42; } There is another challenge here to remove the dead GC allocation (that will have to wait for another bug report). But I think that this simple change is justified by the opportunity to produce much better resulting code when using classes in at least simple ways - haven't even considered possibilities when considering LTO. If there's no objections, I suggest that we should make a push for this. It will require dmd to update its own NewExp::toElem, and to remove the memcpy/memset parts from druntime. --
Aug 12 2015