digitalmars.D.bugs - [Issue 14912] New: Move initialisation of GC'd struct and class data
- via Digitalmars-d-bugs (79/80) Aug 12 2015 https://issues.dlang.org/show_bug.cgi?id=14912
https://issues.dlang.org/show_bug.cgi?id=14912 Issue ID: 14912 Summary: Move initialisation of GC'd struct and class data from the callee to the caller Product: D Version: D2 Hardware: All OS: All Status: NEW Severity: enhancement Priority: P1 Component: dmd Assignee: nobody puremagic.com Reporter: ibuclaw gdcproject.org Currently, druntime will initialise all GC'd data in the caller. Examples: _d_newclass(): p[0 .. ci.init.length] = ci.init[]; _d_newitemT(): memset(p, 0, _ti.tsize); _d_newitemiT(): memcpy(p, init.ptr, init.length); In each example, results in a system call. And because the implementation is always hidden away, the optimizer (or an optimizing backend) cannot assume anything about the contents of the pointer returned in these calls. For instance, in very simple case: class A { int foo () { return 42; } } int test() { A a = new A(), b = a; return b.foo(); } If the contents of 'a' set by the caller in the compiler, we would have the following codegen (pseudo-code): int test() { struct A *a; struct A *b; a = new A(); *a = A.init; b = a; return b.__vptr.foo(b); }From that, an optimizer can break down and inline the default initializerwithout the need for memset/memcpy: // ... a = new A(); a.__vptr = &typeid(A).vtbl a.__monitor = null; // ... Perform constant propagation to replace all occurrences of b with a: // ... return *(a.__vptr + 40)(a); // ... Global value numbering to resolve the lookup in the vtable, and de-virtualize the call: // ... return A.foo(a); // ... After some dead code removal, the inliner now sees the direct call and is ready to inline A.foo: int test() { struct A *a = new A(); a.__vptr = typeid(A).vtbl.ptr a.__monitor = null; return 42; } There is another challenge here to remove the dead GC allocation (that will have to wait for another bug report). But I think that this simple change is justified by the opportunity to produce much better resulting code when using classes in at least simple ways - haven't even considered possibilities when considering LTO. If there's no objections, I suggest that we should make a push for this. It will require dmd to update its own NewExp::toElem, and to remove the memcpy/memset parts from druntime. --
Aug 12 2015