www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 3463] New: Integrate Precise Heap Scanning Into the GC

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463

           Summary: Integrate Precise Heap Scanning Into the GC
           Product: D
           Version: 2.035
          Platform: Other
        OS/Version: Windows
            Status: NEW
          Keywords: patch
          Severity: enhancement
          Priority: P2
         Component: druntime
        AssignedTo: sean invisibleduck.org
        ReportedBy: dsimcha yahoo.com


--- Comment #0 from David Simcha <dsimcha yahoo.com> 2009-11-01 10:45:26 PST ---
Created an attachment (id=487)
Patches to the GC

I've created patches that allow for precise heap scanning in the GC by storing
a pointer to pointer offset information in the last (void*).sizeof bytes of
each allocated memory block that is to be scanned.  The attached patch patches
gcx.d to do this, and fixes a few other minor issues in the runtime to make
everything compatible.

By default, if no bitmask is provided, a conservative bitmask is used to
replicate the old behavior.  The bitmask format is documented in
bitmaskTempl.d, which also provides templates for generating the masks, some
basic tests to make sure the precise heap scanning works, and prototypes of
functions for creating precisely scanned arrays and class instances.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 01 2009
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #1 from David Simcha <dsimcha yahoo.com> 2009-11-01 10:46:03 PST ---
Created an attachment (id=488)
Templates to generate bit masks, documentation of format.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 01 2009
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


David Simcha <dsimcha yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #487 is|0                           |1
           obsolete|                            |


--- Comment #2 from David Simcha <dsimcha yahoo.com> 2009-11-01 10:48:54 PST ---
Created an attachment (id=489)
Correct patch.  Accidentally attached the wrong one.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 01 2009
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


Leandro Lucarella <llucax gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |llucax gmail.com


--- Comment #3 from Leandro Lucarella <llucax gmail.com> 2009-11-01 11:38:16
PST ---
The patch looks nice. I have some questions:

* Why did you choose to store the bitmask after the SENTINEL_POST and not
before? I think that storing the bitmask before the SENTINEL could let you
detect a corrupted bitmaks when version SENTINEL is compiled.

* In the bitMaskMixin string mixin you have a nested function setBitMask()
that's used only once. I wonder if you reused that function before or if you
put that code in a nested function just because you you think it's more clear
that way. It kind of confused me at first.

* Why is the bitMaskMixin a mixin and not just a plain function? I can't see
any reason to make it a string mixin, I am missing something? I find this very
confusing and makes the code harder to follow, since some variables appear from
nowhere.

Thanks for the good work.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 01 2009
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #4 from David Simcha <dsimcha yahoo.com> 2009-11-01 11:49:58 PST ---
1.  I chose to store the bitmask after SENTINEL_POST so that none of the
assumptions of the sentinel code (such as that the sentinel is immediately
after the data) changes.

2.  The fact that setBitMask() is a nested function is a minor holdover from
when the design was a little different.  If anyone really hates it a lot, it
can be refactored.

3.  The mixin is because I needed a lot of the same logic in realloc() and
extend() and it was complicated enough that I felt it was the lesser of two
evils to use a mixin, even with the "variables appearing out of nowhere" magic,
rather than duplicate that logic.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 01 2009
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #5 from Leandro Lucarella <llucax gmail.com> 2009-11-01 12:31:57
PST ---
(In reply to comment #4)
 1.  I chose to store the bitmask after SENTINEL_POST so that none of the
 assumptions of the sentinel code (such as that the sentinel is immediately
 after the data) changes.

Seems reasonable, the SENTINEL version is not very used anyway.
 2.  The fact that setBitMask() is a nested function is a minor holdover from
 when the design was a little different.  If anyone really hates it a lot, it
 can be refactored.

I agree is not terrible, but since it's a pretty trivial change I guess it could be nice to remove it, to improve readability (I don't think is a performance problem, readability and complexity is my only concern). If you don't feel like changing it yourself I can upload an amended patch.
 3.  The mixin is because I needed a lot of the same logic in realloc() and
 extend() and it was complicated enough that I felt it was the lesser of two
 evils to use a mixin, even with the "variables appearing out of nowhere" magic,
 rather than duplicate that logic.

Sure, duplicating code is never a good idea. The question is, why it can't be done with a plain-old function? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 01 2009
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #6 from David Simcha <dsimcha yahoo.com> 2009-11-01 12:36:22 PST ---
 3.  The mixin is because I needed a lot of the same logic in realloc() and
 extend() and it was complicated enough that I felt it was the lesser of two
 evils to use a mixin, even with the "variables appearing out of nowhere" magic,
 rather than duplicate that logic.

Sure, duplicating code is never a good idea. The question is, why it can't be done with a plain-old function?

Because I needed to dump a whole bunch of variables (not just 1) into the stack frames of realloc() and extend() and the only way this could have been done with a plain old function would be to create a struct, create a function that returns the struct, etc. or to use lots and lots of out paramters. I really felt the mixin was the least unclear way that this logic could be injected into both extend() and realloc(). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 01 2009
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #7 from Sean Kelly <sean invisibleduck.org> 2009-11-03 07:52:44 PST
---
Nice work!  It may be preferable to store the pointer elsewhere however.  I
believe all blocks returned by the allocator must be 16 byte-aligned, so
tacking a pointer onto the end of a block either screws this up or uses up a
lot more space than necessary.  I also kind of wish that the pointer didn't
have to be stored at all for small block sizes, since simply storing the mask
itself would take up less space (admittedly, at the expense of more complicated
logic).

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 03 2009
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #8 from David Simcha <dsimcha yahoo.com> 2009-11-03 08:06:43 PST ---
(In reply to comment #7)
 Nice work!  It may be preferable to store the pointer elsewhere however.  I
 believe all blocks returned by the allocator must be 16 byte-aligned, so
 tacking a pointer onto the end of a block either screws this up or uses up a
 lot more space than necessary.  

I don't understand. If someone requests, for example, a 12-byte allocation, the pointer is stored in the last 4 bytes (on 32-bit) of a 16-byte block. I don't increase the block capacity unless I have to. Yes, occasionally it will result in a doubling of the required capacity, but unless you request an allocation within 4 bytes of a full block size, it uses no extra space. The expected value under pseudo-random allocation sizes is probably (I haven't worked this out formally) only 4 bytes on 32-bit. Furthermore, if the NO_SCAN bit is set, no bit mask information at all is stored. This optimization was part of the reason I chose to use the end of the block: Otherwise I probably would have had to reserve space somewhere else before I knew the status of the NO_SCAN bit, meaning that this optimization would have been unimplementable.
 I also kind of wish that the pointer didn't
 have to be stored at all for small block sizes, since simply storing the mask
 itself would take up less space (admittedly, at the expense of more complicated
 logic).

I thought about this, but the problem I kept coming up with was that tracking the size of the mask would require a couple bytes anyhow. IMHO it was important to keep this patch relatively simple and stupid and easy to debug. It's clearly not a long-term solution to our GC woes because it leaves unaddressed so many unrelated issues that will eventually require a full redesign. It's more of an incremental improvement to make the GC "good enough" until D is popular enough that some GC expert implements generational, moving, parallel, etc. GC. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 03 2009
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


Sean Kelly <sean invisibleduck.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED


--- Comment #9 from Sean Kelly <sean invisibleduck.org> 2009-11-03 08:15:26 PST
---
My apologies.  Those comments were based on your description of what you were
doing, and I came to the wrong conclusion.  I'll give the patch a try!  I'm
also kind of curious what impact this will have on collection times.  Seems
like it should be faster overall.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 03 2009
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #10 from David Simcha <dsimcha yahoo.com> 2009-11-03 08:18:35 PST
---
Just one thing to keep in mind:  Heap scanning is only precise if the mask
information is passed to GC.malloc.  This only happens if you use one of my
templates or call GC.malloc directly.  I couldn't get it to work with new. 
(This is where the idea of templating TypeInfo and ClassInfo originated.)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 03 2009
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


nfxjfg gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |nfxjfg gmail.com


--- Comment #11 from nfxjfg gmail.com 2010-05-06 04:28:59 PDT ---
What is the status of this?

Also, did the newest changes in the runtime (for array appending) trash this
patch?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 06 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #12 from Sean Kelly <sean invisibleduck.org> 2010-06-08 12:22:20
PDT ---
Yeah, the patch doesn't work any more.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jun 08 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #13 from nfxjfg gmail.com 2010-06-27 14:07:33 PDT ---
Created an attachment (id=680)
D1 - patch for dmd for creating pointer bitmasks

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jun 27 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #14 from nfxjfg gmail.com 2010-06-27 14:08:34 PDT ---
Created an attachment (id=681)
D1 - patch for Tango's runtime to enable precise GC scanning

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jun 27 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #15 from nfxjfg gmail.com 2010-06-27 14:21:58 PDT ---
I posted two patches to enable precise GC heap scanning under D1/Tango. All
user programs will make use of the precise scanning; no modifications required.

The dmd patch makes dmd generate an additional field in TypeInfo_Struct and
ClassInfo. These fields point to pointer bitmasks for the referenced type. The
patch is written in such a way, that it won't break unpatched Tango or Phobos1
runtimes. In particular, this means the patched compiler should be useable with
an unpatched Phobos1. (And I don't intend to write a Phobos1 patch. Phobos1 is
deaaaad.)

The tango patch makes use of the generated bitmasks. It is based on dsimcha's
patch. The mark phase and the bitmask format are completely different, but the
brain-drilling changes to add bitmask pointers to the memory blocks are the
same. This patch is also designed in such a way, that the patched runtime can
be used with an unpatched compiler (in this case, it defaults to conservative
scanning). The patch also contains a test program (pm.d) for the compiler
generated bitmasks.

The patch should be able to handle all D types. The internal nodes for
associative arrays are still scanned conservatively. You would have to
dynamically allocate and fill pointer maps for each AA, and I'll just say:
sorry, but no.

Note that the bitmask format is different compared to dsimcha's patch. Now it's
a bitmap, where each bit represents an aligned, pointer sized chunk of memory.

Also, I don't use templated TypeInfos as dsimcha was suggesting. I'm convinced
this approach would slow down compile times and generate masses of bloat. It
also would be hard to implement. The question what the TypeInfo of a TypeInfo
template instance should be was unsolved, too.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jun 27 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


nfxjfg gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #680 is|0                           |1
           obsolete|                            |


--- Comment #16 from nfxjfg gmail.com 2010-06-27 14:36:30 PDT ---
Created an attachment (id=682)
D1 - patch for dmd for creating pointer bitmasks

(forgot some minor stuff - also, this patch is loosely against dmd 1.062;
Walter forgot to tag the 1.062 release in svn)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jun 27 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #17 from nfxjfg gmail.com 2010-06-27 14:49:50 PDT ---
PS: I forgot to handle TypeInfo_Typedef. Apply this change in object_.d:

   -370,6 +370,7    class TypeInfo_Typedef : TypeInfo

     override TypeInfo next() { return base; }
     override uint flags() { return base.flags(); }
+    override PointerMap pointermap() { return base.pointermap(); }
     override void[] init() { return m_init.length ? m_init : base.init(); }

     TypeInfo base;

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jun 27 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


nfxjfg gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #682 is|0                           |1
           obsolete|                            |


--- Comment #18 from nfxjfg gmail.com 2010-07-02 02:39:32 PDT ---
Created an attachment (id=689)
D1 - patch for dmd for creating pointer bitmasks

Fix writing of ClassInfo for interfaces.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 02 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #19 from nfxjfg gmail.com 2010-07-02 03:02:40 PDT ---
I also wanted to make scanning of AA nodes precise, but it turned out the
compiler never passes the TypeInfo of AA values to the runtime functions. I
guess I could hack the compiler to fix this, but I'd rather not.

The idea I wanted to implement was to dynamically allocate & build a PointerMap
for the aaA type each time an associative array is instantiated. As an
alternative solution, one could expose the aaA type to the compiler (would have
to move aaA to object.d), and then let the compiler put a PointerMap into
TypeInfo_AssociativeArray. Any opinions what would be preferable?

PS: does anyone care about this stuff, or am I wasting my time?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 02 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


bearophile_hugs eml.cc changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bearophile_hugs eml.cc


--- Comment #20 from bearophile_hugs eml.cc 2010-07-02 05:28:18 PDT ---
I think they are busy doing other things. It's good to find a way to make
built-in AAs too precise.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 02 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #21 from Leandro Lucarella <llucax gmail.com> 2010-07-02 05:41:22
PDT ---
I care! But I guess that if I'm the only one you are wasting your time :)

I'd suggest to bring it up in the DMD (or even druntime, but I guess that one
is D2-only) devel list:
http://lists.puremagic.com/mailman/listinfo/dmd-internals

Other lists:
http://lists.puremagic.com/mailman/listinfo

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 02 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


Rob Jacques <sandford jhu.edu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sandford jhu.edu


--- Comment #22 from Rob Jacques <sandford jhu.edu> 2010-07-02 07:00:20 PDT ---
I also care. Keep up the good work :)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 02 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


Andrei Alexandrescu <andrei metalanguage.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |andrei metalanguage.com


--- Comment #23 from Andrei Alexandrescu <andrei metalanguage.com> 2010-07-02
09:40:25 PDT ---
I just voted for it. It would be great if you could define some benchmarks by
which you assess the improvements your approach is bringing.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 02 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #24 from David Simcha <dsimcha yahoo.com> 2010-07-02 18:29:07 PDT
---
I'm thoroughly impressed.  Now that someone wrote a better patch than I did,
with some of the plumbing issues resolved, I wish I could just use all 10 votes
on it.  Then again, I've gotten to the point where I'm good at working around
the conservativeness of the GC.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 02 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #25 from Leandro Lucarella <llucax gmail.com> 2010-07-04 08:05:15
PDT ---
(In reply to comment #23)
 I just voted for it. It would be great if you could define some benchmarks by
 which you assess the improvements your approach is bringing.

I will try to do a couple of benchmarks when I have the time, but I think the DMD-part of the patch should be merged, even if the current GC is slower with the precise scanning, because having info on the pointers locations open a lot of new possibilities GC-wise, like moving collectors. This can be explored in the feature, once the compiler provide the necessary information about types to the GC. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 04 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #26 from nfxjfg gmail.com 2010-07-18 14:20:24 PDT ---
 Sean Kelly: you said something about different ways of storing the mask. Do
you have any more concrete suggestions?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 18 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #27 from Leandro Lucarella <llucax gmail.com> 2010-07-20 19:33:05
PDT ---
I'm trying to test this patch but I'm having some problems compiling Tango (I'm
using 0.99.9, not trunk). With the patched DMD, I get this error:

dmd: mtype.c:5671: void PointerMap::pointer(size_t): Assertion `offset <
m_size' failed.

Compiling the file: tango/util/digest/MerkleDamgard.d

Here is some output from a GDB session:

(gdb) bt
#0  0x00002b421bf3c175 in *__GI_raise (sig=<value optimized out>) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00002b421bf3ef80 in *__GI_abort () at abort.c:92
#2  0x00002b421bf352b1 in *__GI___assert_fail (assertion=0x58be62 "offset <
m_size", file=<value optimized out>, line=5671, function=0x58bea0 "void
PointerMap::pointer(size_t)") at assert.c:81
#3  0x00000000004f5d55 in PointerMap::pointer (this=0x7fff15974fc0, offset=20)
at mtype.c:5671
#4  0x00000000004eaf15 in TypeDArray::fillPointerMap (this=0x11735e0,
pm=0x7fff15974fc0, offset=12) at mtype.c:2241
#5  0x00000000004679d4 in VarDeclaration::fillPointerMap (this=0x1130700,
pm=0x7fff15974fc0, a_offset=0) at declaration.c:1379
#6  0x000000000040488b in AttribDeclaration::fillPointerMap (this=0x11307c0,
pm=0x7fff15974fc0, offset=0) at attrib.c:289
#7  0x0000000000542327 in ClassDeclaration::toObjFile (this=0x1130150,
multiobj=0) at toobj.c:484
#8  0x0000000000404689 in AttribDeclaration::toObjFile (this=0x113afd0,
multiobj=0) at attrib.c:240
#9  0x00000000004c0da1 in Module::genobjfile (this=0x112bfd0, multiobj=0) at
glue.c:267
#10 0x00000000004e1560 in main (argc=13, argv=0x111f930) at mars.c:1285


(gdb) list
5666     * Actually does nothing if the offset isn't aligned.
5667     */
5668
5669    void PointerMap::pointer(size_t offset)
5670    {
5671        assert(offset < m_size);
5672        //reject unaligned pointers
5673        if (offset % sizeof(size_t))
5674            return;
5675        size_t bitpos = offset / sizeof(size_t);
(gdb) print offset
$1 = 20
(gdb) print m_size
$2 = 20


(gdb) up
#4  0x00000000004eaf15 in TypeDArray::fillPointerMap (this=0x11735e0,
pm=0x7fff15974fc0, offset=12) at mtype.c:2241
2241        pm->pointer(offset + sizeof(size_t));
(gdb) list
2236    }
2237
2238    void TypeDArray::fillPointerMap(PointerMap *pm, size_t offset)
2239    {
2240        // like struct Array { size_t length; byte* data; }
2241        pm->pointer(offset + sizeof(size_t));
2242    }
2243
2244    /***************************** TypeAArray
*****************************/
2245
(gdb) print *pm
$3 = {
  m_bits = {
    <Object> = {
      _vptr.Object = 0x5939d0
    }, 
    members of Bits: 
    bitdim = 3, 
    allocdim = 1, 
    data = 0x11e4c70
  }, 
  m_size = 20
}
(gdb) print offset
$4 = 12


I don't know enough about DMD internals to debug this myself, so any help will
be very much appreciated.

I'd like to run my test suite to the GC with precise scanning to see how it
goes. I've noticed that false pointers can add a lot of variance in the time a
program can take in Linux, where the addresses returned by mmap() is
randomized, so there are times where the address range returned by mmap() is
much more prone to receive false pointers. See this for the full story:
http://www.llucax.com.ar/blog/blog/post/-7a56a111

Running dil to generate the full Tango documentation can take from 50 to 80
seconds depending on the address range returned by the OS (I suspect because of
false pointers; which I hope to prove trying this patch :)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 20 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #28 from Leandro Lucarella <llucax gmail.com> 2010-07-20 19:51:19
PDT ---
Some extra info:

(gdb) up
#5  0x00000000004679d4 in VarDeclaration::fillPointerMap (this=0x1130700,
pm=0x7fff15974fc0, a_offset=0) at declaration.c:1379
1379            type->fillPointerMap(pm, offset + a_offset);
(gdb) list
1374
1375    void VarDeclaration::fillPointerMap(PointerMap *pm, size_t a_offset)
1376    {
1377        //printf("VarDeclaration::fillPointerMap() %s, ty = %d, offs=%d\n",
toChars(), type->ty, (int)offset);
1378        if (!isDataseg())
1379            type->fillPointerMap(pm, offset + a_offset);
1380    }
1381
1382    /******************************************
1383     * If a variable has an auto destructor call, return call for it.

I have commented out that printf() and this is what I got:

VarDeclaration::fillPointerMap() bytes, ty = 19, offs=8
VarDeclaration::fillPointerMap() buffer, ty = 0, offs=12
dmd: mtype.c:5671: void PointerMap::pointer(size_t): Assertion `offset <
m_size' failed.

And this is the piece of D code triggering the assertion:

package class MerkleDamgard : Digest
{
        private uint    bytes;
        private ubyte[] buffer;  <---- This one!

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 20 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #29 from Leandro Lucarella <llucax gmail.com> 2010-07-20 20:01:30
PDT ---
Woops! I'm sorry about the noise, I just found out what the problem was. I was
trying building DMD as a 64 bits app, so sizeof(size_t) was 8, but the
generated code was 32 bits, so it should be 4.

Anyway, the lesson learned is that this patch doesn't look right for cross
compilation. I don't even know if DMD is supposed to work as a cross-compiler,
though.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 20 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #30 from Sean Kelly <sean invisibleduck.org> 2010-07-21 07:21:06
PDT ---
Hm... I'd love to get this into D2, but the diffs are a bit large to apply by
hand.  I don't suppose you'd be inclined to provide a D2 patch as well?  It
should be pretty similar to the Tango patch.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 21 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #31 from Leandro Lucarella <llucax gmail.com> 2010-07-21 08:38:06
PDT ---
(In reply to comment #30)
 Hm... I'd love to get this into D2, but the diffs are a bit large to apply by
 hand.  I don't suppose you'd be inclined to provide a D2 patch as well?  It
 should be pretty similar to the Tango patch.

I would be nice to split the patch in smaller changes too, because there are a couple of extra optimizations not related to the precise scanning (for example, making findPool() use a binary search, I was just about to implement the same optimization in my GC :). There are a few changes of structure not related to the patch too, which improves the code, but makes harder to port the patch to other GCs. Thanks! -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 21 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


nfxjfg gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #689 is|0                           |1
           obsolete|                            |


--- Comment #32 from nfxjfg gmail.com 2010-07-21 11:49:11 PDT ---
Created an attachment (id=692)
D1 - patch for dmd for creating pointer bitmasks

- don't use the dmd provided bit array type anymore to make the patch more 64
bit friendly
- slightly change PointerMap format

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 21 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


nfxjfg gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #681 is|0                           |1
           obsolete|                            |


--- Comment #33 from nfxjfg gmail.com 2010-07-21 11:51:13 PDT ---
Created an attachment (id=693)
D1 - patch for Tango's runtime to enable precise GC scanning

-removed the unrelated optimization stuff such as doing a binary search for
pools (all that was backported from D2 to Tango to bring gcx.d on the same
level; original author was probably Walter)
- add some helper methods for manually building PointerMaps (unused/untested;
originally intended to deal with associative arrays)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 21 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #34 from nfxjfg gmail.com 2010-07-21 11:57:02 PDT ---
I don't really use D2 (all my code is in D1). Porting it to D2 will require
dealing with the recently added array append stuff. Not sure how hard that
would be, but currently I have no intention doing the port myself.

Also note that the GC mark code is unoptimized. I'm not sure how much
optimization potential there is, but my own attempts have been unsuccessful so
far.

About the 64 bit issue: to me it looked like dmd can never cross compile, and
if you want to compile it as 64 bit app, it must generate 64 bit code. Or is
that wrong?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 21 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #35 from Leandro Lucarella <llucax gmail.com> 2010-07-21 11:59:40
PDT ---
(In reply to comment #34)
 I don't really use D2 (all my code is in D1). Porting it to D2 will require
 dealing with the recently added array append stuff. Not sure how hard that
 would be, but currently I have no intention doing the port myself.
 
 Also note that the GC mark code is unoptimized. I'm not sure how much
 optimization potential there is, but my own attempts have been unsuccessful so
 far.
 
 About the 64 bit issue: to me it looked like dmd can never cross compile, and
 if you want to compile it as 64 bit app, it must generate 64 bit code. Or is
 that wrong?

I don't know how it should finally be, but right now, even if you compile DMD as a 64 bits app, it generates 32 bits code (I didn't even find any options to make it generate 64 bits code :). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 21 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #36 from nfxjfg gmail.com 2010-07-21 12:05:08 PDT ---
I guess we will have to see how Walter's 64 bit port will look like.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 21 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #37 from Leandro Lucarella <llucax gmail.com> 2010-07-22 17:54:38
PDT ---
BTW, I just realized that this type information scheme doesn't enable the GC to
move data around (i.e. it doesn't open the door to moving collectors) since the
distinction between *is* a pointer and *may be* a pointer is not present.

This difference is necessary for knowing when you can move things around, since
a block pointed to by a *may be* a pointer should be pinned, because you can't
update the word suspected to be a pointer (because in case is not, you are
corrupting user's memory). Only block pointed to by *is* a pointer words are
candidates for moving.

Maybe considering that this involves a change in the compiler, it would be a
good opportunity to finally open the door to moving collectors. Unfortunately
this means more type information is necessary. Maybe 2 bitmasks can be used, a
"scan" mask (that can take over the NO_SCAN attribute) and a "pointer" mask. A
word marked as "scan" (i.e. "scan" bit == 1) should be scanned but might be a
pointer or not (like the current bitmask proposed in this patch); a word marked
"pointer (i.e. "pointer bit == 1) is guaranteed to be a pointer and the GC is
allowed to update it if the pointed to block is moved.

So a real pointer should have both scan and pointer set to 1 (yeah, scan is
redundant, but I can't think of a better representation), and an union should
have only scan set to 1. I don't know what to do with void* though, since is
very common to use it as a "safe-buffer" instead of a real pointer array. The
safe bet is to keep it only with the "scan" bit, I guess.

I didn't think a lot if this organization of the information is the best for
the GC implementator, but it's simple and opens the door to experimentation, so
I think it's better to have a possibility even when is not the better solution
than not having anything; the interface can be refined in the future.

Since you are already playing with the compiler, do you think it's possible to
change the compiler to provide this information?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 22 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #38 from Leandro Lucarella <llucax gmail.com> 2010-07-22 19:26:36
PDT ---
If I'm understanding the patch right, I think I found a bug. At the end of
reallocNoSync():

+            if (psize < size ||             // if new size is bigger
+                psize > size * 2)           // or less than half
+            {
+                p2 = mallocNoSync(size, bits, bitMask);
+
+                psize -= bitMaskSize;
+                size -= bitMaskSize;


Shouldn't size and psize be updated before mallocNoSync() is called? Otherwise
a block a word larger than needed would be allocated.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 22 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #39 from Leandro Lucarella <llucax gmail.com> 2010-07-22 19:27:29
PDT ---
(In reply to comment #38)
 If I'm understanding the patch right, I think I found a bug. At the end of
 reallocNoSync():
 
 +            if (psize < size ||             // if new size is bigger
 +                psize > size * 2)           // or less than half
 +            {
 +                p2 = mallocNoSync(size, bits, bitMask);
 +
 +                psize -= bitMaskSize;
 +                size -= bitMaskSize;
 
 
 Shouldn't size and psize be updated before mallocNoSync() is called? Otherwise
 a block a word larger than needed would be allocated.

Same for the code inside version (SENTINEL). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 22 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #40 from Leandro Lucarella <llucax gmail.com> 2010-07-22 20:26:48
PDT ---
Also, I think queryNoSync() subtract the bitmask size twice, because getInfo()
already subtract it from the block size.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 22 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


nfxjfg gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #692 is|0                           |1
           obsolete|                            |


--- Comment #41 from nfxjfg gmail.com 2010-07-23 11:33:38 PDT ---
Created an attachment (id=695)
D1 - patch for dmd for creating pointer bitmasks

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 23 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


nfxjfg gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #693 is|0                           |1
           obsolete|                            |


--- Comment #42 from nfxjfg gmail.com 2010-07-23 11:35:23 PDT ---
Created an attachment (id=696)
D1 - patch for Tango's runtime to enable precise GC scanning

- fix bugs mentioned by comments 38-40
- add a bitmask for moveable pointers

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 23 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #43 from nfxjfg gmail.com 2010-07-23 11:39:19 PDT ---
You're right, there seem to be some places where the bitmask size is added or
substracted twice. I don't really know; I took that code over from dsimcha's
patch without modification. I was just thankful that I hadn't to write this
code. I fixed the three issues you found. It seems to work, though I'm not 100%
sure if it's correct now.

Regarding moving collectors:

I think this would be an interesting experiment and worth trying. My patch now
has a second bitmask, where each bit tells whether a word is a moveable pointer
or not.

Instead of a second bitmap, you could just have a per-memory block flag that
tells whether a memory block is a void[] and/or contains unions with pointers.
(If the flag is set, the GC won't change any pointers inside the block.) You
could maintain that flag as additional bitmap (along NOSCAN etc). This way
storing small bitmaps inline would be simpler. Unions or fixed-size void[] are
probably seldom enough to justify this simplification. gcx.d can choose either
way to implement it using the compiler generated bitmasks.

Before actually implementing a moving GC, one should experiment how many pinned
pointers the stack/manually added ranges/unmovable pointers/datasegment
generate and how they would fragment the heap.

Also note that some runtime parts are not ready yet for a moving GC, such as
Object.toHash.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 23 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #44 from Leandro Lucarella <llucax gmail.com> 2010-07-23 12:15:44
PDT ---
(In reply to comment #43)
 You're right, there seem to be some places where the bitmask size is added or
 substracted twice. I don't really know; I took that code over from dsimcha's
 patch without modification. I was just thankful that I hadn't to write this
 code. I fixed the three issues you found. It seems to work, though I'm not 100%
 sure if it's correct now.

Thanks!
 Regarding moving collectors:
 
 I think this would be an interesting experiment and worth trying. My patch now
 has a second bitmask, where each bit tells whether a word is a moveable
 pointer or not.
 
 Instead of a second bitmap, you could just have a per-memory block flag that
 tells whether a memory block is a void[] and/or contains unions with pointers.
 (If the flag is set, the GC won't change any pointers inside the block.) You
 could maintain that flag as additional bitmap (along NOSCAN etc). This way
 storing small bitmaps inline would be simpler. Unions or fixed-size void[] are
 probably seldom enough to justify this simplification. gcx.d can choose either
 way to implement it using the compiler generated bitmasks.

Yes, that can be a nice optimization if void[] and unions are rare.
 Before actually implementing a moving GC, one should experiment how many
 pinned pointers the stack/manually added ranges/unmovable pointers/datasegment
 generate and how they would fragment the heap.

Yup! This patch opens that posibility too! Which is great.
 Also note that some runtime parts are not ready yet for a moving GC, such as
 Object.toHash.

Yes, this is just a step in the right direction, not a final solution. Thanks again! A bikeshed suggestion, maybe it would be better to call PointerMap methods this way: isPointerAt() -> mustScanWordAt() isMoveablePointerAt() -> isPointerAt() hasUnmoveablePointers() -> canUpdatePointers() Because the pointers are not "moveable", the memory they point at is the one moveable or not. The pointer, if really pointer, are updateable, and if they are not guaranteed to be pointers, we shouldn't call them that. That's why I think is clearer to call a "may be pointer" a word that should be scanned and a guaranteed pointer (which the GC could update) simply a pointer. By the same means, I'd call pointer_bits and mpointer_bits scan_bits and pointer_bits respectively. BTW, there is already a GC attribute called NO_MOVE, which follows this nomenclature (is supposed to be used for blocks that can't be moved, not for blocks which pointers can't be updated). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 23 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #45 from Leandro Lucarella <llucax gmail.com> 2010-07-24 20:08:30
PDT ---
Well, I've made a little benchmark for the patch.

I'm using the voronoi[1] benchmark, since I think is a good GC benchmark,
because it exercises the GC a lot, but it does real work with the data (unlike
the typical "tree" benchmark). I find it particularly interesting too because
it seems to have a lot of false positives, making the runtime very bound to the
addresses returned by mmap(), as shown in the post[2] mentioned earlier in this
comments. This benchmarks confirmed that the high variance in runtime presented
by this test is due to false positives.

[1] http://codepad.org/xGDCS3KO
[2] http://www.llucax.com.ar/blog/blog/post/-7a56a111

I've compared 3 binaries, one compiled without any of this patches
(voronoi-dnp-tnp, i.e. Dmd Not Precise, to be honest, this is DMD 1.062
distributed by DigitalMars), one compiled with a patched DMD, but unpatched
Tango to see if the PointerMaps created by the compiler has any effect on
performance (voronoi-dp-tnp) and one compiled with both patched DMD and Tango
(voronoi-dp-tp). Patched DMD is svn r580 (D1 branch), Tango is svn trunk r5505,
and the patches are not the latest ones, are the previous (sorry, I was working
on this before you came up with the new patch, I'll try the new patch when I
have some time).

Anyway, here are the result:
$ for i in {1..10}; do /usr/bin/time -f%e ./voronoi-dnp-tnp -n 30000; done
3.88
3.71
4.28
3.89
3.78
3.63
3.81
4.36
3.82
3.89

min  = 3.63
mean = 3.905, std = 0.234343053378
max  = 4.36

$ for i in {1..10}; do /usr/bin/time -f%e ./voronoi-dp-tnp -n 30000; done
3.74
3.87
3.69
3.91
4.33
4.07
4.26
4.08
3.78
4.32

min  = 3.69
mean = 4.005, std = 0.241994031148
max  = 4.33

$ for i in {1..10}; do /usr/bin/time -f%e ./voronoi-dp-tp -n 30000; done
4.05
4.03
4.04
4.03
4.02
4.02
4.02
4.02
4.02
4.05

min  = 4.02
mean = 4.03, std = 0.0124721912892
max  = 4.05

The differences between the patched an unpatched DMD doesn't look relevant,
specially taking into account the high variance and that they are not even the
same DMD version.

Unfortunately the patched GC fall a little (~3%) behind the unpatched one,
mean-wise, and a little more if we look at the min (about 10%), but the max
drops about the same (~7%) and the variance drops dramatically (~240%) with the
patched GC. Considering the precise scanning patch is the first try, and not
very optimized, it looks like a tradeoff that worth paying while trying to
improve this timings.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 24 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #46 from David Simcha <dsimcha yahoo.com> 2010-07-25 09:07:00 PDT
---
As far as the (small) drop in mean performance, I say "who cares?".  In
practice, once false pointers start eating you alive and the heap gets
unnecessarily huge, everything memory management-related becomes appallingly
slow.  This fact is probably not shown in the synthetic benchmarks.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 25 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #47 from Leandro Lucarella <llucax gmail.com> 2010-07-25 13:43:10
PDT ---
(In reply to comment #46)
 As far as the (small) drop in mean performance, I say "who cares?".  In
 practice, once false pointers start eating you alive and the heap gets
 unnecessarily huge, everything memory management-related becomes appallingly
 slow.  This fact is probably not shown in the synthetic benchmarks.

Yup, as I said in another comment: Running dil to generate the full Tango documentation can take from 50 to 80 seconds depending on the address range returned by the OS (I suspect because of false pointers; which I hope to prove trying this patch :) I'll try to do some testing with dil, but I don't think it will be in the near future. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 25 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #48 from Leandro Lucarella <llucax gmail.com> 2010-07-25 17:04:56
PDT ---
Well timings for dil are much worse :(

This is dil generating the Tango docs, without precise scanning (dmd with the
last patch, Tango unpatched):
52.01
53.73
52.21
51.79
50.58
52.62
53.01
48.49
51.51
49.68

min  = 48.49
mean = 51.563, std = 1.58236847795
max  = 53.73

And this is with precise scanning:
79.47
69.81
72.27
67.12
68.34
83.85
65.28
75.17
77.76
74.73

min  = 65.28
mean = 73.38, std = 5.91698309013
max  = 83.85

I can't explain why is even less stable with precise scanning, maybe is
something to do with it being less cache-friendly. I don't know either why is
so stable without precise scanning, when in other tests I've done it wasn't
like that. Maybe the majority of false positives are from the static memory and
the binary was placed in an address that make them less probable (the binary
address doesn't change between runs).

It would be nice if other people give this patches a try, specially with
programs that have problems with false positives...

Here are the profiling results (using perf[1]):
Non-precise scanning: http://pastebin.com/zRKyc9mW
Precise scanning:     http://pastebin.com/jPJZLL8p

It looks like findPool() might be used much more often than before? For
example, I noticed the mixin code calls findPool() very early, so maybe it's
being called in some situations where it's not necessary. Also, with the patch
sizeOf() needs to call findPool(), but I don't think that should make a lot of
difference (unless is used by the runtime very often for array manipulation).

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 25 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #49 from nfxjfg gmail.com 2010-07-25 17:21:10 PDT ---
 Leandro:

Thanks for doing these tests!

In D1, the runtime calls gc_query on _every_ array append operation. Though the
cache in Gcx.getInfo() speeds it up a bit. But it doesn't look like
gc_query/getInfo() needs to do more work because of this patch?

You also could try to replace the code in the mark function by the old,
non-precise one. This way you can see how much overhead the bitmask
manipulation code causes.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 25 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #50 from Leandro Lucarella <llucax gmail.com> 2010-07-25 17:30:20
PDT ---
(In reply to comment #48)
 It looks like findPool() might be used much more often than before? For
 example, I noticed the mixin code calls findPool() very early, so maybe it's
 being called in some situations where it's not necessary. Also, with the patch
 sizeOf() needs to call findPool(), but I don't think that should make a lot of
 difference (unless is used by the runtime very often for array manipulation).

Seeing the callgraph, 95% of the findPool() calls are being made directly from mark(), 4.5% from fullcollectshell(). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 25 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #51 from Leandro Lucarella <llucax gmail.com> 2010-07-25 18:10:01
PDT ---
(In reply to comment #48)
 Here are the profiling results (using perf[1]):
 Non-precise scanning: http://pastebin.com/zRKyc9mW
 Precise scanning:     http://pastebin.com/jPJZLL8p
 
 It looks like findPool() might be used much more often than before? For
 example, I noticed the mixin code calls findPool() very early, so maybe it's
 being called in some situations where it's not necessary. Also, with the patch
 sizeOf() needs to call findPool(), but I don't think that should make a lot of
 difference (unless is used by the runtime very often for array manipulation).

OTOH, note that all 3 functions have a *lot* of more samples in the precise case, they are doing a lot more of work. It doesn't look like the code added for precise scanning should make *that much* extra work, so that made me think, what if the overhead for storing the type information was significant enough to make the program eat more memory and those functions were doing more work because there is more memory to scan and not because scanning takes longer? And it looks like that might be the problem: Precise scanning: Maximum resident set size, in Kilobytes = 1289984 Elapsed real time, in seconds = 68.56 Non precise scanning: Maximum resident set size, in Kilobytes = 950400 Elapsed real time, in seconds = 49.74 Note this pattern: memory: 950400/1289984 = 0.7367533 time: 49.74/68.56 = 0.7254959 Scary close :) Maybe dil uses a lot of small objects, and/or a lot of 16/32/etc objects, that doubles the memory needed to store them because of the bitmask pointer. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 25 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #52 from Leandro Lucarella <llucax gmail.com> 2010-07-25 18:45:10
PDT ---
OK, here is the other side of the coin, a small fabricated benchmark (actually
stolen from the NG, sightly modified by me to make false pointers much more
likely):

// Written by Babele Dunnit <babele.dunnit gmail.com>
// Found at
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=54084
// Sightly modified by Leandro Lucarella <llucax gmail.com>
// (some readability improvements and output removed)

const IT = 300;
const N1 = 20_000;
const N2 = 40_000;

class Individual
{
        Individual[20] children;
        int[100] data = void;
        this() {
                foreach (i; data)
                        data[i] += (cast(int)&children) - i;
        }
}

class Population
{

        void grow()
        {
                foreach(inout individual; individuals)
                {
                        individual = new Individual;
                }
        }

        Individual[N1] individuals;
}

version = loseMemory;

int main(char[][] args)
{

        Population testPop1 = new Population;
        Population testPop2 = new Population;

        Individual[] indi = new Individual[N1];

        for(int i = 0; i < IT; i++)
        {
                testPop1.grow();
                testPop2.grow();

                version (loseMemory){
                        indi[] = testPop1.individuals ~ testPop2.individuals;
                }

                version (everythingOk){
                        indi[0..N1] = testPop1.individuals;
                        indi[N1..N2] = testPop2.individuals;
                }
        }

        return 0;
}

Here are the results:

Precise:
Maximum resident set size, in Kilobytes = 160192
Elapsed real time, in seconds           = 31.72
(seeing the memory consumption, it was very stable all the time)

Non-precise:
Maximum resident set size, in Kilobytes = 2202400
Elapsed real time, in seconds           = 97.02
(I have to kill it because it was eating up all my memory)

So maybe precise scanning should be configurable (but when not enabled, it
shouldn't store the bitmask in the memory block to avoid the extra heap
space)...

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 25 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


nfxjfg gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #695 is|0                           |1
           obsolete|                            |


--- Comment #53 from nfxjfg gmail.com 2010-07-25 19:08:48 PDT ---
Created an attachment (id=698)
D1 - patch for dmd for creating pointer bitmasks

Some changes to make the patch behave under 64 bit mode. Under 32 bits, the
compiler should produce exactly the same as the patch before. Under 64 bit,
both native 64 bit mode and cross compiling to 32 bit should work. The trick is
that PTRSIZE is a variable, not a #define.


 Leandro:

(to comment #52)
I'm glad that this patch actually seems to help!
What do you think, is compile time configurability enough, or should it be
possible to chose at runtime?

(to comment #51)
I'm not sure what to do about this. One could try to store the bitmap inside
the memory block with just 1-2 bytes to save space. But in most cases, size
alignment would make that useless. Also possible: use the same bitmask for each
page that is subdivided into bins. It's well possible that due to additional
fragmentation this would use much more memory than it saves, though.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 25 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #54 from Leandro Lucarella <llucax gmail.com> 2010-07-25 19:47:26
PDT ---
(In reply to comment #53)
 Created an attachment (id=698) [details]
 D1 - patch for dmd for creating pointer bitmasks
 
 Some changes to make the patch behave under 64 bit mode. Under 32 bits, the
 compiler should produce exactly the same as the patch before. Under 64 bit,
 both native 64 bit mode and cross compiling to 32 bit should work. The trick is
 that PTRSIZE is a variable, not a #define.

Nice! I really hope the DMD patch is accepted :)
  Leandro:
 
 (to comment #52)
 I'm glad that this patch actually seems to help!
 What do you think, is compile time configurability enough, or should it be
 possible to chose at runtime?

I think something in between is the best choice. In the GC I'm working on, I've implemented runtime configurability but at startup-time, using environment variables. I find that being a good tradeoff, since you can make assumptions that would be true for all the program's lifetime and you don't have to recompile (the runtime!) to play with GC settings. But I thinks that doesn't belong to this patch, it can be added in the future :) For now, I think that compile-time configurability is acceptable, but it would be better to not store type information at all if precise scanning is not enabled (otherwise I think it would hurt more than not disabling it).
 (to comment #51)
 I'm not sure what to do about this. One could try to store the bitmap inside
 the memory block with just 1-2 bytes to save space. But in most cases, size
 alignment would make that useless. Also possible: use the same bitmask for each
 page that is subdivided into bins. It's well possible that due to additional
 fragmentation this would use much more memory than it saves, though.

Yes, I think it's hard to improve things, at least under the current GC design, but the important thing (for me) is the compiler patch, not the runtime, because it open the doors for serious GC research. I think the runtime can be rethought and a good design takes advantage of this extra information should be possible in the future. Thanks a lot for the good patches! -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 25 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #55 from Sean Kelly <sean invisibleduck.org> 2010-07-25 20:45:32
PDT ---
If/when this makes it into D2 I'm going to store the info in a parallel array
kind of like the bits are.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 25 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #56 from Leandro Lucarella <llucax gmail.com> 2010-07-26 06:40:21
PDT ---
(In reply to comment #55)
 If/when this makes it into D2 I'm going to store the info in a parallel array
 kind of like the bits are.

Will you store a pointer to the pointer bitmask? In that case you'll have an overhead of 4096/16 = 256, 256*4 = 1KiB (32 bits; 2KiB for 64) for each allocated page, even when a page might have all its data marked as NO_SCAN. That's 25% overhead (50% for 64 bits). Even if that overhead is non-GC'ed memory, so it won't get scanned or anything, it looks like too much. If you plan to store it only when necessary and being dependent on the bin size, it would be very hard to address, I think. I'll try storing the bits directly for small bins (16..128 for 32 bits, ..256 for 64) first, it seems easier and it looks like it could help. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 26 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #57 from Leandro Lucarella <llucax gmail.com> 2010-07-27 21:19:01
PDT ---
I think there is a not-so-important bug in the DMD patch, the bits.length value
looks like it needs to be divided by size_t.sizeof (which is odd, since in the
patch it looks like it's already done in setSize()).

See this test program:

---
extern (C) int printf(char*, ...);

struct Test
{
        int x;
        void* p;
        char[15] s;
        Test* t;
        long l;
        union U {
                int* pi;
                int  i;
        }
        U u;
}

void main()
{
        Test* t = new Test;
        printf("sizeof = %zu\n", Test.sizeof);
        auto pm = typeid(Test).pointermap.bits;
        printf("PointerMap: ptr = %p, length = %zu, T words = %zu, scan bits =
%zx, ptr bits = %zx\n",
                        pm.ptr, pm.length, pm[0], pm[1], pm[2]);
}
---

The output is:

PointerMap: ptr = 0x805ab1c, length = 12, T words = 10, scan bits = 242, ptr
bits = 42

All looks nice except for the length value, which should be 3 instead of 12.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 27 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #58 from Leandro Lucarella <llucax gmail.com> 2010-07-27 21:23:34
PDT ---
And another small comment about the Tango runtime patch, you add a binSize()
function, but there is already a binsize[] array for the same purpose, you can
remove the binSize(bin) function and replace it with binsize[bin].

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 27 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


nfxjfg gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #698 is|0                           |1
           obsolete|                            |


--- Comment #59 from nfxjfg gmail.com 2010-07-27 22:49:51 PDT ---
Created an attachment (id=700)
D1 - patch for dmd for creating pointer bitmasks

You're right, I stupidly write the byte length, not the array length.
This patch fixes it and also eliminates two warnings about void* arithmetic.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 27 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


nfxjfg gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #696 is|0                           |1
           obsolete|                            |


--- Comment #60 from nfxjfg gmail.com 2010-07-28 08:40:37 PDT ---
Created an attachment (id=701)
D1 - patch for Tango's runtime to enable precise GC scanning

- lots of nasty refactoring in gcx.d:
   - kill the string mixin
   - replace binSize by binsize as suggested by Leandro (that was from
dsimcha's patch; no idea why it was there)
   - realloc is now "emulated" (this simplifies things a lot - untested because
neither the runtime, Tango, or any of my code actually use realloc); note that
realloc doesn't shrink anymore, but that could be added back
   - replace some the implementation of sizeOf and addrOf functions by using
Gcx.getInfo (the high level of code duplication in gcx.d sure is horrible)
   - explicitly support SENTINEL (I have no idea why the code apparently worked
with SENTINEL enabled; at least it should have messed up the bitmask scanning
offset)
   - now you can control per environment variable if bitmasks will be stored in
memory blocks (see Memory.d GC ddoc comment)
- update method names and documentation as suggested by Leandro in comment 44


Also, shouldn't functions like freeNoSync check for interior pointers? What
happens if you call it with such a pointer?

There was also some awkwardness about sentinels. For example getInfo() doesn't
substract the size of the sentinel from the returned size. (I fixed that
specific one because not fixing it would have meant to introduce a special
case.)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 28 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #61 from Leandro Lucarella <llucax gmail.com> 2010-07-28 12:23:01
PDT ---
(In reply to comment #60)
 Created an attachment (id=701) [details]
 D1 - patch for Tango's runtime to enable precise GC scanning
 
 - lots of nasty refactoring in gcx.d:

Even when I agree that the GC needs a lot of refactoring, I don't think it's a good idea to include it in this patch, it makes much harder to understand it and it might introduce some subtle bugs that will be very hard to track down in so many changes. I'm working on a GC that was based on the Tango (0.99.9) GC and doing a lot of refactoring myself. The refactoring of things introduced by the patch looks really nice though.
 Also, shouldn't functions like freeNoSync check for interior pointers? What
 happens if you call it with such a pointer?

I don't think so, it should be undefined behavior (based on C's free()). Thanks! -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 28 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #62 from nfxjfg gmail.com 2010-07-28 12:49:49 PDT ---
(In reply to comment #61)
 Even when I agree that the GC needs a lot of refactoring, I don't think it's a
 good idea to include it in this patch, it makes much harder to understand it
 and it might introduce some subtle bugs that will be very hard to track down in
 so many changes.

I have to agree, but I did these changes in order to make storing a bitmask configurable, and to properly support SENTINEL. The changes the original patch did weren't very small either. I see 4 options: 1. keep this anyway 2. keep the old gcx.d around and apply the changes to a new incarnation of gcx.d, and let the user choose the GC implementation at startup or compile time 3. only accept the compiler patch, and wait for Leandro's new GC 4. revert to the previous version of my patch (of course I wouldn't like this at all) Which is it? I also wonder about D2. The array appending issue makes porting the Tango patch non-trivial.
 Also, shouldn't functions like freeNoSync check for interior pointers? What
 happens if you call it with such a pointer?

I don't think so, it should be undefined behavior (based on C's free()).

Then the code should probably check whether the pointer is valid, and abort() or throw an exception if not. I don't think intentionally introducing random behavior when you could avoid it is a good idea. Also the documentation comments in Memory.d seem to disagree with your assessment. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 28 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #63 from Leandro Lucarella <llucax gmail.com> 2010-07-28 12:57:35
PDT ---
(In reply to comment #60)
    - explicitly support SENTINEL (I have no idea why the code apparently worked
 with SENTINEL enabled; at least it should have messed up the bitmask scanning
 offset)

BTW, I think MEMSTOMP is broken too (but in a non-destructive way, things work anyways). Is very hard to keep those features functional because they are so hard to use. I've made those 2 features configurable via env variables in my GC. I'll try to publish my repo in case anyone want to steal some ideas :) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 28 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #64 from Leandro Lucarella <llucax gmail.com> 2010-07-28 13:04:07
PDT ---
(In reply to comment #62)
 (In reply to comment #61)
 Even when I agree that the GC needs a lot of refactoring, I don't think it's a
 good idea to include it in this patch, it makes much harder to understand it
 and it might introduce some subtle bugs that will be very hard to track down in
 so many changes.

I have to agree, but I did these changes in order to make storing a bitmask configurable, and to properly support SENTINEL. The changes the original patch did weren't very small either. I see 4 options: 1. keep this anyway 2. keep the old gcx.d around and apply the changes to a new incarnation of gcx.d, and let the user choose the GC implementation at startup or compile time 3. only accept the compiler patch, and wait for Leandro's new GC 4. revert to the previous version of my patch (of course I wouldn't like this at all) Which is it?

I think this should be replied by whoever have the authority to merge the patches, my comments were just wishes (and you made them true in a much higher proportion that I was expecting :). I'd say 1 is OK. About my GC, is a research work and, even when my goal is something realistic and I'm doing performance tests all the time, don't expect much. :) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 28 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #65 from Leandro Lucarella <llucax gmail.com> 2010-08-07 22:19:43
PDT ---
This blog post might be of interesting for this bug report:
http://www.llucax.com.ar/blog/blog/post/1490c03e

Specially for comment 55 and comment 56.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 07 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #66 from bearophile_hugs eml.cc 2010-08-08 06:03:55 PDT ---
 http://www.llucax.com.ar/blog/blog/post/1490c03e

That page shows me a spectacularly complex page of Python error messages. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 08 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #67 from Leandro Lucarella <llucax gmail.com> 2010-08-08 08:10:19
PDT ---
Sor(In reply to comment #66)
 http://www.llucax.com.ar/blog/blog/post/1490c03e

That page shows me a spectacularly complex page of Python error messages.

Sorry about that, fixed! -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 08 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #68 from Leandro Lucarella <llucax gmail.com> 2010-08-15 07:24:36
PDT ---
More analysis on wasted space (for the current GC and for the precise patch)
here:
http://www.llucax.com.ar/blog/blog/post/098ceae8

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 15 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


nfxjfg gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #701 is|0                           |1
           obsolete|                            |


--- Comment #69 from nfxjfg gmail.com 2010-08-24 01:37:05 PDT ---
Created an attachment (id=737)
 D1 - patch for Tango's runtime to enable precise GC scanning

Same as old patch, updated to newer tango svn.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 24 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #70 from nfxjfg gmail.com 2010-08-24 01:40:52 PDT ---
Created an attachment (id=738)
experiment: use ClassInfo to get bitmask for object allocations

objbitmask.patch is a patch on top of tango_precise_gc.patch, which makes
storing the pointer bitmap inline unnecessary. Instead, it assumes that all
memory blocks flagged with BlkAttr.FINALIZE are D objects, and uses their
ClassInfo to get the bitmask. Seems to be slightly slower, but also reduces
space overhead.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 24 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


nfxjfg gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #700 is|0                           |1
           obsolete|                            |


--- Comment #71 from nfxjfg gmail.com 2010-08-24 06:26:56 PDT ---
Created an attachment (id=739)
D1 - patch for dmd for creating pointer bitmasks

don't emit pointer maps for NO_SCAN types (breaks the test program pm.d
included in the other post)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 24 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


nfxjfg gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #739 is|0                           |1
           obsolete|                            |


--- Comment #72 from nfxjfg gmail.com 2010-09-09 06:36:26 PDT ---
Created an attachment (id=751)
D1 - patch for dmd for creating pointer bitmasks

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 09 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


nfxjfg gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #737 is|0                           |1
           obsolete|                            |


--- Comment #73 from nfxjfg gmail.com 2010-09-09 06:37:13 PDT ---
Created an attachment (id=752)
D1 - patch for Tango's runtime to enable precise GC scanning

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 09 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #74 from nfxjfg gmail.com 2010-09-09 06:41:46 PDT ---
Created an attachment (id=753)
dmd: enable precise scanning for AAs

AAs are special because they use some runtime mechanism. dmd didn't allow
precise scanning because not all type information was available (usually only
the key).

This patch completely changes the runtime interface for AAs in order to allow
precise scanning. It also does much much more. (For example I didn't like that
dmd's runtime interface was so incredibly whacky, such as passing values of
arbitrary runtime type per vararg, and expecting the value next to the argument
before on the stack. Walter probably has to change this anyway for the dmd 64
bit port...)

The patched dmd is backwards compatible and compiles Tango svn just fine. The
patched Tango tells dmd to use the new ABI by declaring a magical member
variable somewhere in object.di.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 09 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #75 from nfxjfg gmail.com 2010-09-09 06:43:43 PDT ---
Created an attachment (id=754)
tango: enable precise scanning for AAs

This is the Tango patch that goes with the dmd patch (attachment 753). The AA
implementation is duplicated because the Tango patch is as well compatible with
an unpatched dmd. An unpatched dmd will compile the "old" AA implementation
instead of the new one.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 09 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


nfxjfg gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #488 is|0                           |1
           obsolete|                            |


--- Comment #76 from nfxjfg gmail.com 2010-09-09 06:45:13 PDT ---
(From update of attachment 488)
marking dsimcha's patches as obsolete because there are way too many
attachments already, and it seems it's not going to be used (even less than my
patches anyway)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 09 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #77 from nfxjfg gmail.com 2010-09-15 02:22:03 PDT ---
There's a small bug in the memory clearing in mallocNoSync(), that could cause
memory leaks. Will post patch on request (nobody is using this anyway, right?).

Also, I found out that that SENTINEL is dead code. Uncommenting the "debug =
SENTINEL;" line on start of gcx.d will not include the code in
"version(SENTINEL)" blocks, because "debug" identifiers and "version"
identifiers live in different namespaces. Makes me wonder, does anyone ever use
the SENTINEL stuff for testing? Or just thought he'd be using it? Btw. someone
said 16 bytes is the minimum alignment on OSX, so SENTINEL is broken there
anyway.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 15 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #78 from Leandro Lucarella <llucax gmail.com> 2010-09-15 05:51:08
PDT ---
(In reply to comment #77)
 There's a small bug in the memory clearing in mallocNoSync(), that could cause
 memory leaks. Will post patch on request (nobody is using this anyway, right?).

I ported it to my GC, so I'm interested in knowing what the bug is about to check if I have it too.
 Also, I found out that that SENTINEL is dead code. Uncommenting the "debug =
 SENTINEL;" line on start of gcx.d will not include the code in
 "version(SENTINEL)" blocks, because "debug" identifiers and "version"
 identifiers live in different namespaces. Makes me wonder, does anyone ever use
 the SENTINEL stuff for testing? Or just thought he'd be using it? Btw. someone
 said 16 bytes is the minimum alignment on OSX, so SENTINEL is broken there
 anyway.

Yeah, I think most of the versioned code is dead or close to it, because I doubt anybody test it. In my GC, which makes SENTINEL and MEMSTOMP run-time configurations, SENTINEL is broken and I didn't had the time to fix it yet. MEMSTOMP seems to work though. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 15 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #79 from nfxjfg gmail.com 2010-09-15 06:33:39 PDT ---
Incremental patch:

diff --git a/tango/core/rt/gc/basic/gcx.d b/tango/core/rt/gc/basic/gcx.d
index 93c8078..0f049d7 100644
--- a/tango/core/rt/gc/basic/gcx.d
+++ b/tango/core/rt/gc/basic/gcx.d
   -589,9 +589,9    class GC
             // Return next item from free list
             gcx.bucket[bin] = (cast(List*)p).next;
             if( !(bits & BlkAttr.NO_SCAN) )
-                cstring.memset(p + size, 0, binsize[bin] - size);
+                cstring.memset(p + request_size, 0, capacity - request_size);
             //debug(PRINTF) printf("\tmalloc => %x\n", p);
-            debug (MEMSTOMP) cstring.memset(p, 0xF0, size);
+            debug (MEMSTOMP) cstring.memset(p, 0xF0, request_size);
         }
         else
         {

request_size is the size the user requested, while size is request_size + the
pointermap's size. But size can be much smaller than memory block's real size
(= capacity), so you really want to start clearing the memory right after the
user data.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 15 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #80 from nfxjfg gmail.com 2010-09-16 11:26:43 PDT ---
By the way, if the patch is going to be accepted, it would probably be good to
get rid of the NO_SCAN flags. Instead, NO_SCAN should be detected by examining
the PointerMap. A PointerMap with length 0 could take over the NO_SCAN meaning. 

E.g.:

Old code:
bool is_NO_SCAN = !(typeinfo.flags() & 1);

New code:
bool is_NO_SCAN = typeinfo.pointermap().bits.length == 0;

Then you don't need to provide/pass around the TypeInfo.flags() anymore. Would
get rid of some uglyness.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 16 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #81 from nfxjfg gmail.com 2010-11-14 18:06:05 PST ---
I obsoleted all the patches because they were outdated (too old dmd/Tango
versions). I don't think it's very efficient to make new patches and post them
here (I mean, there are already 21 obsoleted patches, and this issue has
probably the most comments on bugzilla, despite not much going on).

Please send me an email to request newer patches. These patches include:
* dmd patch for PointerMap output (backwards compatible to old Phobos etc.)
* Tango patch to include PointerMap in object.d, and to make use of them in
gcx.d
* additional dmd patch to make associative array scanning precise (changes ABI
to include more type info; again completely backwards compatible to Phobos)
* additional Tano patch to make associative array scanning precise (backwards
compatible)

The obsoleted patches will only apply to old dmd versions, and what's worse,
only if you run unix2dos on all dmd source files beforehand. (The line endings
in the dmd.zip are inconsistent, and the dmd svn is unusable because releases
are not tagged properly.)

Btw. I didn't really appreciate Walter's 64 bit changes to the AA ABI. Really,
the ABI should pass all TypeInfos to the AA functions. If my patch is applied,
dmd supports roughly 3 AA ABIs, which is very silly.

If someone has a better idea how to keep track of the patches and so on (that
are better than flooding bugzilla or emailing me), please make a suggestions.

Last but not least:

[02:55:20] <wm4> AndreiAlexandres: also, I'm about to post new patches for
precise gc scanning (update to newer dmd versions) to issue 2463 on bugzilla;
is there some better way to participate on D/dmd development than to
continuously post patches to bugzilla that are going to be ignored
[02:57:07] * AndreiAlexandres has quit (Quit: AndreiAlexandres)

It could be a bad coincidence, or show how fabulously awesome the D development
model is.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 14 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #82 from Leandro Lucarella <llucax gmail.com> 2010-11-14 19:17:59
PST ---
Maybe you should try with LDC's or GDC's issues trackers, as this is an
implementation detail maybe it gets better reception there (but it would be
hard to get accepted there since both uses DMDFE and it would mean to fork it).

When I want to keep patches easily updated I use git (git svn in this case).
But then, the lack of tagging in DMD's svn can be annoying.

Anyway, unfortunately DMD development model still sucks, it sucks much less
than... let's say 2 years ago, but...

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 14 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #83 from bearophile_hugs eml.cc 2010-11-15 04:43:23 PST ---
(In reply to comment #82)

 Anyway, unfortunately DMD development model still sucks, it sucks much less
 than... let's say 2 years ago, but...

Walter is willing to slowly improve his practices. Do you have a list of suggestions? If you show this list on the main D newsgroup he might pick something from it. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 15 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #84 from Leandro Lucarella <llucax gmail.com> 2010-11-15 04:47:48
PST ---
(In reply to comment #83)
 (In reply to comment #82)
 
 Anyway, unfortunately DMD development model still sucks, it sucks much less
 than... let's say 2 years ago, but...

Walter is willing to slowly improve his practices. Do you have a list of suggestions? If you show this list on the main D newsgroup he might pick something from it.

I (and others) already suggested him how to improve things, a few really basic examples that come to my mind right now: * Tag releases in svn * Give feedback to people wanting to contribute -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 15 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #85 from bearophile_hugs eml.cc 2010-11-15 09:33:56 PST ---
(In reply to comment #84)

 I (and others) already suggested him how to improve things,

Keep suggesting those things. Sometimes you have to say something five times to Walter.
 * Tag releases in svn

Explain him why this is useful.
 * Give feedback to people wanting to contribute

Maybe he doesn't fully understand why contributions from other people are important. Showing him some examples is useful. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 15 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #86 from Leandro Lucarella <llucax gmail.com> 2010-11-15 09:44:18
PST ---
(In reply to comment #85)
 (In reply to comment #84)
 
 I (and others) already suggested him how to improve things,

Keep suggesting those things. Sometimes you have to say something five times to Walter.
 * Tag releases in svn

Explain him why this is useful.
 * Give feedback to people wanting to contribute

Maybe he doesn't fully understand why contributions from other people are important. Showing him some examples is useful.

This is getting a little off-topic here, but I think I'll focus on other projects, is too much effort. I agree that Walter has improved a little, but the ratio results/efforts is *really* low still, and it doesn't worth it for me any more. I hate sounding like a broken record... -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 15 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


Walter Bright <bugzilla digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bugzilla digitalmars.com


--- Comment #87 from Walter Bright <bugzilla digitalmars.com> 2011-04-13
21:51:35 PDT ---
The idea, as I understand it, is to supply a bit mask of where the pointers
are. For me, the difficulties are:

1. distinguishing real pointers from might-be-a-pointer (such as you might get
from union { int a; void* p; }).

2. large static arrays, large structs/classes, structs with large static array
members, etc.

The amount of static data dedicated to such bit arrays could get very large.

I see two solutions:

1. if it is case (1) or (2), give up for that type and revert to the current
method of scanning that object

2. devise a 'state machine' instead that the gc executes for a type. The state
machine has instructions like "advance n bytes to the next pointer" and "the
next pointer is ambiguous" and "execute the following sequence n times."

I don't see an obvious choice which is better.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Apr 13 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #88 from bearophile_hugs eml.cc 2011-04-14 02:06:33 PDT ---
(In reply to comment #87)

 1. distinguishing real pointers from might-be-a-pointer (such as you might get
 from union { int a; void* p; }).

In C unions are not tagged, but in many C programs the programmer has introduced some mean to tell apart at runtime what's inside the union, like a first bit set in the pointer, a tag field in a struct where the union is, or a tag value elsewhere in the program. So time ago I have suggested an optional standard method useful fur unions, named something like onGC(...). It gets called at runtime by the GC during when it scans the union instance, and in some way tells the GC if a field is a pointer to follow (to call that onGC() the GC has to know that something is a specific union in the first place). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #89 from Walter Bright <bugzilla digitalmars.com> 2011-04-14
02:23:28 PDT ---
(In reply to comment #88)

In order to support C compatibility, untagged unions must be supported by the
type system and the GC.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #90 from bearophile_hugs eml.cc 2011-04-14 03:58:52 PDT ---
(In reply to comment #89)
 (In reply to comment #88)
 
 In order to support C compatibility, untagged unions must be supported by the
 type system and the GC.

Right, but this doesn't touch much what I have said. Many programs that use unions contain some mean at runtime to tell what's inside the union. This information is what onGC() uses. Where the onGC() union method is not present, or the union contents are otherwise unknown, the GC may fall back to being not precise and conservative. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


Steven Schveighoffer <schveiguy yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |schveiguy yahoo.com


--- Comment #91 from Steven Schveighoffer <schveiguy yahoo.com> 2011-04-14
05:16:41 PDT ---
(In reply to comment #87)
 The idea, as I understand it, is to supply a bit mask of where the pointers
 are. For me, the difficulties are:
 
 1. distinguishing real pointers from might-be-a-pointer (such as you might get
 from union { int a; void* p; }).

This is only a problem for a moving GC. In practice, unions are not very common, so the conservative scanning for unions should be very small compared to the other cases.
 
 2. large static arrays, large structs/classes, structs with large static array
 members, etc.
 
 The amount of static data dedicated to such bit arrays could get very large.

Yes and no. Consider right now (although I think David fixed this), we allocate a bit for every 16 bytes of a page, even if the whole page is a block. Even for that, the ratio is very small (1 bit per 16 bytes == 1/128). I think in reality, the only common case for potentially large static data is arrays. I think the code needs some way of saying "here is an array of X elements with this bitmask". I think that should cover most cases.
 I see two solutions:
 
 1. if it is case (1) or (2), give up for that type and revert to the current
 method of scanning that object

This is exactly the opposite of what you want -- large arrays and structs are the biggest problem for the conservative GC. Whether this translates to static arrays, I'm not sure.
 2. devise a 'state machine' instead that the gc executes for a type. The state
 machine has instructions like "advance n bytes to the next pointer" and "the
 next pointer is ambiguous" and "execute the following sequence n times."

This could be a viable solution, but I think all we need would be one descriptor type: offset x has bitmask Y repeated N times. where the offset is optional, and describes the offset from the previous bitmask. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #92 from David Simcha <dsimcha yahoo.com> 2011-04-14 06:04:04 PDT
---
(In reply to comment #91)
 Yes and no.  Consider right now (although I think David fixed this), we
 allocate a bit for every 16 bytes of a page, even if the whole page is a block.

Yes, I did fix this. Now it's every 16 bytes for small (<1 page) allocations and every page for large (>= 1 page) allocations.
 2. devise a 'state machine' instead that the gc executes for a type. The state
 machine has instructions like "advance n bytes to the next pointer" and "the
 next pointer is ambiguous" and "execute the following sequence n times."

This could be a viable solution, but I think all we need would be one descriptor type: offset x has bitmask Y repeated N times. where the offset is optional, and describes the offset from the previous bitmask.

This is roughly what I did in my initial patches. The comments at the top of the file describe it in more detail. See http://d.puremagic.com/issues/attachment.cgi?id=488&action=edit -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #93 from Leandro Lucarella <llucax gmail.com> 2011-04-14 06:21:29
PDT ---
You can take a look at my concurrent D GC (CDGC), which is also precise. It is
based on the work done by nfxjfg gmail.com (which is based on the work done by
David Simcha).

Here is the very modest (and not so useful) project page:
http://www.llucax.com.ar/proj/dgc/

Remember Druntime use to have a branch with CDGC merged:
http://www.dsource.org/projects/druntime/browser/branches/CDGC

And here is a mini HOWTO to try CDGC (in Tango), but now that is merged, it's
outdated if you use Tango trunk (in which case you just have to call bob with
-g=cdgc), but in that post you can find the compiler patches too (which I think
are the exact same nfxjfg gmail.com did):
http://llucax.com.ar/blog/blog/post/-4d1fefb4

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #94 from Walter Bright <bugzilla digitalmars.com> 2011-04-14
11:54:00 PDT ---
(In reply to comment #92)
 This is roughly what I did in my initial patches.  The comments at the top of
 the file describe it in more detail.  See
 http://d.puremagic.com/issues/attachment.cgi?id=488&action=edit

I think that covers things, except for handling ambiguous pointers. But I think it still consumes a lot of static storage. I suggest a state machine, which will consume roughly 1 byte per pointer in the type. The state machine is a list of instructions, one byte each: 00 end of state machine 01 pointer at this offset, advance offset by size_t 02 advance offset by size_t, pointer here, advance by size_t 03 advance offset by 2*size_t, pointer here, advance by size_t .. 0F advance offset by 0F*size_t, pointer here, advance by size_t 10 advance offset by the following two bytes 11 ambiguous pointer at this offset, advance offset by size_t 12 execute loop nn times, where nn is the next byte. loop starts after nn 13 same as 12, but nn is the value in the next 2 bytes 14 end of loop (loops can nest) 15 the following size_t bytes are a pointer to another state machine to call like a function Or something like that. I think the state machine would be very compact, and would be fast to execute. A function can be added to TypeInfo to return the state machine for that type. The compiler would generate the data for the state machine. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #95 from Steven Schveighoffer <schveiguy yahoo.com> 2011-04-14
12:25:48 PDT ---
(In reply to comment #94)
 
 I think that covers things, except for handling ambiguous pointers.

Can you explain why we care about ambiguous pointers? That is, shouldn't we just always consider that an ambiguous pointer is a pointer? Why do we need a separate designation?
 But I think
 it still consumes a lot of static storage. I suggest a state machine, which
 will consume roughly 1 byte per pointer in the type.
 

I think a state machine would work. I especially like the ability to call another state machine program, that helps tremendously where types include other types, and inheritance. To optimize the state machine a bit, it might be good to include a "compression" feature where if you have just a dense bunch of pointer/value types, you give the state machine a bitmask. probably the worst case is an array of array references (which alternate between pointer and length). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #96 from Leandro Lucarella <llucax gmail.com> 2011-04-14 12:32:49
PDT ---
(In reply to comment #95)
 (In reply to comment #94)
 
 I think that covers things, except for handling ambiguous pointers.

Can you explain why we care about ambiguous pointers? That is, shouldn't we just always consider that an ambiguous pointer is a pointer? Why do we need a separate designation?

To allow moving collectors (you can't update a pointer that's not really a pointer). Anyway, this problem is taking care of in the implementation by nfxjfg, which Walter seem to keep ignoring, as he always did with this bug report and all the useful insight it have. I'm not saying the solution is the best, but it's completely ignored for no (publicly know reason). I'll unsubscribe now from this bug report, as it really makes me sick =) If you want to contact me, your know where to find me... -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #97 from Andrei Alexandrescu <andrei metalanguage.com> 2011-04-14
13:00:36 PDT ---
(In reply to comment #96)
 (In reply to comment #95)
 (In reply to comment #94)
 
 I think that covers things, except for handling ambiguous pointers.

Can you explain why we care about ambiguous pointers? That is, shouldn't we just always consider that an ambiguous pointer is a pointer? Why do we need a separate designation?

To allow moving collectors (you can't update a pointer that's not really a pointer). Anyway, this problem is taking care of in the implementation by nfxjfg, which Walter seem to keep ignoring, as he always did with this bug report and all the useful insight it have. I'm not saying the solution is the best, but it's completely ignored for no (publicly know reason). I'll unsubscribe now from this bug report, as it really makes me sick =) If you want to contact me, your know where to find me...

That's a fairly random thing to say, particularly now that this bug report is again attracting interest. Walter has expressed a concern about the proposed implementation, which implies he's definitely not ignoring it. If you wanted to release some pent-up frustration accumulated from the past, fine. Otherwise, the reaction is rather irrational. It would be in everybody's interest (and it would definitely make you feel better) if you continued adding value to this discussion and code. Generally, after having accumulated more information on conservative vs. precise collectors, I think D must aim at being as precise as possible - no ifs and buts. So this report and patch must be taken to completion. I'll get back with some more comments. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #98 from Andrei Alexandrescu <andrei metalanguage.com> 2011-04-14
13:45:23 PDT ---
Though I understand its attractiveness, I oppose the state machine approach
because it is hopelessly specific. After taking to completion this nontrivial
task we'll have (a) one DSL to worry about, (b) a significant piece of
infrastructure that only does one thing.

The ideal approach would be to improve introspection. At the end of the day
each type (including array types) has a specific topology, which should be
accessible to the garbage collector and not only. It would be great if the
garbage collector would get the TypeInfo for each chunk of memory and use its
introspection information for navigating pointers.

The work on improving introspection should be done anyway.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #99 from Walter Bright <bugzilla digitalmars.com> 2011-04-14
13:50:21 PDT ---
(In reply to comment #98)
 The work on improving introspection should be done anyway.

The trouble with using runtime introspection for this is it'll be slow, and the scanning of data needs to be really fast. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #100 from Andrei Alexandrescu <andrei metalanguage.com> 2011-04-14
14:03:26 PDT ---
(In reply to comment #99)
 (In reply to comment #98)
 The work on improving introspection should be done anyway.

The trouble with using runtime introspection for this is it'll be slow, and the scanning of data needs to be really fast.

I understand. That still means (and please correct me if I'm wrong) that the right place for specialized code should be at TypeInfo level. Also, I wonder if there's a need for a DSL. Wouldn't automatically-generated Typeinfo methods suffice? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #101 from Walter Bright <bugzilla digitalmars.com> 2011-04-14
14:46:52 PDT ---
(In reply to comment #96)
 Anyway, this problem is taking care of in the implementation by nfxjfg,

His bitmap implementation consists of two bitmaps, one for all pointers and the other for the subset that are ambiguous. There is one bit for each PTRSIZE bytes. Although that works, the difficulty with it is the potentially large amount of static data this requires: 1. ambiguous pointers are pretty rare, so the second bitmap will be largely 0 2. non-pointer fields still consume space in the bitmaps 3. static arrays are fully fleshed out, so a large static array will have a correspondingly large bitmap 4. large static arrays embedded in a struct with another pointer in them will be fully fleshed out, even if there are no pointers in the array 5. there's no provision for reusing bitmaps for types embedded in aggregate types, everything is repeated This also applies at compile time, where one byte per pointer size for every type is allocated. I think the static size is important - take a look at the .map file of a typical D app, and there's a lot of space consumed by the typeinfo data. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #102 from Walter Bright <bugzilla digitalmars.com> 2011-04-14
14:49:49 PDT ---
(In reply to comment #100)
 (In reply to comment #99)
 (In reply to comment #98)
 The work on improving introspection should be done anyway.

The trouble with using runtime introspection for this is it'll be slow, and the scanning of data needs to be really fast.

right place for specialized code should be at TypeInfo level. Also, I wonder if there's a need for a DSL. Wouldn't automatically-generated Typeinfo methods suffice?

I don't really understand your comment. Are you saying the TypeInfo should do introspection at runtime to generate specialized data tables for the gc (obviously caching it so it is done only once per type)? This is certainly a possible approach. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #103 from Andrei Alexandrescu <andrei metalanguage.com> 2011-04-14
15:08:15 PDT ---
(In reply to comment #102)
 (In reply to comment #100)
 (In reply to comment #99)
 (In reply to comment #98)
 The work on improving introspection should be done anyway.

The trouble with using runtime introspection for this is it'll be slow, and the scanning of data needs to be really fast.

right place for specialized code should be at TypeInfo level. Also, I wonder if there's a need for a DSL. Wouldn't automatically-generated Typeinfo methods suffice?

I don't really understand your comment. Are you saying the TypeInfo should do introspection at runtime to generate specialized data tables for the gc (obviously caching it so it is done only once per type)? This is certainly a possible approach.

I was thinking that the compiler could generate D code that does the scanning instead of us defining a DSL for that. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #104 from Walter Bright <bugzilla digitalmars.com> 2011-04-14
15:47:25 PDT ---
(In reply to comment #103)
 I was thinking that the compiler could generate D code that does the scanning
 instead of us defining a DSL for that.

That's a fascinating idea. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #105 from Andrei Alexandrescu <andrei metalanguage.com> 2011-04-14
16:00:24 PDT ---
(In reply to comment #104)
 (In reply to comment #103)
 I was thinking that the compiler could generate D code that does the scanning
 instead of us defining a DSL for that.

That's a fascinating idea.

I think it's just a simple idea. You do generate code for constructors etc. already... -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #106 from Walter Bright <bugzilla digitalmars.com> 2011-04-14
16:25:11 PDT ---
(In reply to comment #105)
 I think it's just a simple idea. You do generate code for constructors etc.
 already...

The main challenge would be finding the fastest interface for it with the gc. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


Vladimir <thecybershadow gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |thecybershadow gmail.com


--- Comment #107 from Vladimir <thecybershadow gmail.com> 2011-04-14 16:36:48
PDT ---
Am I the only one who is concerned with the performance implications of
complicating the garbage collector any further, especially considering that
precise heap scanning will not completely solve any problems?

 As far as the (small) drop in mean performance, I say "who cares?"

*I* care. I already have solutions for false pointer problems. If I'm going to write an action video game in D, I sure as hell don't want the GC taking its sweet time while the player gets fragged due to "screen lag". -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #108 from David Simcha <dsimcha yahoo.com> 2011-04-14 17:08:48 PDT
---
(In reply to comment #107)
 Am I the only one who is concerned with the performance implications of
 complicating the garbage collector any further, especially considering that
 precise heap scanning will not completely solve any problems?
 
 As far as the (small) drop in mean performance, I say "who cares?"

*I* care. I already have solutions for false pointer problems. If I'm going to write an action video game in D, I sure as hell don't want the GC taking its sweet time while the player gets fragged due to "screen lag".

That's why heap allocations in real-time code are a bad idea. This patch won't change that. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #109 from Vladimir <thecybershadow gmail.com> 2011-04-14 17:11:02
PDT ---
(In reply to comment #108)
 That's why heap allocations in real-time code are a bad idea.  This patch won't
 change that.

Um, no, the GC is currently fast enough to scan a small heap within the timeframe of a video frame. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #110 from bearophile_hugs eml.cc 2011-04-14 17:36:47 PDT ---
(In reply to comment #109)
 (In reply to comment #108)
 That's why heap allocations in real-time code are a bad idea.  This patch won't
 change that.

Um, no, the GC is currently fast enough to scan a small heap within the timeframe of a video frame.

Do you have a proof of this? Because currently the GC gets called when you allocate heap memory. And allocating heap memory (for objects, structs, dynamic arrays, closures, array concatenations, etc) between two frames of a fast video game is probably not a wise thing to do (today) in D. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #111 from Vladimir <thecybershadow gmail.com> 2011-04-14 17:44:29
PDT ---
(In reply to comment #110)
 Because currently the GC gets called when you allocate heap memory. 

Thanks for teaching me how garbage collectors work. I had no idea: http://thecybershadow.net/d/Memory_Management_in_the_D_Programming_Language.pdf
 probably not a wise thing to do

Empty words. Yes, I ran tests and benchmarks. The D GC is okay as it is now. There are efforts to make it faster, and I welcome them by all means. However, precise heap scanning will not solve any of my problems. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #112 from Walter Bright <bugzilla digitalmars.com> 2011-04-14
17:48:22 PDT ---
(In reply to comment #110)
 And allocating heap memory (for objects, structs, dynamic
 arrays, closures, array concatenations, etc) between two frames of a fast video
 game is probably not a wise thing to do (today) in D.

Anything with hard realtime requirements cannot do allocation - even in C/C++, malloc() does not have an upper limit on its time. What is done is to pre-allocate everything necessary before entering the hard realtime section. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #113 from Vladimir <thecybershadow gmail.com> 2011-04-14 18:00:14
PDT ---
(In reply to comment #112)
 Anything with hard realtime requirements cannot do allocation - even in C/C++,
 malloc() does not have an upper limit on its time.
 
 What is done is to pre-allocate everything necessary before entering the hard
 realtime section.

This does not apply to the discussion. Video games do not have hard-realtime requirements. Garbage collection speed dictates how much work must the programmer do to offset the GC's slowness. If the GC is fast enough that it does not cause performance issues on systems satisfying the game's minimum system requirements, the programmer doesn't need to care about it at all. Otherwise, they must do some work (the slower the GC, the more work) to reduce the size of the heap, or the number of allocations, or ultimately abandon heap allocation entirely. This might be easy in simple video games, but becomes increasingly painful in very large projects, with non-trivial user interfaces etc. So, by making the GC slower, you are creating more work for me. I am not happy about it. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #114 from David Simcha <dsimcha yahoo.com> 2011-04-14 18:23:13 PDT
---
(In reply to comment #113)
 (In reply to comment #112)
 Anything with hard realtime requirements cannot do allocation - even in C/C++,
 malloc() does not have an upper limit on its time.
 
 What is done is to pre-allocate everything necessary before entering the hard
 realtime section.

This does not apply to the discussion. Video games do not have hard-realtime requirements. Garbage collection speed dictates how much work must the programmer do to offset the GC's slowness. If the GC is fast enough that it does not cause performance issues on systems satisfying the game's minimum system requirements, the programmer doesn't need to care about it at all. Otherwise, they must do some work (the slower the GC, the more work) to reduce the size of the heap, or the number of allocations, or ultimately abandon heap allocation entirely. This might be easy in simple video games, but becomes increasingly painful in very large projects, with non-trivial user interfaces etc. So, by making the GC slower, you are creating more work for me. I am not happy about it.

How about this: I can virtually guarantee that any slowness caused by precise heap scanning is more than offset by my recent patches (in Git but not in any DMD release yet). See https://github.com/dsimcha/druntime/wiki/Druntime-GC-Optimization-Fork . -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #115 from Vladimir <thecybershadow gmail.com> 2011-04-14 18:29:21
PDT ---
(In reply to comment #114)
 How about this:  I can virtually guarantee that any slowness caused by precise
 heap scanning is more than offset by my recent patches (in Git but not in any
 DMD release yet).  See
 https://github.com/dsimcha/druntime/wiki/Druntime-GC-Optimization-Fork .

Yes, I saw that (I believe I even complimented you on your work on one occasion). Frankly I'm skeptical regarding that guarantee until I'll be able to run some benchmarks, especially in my use cases (I store large objects outside the heap, see https://github.com/CyberShadow/data.d). But even if it holds, I would like to ask to make it easy to turn off the generation of any additional information these GC changes require in the compiler (such as a #define option), so I can easily switch to a custom GC optimized for speed and nothing else. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #116 from Rob Jacques <sandford jhu.edu> 2011-04-14 19:00:38 PDT ---
(In reply to comment #113)
 (In reply to comment #112)
 Anything with hard realtime requirements cannot do allocation - even in C/C++,
 malloc() does not have an upper limit on its time.
 
 What is done is to pre-allocate everything necessary before entering the hard
 realtime section.

This does not apply to the discussion. Video games do not have hard-realtime requirements.

Yes, they do. It's called the frame rate. (Though I'd guess to be technical, this a soft-realtime requirement.)
 Garbage collection speed dictates how much work must the programmer do to
 offset the GC's slowness. If the GC is fast enough that it does not cause
 performance issues on systems satisfying the game's minimum system
 requirements, the programmer doesn't need to care about it at all. Otherwise,
 they must do some work (the slower the GC, the more work) to reduce the size of
 the heap, or the number of allocations, or ultimately abandon heap allocation
 entirely. This might be easy in simple video games, but becomes increasingly
 painful in very large projects, with non-trivial user interfaces etc.
 
 So, by making the GC slower, you are creating more work for me. I am not happy
 about it.

Why do you think the GC will get slower? One of the major points of a precise GC is that it does less total work than a conservative GC. Why? Because following a pointer, even if it goes nowhere, is an expensive operation (I'd guess O(log Heap_Size)) to say nothing of the cost of keeping around and tracing the excess garbage. And the more precision the GC has, the less pointers it has to trace per unit memory and the total memory traced is reduced as well. So it's win-win. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #117 from Vladimir <thecybershadow gmail.com> 2011-04-14 19:06:29
PDT ---
(In reply to comment #116)
 Yes, they do. It's called the frame rate. (Though I'd guess to be technical, 
 this a soft-realtime requirement.)

That's exactly what I meant. Hard realtime means that missing a deadline is equivalent to a complete system failure. Heap allocation is indeed not an option in that case.
 Why do you think the GC will get slower? One of the major points of a precise
 GC is that it does less total work than a conservative GC. Why? Because
 following a pointer, even if it goes nowhere, is an expensive operation (I'd
 guess O(log Heap_Size)) to say nothing of the cost of keeping around and
 tracing the excess garbage. And the more precision the GC has, the less
 pointers it has to trace per unit memory and the total memory traced is reduced
 as well. So it's win-win.

I hope it is as you say it is, but without benchmarks it's hard to say anything, and this talk of state machines etc. is disconcerting. Note that the current GC does a quick check for each possible pointer if it's between the low and high address range of all GC pages - for small heaps, this weeds out most false pointers. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #118 from Walter Bright <bugzilla digitalmars.com> 2011-04-14
19:34:40 PDT ---
(In reply to comment #117)
 I hope it is as you say it is, but without benchmarks it's hard to say
 anything, and this talk of state machines etc. is disconcerting.

Why? Also, even if the compiler emits the tables necessary for more precise gc, the gc implementation can ignore them and do it the old way. Emitting the tables makes it possible for people to experiment with various kinds of gc strategies.
 Note that the
 current GC does a quick check for each possible pointer if it's between the low
 and high address range of all GC pages - for small heaps, this weeds out most
 false pointers.

True, and it works tolerably well. To do a moving gc, however, you need more precise information. BTW, I just want to reiterate that a "hard" realtime constraint means your program fails if it doesn't meet it. A "soft" constraint means the program degrades in some way, but keeps functioning. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #119 from David Simcha <dsimcha yahoo.com> 2011-04-14 19:47:44 PDT
---
(In reply to comment #118)
 Also, even if the compiler emits the tables necessary for more precise gc, the
 gc implementation can ignore them and do it the old way. Emitting the tables
 makes it possible for people to experiment with various kinds of gc strategies.

I still think that, instead of hard coding this logic into the compiler, we should make TypeInfo templated. (See DIP 8 from ages ago, http://www.wikiservice.at/d/wiki.cgi?LanguageDevel/DIPs/DIP8). This would allow different runtime implementations to customize how the pointer offset info works without modifying the compiler. It would also make typeinfo extensible, which is generally a good thing. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #120 from Vladimir <thecybershadow gmail.com> 2011-04-14 19:50:08
PDT ---
(In reply to comment #118)
 I hope it is as you say it is, but without benchmarks it's hard to say
 anything, and this talk of state machines etc. is disconcerting.

Why?

It "sounds" slower. This is subjective and unscientific. That's why I said only a benchmark will show the real results.
 Also, even if the compiler emits the tables necessary for more precise gc, the
 gc implementation can ignore them and do it the old way. Emitting the tables
 makes it possible for people to experiment with various kinds of gc strategies.

I would like to avoid bloating executables any further, and giving reverse engineers more data about my code. (I know executable size doesn't affect performance, at least in theory, but it does matter and can't be completely neglected either.)
 True, and it works tolerably well. To do a moving gc, however, you need more
 precise information.

I don't want a moving GC. I want a fast GC. ("I" in this context means D users with the same requirements, mainly video game developers.) I understand the advantages of a moving GC - heap compaction allowing for an overall smaller managed heap etc., but I hope you understand that sacrificing speed for these goals is not an unilateral improvement for everyone. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #121 from David Simcha <dsimcha yahoo.com> 2011-04-14 19:59:28 PDT
---
(In reply to comment #120)
 I understand the advantages of a moving GC - heap compaction allowing for an
 overall smaller managed heap etc., but I hope you understand that sacrificing
 speed for these goals is not an unilateral improvement for everyone.

Since we don't have any hard benchmarks conveniently available, let's assume for the sake of argument that a precise/moving/etc. GC would be slower. Your case is a niche case and calls for a niche garbage collector implementation. D's GC is designed (IIRC) to allow selecting an implementation at link time. Eventually, someone could write a decent low-latency GC optimized for small heaps for game programmers. In the mean time, we could fork the current one. The stock GC, though, should handle the common cases, not a few game programmers' corner case. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


Leandro Lucarella <llucax gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |llucax gmail.com


--- Comment #122 from Leandro Lucarella <llucax gmail.com> 2011-04-14 20:06:11
PDT ---
(In reply to comment #121)
 (In reply to comment #120)
 I understand the advantages of a moving GC - heap compaction allowing for an
 overall smaller managed heap etc., but I hope you understand that sacrificing
 speed for these goals is not an unilateral improvement for everyone.

Since we don't have any hard benchmarks conveniently available, let's assume for the sake of argument that a precise/moving/etc. GC would be slower. Your case is a niche case and calls for a niche garbage collector implementation. D's GC is designed (IIRC) to allow selecting an implementation at link time. Eventually, someone could write a decent low-latency GC optimized for small heaps for game programmers. In the mean time, we could fork the current one. The stock GC, though, should handle the common cases, not a few game programmers' corner case.

You mean like this one (except it's not optimized for small heaps, just for small pauses and works)? http://llucax.com.ar/blog/blog/post/-1a4bdfba PS: Yeah, for some reason I still get the e-mails even when I removed myself from te Cc =/ -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #123 from Vladimir <thecybershadow gmail.com> 2011-04-14 20:09:13
PDT ---
(In reply to comment #121)
 Your case is a niche case and calls for a niche garbage collector
implementation. 

I would like to ask you to reconsider that opinion. Please take into consideration the size of the video game industry, including that for consoles and mobile devices. Also, keep in mind that a good amount of hype regarding D was generated by video games written in it - starting with Kenta Cho's old OpenGL ones, as well as that commercial D strategy game I can't find at the moment. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #124 from Walter Bright <bugzilla digitalmars.com> 2011-04-14
20:26:14 PDT ---
(In reply to comment #122)
 PS: Yeah, for some reason I still get the e-mails even when I removed myself
 from te Cc =/

Just when I thought I was out... they pull me back in! -- Michael Coreclone, The D Father Part III -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #125 from Walter Bright <bugzilla digitalmars.com> 2011-04-14
20:33:11 PDT ---
(In reply to comment #120)
 I understand the advantages of a moving GC - heap compaction allowing for an
 overall smaller managed heap etc., but I hope you understand that sacrificing
 speed for these goals is not an unilateral improvement for everyone.

In a gc, speed really is the paramount consideration. (The problem with excessive memory consumption is speed, after all.) Unfortunately, the speed of any gc strategy cannot be determined in advance. One has to try it, profile it, and tune it to see if it is an overall improvement. The theoretical speed improvement of more precise gc comes from: 1. the collector doing less work because it knows where the actual pointers are rather than having to scan for them 2. not following and scanning memory that is falsely pointed to by an integer 3. reducing the working set of memory, meaning less work for scanning The theoretical speed decrease comes from: 1. more work to read the pointer tables 2. possibly a large hit from using a virtual function How this will play out will require actually trying it. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 14 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


Jacob Carlborg <doob me.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |doob me.com


--- Comment #126 from Jacob Carlborg <doob me.com> 2011-04-15 00:27:53 PDT ---
Why not just add an additional garbage collector with this new implementation
and leave the old one as it is and then developers can choose which one to use
at link time.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Apr 15 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463


Sean Cavanaugh <WorksOnMyMachine gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |WorksOnMyMachine gmail.com


--- Comment #127 from Sean Cavanaugh <WorksOnMyMachine gmail.com> 2011-04-15
01:40:35 PDT ---
(In reply to comment #120)
 (In reply to comment #118)
 True, and it works tolerably well. To do a moving gc, however, you need more
 precise information.

I don't want a moving GC. I want a fast GC. ("I" in this context means D users with the same requirements, mainly video game developers.) I understand the advantages of a moving GC - heap compaction allowing for an overall smaller managed heap etc., but I hope you understand that sacrificing speed for these goals is not an unilateral improvement for everyone.

I am a game developer, and this thread is fairly fascinating to me, as memory management and good support for Intel SSE2(and AVX) or PowerPC VMX are two of the biggest issues to me when considering alternative languages or the question of 'will this language be suitable in the future'. The SSE problem seems workable with extern C'd C++ DLLs code to handle the heavy math, which leaves the GC as a big 'what does this mean' when evaluating the landscape. The reality is a lot of game engines allocate a surprising amount of memory at run time. The speed of malloc itself is rarely an issue as most searches take reasonably similar amount of time. The real problems with heavy use of malloc are thread lock contention in the allocator, and fragmentation. Fragmentation causes two problems: large allocation failures when memory is low (say 1 MB allocation when 30 MB is 'free'), and virtual pages are unable to be reclaimed due to a stray allocation or two within the page. Lock contention is solved by making separate heaps. Fragmentation is fought also fought by separating the heaps, but organizing the allocations coherently either time-wise or by allocation type where like sized objects pooled into a special pool for objects of that size. As a bonus fixed size object pools have const time for allocation, except when the pool has to grow, but we try real hard to pre-size these to the worst case values. On my last project we had about 8 dlmalloc based heaps and 15 fixed sized allocator pools, to solve these problems. I would greatly prefer a GC to compact the heap to keep the peak memory down, because in embeded (console) environments memory is a constant but time is fungible. VM might be available on the environments, but it isn't going to be backed by disk. Instead the idea of the VM is that it is a tool to fight fragmentation of the underlying physical pages, and to help you get contiguous space to work with. There is also pressure to use larger (64k, 1MB, 4MB pages) pages to keep the TLB lookups fast, which hurts even more with fragmentation. Tiny allocations holding onto these big pages prevents them from being reclaimed, which makes getting those allocations moved somewhere better pretty important. Now the good news is a huge amount of resources in a game do not need to be allocated into a garbage collected space. For the most part anything you send to the GPU data is far better off being written into its memory system and left alone. Physics data and Audio data have similar behaviors for the most part and can be allocated through malloc or aligned forms of malloc (for SSE friendlies). So from a game's developers point of I need to know when the GC will run either by configuration or by manually driving it. Both allow me to run a frame with most of the AI and physics disabled to give more of the time to the collector. A panic execution GC pass that I wasn't expecting is acceptable, provided I get notified of it, as I would expect this to be an indicator memory is getting tight to the point an Out of Memory is imminent. A panic GC is a QA problem as we can tell them where and how often the are occurring and they can in turn tell the designers making the art/levels that they need to start trimming the memory usage a bit. Ideally the GC would be able to run in less time than a single frame (say 10-15ms for a 30fps game). Taking away some amount of time every frame is also acceptable. For example spending 1ms of every frame to do 1ms worth of data movement or analysis for compacting would be a reasonable thing to allow, even if it was in addition to the multi-millisecond spikes at some time interval (30 frames, 30 seconds whatever). Making the whole thing friendly to having lots of CPU cores wouldn't hurt either. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 15 2011
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3463



--- Comment #128 from Vladimir <thecybershadow gmail.com> 2011-04-15 02:14:17
PDT ---
(In reply to comment #127)

Thank you for your insight!

 So from a game's developers point of I need to know when the GC will run either
 by configuration or by manually driving it.  

You can disable automatic garbage collection and manually invoke a collection right now.
 Both allow me to run a frame with 
 most of the AI and physics disabled to give more of the time to the collector. 

This won't work for multiplayer games where the game state must be kept in sync on all sides.
 Ideally the GC would be able to run in less time than a single frame (say
 10-15ms for a 30fps game).

Moving GCs are bound to be slower than the current one, but heap compaction probably doesn't need to happen as often as a simple GC run to reclaim memory.
 Taking away some amount of time every frame is also acceptable. 
 For example spending 1ms of every frame to do 1ms worth of data
 movement or analysis for compacting would be a reasonable thing to allow, 

The current GC doesn't support incremental runs. Jeremie Pelletier has written a garbage collector some time ago which can do a shallow scan and only collect objects with no immediate references: http://pastebin.com/f7a3b4c4a -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 15 2011