digitalmars.D - Memory Management in D: Request for Comment

dsimcha (52/52) Nov 02 2009 During my occasional forays into the dark side of Python and Java, I am ...

Jason House (2/62) Nov 03 2009

dsimcha (4/5) Nov 03 2009 Probably be better to submit it as a patch to Bugzilla if it's reasonabl...

Jason House (3/9) Nov 03 2009 I did hack gcx.d, but only to make the global mutex public. The implemen...

downs (2/2) Nov 03 2009 I submitted a patch a while back for constant-time removeRange. Can we d...

dsimcha (8/10) Nov 03 2009 that one up and/or implement something similar? It's rather useful for t...

downs (2/14) Nov 03 2009 The virtual stacks I use are being allocated via mmap. Every stack is a ...

dsimcha (10/24) Nov 03 2009 With every context switch, the relevant ranges need to be modified - the...

downs (3/27) Nov 04 2009 Well .... D1 Phobos :)

Leandro Lucarella (18/55) Nov 03 2009 The Boehm GC has a lot to learn from. I'm planning to add concurrent
BCS (11/19) Nov 04 2009 I think there are general malloc functions that are lock free. On the sa...

dsimcha (11/30) Nov 04 2009 But x86 atomic ops, while not as slow as locks, are pretty slow. Some b...

BCS (15/52) Nov 04 2009 IIRC a mutex/lock has to have a CAS at some point so if you can get a fu...

Sean Kelly (3/10) Nov 04 2009 I actually started working on this a while back and then set it aside to...

dsimcha (5/14) Nov 04 2009 care of some other things. I'm planning to revisit it once the concurre...

Andrei Alexandrescu (3/18) Nov 04 2009 Performance improvements are fortunately not tied to the language spec.

dsimcha (3/21) Nov 04 2009 But the concurrency model is.

Andrei Alexandrescu (7/28) Nov 04 2009 Sorry, I misunderstood a bit. Anyway, the process is a continuum, so you...

dsimcha <dsimcha yahoo.com> writes:

During my occasional forays into the dark side of Python and Java, I am often
amazed at the extent to which memory management in these languages "just
works".  D should be like this for all but the most low-level programming
tasks, and was intended to be.  It seems like most of the other regulars here
have carved out niches that don't involve improving memory management.

My attempts at adding precise heap scanning to the GC got me thinking about
other ways to improve memory management in D.  I love most aspects of D, but
the constant memory management issues make it feel like much less of a
high-level language than it should feel like.  I'm thinking of making this my
niche around here, as I already know more about the problem than I ever wanted
to know and I'm sick of having memory management not work properly.  Here are
some things that I'd like comments on:

1.  In the Hans Boehm GC, they use a blacklisting scheme whereby they avoid
allocating memory pages that currently have false pointers pointing into them.
 (If a page is not allocated but looks like it has a pointer into it, then we
can assume this is a false pointer.)  If I dip into the GC code again and
implement something like this, we'll be one step closer to making D memory
management "just work" and making false pointers a thing of the past.

2.  I've mentioned this a few times here before, but I wrote a second
stack-based memory allocator called TempAlloc, which allows stack-based memory
allocation that is not necessarily bound to function calls and automatically
falls back on heap allocation for very large objects, rather than killing your
program with a "stack overflow" error message.  I've also written some
implementations of common data structures (so far arrays, hash tables and
sets; I'll probably add binary trees at some point) that are optimized for it.
 (see http://svn.dsource.org/projects/dstats/docs/alloc.html).

The biggest problem with TempAlloc is that it is not scanned by the GC,
meaning that you can't store heap-allocated data in it unless you have another
reference somewhere else.  I don't know how to remedy this, partly because the
stacks are thread-local and I don't know how to remove a range from the GC
upon a thread terminating, even if I hack the GC to give it the features it
needs to properly scan TempAlloc.  Advice would be appreciated.

Other than GC scanning, is there anything else you would like to see added to
TempAlloc?  Do you think it's general enough to be included in core.memory, or
is it too niche?

3.  This one is an order of magnitude less likely than the other two to
actually get implemented, at least by me, but how about thread-local
allocators so you can call malloc() without taking a lock?  I vaguely remember
Sean saying he was working on that a while back, but I never heard anything
about it again.  It's probably best to wait for shared to be implemented for
this so that unshared objects can also be collected w/o stopping the world,
but we should start at least discussing this now.


4.  I submitted a patch a while back to allow the GC to ignore interior
pointers for specific objects.
(http://d.puremagic.com/issues/show_bug.cgi?id=2927)  This would be useful if
you have, for example, a large array that never escapes a class and the class
always maintains a pointer to the head of the array as long as it's alive.
This way, when the class dies, the array dies too even if there are false
pointers to its interior.  Few people have commented on this.  Is there any
reason why it's not a good idea?  Yes, it's somewhat unsafe if you're not
careful, but when the alternative is horrible memory leaks, sometimes
unsafeness is a necessary evil.

Nov 02 2009

Jason House <jason.james.house gmail.com> writes:

Please add weak references! I can email you a druntime compatible
implementation.

dsimcha Wrote:

 During my occasional forays into the dark side of Python and Java, I am often
 amazed at the extent to which memory management in these languages "just
 works".  D should be like this for all but the most low-level programming
 tasks, and was intended to be.  It seems like most of the other regulars here
 have carved out niches that don't involve improving memory management.
 
 My attempts at adding precise heap scanning to the GC got me thinking about
 other ways to improve memory management in D.  I love most aspects of D, but
 the constant memory management issues make it feel like much less of a
 high-level language than it should feel like.  I'm thinking of making this my
 niche around here, as I already know more about the problem than I ever wanted
 to know and I'm sick of having memory management not work properly.  Here are
 some things that I'd like comments on:
 
 1.  In the Hans Boehm GC, they use a blacklisting scheme whereby they avoid
 allocating memory pages that currently have false pointers pointing into them.
  (If a page is not allocated but looks like it has a pointer into it, then we
 can assume this is a false pointer.)  If I dip into the GC code again and
 implement something like this, we'll be one step closer to making D memory
 management "just work" and making false pointers a thing of the past.
 
 2.  I've mentioned this a few times here before, but I wrote a second
 stack-based memory allocator called TempAlloc, which allows stack-based memory
 allocation that is not necessarily bound to function calls and automatically
 falls back on heap allocation for very large objects, rather than killing your
 program with a "stack overflow" error message.  I've also written some
 implementations of common data structures (so far arrays, hash tables and
 sets; I'll probably add binary trees at some point) that are optimized for it.
  (see http://svn.dsource.org/projects/dstats/docs/alloc.html).
 
 The biggest problem with TempAlloc is that it is not scanned by the GC,
 meaning that you can't store heap-allocated data in it unless you have another
 reference somewhere else.  I don't know how to remedy this, partly because the
 stacks are thread-local and I don't know how to remove a range from the GC
 upon a thread terminating, even if I hack the GC to give it the features it
 needs to properly scan TempAlloc.  Advice would be appreciated.
 
 Other than GC scanning, is there anything else you would like to see added to
 TempAlloc?  Do you think it's general enough to be included in core.memory, or
 is it too niche?
 
 3.  This one is an order of magnitude less likely than the other two to
 actually get implemented, at least by me, but how about thread-local
 allocators so you can call malloc() without taking a lock?  I vaguely remember
 Sean saying he was working on that a while back, but I never heard anything
 about it again.  It's probably best to wait for shared to be implemented for
 this so that unshared objects can also be collected w/o stopping the world,
 but we should start at least discussing this now.
 
 
 4.  I submitted a patch a while back to allow the GC to ignore interior
 pointers for specific objects.
 (http://d.puremagic.com/issues/show_bug.cgi?id=2927)  This would be useful if
 you have, for example, a large array that never escapes a class and the class
 always maintains a pointer to the head of the array as long as it's alive.
 This way, when the class dies, the array dies too even if there are false
 pointers to its interior.  Few people have commented on this.  Is there any
 reason why it's not a good idea?  Yes, it's somewhat unsafe if you're not
 careful, but when the alternative is horrible memory leaks, sometimes
 unsafeness is a necessary evil.

Nov 03 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Jason House (jason.james.house gmail.com)'s article
 Please add weak references! I can email you a druntime compatible
implementation.

Probably be better to submit it as a patch to Bugzilla if it's reasonably
well-tested and working already.  On the other hand, I am curious to see how
this
was done.  Did it involve hacking gcx.d or was there a more graceful way?

Nov 03 2009

Jason House <jason.james.house gmail.com> writes:

dsimcha Wrote:

 == Quote from Jason House (jason.james.house gmail.com)'s article
 Please add weak references! I can email you a druntime compatible
implementation.

 
 Probably be better to submit it as a patch to Bugzilla if it's reasonably
 well-tested and working already.  On the other hand, I am curious to see how
this
 was done.  Did it involve hacking gcx.d or was there a more graceful way?

I did hack gcx.d, but only to make the global mutex public. The implementation
is relatively simple. My ticket submission failed when I tried to submit it, so
I emailed Sean directly. That was months ago, and I never heard back. I'll
resubmit a ticket... 

Theoretically, a weak ref might naturally tie into your precise scanning
changes. Admittedly, it's the dereferencing that is tricky to get right without
hacking gcx.d.

Nov 03 2009

downs <default_357-line yahoo.de> writes:

I submitted a patch a while back for constant-time removeRange. Can we dredge
that one up and/or implement something similar? It's rather useful for things
like stackthreads that need to add and remove lots of ranges :)

(Search the NG for "Feature req.+Patch: O(1) removeRange")

Nov 03 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from downs (default_357-line yahoo.de)'s article
 I submitted a patch a while back for constant-time removeRange. Can we dredge

that one up and/or implement something similar? It's rather useful for things
like
stackthreads that need to add and remove lots of ranges :)
 (Search the NG for "Feature req.+Patch: O(1) removeRange")

I guess removeRange is currently O(N)?  I haven't looked.  Anyhow, I think the
bigger problem in practice, i.e. until N is unrealistically large, is that
removeRange requires taking a lock.  O(1) would be nice, though.

What does StackThreads (I assume this is equivalent to fibers) do that it
requires
so many add/remove ranges?

Nov 03 2009

downs <default_357-line yahoo.de> writes:

dsimcha wrote:
 == Quote from downs (default_357-line yahoo.de)'s article
 I submitted a patch a while back for constant-time removeRange. Can we dredge

 that one up and/or implement something similar? It's rather useful for things
like
 stackthreads that need to add and remove lots of ranges :)
 (Search the NG for "Feature req.+Patch: O(1) removeRange")

 
 I guess removeRange is currently O(N)?  I haven't looked.  Anyhow, I think the
 bigger problem in practice, i.e. until N is unrealistically large, is that
 removeRange requires taking a lock.  O(1) would be nice, though.
 
 What does StackThreads (I assume this is equivalent to fibers) do that it
requires
 so many add/remove ranges?

The virtual stacks I use are being allocated via mmap. Every stack is a range.
With every context switch, the relevant ranges need to be modified - the range
that is being switched to needs to be removed (because it becomes the "main"
stack which has special GC handling), and the stack range that's being switched
from needs to be added.

Nov 03 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from downs (default_357-line yahoo.de)'s article
 dsimcha wrote:
 == Quote from downs (default_357-line yahoo.de)'s article
 I submitted a patch a while back for constant-time removeRange. Can we dredge

 that one up and/or implement something similar? It's rather useful for things
like
 stackthreads that need to add and remove lots of ranges :)
 (Search the NG for "Feature req.+Patch: O(1) removeRange")

 I guess removeRange is currently O(N)?  I haven't looked.  Anyhow, I think the
 bigger problem in practice, i.e. until N is unrealistically large, is that
 removeRange requires taking a lock.  O(1) would be nice, though.

 What does StackThreads (I assume this is equivalent to fibers) do that it
requires
 so many add/remove ranges?

 The virtual stacks I use are being allocated via mmap. Every stack is a range.

With every context switch, the relevant ranges need to be modified - the range
that is being switched to needs to be removed (because it becomes the "main"
stack
which has special GC handling), and the stack range that's being switched from
needs to be added.

IDK, that sounds a bit expensive even with O(1) removeRange.  Are you using some
home-grown impl. of StackThreads/fibers, Tango's, or D2 druntime's?  If it's
Tango's or druntime's, I haven't looked, but I would think the GC understands
stuff about them.  If it's your own, have you considered switching and maybe
submitting whatever extra features your impl. has to Tango and/or druntime?

Nov 03 2009

downs <default_357-line yahoo.de> writes:

dsimcha wrote:
 == Quote from downs (default_357-line yahoo.de)'s article
 dsimcha wrote:
 == Quote from downs (default_357-line yahoo.de)'s article
 I submitted a patch a while back for constant-time removeRange. Can we dredge

 that one up and/or implement something similar? It's rather useful for things
like
 stackthreads that need to add and remove lots of ranges :)
 (Search the NG for "Feature req.+Patch: O(1) removeRange")

 I guess removeRange is currently O(N)?  I haven't looked.  Anyhow, I think the
 bigger problem in practice, i.e. until N is unrealistically large, is that
 removeRange requires taking a lock.  O(1) would be nice, though.

 What does StackThreads (I assume this is equivalent to fibers) do that it
requires
 so many add/remove ranges?

 The virtual stacks I use are being allocated via mmap. Every stack is a range.

 With every context switch, the relevant ranges need to be modified - the range
 that is being switched to needs to be removed (because it becomes the "main"
stack
 which has special GC handling), and the stack range that's being switched from
 needs to be added.
 
 IDK, that sounds a bit expensive even with O(1) removeRange.  Are you using
some
 home-grown impl. of StackThreads/fibers, Tango's, or D2 druntime's?  If it's
 Tango's or druntime's, I haven't looked, but I would think the GC understands
 stuff about them.  If it's your own, have you considered switching and maybe
 submitting whatever extra features your impl. has to Tango and/or druntime?

Well .... D1 Phobos :)

I have some philosophical disagreements with the Tango project, much as I admit
their technical superiority, and D2 is still too beta. But at least I don't use
tools.stackthreads that much these days, so it's fine :)

Nov 04 2009

Leandro Lucarella <llucax gmail.com> writes:

dsimcha, el  3 de noviembre a las 05:46 me escribiste:
 During my occasional forays into the dark side of Python and Java, I am often
 amazed at the extent to which memory management in these languages "just
 works".  D should be like this for all but the most low-level programming
 tasks, and was intended to be.  It seems like most of the other regulars here
 have carved out niches that don't involve improving memory management.
 
 My attempts at adding precise heap scanning to the GC got me thinking about
 other ways to improve memory management in D.  I love most aspects of D, but
 the constant memory management issues make it feel like much less of a
 high-level language than it should feel like.  I'm thinking of making this my
 niche around here, as I already know more about the problem than I ever wanted
 to know and I'm sick of having memory management not work properly.  Here are
 some things that I'd like comments on:
 
 1.  In the Hans Boehm GC, they use a blacklisting scheme whereby they avoid
 allocating memory pages that currently have false pointers pointing into them.
  (If a page is not allocated but looks like it has a pointer into it, then we
 can assume this is a false pointer.)  If I dip into the GC code again and
 implement something like this, we'll be one step closer to making D memory
 management "just work" and making false pointers a thing of the past.

The Boehm GC has a lot to learn from. I'm planning to add concurrent
collection in a sightly different way than the Boehm GC, which have at
least a couple of ways of doing so. I know it might look like vaporware at
this point but I have to do it eventually, I need to finish that to
finally get my degree ;). I was just a little busy lately.

 3.  This one is an order of magnitude less likely than the other two to
 actually get implemented, at least by me, but how about thread-local
 allocators so you can call malloc() without taking a lock?  I vaguely remember
 Sean saying he was working on that a while back, but I never heard anything
 about it again.  It's probably best to wait for shared to be implemented for
 this so that unshared objects can also be collected w/o stopping the world,
 but we should start at least discussing this now.

I think avoiding the lock might be possible using atomic ops. I think it
should be easier than using thread-local, but I'm not entirely sure if
it's really possible.

 4.  I submitted a patch a while back to allow the GC to ignore interior
 pointers for specific objects.
 (http://d.puremagic.com/issues/show_bug.cgi?id=2927)  This would be useful if
 you have, for example, a large array that never escapes a class and the class
 always maintains a pointer to the head of the array as long as it's alive.
 This way, when the class dies, the array dies too even if there are false
 pointers to its interior.  Few people have commented on this.  Is there any
 reason why it's not a good idea?  Yes, it's somewhat unsafe if you're not
 careful, but when the alternative is horrible memory leaks, sometimes
 unsafeness is a necessary evil.

I think I commented it, and I think is a good idea to have it. It's not
unsafer than having NO_SCAN or casting a pointer to a size_t.

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Todos en el mundo somos grasas, no hago distinción de sexo y raza, sólo
que algunos lo disfrutan y otros no pueden evitarlo.

Nov 03 2009

BCS <none anon.com> writes:

Hello dsimcha,

 3.  This one is an order of magnitude less likely than the other two
 to actually get implemented, at least by me, but how about
 thread-local allocators so you can call malloc() without taking a
 lock?  I vaguely remember Sean saying he was working on that a while
 back, but I never heard anything about it again.  It's probably best
 to wait for shared to be implemented for this so that unshared objects
 can also be collected w/o stopping the world, but we should start at
 least discussing this now.

I think there are general malloc functions that are lock free. On the same 
note, I've been toying with an idea for how to do memory heap structure that 
would be totally lock free (think CAS) but has some downsides (like it won't 
handle a non-continues memory space and expanding the memory could get very 
expensive. Both of these could be non issues in some cases; short lived job 
specific heaps, memory mapped (shared?) data(base) files, resource limited 
systems, etc.

I've been meaning to get around to implementing it but never had time. I'd 
be willing to collaborate or, if even that doesn't get me moving, just hand 
over the whole project in exchange for a "coauthor" line in the comments.

Nov 04 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from BCS (none anon.com)'s article
 Hello dsimcha,
 3.  This one is an order of magnitude less likely than the other two
 to actually get implemented, at least by me, but how about
 thread-local allocators so you can call malloc() without taking a
 lock?  I vaguely remember Sean saying he was working on that a while
 back, but I never heard anything about it again.  It's probably best
 to wait for shared to be implemented for this so that unshared objects
 can also be collected w/o stopping the world, but we should start at
 least discussing this now.

 I think there are general malloc functions that are lock free. On the same
 note, I've been toying with an idea for how to do memory heap structure that
 would be totally lock free (think CAS) but has some downsides (like it won't
 handle a non-continues memory space and expanding the memory could get very
 expensive. Both of these could be non issues in some cases; short lived job
 specific heaps, memory mapped (shared?) data(base) files, resource limited
 systems, etc.
 I've been meaning to get around to implementing it but never had time. I'd
 be willing to collaborate or, if even that doesn't get me moving, just hand
 over the whole project in exchange for a "coauthor" line in the comments.


But x86 atomic ops, while not as slow as locks, are pretty slow.  Some
benchmarks
I did a while back indicated that, on Windows, atomic increment is only about
twice as fast as an uncontested synchronized { var++; }.  I think using
thread-local storage and having each thread-local heap interact with the global
heap only occasionally, for very large transactions, is a much better bet than
using atomic CAS.

I'm also a little bit confused about what you're suggesting, specifically about
the non-continuous memory space.  Does this mean that things would have to be
freed LIFO?  If so, what use cases would you be covering that TempAlloc doesn't
already cover?  If not, what does it mean?

Nov 04 2009

BCS <none anon.com> writes:

Hello dsimcha,

 == Quote from BCS (none anon.com)'s article
 
 Hello dsimcha,
 
 3.  This one is an order of magnitude less likely than the other two
 to actually get implemented, at least by me, but how about
 thread-local allocators so you can call malloc() without taking a
 lock?  I vaguely remember Sean saying he was working on that a while
 back, but I never heard anything about it again.  It's probably best
 to wait for shared to be implemented for this so that unshared
 objects can also be collected w/o stopping the world, but we should
 start at least discussing this now.
 

 I think there are general malloc functions that are lock free. On the
 same note, I've been toying with an idea for how to do memory heap
 structure that would be totally lock free (think CAS) but has some 
 downsides (like it won't handle a non-continues memory space and
 expanding the memory could get very expensive. Both of these could
 be non issues in some cases; short lived job specific heaps, memory
 mapped (shared?) data(base) files, resource limited systems, etc.
 I've been meaning to get around to implementing it but never had
 time. I'd be willing to collaborate or, if even that doesn't get me
 moving, just hand over the whole project in exchange for a 
 "coauthor" line in the comments.

 But x86 atomic ops, while not as slow as locks, are pretty slow.  Some
 benchmarks I did a while back indicated that, on Windows, atomic
 increment is only about twice as fast as an uncontested synchronized {
 var++; }.

IIRC a mutex/lock has to have a CAS at some point so if you can get a function 
to work with a single CAS, at worst an uncontested action will be as good 
as an uncontested mutex/lock. Once uncontention starts happening it depends 
on how much your retry costs. Another thing to take into account is the
overhead 
vs granularity tradeoff you get with locks that can sometimes be skiped with 
direct use of CAS.

 I think using thread-local storage and having each
 thread-local heap interact with the global heap only occasionally, for
 very large transactions, is a much better bet than using atomic CAS.

I got started thinking about this when I came across a comment that some 
particular malloc didn't have issues with freeing memory from a different 
thread than it was malloced from. 

 
 I'm also a little bit confused about what you're suggesting,
 specifically about the non-continuous memory space.  Does this mean
 that things would have to be freed LIFO?  If so, what use cases would
 you be covering that TempAlloc doesn't already cover?  If not, what
 does it mean?

IIRC, when the current GC runs out of space, it uses mmap (on linux) to get 
more address space mapped to something. This wouldn't work for my idea because 
you can't count of mmap returning a block of ram at any given location, it 
is free to leave gaping holes in the places you can allocate at and I wouldn't 
be able to handle that.

Nov 04 2009

Sean Kelly <sean invisibleduck.org> writes:

dsimcha Wrote:
 
 3.  This one is an order of magnitude less likely than the other two to
 actually get implemented, at least by me, but how about thread-local
 allocators so you can call malloc() without taking a lock?  I vaguely remember
 Sean saying he was working on that a while back, but I never heard anything
 about it again.  It's probably best to wait for shared to be implemented for
 this so that unshared objects can also be collected w/o stopping the world,
 but we should start at least discussing this now.

I actually started working on this a while back and then set it aside to take
care of some other things.  I'm planning to revisit it once the concurrency
model is reasonably sorted out, since such a GC is pretty much useless until
then anyway.

Nov 04 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Sean Kelly (sean invisibleduck.org)'s article
 dsimcha Wrote:
 3.  This one is an order of magnitude less likely than the other two to
 actually get implemented, at least by me, but how about thread-local
 allocators so you can call malloc() without taking a lock?  I vaguely remember
 Sean saying he was working on that a while back, but I never heard anything
 about it again.  It's probably best to wait for shared to be implemented for
 this so that unshared objects can also be collected w/o stopping the world,
 but we should start at least discussing this now.

 I actually started working on this a while back and then set it aside to take

care of some other things.  I'm planning to revisit it once the concurrency
model
is reasonably sorted out, since such a GC is pretty much useless until then
anyway.

Does this mean that this has officially, for all practical purposes, been put
off
until D3?

Nov 04 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

dsimcha wrote:
 == Quote from Sean Kelly (sean invisibleduck.org)'s article
 dsimcha Wrote:
 3.  This one is an order of magnitude less likely than the other two to
 actually get implemented, at least by me, but how about thread-local
 allocators so you can call malloc() without taking a lock?  I vaguely remember
 Sean saying he was working on that a while back, but I never heard anything
 about it again.  It's probably best to wait for shared to be implemented for
 this so that unshared objects can also be collected w/o stopping the world,
 but we should start at least discussing this now.

 I actually started working on this a while back and then set it aside to take

 care of some other things.  I'm planning to revisit it once the concurrency
model
 is reasonably sorted out, since such a GC is pretty much useless until then
anyway.
 
 Does this mean that this has officially, for all practical purposes, been put
off
 until D3?

Performance improvements are fortunately not tied to the language spec.

Andrei

Nov 04 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 dsimcha wrote:
 == Quote from Sean Kelly (sean invisibleduck.org)'s article
 dsimcha Wrote:
 3.  This one is an order of magnitude less likely than the other two to
 actually get implemented, at least by me, but how about thread-local
 allocators so you can call malloc() without taking a lock?  I vaguely remember
 Sean saying he was working on that a while back, but I never heard anything
 about it again.  It's probably best to wait for shared to be implemented for
 this so that unshared objects can also be collected w/o stopping the world,
 but we should start at least discussing this now.

 I actually started working on this a while back and then set it aside to take

 care of some other things.  I'm planning to revisit it once the concurrency
model
 is reasonably sorted out, since such a GC is pretty much useless until then


anyway.
 Does this mean that this has officially, for all practical purposes, been put
off
 until D3?

 Performance improvements are fortunately not tied to the language spec.
 Andrei

But the concurrency model is.

Nov 04 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 dsimcha wrote:
 == Quote from Sean Kelly (sean invisibleduck.org)'s article
 dsimcha Wrote:
 3.  This one is an order of magnitude less likely than the other two to
 actually get implemented, at least by me, but how about thread-local
 allocators so you can call malloc() without taking a lock?  I vaguely remember
 Sean saying he was working on that a while back, but I never heard anything
 about it again.  It's probably best to wait for shared to be implemented for
 this so that unshared objects can also be collected w/o stopping the world,
 but we should start at least discussing this now.

 I actually started working on this a while back and then set it aside to take

 care of some other things.  I'm planning to revisit it once the concurrency
model
 is reasonably sorted out, since such a GC is pretty much useless until then


 anyway.
 Does this mean that this has officially, for all practical purposes, been put
off
 until D3?

 Performance improvements are fortunately not tied to the language spec.
 Andrei

 
 But the concurrency model is.

Sorry, I misunderstood a bit. Anyway, the process is a continuum, so you 
shouldn't expect that the concurrency model will be available only on 
the day of D2's release. Also, there will be many releases of D2 
following the "gold" release. It's not like once we deploy D2 we just 
start working on D3.

Andrei

Nov 04 2009

D Programming

C/C++ Programming

Other

digitalmars.D - Memory Management in D: Request for Comment