digitalmars.D - StackThreads and Phobos

mclysenk mtu.edu (67/67) Jun 30 2006 Hello fellow D users! I've been working diligently on improving StackTh...

Bruno Medeiros (10/34) Jul 02 2006 But isn't that the correct behavior? If one can return to that function

mclysenk mtu.edu (39/45) Jul 02 2006 It is not correct behavior. Here's an example which may illustrate this...

Bruno Medeiros (8/68) Jul 03 2006 Unless my understanding of this is wrong, the use of the word "never"

mclysenk mtu.edu (22/26) Jul 03 2006 Well, you are correct, that it may still get collected in the future -

Bruno Medeiros (10/44) Jul 03 2006 That was clear now, but for what I understand, then your problem is

mclysenk mtu.edu (3/9) Jul 03 2006 Correct. This problem will affect everyone who uses addRange/removeRang...

mclysenk mtu.edu writes:

Hello fellow D users!  I've been working diligently on improving StackThreads
(http://assertfalse.com), and so far D has worked wonderfully.  The same can not
be said for Phobos...  (This may get a bit long winded.)

---

1.  removeRange is really slow.

In order to prevent the garbage collector from accidentally deleting valid
objects, we need to mark the stack of each user-context.

One naive solution is to simply mark the entire stack region, however this isn't
a very good idea.

Say I create a user context which calls a function and then yields.  If that
function performed any allocations, then they would remain on the stack, and get
scanned by the GC.  Until I overwrite those values, the GC will always see them,
and therefore never collect their memory. 

This results in memory leaks, and a dramatic loss of performance.  The solution
is to dynamically resize the range used by the stack.  When we switch into a
context, we remove the range, and before we leave, we add it back.  This nicely
solves the problem, except for one thing - removeRange requires O(n) time.

This performance penalty applies to each Context, increasing the cost of
switching dramatically.

A simple solution is to use a hash set or another efficient data structure.
This would also be able to detect range address conflicts, such as adding a
range twice.

As a short term fix, I propose changing line 987 of internal/gcx.d from:

memmove(ranges + i, ranges + i + 1, (nranges - i) * ranges[0].sizeof);

to:

memmove(ranges + i, ranges + nranges, ranges[0].sizeof);

While this does not remove the sluggish linear search, it does eliminate a very
large memory copy.  The cost of this approach is that the ordering of ranges is
not preserved, which may increase look up time in some circumstances.

Another possibility is to allow for 'dynamic' ranges, which the user would be
able to resize.  In this situation, we could still use a linear search for
removal, however changing the range's dimensions is a constant time operation.
Thread safety could be insured via an atomic operation, ie CMPXCHG.


2.  It is impossible to safely handle a page fault.

Another problem with StackThreads right now is stack growth.  In a regular
context, the operating system automatically grows the stack once the program
hits the bottom.  It keeps on doing this until there is no more address
space/memory left, and then it throws an overflow.

In StackThreads, I'd like to do the same thing.  However, DMD transforms any
page faults into exceptions, which it then unwinds up the call stack.  This
makes it impossible to trap, since the exception handler will unwind the stack
until it hits a try block with a matching catch - and only then will it get
handled.  From this state, there is no way to recover the program, since we
cannot undo the unwind.  The only tenable option is a global fault and
termination of the context.

In windows, you could try to install a custom exception handler, and trap any
stack overflows yourself - except DMD automatically installs an SEH for each
try..catch block.  It is pretty much inevitable that a hapless user will eat
your page fault when he is trying to catch some other sort of exception.

Call backs are one solution, user programs can carefully handle page faults and
stack overflows without disturbing the state of the program.


---

There are a few more issues, but these are the only two I have encountered that
I couldn't hack around.

My personal view is that integrated user-context switching would fix most of
those issues, and provide a great deal of flexibility.  Once the GC issues are
resolved, it can be made very fast.  In fact, the only overhead (outside GC
management) is equivalent to a single function call.

Standard user level contexts make all sorts of things possible, like coroutines
or micro threads.  They can be used to iterate over complex data structures
trivially.  They are much simpler and faster than threads for GUIs, and
eliminate the need for many state machine objects.

In conjunction with standard preemptive threading, user-contexts provide an
elegant solution to many practical programming problems.  Standard library
integration will result in a much faster context switching and better future
support.


-Mikola Lysenko

Jun 30 2006

Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:

mclysenk mtu.edu wrote:
 Hello fellow D users!  I've been working diligently on improving StackThreads
 (http://assertfalse.com), and so far D has worked wonderfully.  The same can
not
 be said for Phobos...  (This may get a bit long winded.)
 
 ---
 
 1.  removeRange is really slow.
 
 In order to prevent the garbage collector from accidentally deleting valid
 objects, we need to mark the stack of each user-context.
 
 One naive solution is to simply mark the entire stack region, however this
isn't
 a very good idea.
 
 Say I create a user context which calls a function and then yields.  If that
 function performed any allocations, then they would remain on the stack, and
get
 scanned by the GC.  Until I overwrite those values, the GC will always see
them,
 and therefore never collect their memory. 
 
 This results in memory leaks, and a dramatic loss of performance.  The solution
 is to dynamically resize the range used by the stack.  When we switch into a
 context, we remove the range, and before we leave, we add it back.  This nicely
 solves the problem, except for one thing - removeRange requires O(n) time.
 

But isn't that the correct behavior? If one can return to that function 
instance(context), shouldn't all allocations made previously by that 
function remain valid?
(note: I haven't used your library, I just read your post and this 
question came up)
Also: Where do you keep the data from each context? In the heap?


-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Jul 02 2006

mclysenk mtu.edu writes:

In article <e88ltq$kvc$3 digitaldaemon.com>, Bruno Medeiros says...
But isn't that the correct behavior? If one can return to that function 
instance(context), shouldn't all allocations made previously by that 
function remain valid?
(note: I haven't used your library, I just read your post and this 
question came up)

It is not correct behavior.  Here's an example which may illustrate this better:

#void func1()





#void func2()




Here is a trace of the execution of func1 alongside it's stack















In this situation, it is doubtful that arr will ever get collected.  In order
for the GC to remove it, it must not find ANY references to arr in memory.
Period.  If the GC scans the entire stack, it will see a reference to arr, and
therefore it will never collect arr.  This is a trivial example, but it is not
difficult to see that these sorts of memory leaks can become very substantial
and very difficult to track down.


The problem is how to tell the garbage collector not to scan past the stack
pointer.  The only way at the moment is to use addRange and removeRange, or zero
out all of the unused stack memory every time you yield.  Either option has a
substantial overhead.

Also: Where do you keep the data from each context? In the heap?

Heap.  The next version of stackthreads will replace malloc/free with system
specific allocators, such as mmap and VirtualAlloc.

-Mikola Lysenko

Jul 02 2006

Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:

mclysenk mtu.edu wrote:
 In article <e88ltq$kvc$3 digitaldaemon.com>, Bruno Medeiros says...
 But isn't that the correct behavior? If one can return to that function 
 instance(context), shouldn't all allocations made previously by that 
 function remain valid?
 (note: I haven't used your library, I just read your post and this 
 question came up)

 
 It is not correct behavior.  Here's an example which may illustrate this
better:
 
 #void func1()





 #void func2()



 
 Here is a trace of the execution of func1 alongside it's stack
 













 
 In this situation, it is doubtful that arr will ever get collected.  In order
 for the GC to remove it, it must not find ANY references to arr in memory.
 Period.  If the GC scans the entire stack, it will see a reference to arr, and
 therefore it will never collect arr.  This is a trivial example, but it is not
 difficult to see that these sorts of memory leaks can become very substantial
 and very difficult to track down.
 
 
 The problem is how to tell the garbage collector not to scan past the stack
 pointer.  The only way at the moment is to use addRange and removeRange, or
zero
 out all of the unused stack memory every time you yield.  Either option has a
 substantial overhead.
 
 Also: Where do you keep the data from each context? In the heap?

 
 Heap.  The next version of stackthreads will replace malloc/free with system
 specific allocators, such as mmap and VirtualAlloc.
 
 -Mikola Lysenko
 
 

Unless my understanding of this is wrong, the use of the word "never" 
(gets collected) is too strong. If during normal program flow the stack 
is overwritten again, the reference becomes free. Or am I missing something?
Also, doesn't this happen with any regular function?


-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Jul 03 2006

mclysenk mtu.edu writes:

In article <e8artd$16e0$1 digitaldaemon.com>, Bruno Medeiros says...
Unless my understanding of this is wrong, the use of the word "never" 
(gets collected) is too strong. If during normal program flow the stack 
is overwritten again, the reference becomes free. Or am I missing something?
Also, doesn't this happen with any regular function?

Well, you are correct, that it may still get collected in the future -
regardless, its prospects are fairly dim.  You are also right in your statement
that this is not abnormal behavior for functions, and the GC knows this.  It has
a special case which handles the stack of a regular function.

When the GC does a full collect, it pauses all threads and marks the range from
the stack pointer to the bottom of the stack.  This prevents it from scanning
stuff which is no longer valid, since that range never gets marked.  This works
very well.

The bad part happens when we create multiple stacks for one thread.  In order
for the GC to properly scan them, we need to mark their stack when it is not in
use.  If the GC didn't scan their stack, then we'd accidentally over-collect
stuff they were using.  If we mark the entire stack range, then we'd
under-collect and get a bunch of memory leaks.  To fix this, we use addRange to
mark the active part of their stack before we switch out their context, and
removeRange to delete their old - and potentially invalid - stack range when we
switch into their context.

This part is tricky, and it burned me many times while I was working on stack
threads before I finally caught on.  It is not a big deal to fix once you know
what's happening.  The main disadvantage at the moment is removeRange - it's
just too slow!  It can be fixed, but it requires changing phobos.  

-Mikola Lysenko

Jul 03 2006

Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:

mclysenk mtu.edu wrote:
 In article <e8artd$16e0$1 digitaldaemon.com>, Bruno Medeiros says...
 Unless my understanding of this is wrong, the use of the word "never" 
 (gets collected) is too strong. If during normal program flow the stack 
 is overwritten again, the reference becomes free. Or am I missing something?
 Also, doesn't this happen with any regular function?

 
 Well, you are correct, that it may still get collected in the future -
 regardless, its prospects are fairly dim.  You are also right in your statement
 that this is not abnormal behavior for functions, and the GC knows this.  It
has
 a special case which handles the stack of a regular function.
 
 When the GC does a full collect, it pauses all threads and marks the range from
 the stack pointer to the bottom of the stack.  This prevents it from scanning
 stuff which is no longer valid, since that range never gets marked.  This works
 very well.
 
 The bad part happens when we create multiple stacks for one thread.  In order
 for the GC to properly scan them, we need to mark their stack when it is not in
 use.  If the GC didn't scan their stack, then we'd accidentally over-collect
 stuff they were using.  If we mark the entire stack range, then we'd
 under-collect and get a bunch of memory leaks.  To fix this, we use addRange to
 mark the active part of their stack before we switch out their context, and
 removeRange to delete their old - and potentially invalid - stack range when we
 switch into their context.
 
 This part is tricky, and it burned me many times while I was working on stack
 threads before I finally caught on.  It is not a big deal to fix once you know
 what's happening.  The main disadvantage at the moment is removeRange - it's
 just too slow!  It can be fixed, but it requires changing phobos.  
 
 -Mikola Lysenko
 
 

That was clear now, but for what I understand, then your problem is 
simply that you must use removeRange(and it is slow).
That is, it doesn't really have to do with function contexts and stacks. 
Anyone who allocates manually managed memory for any purpose, and then 
uses addRange on it, will have the same problem, right? (because they 
too will have to use removeRange)

-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Jul 03 2006

mclysenk mtu.edu writes:

In article <e8baer$1ond$1 digitaldaemon.com>, Bruno Medeiros says...
That was clear now, but for what I understand, then your problem is 
simply that you must use removeRange(and it is slow).
That is, it doesn't really have to do with function contexts and stacks. 
Anyone who allocates manually managed memory for any purpose, and then 
uses addRange on it, will have the same problem, right? (because they 
too will have to use removeRange)


Correct.  This problem will affect everyone who uses addRange/removeRange.

-Mikola Lysenko

Jul 03 2006

D Programming

C/C++ Programming

Other

digitalmars.D - StackThreads and Phobos