www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - DConf 2013 Day 1 Talk 6: Concurrent Garbage Collection for D by Leandro

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On reddit:

http://www.reddit.com/r/programming/comments/1eovfu/dconf_2013_day_1_talk_6_concurrent_garbage/


Enjoy! Discuss!! Vote!!!

Andrei
May 20 2013
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 http://www.reddit.com/r/programming/comments/1eovfu/dconf_2013_day_1_talk_6_concurrent_garbage/
This link on the YouTube page seems to not be online: http://dconf.org/2013/talks/lucarella.pdf Bye, bearophile
May 20 2013
prev sibling next sibling parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Mon, 20 May 2013 13:50:25 +0100, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On reddit:
 http://www.reddit.com/r/programming/comments/1eovfu/dconf_2013_day_1_talk_6_concurrent_garbage/
This may be the Windows Copy On Write feature mentioned in the Q&A at the end: http://support.microsoft.com/kb/103858 .. but it's not clear to me how useful this is for fork emulation or similar. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
May 20 2013
next sibling parent reply "Diggory" <diggsey googlemail.com> writes:
On Monday, 20 May 2013 at 13:55:05 UTC, Regan Heath wrote:
 On Mon, 20 May 2013 13:50:25 +0100, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:

 On reddit:
 http://www.reddit.com/r/programming/comments/1eovfu/dconf_2013_day_1_talk_6_concurrent_garbage/
This may be the Windows Copy On Write feature mentioned in the Q&A at the end: http://support.microsoft.com/kb/103858 .. but it's not clear to me how useful this is for fork emulation or similar. R
Fork isn't needed at all really in the technique described, this is all that's needed: - Map a copy of the memory using copy-on-write - Run some code concurrently It just happens that fork does both of these things, but you can equally well do the two things using separate calls. In fact you should be able to avoid the deadlock issue by not using fork but just remapping some shared memory using copy on write. The GC can exist in a separate thread which pauses itself after every run. To run the GC it's then just a case of: - stop the world - copy registers to stack - remap shared memory using COW - resume the world - resume the GC thread And that would work on all modern OSes, plus you don't have the overhead of creating a new process or even a new thread. Also immutable memory doesn't need to be mapped, the GC thread can access it directly.
May 20 2013
next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Monday, 20 May 2013 at 22:52:33 UTC, Diggory wrote:
 And that would work on all modern OSes, plus you don't have the 
 overhead of creating a new process or even a new thread. Also 
 immutable memory doesn't need to be mapped, the GC thread can 
 access it directly.
Copy on WRITE usually don't happen on immutable memory.
May 20 2013
parent "Diggory" <diggsey googlemail.com> writes:
On Tuesday, 21 May 2013 at 00:00:13 UTC, deadalnix wrote:
 On Monday, 20 May 2013 at 22:52:33 UTC, Diggory wrote:
 And that would work on all modern OSes, plus you don't have 
 the overhead of creating a new process or even a new thread. 
 Also immutable memory doesn't need to be mapped, the GC thread 
 can access it directly.
Copy on WRITE usually don't happen on immutable memory.
I never said it did... I said that when using fork the immutable memory must be mapped into the new process, when using a thread it does not since threads share the same address space.
May 20 2013
prev sibling parent reply Leandro Lucarella <luca llucax.com.ar> writes:
Diggory, el 21 de May a las 00:52 me escribiste:
 On Monday, 20 May 2013 at 13:55:05 UTC, Regan Heath wrote:
On Mon, 20 May 2013 13:50:25 +0100, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

On reddit:
http://www.reddit.com/r/programming/comments/1eovfu/dconf_2013_day_1_talk_6_concurrent_garbage/
This may be the Windows Copy On Write feature mentioned in the Q&A at the end: http://support.microsoft.com/kb/103858 .. but it's not clear to me how useful this is for fork emulation or similar. R
Fork isn't needed at all really in the technique described, this is all that's needed: - Map a copy of the memory using copy-on-write - Run some code concurrently It just happens that fork does both of these things, but you can equally well do the two things using separate calls. In fact you should be able to avoid the deadlock issue by not using fork but just remapping some shared memory using copy on write. The GC can exist in a separate thread which pauses itself after every run. To run the GC it's then just a case of: - stop the world - copy registers to stack - remap shared memory using COW - resume the world - resume the GC thread And that would work on all modern OSes, plus you don't have the overhead of creating a new process or even a new thread. Also immutable memory doesn't need to be mapped, the GC thread can access it directly.
I'm interested in what you're describing, but I don't know how can you achieve this without fork()ing (or clone()ing in Linux). What does "remap shared memory using COW" in a context where fork() doesn't happen? Why do you even need shared memory if fork() doesn't happen? If "remap shared memory using COW" means get a different address for the same block of memory until a write happens in that block, then you can't scan the roots anymore. I'm *very* interested in your suggestion. -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- CONDUCTOR BORRACHO CASI PROVOCA UNA TRAGEDIA: BATMAN UNICO TESTIGO -- Crónica TV
May 21 2013
next sibling parent "Diggory" <diggsey googlemail.com> writes:
On Tuesday, 21 May 2013 at 20:08:16 UTC, Leandro Lucarella wrote:
 I'm interested in what you're describing, but I don't know how 
 can you
 achieve this without fork()ing (or clone()ing in Linux). What 
 does
 "remap shared memory using COW" in a context where fork() 
 doesn't
 happen? Why do you even need shared memory if fork() doesn't 
 happen? If
 "remap shared memory using COW" means get a different address 
 for the
 same block of memory until a write happens in that block, then 
 you can't
 scan the roots anymore.

 I'm *very* interested in your suggestion.
I do mean mapping the same physical block of memory into two virtual memory blocks, such that a write will cause the OS to make a copy and sever the link. Given that we're dealing with significantly large blocks of memory all in multiples of the page size, it should be a simple matter to map from one copy of the memory to the other and back again using basic pointer arithmetic, so scanning the roots should not be a problem. You need shared memory because a file handle is needed when calling "mmap" to enable the COW functionality, and shared memory lets you get a file handle to a block of memory.
May 21 2013
prev sibling parent reply "Diggory" <diggsey googlemail.com> writes:
On Tuesday, 21 May 2013 at 20:08:16 UTC, Leandro Lucarella wrote:
 I'm interested in what you're describing, but I don't know how 
 can you
 achieve this without fork()ing (or clone()ing in Linux). What 
 does
 "remap shared memory using COW" in a context where fork() 
 doesn't
 happen? Why do you even need shared memory if fork() doesn't 
 happen? If
 "remap shared memory using COW" means get a different address 
 for the
 same block of memory until a write happens in that block, then 
 you can't
 scan the roots anymore.

 I'm *very* interested in your suggestion.
After doing some further investigation I think I've found a fairly awesome way of doing garbage collection on windows, hopefully linux has a similar mechanism. It doesn't require memory mapped files or anything special, it can be done retroactively, and it allows a kind of reverse copy on write which is what's actually needed. When the GC is run: - Use VirtualProtect to mark all mutable memory pages as read-only - Add a vectored exception handler to handle the access violation exception - Resume the GC thread In the exception handler: - Copy the page being modified to a new page - Mark the original page as writable again - Tell the GC to use the new page instead - Continue The main problem I see is the need to synchronise between the exception handler and the GC, when it would be nice if the exception handler was as light-weight as possible. However this can be largely solved by doing finer grained synchronisation/some clever stuff. For example, the handler can add the new mapping first, then do the page copy, and only then wait for the GC to acknowledge the new mapping, so the GC has the whole time that the copying is taking place to see the change. The GC would also of course have to wait for the copying to complete before continuing but pausing the GC for a short time isn't really an issue at all.
May 23 2013
next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Friday, 24 May 2013 at 06:52:19 UTC, Diggory wrote:
 On Tuesday, 21 May 2013 at 20:08:16 UTC, Leandro Lucarella 
 wrote:
 I'm interested in what you're describing, but I don't know how 
 can you
 achieve this without fork()ing (or clone()ing in Linux). What 
 does
 "remap shared memory using COW" in a context where fork() 
 doesn't
 happen? Why do you even need shared memory if fork() doesn't 
 happen? If
 "remap shared memory using COW" means get a different address 
 for the
 same block of memory until a write happens in that block, then 
 you can't
 scan the roots anymore.

 I'm *very* interested in your suggestion.
After doing some further investigation I think I've found a fairly awesome way of doing garbage collection on windows, hopefully linux has a similar mechanism. It doesn't require memory mapped files or anything special, it can be done retroactively, and it allows a kind of reverse copy on write which is what's actually needed. When the GC is run: - Use VirtualProtect to mark all mutable memory pages as read-only - Add a vectored exception handler to handle the access violation exception - Resume the GC thread
I've tried writing a generational GC for D that used page protection for write barriers a while ago. IIRC, I ran into performance issues (the page faults were rather expensive). This approach does have the benefit that it will not cause pages that have been moved to swap to be pulled out in order to be scanned every time, though.
May 24 2013
parent Leandro Lucarella <luca llucax.com.ar> writes:
Vladimir Panteleev, el 24 de May a las 09:55 me escribiste:
When the GC is run:
- Use VirtualProtect to mark all mutable memory pages as read-only
- Add a vectored exception handler to handle the access violation
exception
- Resume the GC thread
I've tried writing a generational GC for D that used page protection for write barriers a while ago. IIRC, I ran into performance issues (the page faults were rather expensive).
Yeah, using memory protection to do what fork does manually is a known approach, discussed even in the "Garbage Collection" book[1]. The good thing about fork is it's sooooo much easier to implement, and the OS is already highly tuned to do this for you. That's why, even when it might be good to explore, is not a very tempting approach for me (but I have it in mind as an alternative way to "fix" the potential deadlock caused by glibc internal mutex).
 This approach does have the benefit that it will not cause pages
 that have been moved to swap to be pulled out in order to be scanned
 every time, though.
[1] http://books.google.de/books/about/Garbage_Collection.html?id=UdtQAAAAMAAJ&redir_esc=y -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Se ha dicho tanto que las apariencias engañan Por supuesto que engañarán a quien sea tan vulgar como para creerlo
May 27 2013
prev sibling parent Artur Skawina <art.08.09 gmail.com> writes:
On 05/24/13 08:52, Diggory wrote:
 After doing some further investigation I think I've found a fairly awesome way
of doing garbage collection on windows, hopefully linux has a similar
mechanism. It doesn't require memory mapped files or anything special, it can
be done retroactively, and it allows a kind of reverse copy on write which is
what's actually needed.
 
 When the GC is run:
 - Use VirtualProtect to mark all mutable memory pages as read-only
 - Add a vectored exception handler to handle the access violation exception
 - Resume the GC thread
 
 In the exception handler:
 - Copy the page being modified to a new page
A page fault per every page written to between every GC run + a user space callback for every such fault + a page sized copy every time is not going to perform very well... (There are ways to avoid these costs, but no idea if it's easily doable on windows...) artur
May 24 2013
prev sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Monday, 20 May 2013 at 13:55:05 UTC, Regan Heath wrote:
 On Mon, 20 May 2013 13:50:25 +0100, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:

 On reddit:
 http://www.reddit.com/r/programming/comments/1eovfu/dconf_2013_day_1_talk_6_concurrent_garbage/
This may be the Windows Copy On Write feature mentioned in the Q&A at the end: http://support.microsoft.com/kb/103858
Yes, basically. I can't find where I've read that mapping COW memory from within the same process is only supported on NT versions. However, doing it from within the same process is not practical anyway, since that would imply at least halving the address space.
May 20 2013
parent reply "Diggory" <diggsey googlemail.com> writes:
On Tuesday, 21 May 2013 at 04:26:18 UTC, Vladimir Panteleev wrote:
 On Monday, 20 May 2013 at 13:55:05 UTC, Regan Heath wrote:
 On Mon, 20 May 2013 13:50:25 +0100, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:

 On reddit:
 http://www.reddit.com/r/programming/comments/1eovfu/dconf_2013_day_1_talk_6_concurrent_garbage/
This may be the Windows Copy On Write feature mentioned in the Q&A at the end: http://support.microsoft.com/kb/103858
Yes, basically. I can't find where I've read that mapping COW memory from within the same process is only supported on NT versions. However, doing it from within the same process is not practical anyway, since that would imply at least halving the address space.
Either way, at least on windows the separate process would have to be persistent as creating a new process has a lot more overhead attached to it than on posix systems. Implicitly creating a child process is also something a D programmer might not want, again this is more windows specific where processes do not have the same strict hierarchy as on posix. On 64-bit systems there shouldn't be a problem with address space, and even on 32-bit systems, the objects which take up the vast majority of the address space are usually arrays of primitive types which don't need to be scanned by the GC. Certainly for 64-bit systems I think it's worth at least doing some performance comparisons, and I think using a thread will turn out to be more efficient. If it turns out that there are merits to both options then having a GC option to use a process or a thread might make sense.
May 20 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 21 May 2013 at 04:52:25 UTC, Diggory wrote:
 Either way, at least on windows the separate process would have 
 to be persistent as creating a new process has a lot more 
 overhead attached to it than on posix systems. Implicitly 
 creating a child process is also something a D programmer might 
 not want, again this is more windows specific where processes 
 do not have the same strict hierarchy as on posix.
What would be some of the potential downsides of a long-running satellite process? A concurrent GC would probably be opt-in or opt-out, either way.
May 23 2013
parent reply "Diggory" <diggsey googlemail.com> writes:
On Thursday, 23 May 2013 at 12:39:04 UTC, Vladimir Panteleev 
wrote:
 On Tuesday, 21 May 2013 at 04:52:25 UTC, Diggory wrote:
 Either way, at least on windows the separate process would 
 have to be persistent as creating a new process has a lot more 
 overhead attached to it than on posix systems. Implicitly 
 creating a child process is also something a D programmer 
 might not want, again this is more windows specific where 
 processes do not have the same strict hierarchy as on posix.
What would be some of the potential downsides of a long-running satellite process? A concurrent GC would probably be opt-in or opt-out, either way.
You'd either have to distribute a separate .exe for the GC process or embed the second .exe inside the first and extract it at runtime to a temporary location, or call the same .exe with some flag which tells it to turn into a GC. None of those options are particularly nice, all of them require significant interference from the GC, it's not just a case of calling some function to start the GC. This is especially a problem in cases where the runtime can't insert the init handler such as when using WinMain, which you currently have to do unless you want a console window to pop up. Next there's the fact that you have a separate top level process for every D process. This is going to cause problems with security software since you now have to give both processes permissions in order to run. In addition you have to consider the case where one or other of the two processes is paused or killed unexpectedly - you don't want orphaned GC processes hanging around wasting resources.
May 23 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Thursday, 23 May 2013 at 23:49:20 UTC, Diggory wrote:
 You'd either have to distribute a separate .exe for the GC 
 process or embed the second .exe inside the first and extract 
 it at runtime to a temporary location, or call the same .exe 
 with some flag which tells it to turn into a GC. None of those 
 options are particularly nice, all of them require significant 
 interference from the GC, it's not just a case of calling some 
 function to start the GC.
I don't know why you say that. The satellite process would be the same executable file as the main process, but started with a special switch (or environment variable), which will be handled by the D runtime. Since the two processes use the same image, the executable code will share the same pages in memory, resulting in a small overhead.
 This is especially a problem in cases where the runtime can't 
 insert the init handler such as when using WinMain,
When implementing WinMain, you have to call the runtime initialization function, which will initialize the GC. GC initialization for the satellite process need not exit.
 which you currently have to do unless you want a console window 
 to pop up.
I don't think this is true. Although the image subsystem is auto-detected by the entry point you're using (CONSOLE for main, WINDOWS for WinMain), you can specify it explicitly using the /SUBSYSTEM linker switch (/SUBSYSTEM:WINDOWS).
 Next there's the fact that you have a separate top level 
 process for every D process. This is going to cause problems 
 with security software since you now have to give both 
 processes permissions in order to run.
Sorry? Could you provide an example (with a specific security software package)?
 In addition you have to consider the case where one or other of 
 the two processes is paused or killed unexpectedly - you don't 
 want orphaned GC processes hanging around wasting resources.
Why is this an issue? Windows provides simple ways to wait or check for a process's termination.
May 24 2013
parent reply "Diggory" <diggsey googlemail.com> writes:
On Friday, 24 May 2013 at 07:48:49 UTC, Vladimir Panteleev wrote:
 I don't know why you say that. The satellite process would be 
 the same executable file as the main process, but started with 
 a special switch (or environment variable), which will be 
 handled by the D runtime. Since the two processes use the same 
 image, the executable code will share the same pages in memory, 
 resulting in a small overhead.
I'm fairly sure windows only shares memory between instances of the same DLL, not executables.
 When implementing WinMain, you have to call the runtime 
 initialization function, which will initialize the GC. GC 
 initialization for the satellite process need not exit.
You're still executing part of the user's code and it's going to be exceedingly hard to satisfactorily explain why the main function will be called twice except the second time it will mysteriously stop half-way through because the runtime init function doesn't return...
 I don't think this is true. Although the image subsystem is 
 auto-detected by the entry point you're using (CONSOLE for 
 main, WINDOWS for WinMain), you can specify it explicitly using 
 the /SUBSYSTEM linker switch (/SUBSYSTEM:WINDOWS).
The only time I've been able to get it to not show a console window is when using WinMain...
 Sorry? Could you provide an example (with a specific security 
 software package)?
Any security software with the active defense type stuff, Commodo does it for example. The things they ask you about vary and it's completely pointless IMO, but they do ask for permission for practically everything and with two processes that's twice as much to ask you about. Anyway, my main point is that it's not the kind of thing you want to impose as standard on all D applications, there may be other problems it causes depending on what the program is for. If it exists at all it should be opt-in and the default should still be as performant as possible.
 I've tried writing a generational GC for D that used page 
 protection for write barriers a while ago. IIRC, I ran into 
 performance issues (the page faults were rather expensive).
Hmm, it shouldn't really be much slower than the technique Leandro was using - both rely on the exact same number of page faults - and his results looked very promising.
May 24 2013
next sibling parent "Diggory" <diggsey googlemail.com> writes:
On Friday, 24 May 2013 at 09:56:25 UTC, Diggory wrote:
 The only time I've been able to get it to not show a console 
 window is when using WinMain...
OK, nvm, it seems it's just that the "SUBSYSTEM" setting in VisualD has no effect, you have to set it through a .def file.
May 24 2013
prev sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Friday, 24 May 2013 at 09:56:25 UTC, Diggory wrote:
 I'm fairly sure windows only shares memory between instances of 
 the same DLL, not executables.
Oh? I thought Windows treats all modules in the same way. Do you have a source?
 You're still executing part of the user's code and it's going 
 to be exceedingly hard to satisfactorily explain why the main 
 function will be called twice except the second time it will 
 mysteriously stop half-way through because the runtime init 
 function doesn't return...
I don't see that as a problem, considering such a GC would need to be opt-in.
 Any security software with the active defense type stuff, 
 Commodo does it for example. The things they ask you about vary 
 and it's completely pointless IMO, but they do ask for 
 permission for practically everything and with two processes 
 that's twice as much to ask you about.
The reason why I asked for a specific example is that I don't know of any actual case when this will be a problem in practice. For one, I've never seen security software that reacts differently on two instances of the same executable - usually, any permissions you set are for the .exe file, or more generic rules (like port numbers). For two, the satellite process does not need any kinds of permissions. Even the requirement of accessing other processes' memory that the straight-forward approach needs can be avoided by allowing the satellite process to inherit a handle to the main process.
 Anyway, my main point is that it's not the kind of thing you 
 want to impose as standard on all D applications, there may be 
 other problems it causes depending on what the program is for. 
 If it exists at all it should be opt-in and the default should 
 still be as performant as possible.
Yes, which is why I also think it should be opt-in.
 I've tried writing a generational GC for D that used page 
 protection for write barriers a while ago. IIRC, I ran into 
 performance issues (the page faults were rather expensive).
Hmm, it shouldn't really be much slower than the technique Leandro was using - both rely on the exact same number of page faults - and his results looked very promising.
Leandro wasn't using Windows ;)
May 24 2013
parent "Diggory" <diggsey googlemail.com> writes:
On Friday, 24 May 2013 at 10:20:12 UTC, Vladimir Panteleev wrote:
 Leandro wasn't using Windows ;)
True but assuming windows isn't completely degenerate the overhead of a page fault is going to be pretty close to identical on the same cpu architecture. On Friday, 24 May 2013 at 11:15:21 UTC, Artur Skawina wrote:
 A page fault per every page written to between every GC run + a 
 user
 space callback for every such fault + a page sized copy every 
 time is
 not going to perform very well...

 (There are ways to avoid these costs, but no idea if it's 
 easily doable on
 windows...)

 artur
It's only between the time a GC run is started and the time it finishes, exactly the same as in the talk. A generational GC should be able to do complete mark cycles fairly infrequently.
May 24 2013
prev sibling next sibling parent Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On Mon, 20 May 2013 08:50:25 -0400
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 On reddit:
 
 http://www.reddit.com/r/programming/comments/1eovfu/dconf_2013_day_1_talk_6_concurrent_garbage/
 
Torrents up, as well as links: http://semitwist.com/download/misc/dconf2013/
May 20 2013
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-05-20 14:50, Andrei Alexandrescu wrote:
 On reddit:

 http://www.reddit.com/r/programming/comments/1eovfu/dconf_2013_day_1_talk_6_concurrent_garbage/
Great talk. What's up with the extra minute added in the beginning? -- /Jacob Carlborg
May 20 2013
parent Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On Tue, 21 May 2013 08:27:23 +0200
Jacob Carlborg <doob me.com> wrote:

 On 2013-05-20 14:50, Andrei Alexandrescu wrote:
 On reddit:

 http://www.reddit.com/r/programming/comments/1eovfu/dconf_2013_day_1_talk_6_concurrent_garbage/
Great talk. What's up with the extra minute added in the beginning?
Suspense!
May 20 2013
prev sibling next sibling parent reply "Dicebot" <m.strashun gmail.com> writes:
Can't wait to see a prototype for D2 :) I have a feeling that 
this may solve at least some of vibe.d latency issues at high 
concurrency levels.
May 21 2013
parent Leandro Lucarella <luca llucax.com.ar> writes:
Dicebot, el 21 de May a las 09:55 me escribiste:
 Can't wait to see a prototype for D2 :) I have a feeling that this
 may solve at least some of vibe.d latency issues at high concurrency
 levels.
I hope I can start porting it to D2 at some (not so far in the future) point... -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Did you know the originally a Danish guy invented the burglar-alarm unfortunately, it got stolen
May 21 2013
prev sibling parent reply "Juan Manuel Cabo" <juanmanuel.cabo gmail.com> writes:
On Monday, 20 May 2013 at 12:50:23 UTC, Andrei Alexandrescu wrote:
 On reddit:

 http://www.reddit.com/r/programming/comments/1eovfu/dconf_2013_day_1_talk_6_concurrent_garbage/


 Enjoy! Discuss!! Vote!!!

 Andrei
I know that this is slightly offtopic, since the topic lately seems to be how to make the GC run generationally or with small footprint (don't stop the world, etc.). I'd like to know if there is interest in a precise garbage collector. Anyways, here is how .NET does it: http://blogs.msdn.com/b/abhinaba/archive/2009/03/03/back-to-basics-how-does-the-gc-find-object-references.aspx It uses a mask stored in the type information of a class. D doesn't have this kind of type info in runtime I guess, but since D is on the verge of supporting multiple dlls/so, the time is now for a small modification to be made in the ABI to support this (if it is ever going to be made). I know that in 64bits there is less of a problem with data as false pointers, but having a precise garbage collector would make two things possible: 1) Defragmenting the heap by being able to move references. 2) Easier to make a generational GC The following link explains (in the first comment) how .NET distinguishes its own stack frames from non-managed stack frames, by adding a "cookie": http://stackoverflow.com/questions/10669173/how-does-the-gc-update-references-after-compaction-occurs --jm
May 24 2013
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, May 24, 2013 20:30:54 Juan Manuel Cabo wrote:

 I know that this is slightly offtopic, since the topic lately
 seems to be how to make the GC run generationally or with small
 footprint (don't stop the world, etc.).
 
 I'd like to know if there is interest in a precise garbage
 collector.
There is interest in it, and Rainer Schütze did a talk on it at DConf. At the current pace (assuming that Andrei actually posts one on Monday even though it's a federal holiday in the US), it'll be posted on June 3rd (and if he skips Monday, then it'll probably be June 5th). And actually, the precise GC changes stand a much better chance of making it into druntime in the short term than any concurrency changes do. - Jonathan M Davis
May 24 2013
next sibling parent reply "Juan Manuel Cabo" <juanmanuel.cabo gmail.com> writes:
On Friday, 24 May 2013 at 19:44:19 UTC, Jonathan M Davis wrote:
 On Friday, May 24, 2013 20:30:54 Juan Manuel Cabo wrote:

 I know that this is slightly offtopic, since the topic lately
 seems to be how to make the GC run generationally or with small
 footprint (don't stop the world, etc.).
 
 I'd like to know if there is interest in a precise garbage
 collector.
There is interest in it, and Rainer Schütze did a talk on it at DConf. At the current pace (assuming that Andrei actually posts one on Monday even though it's a federal holiday in the US), it'll be posted on June 3rd (and if he skips Monday, then it'll probably be June 5th). And actually, the precise GC changes stand a much better chance of making it into druntime in the short term than any concurrency changes do. - Jonathan M Davis
Thanks, that is great news! --jm
May 24 2013
parent reply "Diggory" <diggsey googlemail.com> writes:
On 64-bit windows there is also the "GetWriteWatch" function 
which lets you access the dirty flag in the page table = no page 
faults = super efficient concurrent generational GC. Just a shame 
it doesn't exist on 32-bit systems for some reason.
May 24 2013
parent Sean Cavanaugh <WorksOnMyMachine gmail.com> writes:
On 5/24/2013 11:12 PM, Diggory wrote:
 On 64-bit windows there is also the "GetWriteWatch" function which lets
 you access the dirty flag in the page table = no page faults = super
 efficient concurrent generational GC. Just a shame it doesn't exist on
 32-bit systems for some reason.
There's all sorts of interesting stuff in 64 bit windows :) The user mode thread scheduler is pretty cool. On the flip side: 32 bit is in its twilight days, and I am reasonably confident the next game I work on will be the last one that even supports 32 bits. Then I can finally use all the new 64 bit goodies :) 32 bits will be reserved for phones and tablets (and even then tablets will probably be making the switch pretty soon-ish)
May 27 2013
prev sibling parent reply "Brian Rogoff" <brogoff gmail.com> writes:
On Friday, 24 May 2013 at 19:44:19 UTC, Jonathan M Davis wrote:
 On Friday, May 24, 2013 20:30:54 Juan Manuel Cabo wrote:
 I'd like to know if there is interest in a precise garbage
 collector.
There is interest in it, and Rainer Schütze did a talk on it at DConf. At the current pace (assuming that Andrei actually posts one on Monday even though it's a federal holiday in the US), it'll be posted on June 3rd (and if he skips Monday, then it'll probably be June 5th). And actually, the precise GC changes stand a much better chance of making it into druntime in the short term than any concurrency changes do. - Jonathan M Davis
That's very promising. The lack of precise garbage collection and the unclear story with regards to programming sans-GC (maybe it's clear to someone, but not to me) is far more of a "deal breaker" for me than the lack of non-nullable pointers. I hope that you're right and that this gets sorted out soon. -- Brian
May 27 2013
parent "Diggory" <diggsey googlemail.com> writes:
On Monday, 27 May 2013 at 17:56:10 UTC, Brian Rogoff wrote:
 On Friday, 24 May 2013 at 19:44:19 UTC, Jonathan M Davis wrote:
 On Friday, May 24, 2013 20:30:54 Juan Manuel Cabo wrote:
 I'd like to know if there is interest in a precise garbage
 collector.
There is interest in it, and Rainer Schütze did a talk on it at DConf. At the current pace (assuming that Andrei actually posts one on Monday even though it's a federal holiday in the US), it'll be posted on June 3rd (and if he skips Monday, then it'll probably be June 5th). And actually, the precise GC changes stand a much better chance of making it into druntime in the short term than any concurrency changes do. - Jonathan M Davis
That's very promising. The lack of precise garbage collection and the unclear story with regards to programming sans-GC (maybe it's clear to someone, but not to me) is far more of a "deal breaker" for me than the lack of non-nullable pointers. I hope that you're right and that this gets sorted out soon. -- Brian
It's actually possible to improve the precision of the GC without any additional type info. As long as you can give some unique ID to each type when you allocate it then the GC can learn the layout of that type on the fly. For example a simple algorithm would be: - When a new ID is first see create new type-info that is all pointers. - While scanning an instance of that type, if a pointer points to a value in the higher half, or a sufficiently low value which is not equal to zero, then remove this pointer from the type-info. You would have to disable this for unions, but for the rest it should work fine. Plus with more intelligent algorithms you can handle more cases. You could even save the type-info to a file and reuse it later to improve performance.
May 27 2013