www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Garbage Collector

reply Konstantin <soonts gmail.com> writes:
Started learning D. Like the language. However, found several 
people complaining about garbage collector’s reliability and 
performance. For me it’s a showstopper.

I don’t believe a community is capable of creating a good GC. 
It’s just too complex engineering task. It’s been a known problem 
for years, still no solution.

Since recently, Microsoft’s .NET framework and runtime are open 
source under MIT license. Here’s the main parts of their GC:  
https://github.com/dotnet/coreclr/tree/master/src/gc As you see, 
it’s in C++, and contains 10-20 times more code, then D’s GC. 
Theoretically, it should be cross-platform: there’re build 
instructions for Linux (including ARM), OSX, and BSD; but I 
haven’t tried building on those.

Has anyone thought about taking GC from .NET and reusing it in D?

That GC is very efficient for wide range of applications. I’ve 
been using .NET since 3.0 on desktops, servers, embedded and 
mobiles, never had issues with GC, it just works, and the 
performance is good.

I don‘t know architectural details of either GC.

I perfectly aware it might happen so they are completely 
incompatible, or very hard to port: because .NET’s System.Object 
vs. D’s Object differences; because D’s fibers or just very 
different threading model; because different CRT; many other 
reasons possible.

But if we’re lucky, this GC could lead to a great improvement for 
D ecosystem, not costing too much time. IMO that is something 
doable by a single person in a spare time.

Thoughts?
Jun 15 2016
next sibling parent reply jmh530 <john.michael.hall gmail.com> writes:
On Wednesday, 15 June 2016 at 13:19:31 UTC, Konstantin wrote:
 Started learning D. Like the language. However, found several 
 people complaining about garbage collector’s reliability and 
 performance. For me it’s a showstopper.
Possible to disable.
 I don’t believe a community is capable of creating a good GC. 
 It’s just too complex engineering task. It’s been a known 
 problem for years, still no solution.
They've got a GSOC guy workin' on it now. I would hold off judgements until that effort is concluded.
 Has anyone thought about taking GC from .NET and reusing it in 
 D?
I don't think this would work, but I don't know enough to be able to explain why.
Jun 15 2016
parent Konstantin <soonts gmail.com> writes:
On Wednesday, 15 June 2016 at 13:27:47 UTC, jmh530 wrote:

 Possible to disable.
I don’t want to: for the last couple years I’ve been developing one of the reasons of that is GC.
 They've got a GSOC guy workin' on it now. I would hold off 
 judgements until that effort is concluded.
OK let's see.
Jun 15 2016
prev sibling next sibling parent reply rikki cattermole <rikki cattermole.co.nz> writes:
I'm not sure how much you know about D but:

1. Somebody is working on improving D's GC as part of GSOC in the hopes 
of making it able to be precise (from memory not 100% sure).
2. Only a few language features forces you to use the GC.
3. For most uses you are not forced to use the GC in any form especially 
with the help of std.experimental.allocator.
4. Our GC is based upon the Boehm GC. Its old. Even the more recent 
versions would be far better then what we have (we forked a long time ago).
5. The requirements for our GC is quite intricate. I.e. you can't just 
pop in one that doesn't understand about our Thread Local Storage (TLS) 
and stuff.
6. As said by somebody else we can disable the GC so it won't go ahead 
and scan upon allocation (only time it does).
Jun 15 2016
parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 16/06/2016 1:33 AM, rikki cattermole wrote:
 I'm not sure how much you know about D but:

 1. Somebody is working on improving D's GC as part of GSOC in the hopes
 of making it able to be precise (from memory not 100% sure).
 2. Only a few language features forces you to use the GC.
 3. For most uses you are not forced to use the GC in any form especially
 with the help of std.experimental.allocator.
 4. Our GC is based upon the Boehm GC. Its old. Even the more recent
 versions would be far better then what we have (we forked a long time ago).
 5. The requirements for our GC is quite intricate. I.e. you can't just
 pop in one that doesn't understand about our Thread Local Storage (TLS)
 and stuff.
 6. As said by somebody else we can disable the GC so it won't go ahead
 and scan upon allocation (only time it does).
I forgot to mention, good D code is not the same as a higher level language like Java. Here you don't have the automagick behavior of arrays. If you append it will have a high cost. All allocations have a large cost. Instead allocate in one large block which will of course be a whole lot faster then small tiny ones. So even if the GC is enabled, good D code won't cause too much slow down unless you decide to write heavy OOP code.
Jun 15 2016
parent reply Konstantin <soonts gmail.com> writes:
On Wednesday, 15 June 2016 at 13:40:11 UTC, rikki cattermole 
wrote:

 5. The requirements for our GC is quite intricate. I.e. you 
 can't just
 pop in one that doesn't understand about our Thread Local 
 Storage (TLS)
 and stuff.
D’s TLS that different from .NET's TLS? https://msdn.microsoft.com/en-us/library/system.threadstaticattribute(v=vs.110).aspx
 I forgot to mention, good D code is not the same as a higher 
 level language like Java.
 Here you don't have the automagick behavior of arrays. If you 
 append it will have a high cost. All allocations have a large 
 cost. Instead allocate in one large block which will of course 
 be a whole lot faster then small tiny ones.
You’re saying memory allocations in D are generally very expensive, but that’s not a problem, because it already functions as designed?
 So even if the GC is enabled, good D code won't cause too much 
 slow down unless you decide to write heavy OOP code.
I’ve been developing heavy OOP code in various languages (mostly very well for me.
Jun 15 2016
parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 16/06/2016 4:52 AM, Konstantin wrote:
 On Wednesday, 15 June 2016 at 13:40:11 UTC, rikki cattermole wrote:

 5. The requirements for our GC is quite intricate. I.e. you can't just
 pop in one that doesn't understand about our Thread Local Storage (TLS)
 and stuff.
D’s TLS that different from .NET's TLS? https://msdn.microsoft.com/en-us/library/system.threadstaticattribute(v=vs.110).aspx
Yes it most definitely is. We roll our own for platforms that do not support it. There is an abstraction in druntime specifically to handle this problem.
 I forgot to mention, good D code is not the same as a higher level
 language like Java.
 Here you don't have the automagick behavior of arrays. If you append
 it will have a high cost. All allocations have a large cost. Instead
 allocate in one large block which will of course be a whole lot faster
 then small tiny ones.
You’re saying memory allocations in D are generally very expensive, but that’s not a problem, because it already functions as designed?
No. Memory allocations are /always/ expensive. Higher level languages like Java have the benefit of using pools and optimizing for this usage pattern, D does and will never have this. Keep in mind an allocation = usage of malloc + write to returned pointer.
 So even if the GC is enabled, good D code won't cause too much slow
 down unless you decide to write heavy OOP code.
I’ve been developing heavy OOP code in various languages (mostly C++, for me.
Well if you really insist to have a String class don't be too surprised for some reason it doesn't have the same performance to say Java. Aka don't go around creating/destroying classes a huge amount unless you have rolled some form of memory management policy such as reserving memory for the GC to use. We have other tools where OOP normally would be used such as templates, structs, function pointers and delegates.
Jun 15 2016
parent reply Konstantin <soonts gmail.com> writes:
On Wednesday, 15 June 2016 at 17:02:11 UTC, rikki cattermole 
wrote:

 Higher level languages like Java have the benefit of using 
 pools and optimizing for this usage pattern, D does and will 
 never have this.
Why don't you want the same for D?
 Well if you really insist to have a String class don't be too 
 surprised for some reason it doesn't have the same performance 
 to say Java.
Some areas, like compiling, or producing HTML/XML/JSON documents, manipulate strings a lot. Other areas, like GUI editors for sufficiently complex documents, or level editors for videogame, need to efficiently manipulate huge trees of assorted small objects, not necessarily strings.
 Aka don't go around creating/destroying classes a huge amount 
 unless you have rolled some form of memory management policy 
 such as reserving memory for the GC to use.
Yeah, that’s what I regularly do in C++ when I need to efficiently create/destroys many small objects. Sure, this typically leads to the best performance, e.g. because I can make the memory layout as cache friendly as humanly possible. But not all projects need that. And even for very performance demanding apps, not all components of the app need that. For such cases, a good GC (that just works well out of the box like .NET's GC does) can reduce development costs significantly.
Jun 15 2016
parent rikki cattermole <rikki cattermole.co.nz> writes:
On 16/06/2016 6:53 AM, Konstantin wrote:
 On Wednesday, 15 June 2016 at 17:02:11 UTC, rikki cattermole wrote:

 Higher level languages like Java have the benefit of using pools and
 optimizing for this usage pattern, D does and will never have this.
Why don't you want the same for D?
Because we don't need them. Sprinkling of fairy dust is for stories, not reality.
 Well if you really insist to have a String class don't be too
 surprised for some reason it doesn't have the same performance to say
 Java.
Some areas, like compiling, or producing HTML/XML/JSON documents, manipulate strings a lot. Other areas, like GUI editors for sufficiently complex documents, or level editors for videogame, need to efficiently manipulate huge trees of assorted small objects, not necessarily strings.
You're quite right and that is why we have a GC to begin with. Its also part of the reason why std.experimental.allocator will allow you to create an allocator that is able to handle such work load and then free when complete.
 Aka don't go around creating/destroying classes a huge amount unless
 you have rolled some form of memory management policy such as
 reserving memory for the GC to use.
Yeah, that’s what I regularly do in C++ when I need to efficiently create/destroys many small objects. Sure, this typically leads to the best performance, e.g. because I can make the memory layout as cache friendly as humanly possible. But not all projects need that. And even for very performance demanding apps, not all components of the app need that. For such cases, a good GC (that just works well out of the box like .NET's GC does) can reduce development costs significantly.
So exactly like what our GC does do. Unless you're doing real time development in any form (e.g. sound) you won't need to do much to work around the GC.
Jun 15 2016
prev sibling next sibling parent Kagamin <spam here.lot> writes:
On Wednesday, 15 June 2016 at 13:19:31 UTC, Konstantin wrote:
 Has anyone thought about taking GC from .NET and reusing it in 
 D?
Fast GC for D was considered and rejected. What can be done is a precise and concurrent GC.
Jun 15 2016
prev sibling next sibling parent reply ketmar <ketmar ketmar.no-ip.org> writes:
On Wednesday, 15 June 2016 at 13:19:31 UTC, Konstantin wrote:
 I don’t believe a community is capable of creating a good GC.
you are wrong. and you definitely know nothing about garbage collection, virtual machines and code generation. i wonder why people keep coming with "suggestions" and "solutions" without even a small knowledge in problem field.
Jun 15 2016
parent reply Jonathan Marler <johnnymarler gmail.com> writes:
On Wednesday, 15 June 2016 at 13:38:33 UTC, ketmar wrote:
 On Wednesday, 15 June 2016 at 13:19:31 UTC, Konstantin wrote:
 I don’t believe a community is capable of creating a good GC.
you are wrong. and you definitely know nothing about garbage collection, virtual machines and code generation. i wonder why people keep coming with "suggestions" and "solutions" without even a small knowledge in problem field.
That's pretty harsh Ketmar. It's obvious he knows the general ideas and was just wondering if using the .NET GC was a viable option. I think responding to others in such a demeaning way is harmful to the D community as it isolates people. It doesn't encourage people to be curious or want to start a discussion. Having people, especially newcomers to D come in and make suggestions and solutions is a great thing for a community. It means they saw enough potential in the language to want to know more and maybe even how they could contribute.
Jun 16 2016
parent reply Joerg Joergonson <JJoergonson gmail.com> writes:
On Friday, 17 June 2016 at 04:20:23 UTC, Jonathan Marler wrote:
 On Wednesday, 15 June 2016 at 13:38:33 UTC, ketmar wrote:
 On Wednesday, 15 June 2016 at 13:19:31 UTC, Konstantin wrote:
 [...]
you are wrong. and you definitely know nothing about garbage collection, virtual machines and code generation. i wonder why people keep coming with "suggestions" and "solutions" without even a small knowledge in problem field.
That's pretty harsh Ketmar. It's obvious he knows the general ideas and was just wondering if using the .NET GC was a viable option. I think responding to others in such a demeaning way is harmful to the D community as it isolates people. It doesn't encourage people to be curious or want to start a discussion. Having people, especially newcomers to D come in and make suggestions and solutions is a great thing for a community. It means they saw enough potential in the language to want to know more and maybe even how they could contribute.
It also makes ketmar look like a tard and childish. Konstantin said he 'believed' something then ketmar responded with a fallacious attack. Maybe ketmar needs to take his meds? ;)
Jun 17 2016
parent dewitt <dkdewitt gmail.com> writes:
On Friday, 17 June 2016 at 21:47:36 UTC, Joerg Joergonson wrote:
 On Friday, 17 June 2016 at 04:20:23 UTC, Jonathan Marler wrote:
 On Wednesday, 15 June 2016 at 13:38:33 UTC, ketmar wrote:
 On Wednesday, 15 June 2016 at 13:19:31 UTC, Konstantin wrote:
 [...]
you are wrong. and you definitely know nothing about garbage collection, virtual machines and code generation. i wonder why people keep coming with "suggestions" and "solutions" without even a small knowledge in problem field.
That's pretty harsh Ketmar. It's obvious he knows the general ideas and was just wondering if using the .NET GC was a viable option. I think responding to others in such a demeaning way is harmful to the D community as it isolates people. It doesn't encourage people to be curious or want to start a discussion. Having people, especially newcomers to D come in and make suggestions and solutions is a great thing for a community. It means they saw enough potential in the language to want to know more and maybe even how they could contribute.
It also makes ketmar look like a tard and childish. Konstantin said he 'believed' something then ketmar responded with a fallacious attack. Maybe ketmar needs to take his meds? ;)
The idea that "a community" cannot create a GC is false. It is not too complex as there are plenty of complex projects that are community driven. A better assumption would be he doesn't believe THIS community can create a GC. Not that I believe either because there is no reason a community driven GC would not be successful. There are plenty of good GCs out there like Java's for instance and work great for the ecosystem of the language but even if we had the greatest then some D devs would still be upset because a lot of D devs come from C/C++ and do not want the GC. Both statements made assumptions and I do not think they are even close to the worst things said on this forum. ppl chill a little. Light one up, Drink one, put some rounds down range, whatever you gotta do to lighten up a little.
Jun 17 2016
prev sibling next sibling parent reply Jack Stouffer <jack jackstouffer.com> writes:
On Wednesday, 15 June 2016 at 13:19:31 UTC, Konstantin wrote:
 I don’t believe a community is capable of creating a good GC. 
 It’s just too complex engineering task. It’s been a known 
 problem for years, still no solution.
GCs are a solved problem and the most common and fastest techniques have been known for more than 20 years. The GC implementation that D is using now came from the 70's, for example. One guy wrote the LuaJIT GC, which beat almost everyone else in performance when I last checked, so I think this is a massive exaggeration.
 Has anyone thought about taking GC from .NET and reusing it in 
 D?
Two words: write barriers.
Jun 15 2016
parent reply Konstantin <soonts gmail.com> writes:
On Wednesday, 15 June 2016 at 13:56:09 UTC, Jack Stouffer wrote:

 One guy wrote the LuaJIT GC, which beat almost everyone else in 
 performance when I last checked
“The current garbage collector is relatively slow compared to implementations for other language runtimes. It's not competitive with top-of-the-line GCs, especially for large workloads.“ https://github.com/LuaJIT/LuaJIT/issues/38 They have planned something for 3.0 that may or may not work: http://wiki.luajit.org/New-Garbage-Collector But that’s merely a design, AFAIK there’s no implementation. They’re still looking for a sponsor for that.
 Has anyone thought about taking GC from .NET and reusing it in 
 D?
Two words: write barriers.
What about them? You mean not all D’s target platforms support them?
Jun 15 2016
next sibling parent reply Jack Stouffer <jack jackstouffer.com> writes:
On Wednesday, 15 June 2016 at 17:03:21 UTC, Konstantin wrote:
 Two words: write barriers.
What about them? You mean not all D’s target platforms support them?
They're not acceptable for a systems programming language as they require you to pay for something that you might not use. According to our resident GC maintainer (among many other things), they would cause a 1%-5% slow down in the language: https://github.com/dlang/druntime/pull/1081#issuecomment-69151660
Jun 15 2016
parent reply Konstantin <soonts gmail.com> writes:
On Wednesday, 15 June 2016 at 18:23:52 UTC, Jack Stouffer wrote:

 They're not acceptable for a systems programming language as 
 they require you to pay for something that you might not use.

 According to our resident GC maintainer (among many other 
 things), they would cause a 1%-5% slow down in the language: 
 https://github.com/dlang/druntime/pull/1081#issuecomment-69151660
Well I’m not sure about the 5% (MS says their write barrier overhead is comparable to the cost of a simple method call, namely 6.4ns: https://msdn.microsoft.com/en-us/library/ms973852.aspx), but yeah, there’s some tradeoff, for having a good GC. By the way, Go implemented those barriers in version 1.5 a year ago: https://blog.golang.org/go15gc
Jun 15 2016
next sibling parent cym13 <cpicard openmailbox.org> writes:
On Wednesday, 15 June 2016 at 19:39:59 UTC, Konstantin wrote:
 On Wednesday, 15 June 2016 at 18:23:52 UTC, Jack Stouffer wrote:

 They're not acceptable for a systems programming language as 
 they require you to pay for something that you might not use.

 According to our resident GC maintainer (among many other 
 things), they would cause a 1%-5% slow down in the language: 
 https://github.com/dlang/druntime/pull/1081#issuecomment-69151660
Well I’m not sure about the 5% (MS says their write barrier overhead is comparable to the cost of a simple method call, namely 6.4ns: https://msdn.microsoft.com/en-us/library/ms973852.aspx), but yeah, there’s some tradeoff, for having a good GC. By the way, Go implemented those barriers in version 1.5 a year ago: https://blog.golang.org/go15gc
May I point out that you do not seem to have any kind of experience of D's GC? Try it and see for yourself wether it actually stops you or not. It's right that not everyone is pleased with the current GC but those users have specific expectations and I'm not certain at The point is, D doesn't have to have a GC. Not using it is way easier than in most other languages because all the tools to help you profile it and avoid it are provided by the compiler. Go without a good GC is a dead language. D without a good GC is just not as good as it could be. And btw we're generally faster than Go ;-) The point is: while a better GC is a work in progress we'll *never* have a GC that can fit all needs, but it's not as critical as it is limitations arise.
Jun 15 2016
prev sibling parent reply Jack Stouffer <jack jackstouffer.com> writes:
On Wednesday, 15 June 2016 at 19:39:59 UTC, Konstantin wrote:
 Well I’m not sure about the 5% (MS says their write barrier 
 overhead is comparable to the cost of a simple method call, 
 namely 6.4ns: 
 https://msdn.microsoft.com/en-us/library/ms973852.aspx), but 
 yeah, there’s some tradeoff, for having a good GC.
Even 1% overhead is unacceptable. Again, it's not reasonable for a systems language to have people pay for things they're not using. Write barriers will come to D over Walter's dead body.
 By the way, Go implemented those barriers in version 1.5 a year 
 ago: https://blog.golang.org/go15gc
Go has no allocation strategies but the GC, so that point is moot.
Jun 15 2016
parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 15 June 2016 at 20:22:21 UTC, Jack Stouffer wrote:
 On Wednesday, 15 June 2016 at 19:39:59 UTC, Konstantin wrote:
 Well I’m not sure about the 5% (MS says their write barrier 
 overhead is comparable to the cost of a simple method call, 
 namely 6.4ns: 
 https://msdn.microsoft.com/en-us/library/ms973852.aspx), but 
 yeah, there’s some tradeoff, for having a good GC.
Even 1% overhead is unacceptable. Again, it's not reasonable for a systems language to have people pay for things they're not using. Write barriers will come to D over Walter's dead body.
Simple exercise. You have 100 000 servers. Your application suddenly become 1% slower. How angry is your CFO when he discovers how many new machines he needs to buy ?
Jun 15 2016
parent Observer <here inter.net> writes:
On Wednesday, 15 June 2016 at 20:43:55 UTC, deadalnix wrote:
 Simple exercise. You have 100 000 servers. Your application 
 suddenly become 1% slower. How angry is your CFO when he 
 discovers how many new machines he needs to buy ?
Probably not too angry at all. This is still just a 1% budget increase, which amounts to a rounding error. Say those 100K servers cost $2K each, meaning $200M for the lot. An extra $2M capital costs doesn't mean much in that context. Perhaps a bigger issue might be the ongoing extra cost for energy, which applies to all the machines, not just tne new ones. Look at it another way. Anyone running 100_000 machines will certainly not be running them all flat-out, to where a 1% increase will push out a requirement for more machines. One needs extra capacity anyway to handle usual surges in the volume of business being handled by the servers. Look at it yet another way. Sure, $2M is a big number in absolute terms, for most of us. But if I were that CFO, instead of yelling about the problem, I'd go to the CTO and tell him to take 100 machines out of service and have the developers use them to profile the application and find places where much more than 1% can be saved.
Jun 18 2016
prev sibling parent reply Edwin van Leeuwen <edder tkwsping.nl> writes:
On Wednesday, 15 June 2016 at 17:03:21 UTC, Konstantin wrote:
 On Wednesday, 15 June 2016 at 13:56:09 UTC, Jack Stouffer wrote:
 Has anyone thought about taking GC from .NET and reusing it 
 in D?
Two words: write barriers.
What about them? You mean not all D’s target platforms support them?
I think he meant that the .NET GC (and most GC designs) rely on write barriers, but D does not have write barriers, since D is meant to be a proper systems language.
Jun 15 2016
parent reply jmh530 <john.michael.hall gmail.com> writes:
On Wednesday, 15 June 2016 at 18:28:42 UTC, Edwin van Leeuwen 
wrote:
 I think he meant that the .NET GC (and most GC designs) rely on 
 write barriers, but D does not have write barriers, since D is 
 meant to be a proper systems language.
My reading of that LuaJIT GC document is that it requires write barriers, but that they are very cheap.
Jun 15 2016
next sibling parent ketmar <ketmar ketmar.no-ip.org> writes:
On Wednesday, 15 June 2016 at 18:48:28 UTC, jmh530 wrote:
 My reading of that LuaJIT GC document is that it requires write 
 barriers, but that they are very cheap.
...for language that was originally VM-based. yet they'll have a noticable impact on language like D -- especially when programmer want to opt-out GC.
Jun 15 2016
prev sibling parent deadalnix <deadalnix gmail.com> writes:
On Wednesday, 15 June 2016 at 18:48:28 UTC, jmh530 wrote:
 On Wednesday, 15 June 2016 at 18:28:42 UTC, Edwin van Leeuwen 
 wrote:
 I think he meant that the .NET GC (and most GC designs) rely 
 on write barriers, but D does not have write barriers, since D 
 is meant to be a proper systems language.
My reading of that LuaJIT GC document is that it requires write barriers, but that they are very cheap.
Problem is, in D, many people want to NOT use a GC, and this is something we want to support. These people also do NOT want to pay for write barrier they do not use. That being said, we can do write barrier leveraging MMU on immutable (it'd be too expensive to do it on mutable data) during collection only, so that people that do not want to use the GC do not pay for it. The technique is used successfully in the ML family of languages for ages now with great results. Generally, I think the right path forward for D's GC is not to emulate managed language's GC as this clearly won't be acceptable for many users. On the other hand, we should: 1/ Leverage D's type system as to get infos about mutability/thread locality and segregate the heap accordingly/use adapted technique for each. 2/ Make sure the GC can deliver malloc grade performance in an alloc/free scenario, as to enable hybrid approach and allow people to rely on the GC to the extent they are willing to pay for. jemalloc internal datastructures are very amendable to build a GC. I started using this approach in SDC's GC. The only thing preventing me to move faster here is simply the time I can allocate to solve that problem.
Jun 15 2016
prev sibling parent thedeemon <dlang thedeemon.com> writes:
On Wednesday, 15 June 2016 at 13:19:31 UTC, Konstantin wrote:

 Has anyone thought about taking GC from .NET and reusing it in 
 D?
One significant point has been already mentioned: cost of write barriers. I'd like to mention another factor: .NET GC is a copying one, it moves data around. One good feature of current D is it never moves data, so you can very easily call C and C++ code and pass pointers to your buffers and stuff and C/C++ code just takes these pointers and works with them as usual. No pinning, no marshaling, zero overhead. If you take a moving GC like .NET's, you immediately make all C/C++ interaction much harder, now you need to worry about pinning stuff or copying "managed" data to "unmanaged" memory and back. This is all costly both in terms of CPU cycles and of programmer cycles. You'll need "FFI", what most other GC-ed languages have to have, and D doesn't.
Jun 16 2016