www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - The "no gc" crowd

reply "ponce" <contact gam3sfrommars.fr> writes:
At least on Internet forums, there seems to be an entire category 
of people dismissing D immediately because it has a GC.

http://www.reddit.com/r/programming/comments/1nxs2i/the_state_of_rust_08/ccne46t
http://www.reddit.com/r/programming/comments/1nxs2i/the_state_of_rust_08/ccnddqd
http://www.reddit.com/r/programming/comments/1nsxaa/when_performance_matters_comparing_c_and_go/cclqbqw

The subject inevitably comes in every reddit thread like it was 
some kind of show-stopper.

Now I know first-hand how much work avoiding a GC can give 
(http://blog.gamesfrommars.fr/2011/01/25/optimizing-crajsh-part-2-2/).

Yet with D the situation is different and I feel that criticism 
is way overblown:
- first of all, few people will have problems with GC in D at all
- then minimizing allocations can usually solve most of the 
problems
- if it's still a problem, the GC can be completely disabled, 
relevant language features avoided, and there will be no GC pause
- this work of avoiding allocations would happen anyway in a C++ 
codebase
- I happen to have a job with some hardcore optimized C++ 
codebase and couldn't care less that a GC would run provided 
there is a way to minimize GC usage (and there is)

Whatever rational rebutal we have it's never heard.
The long answer is that it's not a real problem. But it seems 
people want a short answer. It's also an annoying fight to have 
since so much of it is based on zero data.

Is there a plan to have a standard counter-attack to that kind of 
overblown problems?
It could be just a solid blog post or a  nogc feature.
Oct 08 2013
next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
Whereas I'm right now with more time than money, I've been 
thinking about doing a kickstarter or something with the goal of 
doing the necessary changes to make going without the gc easy:

1) make sure all allocations in phobos are explicit and avoidable 
where possible
2) have a flag where you can make gc allocations throw an assert 
error at runtime for debugging critical sections
3) maybe the  nogc thing people are talking about

and whatever else comes up.


While I tend to agree that the gc arguments don't carry much 
water (much bigger problem IMO is how intertwined phobos is... 
and even that is meh much of the time), having all this set up 
would be nice slam counter argument: the stdlib does not require 
the gc and there's ways to avoid it pretty easily.


The downside I expect with the kickstarter thing though is most 
the people complaining about this probably aren't willing to 
pitch in cash - they have little to gain from it since they 
aren't D users already. But it'd still be interesting to try.
Oct 08 2013
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Adam D. Ruppe:

 3) maybe the  nogc thing people are talking about
Some people have even suggested to support nogc or noheap as module attributes (beside being usable as function attributes). Bye, bearophile
Oct 08 2013
prev sibling parent reply "Tourist" <gravatar gravatar.com> writes:
On Tuesday, 8 October 2013 at 15:53:47 UTC, Adam D. Ruppe wrote:
 2) have a flag where you can make gc allocations throw an 
 assert error at runtime for debugging critical sections
Why handle it at runtime and not at compile time?
Oct 08 2013
parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tuesday, 8 October 2013 at 16:02:05 UTC, Tourist wrote:
 On Tuesday, 8 October 2013 at 15:53:47 UTC, Adam D. Ruppe wrote:
 2) have a flag where you can make gc allocations throw an 
 assert error at runtime for debugging critical sections
Why handle it at runtime and not at compile time?
One is I can implement a runtime check pretty easily so it'd just be the first step because it would go quickly. The other reason though would be D doesn't really have an attributed section. You can't do: void foo() { // stuff nogc { // your performance sensitive loop code } } But on the function level, that could be done at compile time if implemented.
Oct 08 2013
next sibling parent "Tourist" <gravatar gravatar.com> writes:
On Tuesday, 8 October 2013 at 16:18:50 UTC, Adam D. Ruppe wrote:
 On Tuesday, 8 October 2013 at 16:02:05 UTC, Tourist wrote:
 On Tuesday, 8 October 2013 at 15:53:47 UTC, Adam D. Ruppe 
 wrote:
 2) have a flag where you can make gc allocations throw an 
 assert error at runtime for debugging critical sections
Why handle it at runtime and not at compile time?
One is I can implement a runtime check pretty easily so it'd just be the first step because it would go quickly. The other reason though would be D doesn't really have an attributed section. You can't do: void foo() { // stuff nogc { // your performance sensitive loop code } } But on the function level, that could be done at compile time if implemented.
IMO it would be much more effective if it would be handled at compile time, both saving the dev's time and guaranteeing that there are no unwanted assert(0)'s lurking around in untested corners.
Oct 08 2013
prev sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Tuesday, 8 October 2013 at 16:18:50 UTC, Adam D. Ruppe wrote:
 One is I can implement a runtime check pretty easily so it'd 
 just be the first step because it would go quickly.
Runtime check is almost useless for this.
Oct 08 2013
parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tuesday, 8 October 2013 at 16:24:05 UTC, Dicebot wrote:
 Runtime check is almost useless for this.
They're how I do most the allocation checks now and could also be put in unit tests. But I agree the ideal would be a compile time check. The way I want it to work is to define a new thing in the language, maybe via __traits, that can recursively check the call graph of a function for the presence of any UDA. Then we attach gc to the druntime functions that use them, starting at gc_malloc and right up into aaDup, d_newclass, whatever concat is called. Ensure that the compiler considers these implicit function calls the same as any others and propagates the druntime UDA through the recursive trait. Then we define nogc to just be static assert(!hasAttribute!(gc, __traits(getAttributesRecursively, FUNCTION)). Maybe do a compiler or druntime change to something can be expanded to a template instantation too so this doesn't even have to be a specific builtin. (Though I've actually been able to do this by hacking druntime before.) But whatever that specific impl doesn't matter. Bottom line: nogc works after just marking the core functions in druntime that may allocate, and the very same functionality can be used for other things too, like noheap. Mark malloc with a uda, then define noheap as static assert(!gc && !malloc) Then we have compile time checking, at least on the function level.
Oct 08 2013
prev sibling next sibling parent reply "qznc" <qznc web.de> writes:
On Tuesday, 8 October 2013 at 15:43:46 UTC, ponce wrote:
 Whatever rational rebutal we have it's never heard.
 The long answer is that it's not a real problem. But it seems 
 people want a short answer. It's also an annoying fight to have 
 since so much of it is based on zero data.
Do not forget the followup: Without GC you cannot use the standard library. I am not sure if there is a really good counter. Some people think that a GC will never ever be possible in their domain. Either they are right or no rational argument will help. Only a small percentage will read the reasonable long argumentation. Maybe a killer app: Something hard realtime and high performance packed into a commercially successful product of a well known company.
Oct 08 2013
parent Paulo Pinto <pjmlp progtools.org> writes:
Am 08.10.2013 17:55, schrieb qznc:
 On Tuesday, 8 October 2013 at 15:43:46 UTC, ponce wrote:
 Whatever rational rebutal we have it's never heard.
 The long answer is that it's not a real problem. But it seems people
 want a short answer. It's also an annoying fight to have since so much
 of it is based on zero data.
Do not forget the followup: Without GC you cannot use the standard library. I am not sure if there is a really good counter. Some people think that a GC will never ever be possible in their domain. Either they are right or no rational argument will help. Only a small percentage will read the reasonable long argumentation. Maybe a killer app: Something hard realtime and high performance packed into a commercially successful product of a well known company.
Having been an user of the Native Oberon OS in the 90's, I tend to think it is more of a mental barrier than everything else. That said, in Oberon it is easier to control garbage than in D, as all GC related allocation are done via NEW or string manipulations, there are no implicit allocations. Manual memory management is also available via the SYSTEM module. -- Paulo
Oct 08 2013
prev sibling next sibling parent "evilrat" <evilrat666 gmail.com> writes:
On Tuesday, 8 October 2013 at 15:43:46 UTC, ponce wrote:
 Yet with D the situation is different and I feel that criticism 
 is way overblown:
 - first of all, few people will have problems with GC in D at 
 all
 - then minimizing allocations can usually solve most of the 
 problems
 - if it's still a problem, the GC can be completely disabled, 
 relevant language features avoided, and there will be no GC 
 pause
 - this work of avoiding allocations would happen anyway in a 
 C++ codebase
 - I happen to have a job with some hardcore optimized C++ 
 codebase and couldn't care less that a GC would run provided 
 there is a way to minimize GC usage (and there is)

 Whatever rational rebutal we have it's never heard.
 The long answer is that it's not a real problem. But it seems 
 people want a short answer. It's also an annoying fight to have 
 since so much of it is based on zero data.

 Is there a plan to have a standard counter-attack to that kind 
 of overblown problems?
 It could be just a solid blog post or a  nogc feature.
implementations), maybe that's just because i don't have that high memory/performance constraints, but still i definitely would start with GC and then work for minimize allocations... just because it allows focus on code more than memory management, and thus should give much better productivity and quality.
Oct 08 2013
prev sibling next sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Tuesday, 8 October 2013 at 15:43:46 UTC, ponce wrote:
 Is there a plan to have a standard counter-attack to that kind 
 of overblown problems?
 It could be just a solid blog post or a  nogc feature.
It is not overblown. It is simply " nogc" which is lacking but absolutely mandatory. Amount of hidden language allocations makes manually cleaning code of those via runtime asserts completely unreasonable for real project.
Oct 08 2013
next sibling parent reply "ponce" <contact gam3sfrommars.fr> writes:
On Tuesday, 8 October 2013 at 16:22:25 UTC, Dicebot wrote:
 It is not overblown. It is simply " nogc" which is lacking but 
 absolutely mandatory. Amount of hidden language allocations 
 makes manually cleaning code of those via runtime asserts 
 completely unreasonable for real project.
Hidden language allocations: - concatenation operator ~ - homogeneous arguments void (T[]... args) - "real" closures that escapes - array literals - some phobos calls What else am I missing? I don't see the big problem, and a small fraction of projects will require a complete ban on GC allocation, right?
Oct 08 2013
next sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Tuesday, 8 October 2013 at 16:29:38 UTC, ponce wrote:
 Hidden language allocations:
 - concatenation operator   ~
 - homogeneous arguments   void (T[]... args)
 - "real" closures that escapes
 - array literals
 - some phobos calls

 What else am I missing?
 I don't see the big problem, and a small fraction of projects 
 will require a complete ban on GC allocation, right?
Should be all I am aware of (though closures sometimes do allocate even without escaping AFAIK). This is more than enough. Imagine stuff like vibe.d - for proper performance you don't want to make any allocations during request handling. Neither GC, nor malloc. It is still perfectly fine to run GC in background (well, assuming we will get concurrent GC one day) for some persistent tasks but how are you going to verify your request handling is clean? By tracking mentioning of array literals in random places by hand? During every single pull review? I have said on this topic several times - it does not matter what is _possible_ to do with D memory model. It does matter what is _convenient_ to do. If something is possible but needs more attention than in C++ it will be considered by crowd as impossible and no blog posts will change that. (loud shouting " noheap, noheap, noheap !")
Oct 08 2013
next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tuesday, 8 October 2013 at 17:00:35 UTC, Dicebot wrote:
 Imagine stuff like vibe.d - for proper performance you don't 
 want to make any allocations during request handling.
That brings up another interesting advantage to my extensible scheme: we could also define blocking in the library to put on I/O calls and then vibe.d does a check for it and complains if you called one. Though getting this kind of coverage would be hard, a third party lib might not use it and then the custom check would miss the problem. But, in general, I'm sure if we had the capability, uses would come up beyond nogc/noheap.
 I have said on this topic several times - it does not matter 
 what is _possible_ to do with D memory model. It does matter 
 what is _convenient_ to do.
And, of course, "my function didn't compile because phobos uses the gc" is hardly convenient unless phobos offers the functionality (where possible) without allocating. I think output ranges offer a pretty good solution here for a lot of cases, and could be added without breaking the current interfaces - just keep functions that return new strings as alternatives implemented in terms of the base output range function.
Oct 08 2013
parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 10/8/13, Adam D. Ruppe <destructionator gmail.com> wrote:
 That brings up another interesting advantage to my extensible
 scheme: we could also define  blocking in the library to put on
 I/O calls and then vibe.d does a check for it and complains if
 you called one.
Kind of relevant, I've recently filed this: http://d.puremagic.com/issues/show_bug.cgi?id=10979
Oct 08 2013
prev sibling next sibling parent "Elvis Zhou" <elvis.x.zhou gmail.com> writes:
On Tuesday, 8 October 2013 at 17:00:35 UTC, Dicebot wrote:
 On Tuesday, 8 October 2013 at 16:29:38 UTC, ponce wrote:
 Hidden language allocations:
 - concatenation operator   ~
 - homogeneous arguments   void (T[]... args)
 - "real" closures that escapes
 - array literals
 - some phobos calls

 What else am I missing?
 I don't see the big problem, and a small fraction of projects 
 will require a complete ban on GC allocation, right?
Should be all I am aware of (though closures sometimes do allocate even without escaping AFAIK). This is more than enough. Imagine stuff like vibe.d - for proper performance you don't want to make any allocations during request handling. Neither GC, nor malloc. It is still perfectly fine to run GC in background (well, assuming we will get concurrent GC one day) for some persistent tasks but how are you going to verify your request handling is clean? By tracking mentioning of array literals in random places by hand? During every single pull review? I have said on this topic several times - it does not matter what is _possible_ to do with D memory model. It does matter what is _convenient_ to do. If something is possible but needs more attention than in C++ it will be considered by crowd as impossible and no blog posts will change that. (loud shouting " noheap, noheap, noheap !")
+1
Oct 08 2013
prev sibling next sibling parent reply "Araq" <rumpf_a web.de> writes:
 Imagine stuff like vibe.d - for proper performance you don't 
 want to make any allocations during request handling. Neither 
 GC, nor malloc.
This is absurd. O(1) malloc implementations exist, it is a solved problem. (http://www.gii.upv.es/tlsf/) TLSF executes a maximum of 168 processor instructions in a x86 architecture. Saying that you can't use that during request handling is like saying that you can't afford a cache miss.
Oct 08 2013
parent reply "Dicebot" <public dicebot.lv> writes:
On Tuesday, 8 October 2013 at 17:55:33 UTC, Araq wrote:
 O(1) malloc implementations exist, it is a solved problem. 
 (http://www.gii.upv.es/tlsf/)
custom allocator != generic malloc In such conditions you almost always want to use incremental region allocator anyway. Problem is hidden automatical allocation.
 TLSF executes a maximum of 168 processor instructions in a x86 
 architecture. Saying that you can't use that during request 
 handling is like saying that you can't afford a cache miss.
Some time ago I have been working in a networking project where request context was specifically designed to fit in a single cache line and breaking this immediately resulted in 30-40% performance penalty. There is nothing crazy about saying you can't afford an extra cache miss. It is just not that common. Same goes for avoiding heap allocations (but is much more common).
Oct 08 2013
parent reply Paulo Pinto <pjmlp progtools.org> writes:
Am 08.10.2013 22:39, schrieb Dicebot:
 On Tuesday, 8 October 2013 at 17:55:33 UTC, Araq wrote:
 O(1) malloc implementations exist, it is a solved problem.
 (http://www.gii.upv.es/tlsf/)
custom allocator != generic malloc In such conditions you almost always want to use incremental region allocator anyway. Problem is hidden automatical allocation.
 TLSF executes a maximum of 168 processor instructions in a x86
 architecture. Saying that you can't use that during request handling
 is like saying that you can't afford a cache miss.
Some time ago I have been working in a networking project where request context was specifically designed to fit in a single cache line and breaking this immediately resulted in 30-40% performance penalty. There is nothing crazy about saying you can't afford an extra cache miss. It is just not that common. Same goes for avoiding heap allocations (but is much more common).
How did you manage to keep the request size portable across processors/motherboards? Was the hardware design fixed? -- Paulo
Oct 08 2013
parent "Dicebot" <public dicebot.lv> writes:
On Tuesday, 8 October 2013 at 20:55:39 UTC, Paulo Pinto wrote:
 How did you manage to keep the request size portable across 
 processors/motherboards?

 Was the hardware design fixed?
Yes it was tightly coupled h/w + s/w solution sold as a whole and portability was out of question. I am still under NDA for few more years though so can't really tell most interesting stuff. (I was speaking about request context struct though, not request data itself)
Oct 08 2013
prev sibling parent reply Brad Roberts <braddr puremagic.com> writes:
On 10/8/13 10:00 AM, Dicebot wrote:
 proper performance
I apologize for picking out your post, Dicebot, as the illustrative example, but I see this pop up in various discussion and I've been meaning to comment on it for a while. Please stop using words like 'proper', 'real', and other similar terms to describe a requirement. It's a horrible specifier and adds no useful detail. It tends to needlessly setup the convarsation as confrontational or adversarial and implies that anyone that disagrees is wrong or not working on a real system. There's lots of cases where pushing to the very edge of bleeding isn't actually required. Thanks, Brad
Oct 08 2013
parent reply "Dicebot" <public dicebot.lv> writes:
On Tuesday, 8 October 2013 at 19:52:32 UTC, Brad Roberts wrote:
 On 10/8/13 10:00 AM, Dicebot wrote:
 proper performance
I apologize for picking out your post, Dicebot, as the illustrative example, but I see this pop up in various discussion and I've been meaning to comment on it for a while. Please stop using words like 'proper', 'real', and other similar terms to describe a requirement. It's a horrible specifier and adds no useful detail. It tends to needlessly setup the convarsation as confrontational or adversarial and implies that anyone that disagrees is wrong or not working on a real system. There's lots of cases where pushing to the very edge of bleeding isn't actually required. Thanks, Brad
What wording would you suggest to use? For me "proper" is pretty much equal to "meeting requirements / expectations as defined by similar projects written in C". It has nothing do with "real" vs "toy" projects, just implies that in some domains such expectations are more restrictive.
Oct 08 2013
parent Brad Roberts <braddr puremagic.com> writes:
On 10/8/13 1:41 PM, Dicebot wrote:
 On Tuesday, 8 October 2013 at 19:52:32 UTC, Brad Roberts wrote:
 On 10/8/13 10:00 AM, Dicebot wrote:
 proper performance
I apologize for picking out your post, Dicebot, as the illustrative example, but I see this pop up in various discussion and I've been meaning to comment on it for a while. Please stop using words like 'proper', 'real', and other similar terms to describe a requirement. It's a horrible specifier and adds no useful detail. It tends to needlessly setup the convarsation as confrontational or adversarial and implies that anyone that disagrees is wrong or not working on a real system. There's lots of cases where pushing to the very edge of bleeding isn't actually required. Thanks, Brad
What wording would you suggest to use? For me "proper" is pretty much equal to "meeting requirements / expectations as defined by similar projects written in C". It has nothing do with "real" vs "toy" projects, just implies that in some domains such expectations are more restrictive.
Looking at the context you used it in, defining the performance of vibe and requiring that it never allocate during page processing to get 'proper performance'. Would you agree that at least in this case and with that definition that it's an over-reach at least? That there's likely to be far more applications where allocation is perfectly acceptable and quite possibly even required to achieve the goals of the applications than not? Yes, there are some where the performance or latency requirements are so high that the allocations have to be avoided to achieve them, but that's more the exception than the rule? For those applications 'proper performance' allows a far greater lattitude of implementation choices. It's applying your, unspecified, requirements as the definition of valid. The biggest place it irks me isn't actually this type of case but rather when applied to docs or the language spec.
Oct 08 2013
prev sibling next sibling parent reply "Brad Anderson" <eco gnuk.net> writes:
On Tuesday, 8 October 2013 at 16:29:38 UTC, ponce wrote:
 On Tuesday, 8 October 2013 at 16:22:25 UTC, Dicebot wrote:
 It is not overblown. It is simply " nogc" which is lacking but 
 absolutely mandatory. Amount of hidden language allocations 
 makes manually cleaning code of those via runtime asserts 
 completely unreasonable for real project.
Hidden language allocations: - concatenation operator ~ - homogeneous arguments void (T[]... args) - "real" closures that escapes - array literals - some phobos calls What else am I missing? I don't see the big problem, and a small fraction of projects will require a complete ban on GC allocation, right?
Johannes Pfau's -vgc pull request[1] had a list of ones he was able to find. It's all allocations, not just hidden allocations: COV // Code coverage enabled NEW // User called new (and it's not placement new) ASSERT_USER // A call to assert. This usually throws, but can be overwritten // by user SWITCH_USER // Called on switch error. This usually throws, but can be // overwritten by user HIDDEN_USER // Called on hidden function error. This usually throws, but can // be overwritten by user CONCAT // a ~ b ARRAY // array.length = value, literal, .dup, .idup, .sort APPEND // a~= b AALITERAL // ["a":1] CLOSURE 1. https://github.com/D-Programming-Language/dmd/pull/1886
Oct 08 2013
parent "Don" <x nospam.com> writes:
On Tuesday, 8 October 2013 at 17:47:54 UTC, Brad Anderson wrote:
 On Tuesday, 8 October 2013 at 16:29:38 UTC, ponce wrote:
 On Tuesday, 8 October 2013 at 16:22:25 UTC, Dicebot wrote:
 It is not overblown. It is simply " nogc" which is lacking 
 but absolutely mandatory. Amount of hidden language 
 allocations makes manually cleaning code of those via runtime 
 asserts completely unreasonable for real project.
Hidden language allocations: - concatenation operator ~ - homogeneous arguments void (T[]... args) - "real" closures that escapes - array literals - some phobos calls What else am I missing? I don't see the big problem, and a small frac
tion of projects
 will require a complete ban on GC allocation, right?
Johannes Pfau's -vgc pull request[1] had a list of ones he was able to find. It's all allocations, not just hidden allocations: COV // Code coverage enabled NEW // User called new (and it's not placement new) ASSERT_USER // A call to assert. This usually throws, but can be overwritten // by user SWITCH_USER // Called on switch error. This usually throws, but can be // overwritten by user HIDDEN_USER // Called on hidden function error. This usually throws, but can // be overwritten by user CONCAT // a ~ b ARRAY // array.length = value, literal, .dup, .idup, .sort APPEND // a~= b AALITERAL // ["a":1] CLOSURE 1. https://github.com/D-Programming-Language/dmd/pull/1886
The closure one is a problem. I think that returning a closure should use a different syntax from using a normal delegate. I doubt it's something you _ever_ want to do by accident. It's a problem because you can't see at a glance if a function uses a closure or not. You have to inspect the entire function very carefully, checking all code paths.
Oct 09 2013
prev sibling next sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, October 08, 2013 18:29:36 ponce wrote:
 On Tuesday, 8 October 2013 at 16:22:25 UTC, Dicebot wrote:
 It is not overblown. It is simply " nogc" which is lacking but
 absolutely mandatory. Amount of hidden language allocations
 makes manually cleaning code of those via runtime asserts
 completely unreasonable for real project.
Hidden language allocations: - concatenation operator ~ - homogeneous arguments void (T[]... args) - "real" closures that escapes - array literals - some phobos calls What else am I missing? I don't see the big problem, and a small fraction of projects will require a complete ban on GC allocation, right?
I think that it's clear that for some projects, it's critical to minimize the GC, and I think that it's clear that we need to do a better job of supporting the folks who want to minimize GC usage, but I also think that for the vast majority of cases, complaints about the GC are way overblown. It becomes an issue when you're doing a lot of heap allocations, but it's frequently easy to design D code so that heap allocations are relatively rare such that they aren't going to be a serious problem outside of code which is performance critical to the point that it would be worrying about the cost of malloc (which most code isn't). Personally, the only time that I've run into issues with the GC is when trying to do use RedBlackTree with a lot of items. That has a tendancy to tank performance. So, yes, it's problem. Yes, we need to improve the situaton. But for most situations, I think that the concern about the GC is way overblown. - Jonathan M Davis
Oct 08 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/8/2013 12:34 PM, Jonathan M Davis wrote:
 I think that it's clear that for some projects, it's critical to minimize the
 GC, and I think that it's clear that we need to do a better job of supporting
 the folks who want to minimize GC usage, but I also think that for the vast
 majority of cases, complaints about the GC are way overblown. It becomes an
 issue when you're doing a lot of heap allocations, but it's frequently easy to
 design D code so that heap allocations are relatively rare such that they
 aren't going to be a serious problem outside of code which is performance
 critical to the point that it would be worrying about the cost of malloc
 (which most code isn't). Personally, the only time that I've run into issues
 with the GC is when trying to do use RedBlackTree with a lot of items. That
 has a tendancy to tank performance.

 So, yes, it's problem. Yes, we need to improve the situaton. But for most
 situations, I think that the concern about the GC is way overblown.
+1 Some years ago, a colleague of mine moonlighted teaching remedial algebra at the UW. She'd write on the board: x + 2 = 5 and call on a student to "solve for x". The student would collapse in a stuttering heap of jelly, emitting sparks and smoke like a Star Trek computer. She discovered that if she wrote instead: _ + 2 = 5 and would ask the same student what goes in the blank spot, he'd say "3" without hesitation. In other words, the student would only see the words "solve", "x", and "algebra" which were a shortcut in his brain to "I can't do this" and "gee math is hard." She found she was a far more effective teacher by avoiding using those words. I realized the same thing was happening with the word "template". I talked Andrei into avoiding all use of that word in "The D Programming Language", figuring that we could get people who were terrified of "templates" to use them successfully without realizing it (and I think this was very successful). We have a similar problem with "GC". People hear that word, and they are instantly turned off. No amount of education will change that. We simply have to find a better way to deal with this issue.
Oct 08 2013
next sibling parent reply "Brad Anderson" <eco gnuk.net> writes:
On Tuesday, 8 October 2013 at 23:05:37 UTC, Walter Bright wrote:
 We have a similar problem with "GC". People hear that word, and 
 they are instantly turned off. No amount of education will 
 change that. We simply have to find a better way to deal with 
 this issue.
Time to replace the Garbage Collector with a Memory Recycler.
Oct 08 2013
next sibling parent reply "ponce" <contact gmsfrommars.fr> writes:
On Tuesday, 8 October 2013 at 23:32:51 UTC, Brad Anderson wrote:
 On Tuesday, 8 October 2013 at 23:05:37 UTC, Walter Bright wrote:
 We have a similar problem with "GC". People hear that word, 
 and they are instantly turned off. No amount of education will 
 change that. We simply have to find a better way to deal with 
 this issue.
Time to replace the Garbage Collector with a Memory Recycler.
Resource Guard? Area cleaner? Or the Graph Inspector (tm).
Oct 08 2013
parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Wednesday, 9 October 2013 at 00:00:09 UTC, ponce wrote:
 Resource Guard?
Actually, I've been coming to see the gc as being an implementation detail for immutability, and memory safety is a secondary piece. With immutable though, you are guaranteeing that the contents never change. Never change means the memory must never be reused either - thus never freed. But we want to reclaim that resource if we can. The GC lets us do that without ever breaking the guarantee of immutability. To this end, I really, really want to see the scope parameter (and return value!) thing implemented. All mutable and const slices and pointers would be assumed to be scope (or if this breaks too much code, strongly recommended to be scope at all times). That way, you won't let the reference escape, meaning it is safe to manage those resources with destructors on the outer layer, while using them for cheap on inner layers by just passing the pointer, which can even be implicit, iff scope. Those wouldn't need the gc. Immutable resources, however, are always safe to store, so scope immutable would be irrelevant. This illusion is maintained with the gc.
Oct 08 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/8/2013 4:32 PM, Brad Anderson wrote:
 Time to replace the Garbage Collector with a Memory Recycler.
"Soylent Green" ?
Oct 08 2013
parent reply "Brad Anderson" <eco gnuk.net> writes:
On Wednesday, 9 October 2013 at 00:01:30 UTC, Walter Bright wrote:
 On 10/8/2013 4:32 PM, Brad Anderson wrote:
 Time to replace the Garbage Collector with a Memory Recycler.
"Soylent Green" ?
"You've got to tell them... It's bar...foo is made out of bar"
Oct 08 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 10/8/2013 5:49 PM, Brad Anderson wrote:
 On Wednesday, 9 October 2013 at 00:01:30 UTC, Walter Bright wrote:
 On 10/8/2013 4:32 PM, Brad Anderson wrote:
 Time to replace the Garbage Collector with a Memory Recycler.
"Soylent Green" ?
"You've got to tell them... It's bar...foo is made out of bar"
That so-called "new" memory you've been promised ... it's been used before!
Oct 08 2013
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
On 9 October 2013 09:05, Walter Bright <newshound2 digitalmars.com> wrote:

 On 10/8/2013 12:34 PM, Jonathan M Davis wrote:

 I think that it's clear that for some projects, it's critical to minimize
 the
 GC, and I think that it's clear that we need to do a better job of
 supporting
 the folks who want to minimize GC usage, but I also think that for the
 vast
 majority of cases, complaints about the GC are way overblown. It becomes
 an
 issue when you're doing a lot of heap allocations, but it's frequently
 easy to
 design D code so that heap allocations are relatively rare such that they
 aren't going to be a serious problem outside of code which is performance
 critical to the point that it would be worrying about the cost of malloc
 (which most code isn't). Personally, the only time that I've run into
 issues
 with the GC is when trying to do use RedBlackTree with a lot of items.
 That
 has a tendancy to tank performance.

 So, yes, it's problem. Yes, we need to improve the situaton. But for most
 situations, I think that the concern about the GC is way overblown.
+1 Some years ago, a colleague of mine moonlighted teaching remedial algebra at the UW. She'd write on the board: x + 2 = 5 and call on a student to "solve for x". The student would collapse in a stuttering heap of jelly, emitting sparks and smoke like a Star Trek computer. She discovered that if she wrote instead: _ + 2 = 5 and would ask the same student what goes in the blank spot, he'd say "3" without hesitation. In other words, the student would only see the words "solve", "x", and "algebra" which were a shortcut in his brain to "I can't do this" and "gee math is hard." She found she was a far more effective teacher by avoiding using those words. I realized the same thing was happening with the word "template". I talked Andrei into avoiding all use of that word in "The D Programming Language", figuring that we could get people who were terrified of "templates" to use them successfully without realizing it (and I think this was very successful). We have a similar problem with "GC". People hear that word, and they are instantly turned off. No amount of education will change that. We simply have to find a better way to deal with this issue.
I think there is certainly an element of this. When someone identifies something in their code that consistently causes issues, they tend to ban it, rather than try and understand it better. I will admit to being absolutely guilty of this in the past myself. We banned STL and the template keyword outright (not so uncommon in the games industry). But the problem is not so simple as taking the time to understand something thoroughly... People often forget that programmers work in large teams, and typically, about 30-40% of those programmers are junior, and a further 30-50% are just average-joe programmers for who it's a day job, and don't really give a shit. The take-away is, if a couple of guys grapple with something that's a consistent problem and find a very specific (possibly obscure) usage that doesn't violate their criteria, they then have a communication issue with a whole bunch of other inexperienced or apathetic programmers, which I can assure you, is an endless up-hill battle. It's usually easier to ban it, and offer a slightly less pleasant, but reliable alternative instead. Unless you feel it's a good use of time and money to have your few best senior programmers trawling through code finding violations of standard practise and making trivial (but hard to find/time consuming) fixes on build nights. The (existing) GC definitely fits this description in my mind.
Oct 08 2013
prev sibling parent reply "Elvis Zhou" <elvis.x.zhou gmail.com> writes:
On Tuesday, 8 October 2013 at 23:05:37 UTC, Walter Bright wrote:
 On 10/8/2013 12:34 PM, Jonathan M Davis wrote:
 I think that it's clear that for some projects, it's critical 
 to minimize the
 GC, and I think that it's clear that we need to do a better 
 job of supporting
 the folks who want to minimize GC usage, but I also think that 
 for the vast
 majority of cases, complaints about the GC are way overblown. 
 It becomes an
 issue when you're doing a lot of heap allocations, but it's 
 frequently easy to
 design D code so that heap allocations are relatively rare 
 such that they
 aren't going to be a serious problem outside of code which is 
 performance
 critical to the point that it would be worrying about the cost 
 of malloc
 (which most code isn't). Personally, the only time that I've 
 run into issues
 with the GC is when trying to do use RedBlackTree with a lot 
 of items. That
 has a tendancy to tank performance.

 So, yes, it's problem. Yes, we need to improve the situaton. 
 But for most
 situations, I think that the concern about the GC is way 
 overblown.
+1 Some years ago, a colleague of mine moonlighted teaching remedial algebra at the UW. She'd write on the board: x + 2 = 5 and call on a student to "solve for x". The student would collapse in a stuttering heap of jelly, emitting sparks and smoke like a Star Trek computer. She discovered that if she wrote instead: _ + 2 = 5 and would ask the same student what goes in the blank spot, he'd say "3" without hesitation. In other words, the student would only see the words "solve", "x", and "algebra" which were a shortcut in his brain to "I can't do this" and "gee math is hard." She found she was a far more effective teacher by avoiding using those words. I realized the same thing was happening with the word "template". I talked Andrei into avoiding all use of that word in "The D Programming Language", figuring that we could get people who were terrified of "templates" to use them successfully without realizing it (and I think this was very successful). We have a similar problem with "GC". People hear that word, and they are instantly turned off. No amount of education will change that. We simply have to find a better way to deal with this issue.
You remind me of a well-known Chinese fable. At the time when Fan, a nobleman of the state of Jin, became a fugitive, a commoner found a bell and wanted to carry it off on his back. But the bell was too big for him. When he tried to knock it into pieces with a hammer there was a loud clanging sound. He was afraid that someone will hear the noise and take the bell from him, so he immediately stopped his own ears. To worry about other people hearing the noise is understandable, but to worry about himself hearing the noise (as if stopping his own ears would prevent other people from hearing) is absurd.
Oct 09 2013
parent "Froglegs" <barf barf.com> writes:
  GC works for some cases, but the global one size fits all GC 
that D uses is no good.
Oct 09 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/8/13 9:29 AM, ponce wrote:
 On Tuesday, 8 October 2013 at 16:22:25 UTC, Dicebot wrote:
 It is not overblown. It is simply " nogc" which is lacking but
 absolutely mandatory. Amount of hidden language allocations makes
 manually cleaning code of those via runtime asserts completely
 unreasonable for real project.
Hidden language allocations: - concatenation operator ~ - homogeneous arguments void (T[]... args) - "real" closures that escapes - array literals - some phobos calls What else am I missing? I don't see the big problem, and a small fraction of projects will require a complete ban on GC allocation, right?
It's clear that the perception of GC will not change soon, however good or not the arguments may be as applied to various situations and projects. It is also a reality that our GC is slow. So we need to attack this problem from multiple angles: * Make RefCounted work as immutable data. There will be a casting away of immutability in there, but since it's under the aegis of the standard library, it is admissible. All implementations must ensure RefCounted works. * Add reference counted slice and associative array types that are as close as humanly possible to the built-in ones. * Advertise all of the above in a top module such as std.refcounted. It's amazing how many D programmers have no idea RefCounted even exists. * Review all of Phobos for hidden allocations and make appropriate decisions on how to give the user more control over allocation. I have some idea on how that can be done, at various disruption costs. * Get Robert Schadek's precise GC in. Walter and I have become 101% convinced a precise GC is the one way to go about GC. I'm up to my neck in D work for Facebook so my time is well spent. We must definitely up the ante in terms of development speed, and pronto. Andrei
Oct 08 2013
next sibling parent "Brad Anderson" <eco gnuk.net> writes:
On Wednesday, 9 October 2013 at 02:22:35 UTC, Andrei Alexandrescu 
wrote:
 * Advertise all of the above in a top module such as 
 std.refcounted. It's amazing how many D programmers have no 
 idea RefCounted even exists.
std.typecons is a little treasure trove of stuff nobody can find or knows about. I think tuple should be promoted to its own module too (which would help alleviate some of the confusion new users have with std.typetuple as well).
 * Review all of Phobos for hidden allocations and make 
 appropriate decisions on how to give the user more control over 
 allocation. I have some idea on how that can be done, at 
 various disruption costs.
I ran Johannes Pfau's work-in-progress -vgc [1] while building Phobos back in June. Here are the now out of date results: http://goo.gl/HP78r I'm sure plenty is missing in there (I think allocations inside templates that aren't instiated aren't included).
 * Get Robert Schadek's precise GC in. Walter and I have become 
 101% convinced a precise GC is the one way to go about GC.

 I'm up to my neck in D work for Facebook so my time is well 
 spent. We must definitely up the ante in terms of development 
 speed, and pronto.


 Andrei
All of this sounds great. 1. https://github.com/D-Programming-Language/dmd/pull/1886
Oct 08 2013
prev sibling next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 October 2013 at 02:22:35 UTC, Andrei Alexandrescu 
wrote:
 * Get Robert Schadek's precise GC in. Walter and I have become 
 101% convinced a precise GC is the one way to go about GC.
This make sense. We also badly need to be able to use type qualifier. We must stop the world when collecting thread local data or immutable one. That do not make any sense. I'm pretty sure that in most application, sheared data is a small set. First step is to get the compiler to generate appropriate calls, even if they go to the same GC alloc function to begin with.
Oct 08 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/8/2013 8:18 PM, deadalnix wrote:
 We also badly need to be able to use type qualifier. We must
 stop the world when collecting thread local data or immutable one. That do not
 make any sense.
Making this work is fraught with difficulty. It is normal behavior in D to create local data with new(), build a data structure, and then cast it to shared so it can be transferred to another thread. This will fail miserably if the data is allocated on a thread local heap.
Oct 08 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/8/13 8:31 PM, Walter Bright wrote:
 On 10/8/2013 8:18 PM, deadalnix wrote:
 We also badly need to be able to use type qualifier. We must
 stop the world when collecting thread local data or immutable one.
 That do not
 make any sense.
Making this work is fraught with difficulty. It is normal behavior in D to create local data with new(), build a data structure, and then cast it to shared so it can be transferred to another thread.
Stop right there. NO. That is NOT normal behavior, and if we require it in order to work we have made a mistake. There should be NO casting required to get work done in client code. Andrei
Oct 08 2013
next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 October 2013 at 03:46:16 UTC, Andrei Alexandrescu 
wrote:
 On 10/8/13 8:31 PM, Walter Bright wrote:
 On 10/8/2013 8:18 PM, deadalnix wrote:
 We also badly need to be able to use type qualifier. We must
 stop the world when collecting thread local data or immutable 
 one.
 That do not
 make any sense.
Making this work is fraught with difficulty. It is normal behavior in D to create local data with new(), build a data structure, and then cast it to shared so it can be transferred to another thread.
Stop right there. NO. That is NOT normal behavior, and if we require it in order to work we have made a mistake. There should be NO casting required to get work done in client code.
I have to concur with Andrei on that one. The win is simply so big that we can't ignore it. I you really want to do something like that, you can allocate as shared, cast to TL build the structure and cast back to shared. When you break the type system, all guarantee are thrown away, and breaking it twice won't make thing worse at this point. That being said, isolated is probably something we want to add in the future.
Oct 08 2013
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2013-10-09 04:07:53 +0000, "deadalnix" <deadalnix gmail.com> said:

 That being said, isolated is probably something we want to add in the future.
You'll hit a problem with immutable though. Immutable is implicitly shared, and immutable strings are everywhere! -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 09 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 October 2013 at 13:03:33 UTC, Michel Fortin wrote:
 On 2013-10-09 04:07:53 +0000, "deadalnix" <deadalnix gmail.com> 
 said:

 That being said, isolated is probably something we want to add 
 in the future.
You'll hit a problem with immutable though. Immutable is implicitly shared, and immutable strings are everywhere!
Collecting immutable do not require to stop the application.
Oct 09 2013
parent Michel Fortin <michel.fortin michelf.ca> writes:
On 2013-10-09 15:18:23 +0000, "deadalnix" <deadalnix gmail.com> said:

 On Wednesday, 9 October 2013 at 13:03:33 UTC, Michel Fortin wrote:
 On 2013-10-09 04:07:53 +0000, "deadalnix" <deadalnix gmail.com> said:
 
 That being said, isolated is probably something we want to add in the future.
You'll hit a problem with immutable though. Immutable is implicitly shared, and immutable strings are everywhere!
Collecting immutable do not require to stop the application.
All I'm pointing out is that as long as immutable is implicitly shared, immutable memory has to be allocated and collected the same way as shared memory "just in case" someone passes it to another thread. The type system can't prevent you from sharing your immutable strings with other threads. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 09 2013
prev sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, October 08, 2013 20:46:16 Andrei Alexandrescu wrote:
 On 10/8/13 8:31 PM, Walter Bright wrote:
 On 10/8/2013 8:18 PM, deadalnix wrote:
 We also badly need to be able to use type qualifier. We must
 stop the world when collecting thread local data or immutable one.
 That do not
 make any sense.
Making this work is fraught with difficulty. It is normal behavior in D to create local data with new(), build a data structure, and then cast it to shared so it can be transferred to another thread.
Stop right there. NO. That is NOT normal behavior, and if we require it in order to work we have made a mistake. There should be NO casting required to get work done in client code.
Except that for the most part, that's the only way that immutable objects can be created - particularly if you're talking about arrays or AAs. It's _very_ common to do what Walter is describing. On top of that, we're forced to cast to immutable or shared to pass anything via std.concurrency, which causes pretty much the same problem. We're not even vaguely set up at this point to have type qualifiers indicate how something was constructed or what thread it comes from. - Jonathan M Davis
Oct 08 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/8/13 9:24 PM, Jonathan M Davis wrote:
 On Tuesday, October 08, 2013 20:46:16 Andrei Alexandrescu wrote:
 On 10/8/13 8:31 PM, Walter Bright wrote:
 On 10/8/2013 8:18 PM, deadalnix wrote:
 We also badly need to be able to use type qualifier. We must
 stop the world when collecting thread local data or immutable one.
 That do not
 make any sense.
Making this work is fraught with difficulty. It is normal behavior in D to create local data with new(), build a data structure, and then cast it to shared so it can be transferred to another thread.
Stop right there. NO. That is NOT normal behavior, and if we require it in order to work we have made a mistake. There should be NO casting required to get work done in client code.
Except that for the most part, that's the only way that immutable objects can be created - particularly if you're talking about arrays or AAs. It's _very_ common to do what Walter is describing. On top of that, we're forced to cast to immutable or shared to pass anything via std.concurrency, which causes pretty much the same problem. We're not even vaguely set up at this point to have type qualifiers indicate how something was constructed or what thread it comes from. - Jonathan M Davis
The way I see it we must devise a robust solution to that, NOT consider the state of the art immutable (heh, a pun). Andrei
Oct 08 2013
next sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 09/10/13 06:25, Andrei Alexandrescu wrote:
 The way I see it we must devise a robust solution to that, NOT consider the
 state of the art immutable (heh, a pun).
Must say I have had a miserable experience with immutability and any kind of complex data structure, particularly when concurrency is involved. As a practical fact I've often found it necessary to convert to immutable (not always via a cast or std.conv.to; sometimes via assumeUnique) to pass a complex data structure to a thread, but then to convert _away_ from immutable inside the thread in order for that data (which is never actually mutated) to be practically usable. I'm sure there are things that I could do better, but I did not find a superior solution that was also performance-friendly.
Oct 10 2013
parent "qznc" <qznc web.de> writes:
On Thursday, 10 October 2013 at 17:23:20 UTC, Joseph Rushton 
Wakeling wrote:
 On 09/10/13 06:25, Andrei Alexandrescu wrote:
 The way I see it we must devise a robust solution to that, NOT 
 consider the
 state of the art immutable (heh, a pun).
Must say I have had a miserable experience with immutability and any kind of complex data structure, particularly when concurrency is involved.
I feel your pain. See also this thread in D.learn: http://forum.dlang.org/post/sdefkajobwcfikkelxbr forum.dlang.org
Oct 10 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, October 10, 2013 19:23:14 Joseph Rushton Wakeling wrote:
 On 09/10/13 06:25, Andrei Alexandrescu wrote:
 The way I see it we must devise a robust solution to that, NOT consider
 the
 state of the art immutable (heh, a pun).
Must say I have had a miserable experience with immutability and any kind of complex data structure, particularly when concurrency is involved. As a practical fact I've often found it necessary to convert to immutable (not always via a cast or std.conv.to; sometimes via assumeUnique) to pass a complex data structure to a thread, but then to convert _away_ from immutable inside the thread in order for that data (which is never actually mutated) to be practically usable. I'm sure there are things that I could do better, but I did not find a superior solution that was also performance-friendly.
std.concurrency's design basically requires that you cast objects to shared or immutable in order to pass them across threads (and using assumeUnique is still doing the cast, just internally). And then you have to cast them back to thread-local mutable on the other side to complete the pass and make the object useable again. There's really no way around that at this point, not without completely redesigning shared. Arguably, it's better to use shared when doing that rather than immutable, but at least in the past, shared hasn't worked right with std.concurrency even though it's supposed to (though that's an implementation issue rather than a design one, and it might be fixed by now - I haven't tried recently). And whether you're using shared or immutable, you're still having to cast. I'm honestly surprised that Andrei is rejecting the idea of casting to/from shared or immutable being normal given how it's required by our current concurrency model. And changing that would be a _big_ change. - Jonathan M Davis
Oct 10 2013
prev sibling next sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 10/10/13 19:31, Jonathan M Davis wrote:
 I'm honestly surprised that Andrei is rejecting the idea of casting to/from
 shared or immutable being normal given how it's required by our current
 concurrency model. And changing that would be a _big_ change.
I'm starting to incline towards the view that type qualifications of _any_ kind become problematic once you start working with any types other than built-in, and not just in the context of concurrency. See e.g.: http://d.puremagic.com/issues/show_bug.cgi?id=11148 http://d.puremagic.com/issues/show_bug.cgi?id=11188 I'd really appreciate advice on how to handle issues like these, because it's becoming a serious obstacle to my work on std.rational.
Oct 10 2013
next sibling parent "Daniel Davidson" <nospam spam.com> writes:
On Thursday, 10 October 2013 at 17:36:11 UTC, Joseph Rushton 
Wakeling wrote:
 On 10/10/13 19:31, Jonathan M Davis wrote:
 I'm honestly surprised that Andrei is rejecting the idea of 
 casting to/from
 shared or immutable being normal given how it's required by 
 our current
 concurrency model. And changing that would be a _big_ change.
I'm starting to incline towards the view that type qualifications of _any_ kind become problematic once you start working with any types other than built-in, and not just in the context of concurrency. See e.g.: http://d.puremagic.com/issues/show_bug.cgi?id=11148 http://d.puremagic.com/issues/show_bug.cgi?id=11188 I'd really appreciate advice on how to handle issues like these, because it's becoming a serious obstacle to my work on std.rational.
As qnzc pointed out - check out this thread: http://forum.dlang.org/post/sdefkajobwcfikkelxbr forum.dlang.org Your problems with BigInt occur because the language has a special optimization for assignment of structs with no mutable aliasing. Fundamental math types have no aliasing so that assignment from any to all is fine and efficient via a data copy. For types like BigInt with mutable aliasing crossing from mutable to immutable and back is a no-go because of the reasons pointed out in the response to bug 11148. You can not and should not be able to do what you are asking - pass mutable with aliasing into immutable because then immutable would not be guaranteed. Two options are copy BigInt beforehand if you want to keep by value semantics on your function signatures or maybe pass by reference (they are big after all so why copy)? Passing by ref won't really be the full solution to your problem on comment 6 of bug 11148. What you really want to do is take a const(BigInt) or ref to it and make a mutable copy. So do that! But wait, how can you do that? You need to have a dup type function. Ideally there would be a generic way to do this. I have one that works for most cases and including yours: https://github.com/patefacio/d-help/blob/master/d-help/opmix/dup.d This support could easily be put in the standard. import std.bigint, std.stdio; import opmix.mix; void foo(const(BigInt) n) { // make mutable copy auto bi = n.gdup; bi *= 2; writeln(bi); } void main() { const cbi = BigInt("1234567890987654321"); foo(cbi); writeln(cbi); } ---------------------- 2469135781975308642 1234567890987654321 Thanks, Dan
Oct 10 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/10/13 10:36 AM, Joseph Rushton Wakeling wrote:
 On 10/10/13 19:31, Jonathan M Davis wrote:
 I'm honestly surprised that Andrei is rejecting the idea of casting
 to/from
 shared or immutable being normal given how it's required by our current
 concurrency model. And changing that would be a _big_ change.
I'm starting to incline towards the view that type qualifications of _any_ kind become problematic once you start working with any types other than built-in, and not just in the context of concurrency. See e.g.: http://d.puremagic.com/issues/show_bug.cgi?id=11148 http://d.puremagic.com/issues/show_bug.cgi?id=11188 I'd really appreciate advice on how to handle issues like these, because it's becoming a serious obstacle to my work on std.rational.
I'll look into this soon. Andrei
Oct 10 2013
parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 11/10/13 02:44, Andrei Alexandrescu wrote:
 I'll look into this soon.
That would be fantastic, thank you very much. Any chance you could ask Don Clugston to get in touch with me about these issues? std.bigint is his, and I know he was in touch with David Simcha about std.rational. I don't have his email address, as he (understandably) has a fake email address set up as his reply-to here.
Oct 10 2013
prev sibling next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
On Oct 10, 2013, at 10:23 AM, Joseph Rushton Wakeling =
<joseph.wakeling webdrake.net> wrote:

 On 09/10/13 06:25, Andrei Alexandrescu wrote:
 The way I see it we must devise a robust solution to that, NOT =
consider the
 state of the art immutable (heh, a pun).
=20 Must say I have had a miserable experience with immutability and any =
kind of complex data structure, particularly when concurrency is = involved. As long as the reference itself can be reassigned (tail-immutable, I = suppose) I think immutable is occasionally quite useful for complex data = structures. It basically formalizes the RCU (read-copy-update) approach = to wait-free concurrency. I'd tend to use this most often for global = data structures built up on app start, and updated rarely to never as = the program runs.=
Oct 10 2013
parent "Daniel Davidson" <nospam spam.com> writes:
On Thursday, 10 October 2013 at 17:39:55 UTC, Sean Kelly wrote:
 On Oct 10, 2013, at 10:23 AM, Joseph Rushton Wakeling 
 <joseph.wakeling webdrake.net> wrote:

 On 09/10/13 06:25, Andrei Alexandrescu wrote:
 The way I see it we must devise a robust solution to that, 
 NOT consider the
 state of the art immutable (heh, a pun).
Must say I have had a miserable experience with immutability and any kind of complex data structure, particularly when concurrency is involved.
As long as the reference itself can be reassigned (tail-immutable, I suppose) I think immutable is occasionally quite useful for complex data structures. It basically formalizes the RCU (read-copy-update) approach to wait-free concurrency. I'd tend to use this most often for global data structures built up on app start, and updated rarely to never as the program runs.
Nice. Please show an example that includes complex data with associative arrays. Thanks Dan
Oct 10 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 10/10/13 19:39, Sean Kelly wrote:
 As long as the reference itself can be reassigned (tail-immutable, I suppose)
I think immutable is occasionally quite useful for complex data structures.  It
basically formalizes the RCU (read-copy-update) approach to wait-free
concurrency.  I'd tend to use this most often for global data structures built
up on app start, and updated rarely to never as the program runs.
This kind of stuff is outside my experience, so if you'd like to offer a more detailed explanation/example, I'd be very grateful :-)
Oct 10 2013
prev sibling next sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Oct 10, 2013, at 10:36 AM, Joseph Rushton Wakeling =
<joseph.wakeling webdrake.net> wrote:

 On 10/10/13 19:31, Jonathan M Davis wrote:
 I'm honestly surprised that Andrei is rejecting the idea of casting =
to/from
 shared or immutable being normal given how it's required by our =
current
 concurrency model. And changing that would be a _big_ change.
=20 I'm starting to incline towards the view that type qualifications of =
_any_ kind become problematic once you start working with any types = other than built-in, and not just in the context of concurrency. See = e.g.:
 http://d.puremagic.com/issues/show_bug.cgi?id=3D11148
 http://d.puremagic.com/issues/show_bug.cgi?id=3D11188
I'm inclined to agree about shared. But I see this largely as more = encouragement to keep data thread-local in D. If we can clean up move = semantics via std.concurrency, I would be reasonably happy with data = sharing in D. As for const / immutable, I guess I don't see this as such an issue = because I've been dealing with it in C++ for so long. You either have = to commit 100% to using const attributes or not use them at all. = Anything in between is fraught with problems.=
Oct 10 2013
prev sibling next sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Oct 10, 2013, at 10:43 AM, Joseph Rushton Wakeling =
<joseph.wakeling webdrake.net> wrote:

 On 10/10/13 19:39, Sean Kelly wrote:
 As long as the reference itself can be reassigned (tail-immutable, I =
suppose) I think immutable is occasionally quite useful for complex data = structures. It basically formalizes the RCU (read-copy-update) approach = to wait-free concurrency. I'd tend to use this most often for global = data structures built up on app start, and updated rarely to never as = the program runs.
=20
 This kind of stuff is outside my experience, so if you'd like to offer =
a more detailed explanation/example, I'd be very grateful :-) Configuration data, for example. On app start you might load a config = file, generate information about the user, and so on, before real = processing begins. This data needs to be visible everywhere and it = rarely if ever changes as the program runs, so you fill the data = structures and then make them immutable. Assuming, of course, that the = data structures have immutable versions of all the necessary functions = (which is unfortunately a pretty big assumption).=
Oct 10 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 10/10/13 19:50, Sean Kelly wrote:
 Configuration data, for example.  On app start you might load a config file,
generate information about the user, and so on, before real processing begins. 
This data needs to be visible everywhere and it rarely if ever changes as the
program runs, so you fill the data structures and then make them immutable. 
Assuming, of course, that the data structures have immutable versions of all
the necessary functions (which is unfortunately a pretty big assumption).
Yup, you're right, it's a big assumption. In my case I was interested in loading a graph (network) and running many simulations on it in parallel. The graph itself was static, so could readily be made immutable. However, I found that it was difficult to write code that would accept both immutable and mutable graphs as input, without impacting performance. So, I opted for threads to receive an immutable graph and cast it to mutable, even though it was never actually altered. My experience was no doubt partially due to issues with the overall design I chose, and maybe I could have found a way around it, but it just seemed easier to use this flawed approach than to re-work everything.
Oct 10 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 10/10/13 19:46, Sean Kelly wrote:
 As for const / immutable, I guess I don't see this as such an issue because
I've been dealing with it in C++ for so long.  You either have to commit 100%
to using const attributes or not use them at all.  Anything in between is
fraught with problems.
Well, the problem is essentially that you can have a function like: void foo(int i) { ... } ... and if you pass it an immutable or const int, this is not a problem, because you're passing by value. But now try void foo(BigInt i) { ... } ... and it won't work when passed a const/immutable variable, even though again you're passing by value. That's not nice, not intuitive, and generally speaking makes working with complex data types annoying. It's why, for example, std.math.abs currently works with BigInt but not with const or immutable BigInt -- which is very irritating indeed.
Oct 10 2013
prev sibling next sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Oct 10, 2013, at 11:17 AM, Joseph Rushton Wakeling =
<joseph.wakeling webdrake.net> wrote:

 On 10/10/13 19:50, Sean Kelly wrote:
 Configuration data, for example.  On app start you might load a =
config file, generate information about the user, and so on, before real = processing begins. This data needs to be visible everywhere and it = rarely if ever changes as the program runs, so you fill the data = structures and then make them immutable. Assuming, of course, that the = data structures have immutable versions of all the necessary functions = (which is unfortunately a pretty big assumption).
=20
 Yup, you're right, it's a big assumption.  In my case I was interested =
in loading a graph (network) and running many simulations on it in = parallel. The graph itself was static, so could readily be made = immutable. However, I found that it was difficult to write code that = would accept both immutable and mutable graphs as input, without = impacting performance. So, I opted for threads to receive an immutable = graph and cast it to mutable, even though it was never actually altered.
=20
 My experience was no doubt partially due to issues with the overall =
design I chose, and maybe I could have found a way around it, but it = just seemed easier to use this flawed approach than to re-work = everything. That's kind of the issue I ran into with shared in Druntime. It seemed = like what I had to do was have a shared method that internally cast = "this" to unshared and then called the real function, which I knew was = safe but the type system hated. But this seemed like a horrible = approach and so I didn't ever qualify anything as shared.=
Oct 10 2013
prev sibling next sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Oct 10, 2013, at 11:21 AM, Joseph Rushton Wakeling =
<joseph.wakeling webdrake.net> wrote:

 On 10/10/13 19:46, Sean Kelly wrote:
 As for const / immutable, I guess I don't see this as such an issue =
because I've been dealing with it in C++ for so long. You either have = to commit 100% to using const attributes or not use them at all. = Anything in between is fraught with problems.
=20
 Well, the problem is essentially that you can have a function like:
=20
    void foo(int i) { ... }
=20
 ... and if you pass it an immutable or const int, this is not a =
problem, because you're passing by value.
=20
 But now try
=20
    void foo(BigInt i) { ... }
=20
 ... and it won't work when passed a const/immutable variable, even =
though again you're passing by value. That's not nice, not intuitive, = and generally speaking makes working with complex data types annoying.
=20
 It's why, for example, std.math.abs currently works with BigInt but =
not with const or immutable BigInt -- which is very irritating indeed. Isn't BigInt a struct? I'd expect it to work via copying just like = concrete types.=
Oct 10 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Oct 10, 2013 at 07:36:06PM +0200, Joseph Rushton Wakeling wrote:
 On 10/10/13 19:31, Jonathan M Davis wrote:
I'm honestly surprised that Andrei is rejecting the idea of casting
to/from shared or immutable being normal given how it's required by
our current concurrency model. And changing that would be a _big_
change.
I'm starting to incline towards the view that type qualifications of _any_ kind become problematic once you start working with any types other than built-in, and not just in the context of concurrency. See e.g.: http://d.puremagic.com/issues/show_bug.cgi?id=11148 http://d.puremagic.com/issues/show_bug.cgi?id=11188 I'd really appreciate advice on how to handle issues like these, because it's becoming a serious obstacle to my work on std.rational.
I left some comments on these bugs. Basically, BigInt should not be implicitly castable from const/immutable to unqual, because unlike the built-in types, it's *not* a value type: BigInt x = 123; BigInt y = x; // creates an alias to x's data. Allowing implicit conversion to unqual would break immutability: immutable(BigInt) x = 123; const(BigInt) sneaky = x; // sneaky aliases x BigInt y = sneaky; // now y aliases sneaky, which aliases x (oops) Of course, the way BigInt is implemented, any operation on it causes new data to be created (essentially it behaves like a copy-on-write type), so it's not as though you can directly modify immutable this way, but it breaks the type system and opens up possible loopholes. What you need to do is to use inout for functions that need to handle both built-in ints and BigInts, e.g.: inout(Num) abs(Num)(inout(Num) x) { return (x >= 0) ? x : -x; } This *should* work (I think -- I didn't check :-P). Arguably, a *lot* of generic code involving numerical operations is broken, because they assume built-in types' behaviour of being implicitly convertible to/from immutable (due to being value types). I don't know about shared, though. Last I heard, shared was one big mess so I'm not even going to touch it. T -- If the comments and the code disagree, it's likely that *both* are wrong. -- Christopher
Oct 10 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 10/10/13 20:28, Sean Kelly wrote:
 Isn't BigInt a struct?  I'd expect it to work via copying just like concrete
types.
Yes, it's a struct, but somewhere inside its internals I think it contains arrays. I'm not sure how that affects copying etc., but suffice to say that if you try the following: BigInt a = 2; BigInt b = a; b = 3; assert(a != b); assert(a !is b); ... then it passes. So it behaves at least in this extent like a value type. But suffice to say that it was an unpleasant surprise that I couldn't just take it and pass to a function accepting an unqualified BigInt argument.
Oct 10 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 10/10/13 20:28, H. S. Teoh wrote:
 I don't know about shared, though. Last I heard, shared was one big mess
 so I'm not even going to touch it.
Yes, that seems to be the consensus.
Oct 10 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 10/10/13 20:28, H. S. Teoh wrote:
 I left some comments on these bugs. Basically, BigInt should not be
 implicitly castable from const/immutable to unqual, because unlike the
 built-in types, it's *not* a value type:

 	BigInt x = 123;
 	BigInt y = x;  // creates an alias to x's data.
BigInt a = 2; BigInt b = a; b = 3; writeln(a); writeln(b); ... gives you: 2 3 So, even though there's an array hidden away inside std.BigInt, it still seems to copy via value.
 Of course, the way BigInt is implemented, any operation on it causes new
 data to be created (essentially it behaves like a copy-on-write type),
 so it's not as though you can directly modify immutable this way, but
 it breaks the type system and opens up possible loopholes.
I guess that explains my result above .... ?
 What you need to do is to use inout for functions that need to handle
 both built-in ints and BigInts, e.g.:

 	inout(Num) abs(Num)(inout(Num) x) {
 		return (x >= 0) ? x : -x;
 	}

 This *should* work (I think -- I didn't check :-P).
I did, and it results in issues with BigInt's opCmp. But that may say more about BigInt's opCmp than about your solution.
 Arguably, a *lot* of generic code involving numerical operations is
 broken, because they assume built-in types' behaviour of being
 implicitly convertible to/from immutable (due to being value types).
How would you suggest correcting that?
Oct 10 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Oct 10, 2013 at 08:39:52PM +0200, Joseph Rushton Wakeling wrote:
 On 10/10/13 20:28, Sean Kelly wrote:
Isn't BigInt a struct?  I'd expect it to work via copying just like
concrete types.
Yes, it's a struct, but somewhere inside its internals I think it contains arrays. I'm not sure how that affects copying etc., but suffice to say that if you try the following: BigInt a = 2; BigInt b = a; b = 3; assert(a != b); assert(a !is b); ... then it passes. So it behaves at least in this extent like a value type.
I took a glance over the BigInt code, and it appears to have some kind of copy-on-write semantics. For example, in your code above, when you wrote b=a, b actually *aliases* a, but when you assign 3 to b, a new data array is created and b is updated to point to the new data instead. So it's not really a true value type, but more like a COW reference type.
 But suffice to say that it was an unpleasant surprise that I
 couldn't just take it and pass to a function accepting an
 unqualified BigInt argument.
That only works with true value types, but BigInt isn't really one of them. :) T -- Everybody talks about it, but nobody does anything about it! -- Mark Twain
Oct 10 2013
prev sibling next sibling parent reply "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On 2013-10-10, 20:28, H. S. Teoh wrote:

 On Thu, Oct 10, 2013 at 07:36:06PM +0200, Joseph Rushton Wakeling wrote:
 On 10/10/13 19:31, Jonathan M Davis wrote:
I'm honestly surprised that Andrei is rejecting the idea of casting
to/from shared or immutable being normal given how it's required by
our current concurrency model. And changing that would be a _big_
change.
I'm starting to incline towards the view that type qualifications of _any_ kind become problematic once you start working with any types other than built-in, and not just in the context of concurrency. See e.g.: http://d.puremagic.com/issues/show_bug.cgi?id=11148 http://d.puremagic.com/issues/show_bug.cgi?id=11188 I'd really appreciate advice on how to handle issues like these, because it's becoming a serious obstacle to my work on std.rational.
I left some comments on these bugs. Basically, BigInt should not be implicitly castable from const/immutable to unqual, because unlike the built-in types, it's *not* a value type:
[snip]
 What you need to do is to use inout for functions that need to handle
 both built-in ints and BigInts, e.g.:
[snip] Here's a COW reference type that I can easily pass to a function requiring a mutable version of the type: struct S { immutable(int)[] arr; } And usage: void foo(S s) {} void main() { const S s; foo(s); } This compiles and works beautifully. Of course, no actual COW is happening here, but COW is what the type system says has to happen. Another example COW type: string; Now, my point here is that BigInt could easily use an immutable buffer internally, as long as it's purely COW. It could, and it should. does not help - it's obscuring the problem. TLDR: Do not use inout(T). Fix BigInt. -- Simen
Oct 10 2013
parent "Daniel Davidson" <nospam spam.com> writes:
On Friday, 11 October 2013 at 00:30:35 UTC, Simen Kjaeraas wrote:
 Here's a COW reference type that I can easily pass to a function
 requiring a mutable version of the type:

   struct S {
     immutable(int)[] arr;
   }

 And usage:

   void foo(S s) {}

   void main() {
     const S s;
     foo(s);
   }


 This compiles and works beautifully. Of course, no actual COW is
 happening here, but COW is what the type system says has to 
 happen.
 Another example COW type:

   string;

 Now, my point here is that BigInt could easily use an immutable
 buffer internally, as long as it's purely COW. It could, and it 
 should.
 If it did, we would not be having this discussion, as bugs 


 inout'
 does not help - it's obscuring the problem.

 TLDR: Do not use inout(T). Fix BigInt.
Good catch. immutable(T)[] is special. Do the same with a contained associative array and you'll be my hero.
Oct 10 2013
prev sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 11/10/13 02:30, Simen Kjaeraas wrote:
 TLDR: Do not use inout(T). Fix BigInt.
Yes, that's my feeling too. Even if you do add inout (or "in", which also works in terms of allowing qualified BigInts) to std.math.abs, you immediately run back into problems ... with BigInt.
Oct 10 2013
prev sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 October 2013 at 04:24:23 UTC, Jonathan M Davis 
wrote:
 Except that for the most part, that's the only way that 
 immutable objects can
 be created - particularly if you're talking about arrays or 
 AAs. It's _very_
 common to do what Walter is describing. On top of that, we're 
 forced to cast
 to immutable or shared to pass anything via std.concurrency, 
 which causes
 pretty much the same problem. We're not even vaguely set up at 
 this point to
 have type qualifiers indicate how something was constructed or 
 what thread it
 comes from.
As said, we can update the runtime interface while having everything allocated in the same pool.
Oct 08 2013
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-10-09 05:31, Walter Bright wrote:

 Making this work is fraught with difficulty. It is normal behavior in D
 to create local data with new(), build a data structure, and then cast
 it to shared so it can be transferred to another thread. This will fail
 miserably if the data is allocated on a thread local heap.
I agree with Andrei here. Alternatively perhaps the runtime can move the data to a global pool if it's casted to shared. -- /Jacob Carlborg
Oct 09 2013
next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
On Oct 9, 2013, at 4:30 AM, Jacob Carlborg <doob me.com> wrote:
=20
 On 2013-10-09 05:31, Walter Bright wrote:
=20
 Making this work is fraught with difficulty. It is normal behavior in D
 to create local data with new(), build a data structure, and then cast
 it to shared so it can be transferred to another thread. This will fail
 miserably if the data is allocated on a thread local heap.
=20 I agree with Andrei here. Alternatively perhaps the runtime can move the d=
ata to a global pool if it's casted to shared. Generally not, since even D's precise GC is partially conservative. It's al= so way more expensive than any cast should be. For better or worse, I think b= eing able to cast data to shared means that we can't have thread-local pools= . Unless a new attribute were introduced like "local" that couldn't ever be c= ast to shared, and that sounds like a disaster.=20=
Oct 09 2013
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-10-09 15:51, Sean Kelly wrote:

 Generally not, since even D's precise GC is partially conservative.  It's also
way more expensive than any cast should be. For better or worse, I think being
able to cast data to shared means that we can't have thread-local pools. Unless
a new attribute were introduced like "local" that couldn't ever be cast to
shared, and that sounds like a disaster.
Since casting breaks the type system to begin with and is an advanced feature. How about providing a separate function that moves the object? It will be up to the user to call the function. -- /Jacob Carlborg
Oct 09 2013
parent reply Sean Kelly <sean invisibleduck.org> writes:
On Oct 9, 2013, at 7:35 AM, Jacob Carlborg <doob me.com> wrote:

 On 2013-10-09 15:51, Sean Kelly wrote:
=20
 Generally not, since even D's precise GC is partially conservative.  =
It's also way more expensive than any cast should be. For better or = worse, I think being able to cast data to shared means that we can't = have thread-local pools. Unless a new attribute were introduced like = "local" that couldn't ever be cast to shared, and that sounds like a = disaster.
=20
 Since casting breaks the type system to begin with and is an advanced =
feature. How about providing a separate function that moves the object? = It will be up to the user to call the function. Okay so following that=85 it might be reasonable if the location of data = keyed off the attribute set at construction. So "new shared(X)" puts it = in the shared pool. Strings are a bit trickier though, since there's no = way to specify the locality of the result of a string operation: shared string x =3D a ~ b ~ c; And I'm inclined to think that solving the issue for strings actually = gains us more than for classes in terms of performance.=
Oct 09 2013
next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 October 2013 at 15:48:58 UTC, Sean Kelly wrote:
 On Oct 9, 2013, at 7:35 AM, Jacob Carlborg <doob me.com> wrote:

 On 2013-10-09 15:51, Sean Kelly wrote:
 
 Generally not, since even D's precise GC is partially 
 conservative.  It's also way more expensive than any cast 
 should be. For better or worse, I think being able to cast 
 data to shared means that we can't have thread-local pools. 
 Unless a new attribute were introduced like "local" that 
 couldn't ever be cast to shared, and that sounds like a 
 disaster.
Since casting breaks the type system to begin with and is an advanced feature. How about providing a separate function that moves the object? It will be up to the user to call the function.
Okay so following that… it might be reasonable if the location of data keyed off the attribute set at construction. So "new shared(X)" puts it in the shared pool. Strings are a bit trickier though, since there's no way to specify the locality of the result of a string operation: shared string x = a ~ b ~ c; And I'm inclined to think that solving the issue for strings actually gains us more than for classes in terms of performance.
The heap allocated part here is immutable. The slice itself is a value type.
Oct 09 2013
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-10-09 17:48, Sean Kelly wrote:

 Okay so following that… it might be reasonable if the location of data keyed
off the attribute set at construction.  So "new shared(X)" puts it in the
shared pool.
I thought that was obvious. Is there a problem with that approach? -- /Jacob Carlborg
Oct 09 2013
parent reply Sean Kelly <sean invisibleduck.org> writes:
On Oct 9, 2013, at 9:42 AM, Jacob Carlborg <doob me.com> wrote:
=20
 On 2013-10-09 17:48, Sean Kelly wrote:
=20
 Okay so following that=E2=80=A6 it might be reasonable if the location of=
data keyed off the attribute set at construction. So "new shared(X)" puts i= t in the shared pool.
=20
 I thought that was obvious. Is there a problem with that approach?
Only that this would have to be communicated to the user, since moving data l= ater is problematic. Today, I think it's common to construct an object as un= shared and then cast it.=20=
Oct 09 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-10-10 02:22, Sean Kelly wrote:

 Only that this would have to be communicated to the user, since moving data
later is problematic. Today, I think it's common to construct an object as
unshared and then cast it.
What is the reason to not create it as "shared" in the first place? -- /Jacob Carlborg
Oct 09 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/9/2013 11:34 PM, Jacob Carlborg wrote:
 On 2013-10-10 02:22, Sean Kelly wrote:

 Only that this would have to be communicated to the user, since moving data
 later is problematic. Today, I think it's common to construct an object as
 unshared and then cast it.
What is the reason to not create it as "shared" in the first place?
1. Shared data cannot be passed to regular functions. 2. Functions that create data structures would have to know in advance that they'll be creating a shared object. I'm not so sure this would not be an invasive change. 3. Immutable data is implicitly shared. But it is not created immutable - it is created as mutable data, then set to some state, then cast to immutable.
Oct 10 2013
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-10-10 09:18, Walter Bright wrote:

 1. Shared data cannot be passed to regular functions.
That I understand.
 2. Functions that create data structures would have to know in advance
 that they'll be creating a shared object. I'm not so sure this would not
 be an invasive change.
If the function doesn't know it creates shared data it will assume it's not and it won't use any synchronization. Then suddenly someone casts it to "shared" and you're in trouble.
 3. Immutable data is implicitly shared. But it is not created immutable
 - it is created as mutable data, then set to some state, then cast to
 immutable.
It should be possible to create immutable data in the first place. No cast should be required. -- /Jacob Carlborg
Oct 10 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 10/10/2013 12:51 AM, Jacob Carlborg wrote:
 On 2013-10-10 09:18, Walter Bright wrote:

 1. Shared data cannot be passed to regular functions.
That I understand.
 2. Functions that create data structures would have to know in advance
 that they'll be creating a shared object. I'm not so sure this would not
 be an invasive change.
If the function doesn't know it creates shared data it will assume it's not and it won't use any synchronization. Then suddenly someone casts it to "shared" and you're in trouble.
Same comment as for immutable data - create the data structure as thread local, because of (1), and then cast to shared and hand it to another thread.
 3. Immutable data is implicitly shared. But it is not created immutable
 - it is created as mutable data, then set to some state, then cast to
 immutable.
It should be possible to create immutable data in the first place. No cast should be required.
Oct 10 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/10/13 12:18 AM, Walter Bright wrote:
 On 10/9/2013 11:34 PM, Jacob Carlborg wrote:
 On 2013-10-10 02:22, Sean Kelly wrote:

 Only that this would have to be communicated to the user, since
 moving data
 later is problematic. Today, I think it's common to construct an
 object as
 unshared and then cast it.
What is the reason to not create it as "shared" in the first place?
1. Shared data cannot be passed to regular functions.
I don't understand this. If a function/method accepts "shared", then it can be passed shared data.
 2. Functions that create data structures would have to know in advance
 that they'll be creating a shared object. I'm not so sure this would not
 be an invasive change.
There is no other way around it. And this is not a change - it's fixing something.
 3. Immutable data is implicitly shared. But it is not created immutable
 - it is created as mutable data, then set to some state, then cast to
 immutable.
That all must happen in the runtime, NOT in user code. Andrei
Oct 10 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/10/2013 10:54 AM, Andrei Alexandrescu wrote:
 On 10/10/13 12:18 AM, Walter Bright wrote:
 On 10/9/2013 11:34 PM, Jacob Carlborg wrote:
 On 2013-10-10 02:22, Sean Kelly wrote:

 Only that this would have to be communicated to the user, since
 moving data
 later is problematic. Today, I think it's common to construct an
 object as
 unshared and then cast it.
What is the reason to not create it as "shared" in the first place?
1. Shared data cannot be passed to regular functions.
I don't understand this. If a function/method accepts "shared", then it can be passed shared data.
I meant regular functions as in they are not typed as taking shared arguments. Shared cannot be implicitly cast to unshared. I say regular because very, very few functions are typed as accepting shared arguments.
 2. Functions that create data structures would have to know in advance
 that they'll be creating a shared object. I'm not so sure this would not
 be an invasive change.
There is no other way around it. And this is not a change - it's fixing something.
I'm not convinced of that at all.
 3. Immutable data is implicitly shared. But it is not created immutable
 - it is created as mutable data, then set to some state, then cast to
 immutable.
That all must happen in the runtime, NOT in user code. Andrei
Oct 10 2013
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, October 10, 2013 11:28:02 Walter Bright wrote:
 On 10/10/2013 10:54 AM, Andrei Alexandrescu wrote:
 On 10/10/13 12:18 AM, Walter Bright wrote:
 1. Shared data cannot be passed to regular functions.
I don't understand this. If a function/method accepts "shared", then it can be passed shared data.
I meant regular functions as in they are not typed as taking shared arguments. Shared cannot be implicitly cast to unshared. I say regular because very, very few functions are typed as accepting shared arguments.
Yeah. The only times that something is going to accept shared is when it was specifically designed to work as shared (which most code isn't), or if it's templated and the template happens to work with shared. Regular functions just aren't going to work with shared without casting away shared, because that would usually mean either templating everything or duplicating functions all over the place. - Jonathan M Davis
Oct 10 2013
parent reply "Sean Kelly" <sean invisibleduck.org> writes:
On Thursday, 10 October 2013 at 23:33:27 UTC, Jonathan M Davis 
wrote:
 Yeah. The only times that something is going to accept shared 
 is when it was
 specifically designed to work as shared (which most code 
 isn't), or if it's
 templated and the template happens to work with shared. Regular 
 functions just
 aren't going to work with shared without casting away shared, 
 because that
 would usually mean either templating everything or duplicating 
 functions all over the place.
I think that's pretty reasonable though. Shared data needs to be treated differently, explicitly, or things go downhill fast.
Oct 10 2013
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, October 11, 2013 02:08:16 Sean Kelly wrote:
 On Thursday, 10 October 2013 at 23:33:27 UTC, Jonathan M Davis
 
 wrote:
 Yeah. The only times that something is going to accept shared
 is when it was
 specifically designed to work as shared (which most code
 isn't), or if it's
 templated and the template happens to work with shared. Regular
 functions just
 aren't going to work with shared without casting away shared,
 because that
 would usually mean either templating everything or duplicating
 functions all over the place.
I think that's pretty reasonable though. Shared data needs to be treated differently, explicitly, or things go downhill fast.
I'm not disagreeing with how shared works. I'm disagreeing with the idea that it's not supposed to be normal to cast shared away when operating on shared objects. I expect that the most common idiom for dealing with shared is to protect it with a lock, cast it to thread-local, do whatever you're going to do with it, make sure that there are no thread-local references to it once you're done operating on it, and then release the lock. e.g. synchronized { auto tc = cast(T)mySharedT; tc.memberFunc(); doStuff(tc); //no thread-local references to tc other than tc should //exist at this point. } That works perfectly fine and makes it so that shared objects are clearly delineated from thread-local ones by the type system, but it does require casting, and it requires that you make sure that the object is not misused while it's not being treated as shared. The only real alternative to that is to create types which are designed to be operated on as shared, but I would expect that to be the rare case rather than the norm, because that requires creating new types just for sharing across threads rather than using the same types that you use in your thread-local code, and I don't expect programmers to want to do that in the average case. However, from what I've seen, at the moment, the most typical reaction is to simply use __gshared, which is the least safe of the various options. So, people need to be better educated and/or we need figure out a different design for shared. - Jonathan M Davis
Oct 10 2013
next sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-10-11 03:05, Jonathan M Davis wrote:

 I'm not disagreeing with how shared works. I'm disagreeing with the idea that
 it's not supposed to be normal to cast shared away when operating on shared
 objects. I expect that the most common idiom for dealing with shared is to
 protect it with a lock, cast it to thread-local, do whatever you're going to
 do with it, make sure that there are no thread-local references to it once
 you're done operating on it, and then release the lock. e.g.

 synchronized
 {
   auto tc = cast(T)mySharedT;
   tc.memberFunc();
   doStuff(tc);
   //no thread-local references to tc other than tc should
   //exist at this point.
 }
With Michel Fortin's proposal I think the above could work without a cast, if doStuff is pure function. http://michelf.ca/blog/2012/mutex-synchonization-in-d/ -- /Jacob Carlborg
Oct 11 2013
prev sibling parent reply "Sean Kelly" <sean invisibleduck.org> writes:
On Friday, 11 October 2013 at 01:05:19 UTC, Jonathan M Davis 
wrote:
 On Friday, October 11, 2013 02:08:16 Sean Kelly wrote:
 
 Shared data needs to be
 treated differently, explicitly, or things go downhill fast.
I'm not disagreeing with how shared works. I'm disagreeing with the idea that it's not supposed to be normal to cast shared away when operating on shared objects. I expect that the most common idiom for dealing with shared is to protect it with a lock, cast it to thread-local, do whatever you're going to do with it, make sure that there are no thread-local references to it once you're done operating on it, and then release the lock.
The thing with locks is that you need to use the same lock for all accesses to a set of mutated data or atomicity isn't guaranteed. And if you're locking externally you don't know what might change inside a class during a method call, so you have to use the same lock for all operations on that object, regardless of what you're doing. At that point you may as well just synchronize on the class itself and be done with it. So sure, it saves you from having to define shared or synchronized methods, but I don't think this should be how we want to write concurrent code in D.
Oct 11 2013
parent reply "Dicebot" <public dicebot.lv> writes:
On Friday, 11 October 2013 at 17:46:01 UTC, Sean Kelly wrote:
 The thing with locks is that you need to use the same lock for 
 all accesses to a set of mutated data or atomicity isn't 
 guaranteed.  And if you're locking externally you don't know 
 what might change inside a class during a method call, so you 
 have to use the same lock for all operations on that object, 
 regardless of what you're doing.  At that point you may as well 
 just synchronize on the class itself and be done with it.  So 
 sure, it saves you from having to define shared or synchronized 
 methods, but I don't think this should be how we want to write 
 concurrent code in D.
How can one possibly used "synchronized" for this in absence of classes if desire behavior is to lock an entity, not statement block?
Oct 11 2013
parent reply "Sean Kelly" <sean invisibleduck.org> writes:
On Friday, 11 October 2013 at 17:50:26 UTC, Dicebot wrote:
 How can one possibly used "synchronized" for this in absence of 
 classes if desire behavior is to lock an entity, not statement 
 block?
I'm not sure I follow. But I was in part objecting to the use of synchronized without a related object: synchronized { // do stuff } This statement should be illegal. You must always specify a synchronization context: synchronized(myMutex) { // do stuff } For the rest, it seemed like the suggestion was that you could just wrap a statement in any old synchronized block and all your problems would be solved, which absolutely isn't the case.
Oct 11 2013
next sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Friday, 11 October 2013 at 18:05:00 UTC, Sean Kelly wrote:
 On Friday, 11 October 2013 at 17:50:26 UTC, Dicebot wrote:
 How can one possibly used "synchronized" for this in absence 
 of classes if desire behavior is to lock an entity, not 
 statement block?
I'm not sure I follow. But I was in part objecting to the use of synchronized without a related object: synchronized { // do stuff } This statement should be illegal. You must always specify a synchronization context: synchronized(myMutex) { // do stuff } For the rest, it seemed like the suggestion was that you could just wrap a statement in any old synchronized block and all your problems would be solved, which absolutely isn't the case.
I was reading this : http://dlang.org/statement.html#SynchronizedStatement It says that Expression in sync statement must evaluate to Object or interface and mutex get created specifically for it. But what if I want to use struct in that block? Or array?
Oct 11 2013
next sibling parent "Sean Kelly" <sean invisibleduck.org> writes:
On Friday, 11 October 2013 at 18:10:27 UTC, Dicebot wrote:
 I was reading this : 
 http://dlang.org/statement.html#SynchronizedStatement

 It says that Expression in sync statement must evaluate to 
 Object or interface and mutex get created specifically for it. 
 But what if I want to use struct in that block? Or array?
Synchronize on a dummy object or use core.sync.mutex: auto m = new Mutex; synchronized(m) { } It's effectively the same as in C++ except that synchronized saves you the trouble of using an RAII scoped_lock variable.
Oct 11 2013
prev sibling parent reply "Sean Kelly" <sean invisibleduck.org> writes:
On Friday, 11 October 2013 at 18:10:27 UTC, Dicebot wrote:
 I was reading this : 
 http://dlang.org/statement.html#SynchronizedStatement

 It says that Expression in sync statement must evaluate to 
 Object or interface and mutex get created specifically for it. 
 But what if I want to use struct in that block? Or array?
Synchronize on a dummy object or use core.sync.mutex: auto m = new Mutex; synchronized(m) { } It's effectively the same as in C++ except that synchronized saves you the trouble of using an RAII scoped_lock variable.
Oct 11 2013
parent reply "Dicebot" <public dicebot.lv> writes:
On Friday, 11 October 2013 at 18:18:45 UTC, Sean Kelly wrote:
 On Friday, 11 October 2013 at 18:10:27 UTC, Dicebot wrote:
 I was reading this : 
 http://dlang.org/statement.html#SynchronizedStatement

 It says that Expression in sync statement must evaluate to 
 Object or interface and mutex get created specifically for it. 
 But what if I want to use struct in that block? Or array?
Synchronize on a dummy object or use core.sync.mutex: auto m = new Mutex; synchronized(m) { } It's effectively the same as in C++ except that synchronized saves you the trouble of using an RAII scoped_lock variable.
Yeah, but it can't possibly work in conjunction with proposed "shared" stripping inside the block, can it?
Oct 11 2013
parent reply "Sean Kelly" <sean invisibleduck.org> writes:
On Friday, 11 October 2013 at 18:19:59 UTC, Dicebot wrote:
 On Friday, 11 October 2013 at 18:18:45 UTC, Sean Kelly wrote:
 Synchronize on a dummy object or use core.sync.mutex:

 auto m = new Mutex;
 synchronized(m) {

 }

 It's effectively the same as in C++ except that synchronized 
 saves you the trouble of using an RAII scoped_lock variable.
Yeah, but it can't possibly work in conjunction with proposed "shared" stripping inside the block, can it?
It should. Stripping "shared" just means that you'll be able to call any function available on the struct as opposed to only explicitly shared functions. And the mutex gives you atomic behavior (assuming you use the mutex properly anywhere else you access the struct).
Oct 11 2013
parent reply "Dicebot" <public dicebot.lv> writes:
On Friday, 11 October 2013 at 18:22:46 UTC, Sean Kelly wrote:
 It should.  Stripping "shared" just means that you'll be able 
 to call any function available on the struct as opposed to only 
 explicitly shared functions.  And the mutex gives you atomic 
 behavior (assuming you use the mutex properly anywhere else you 
 access the struct).
How would it know which entity is associated with that mutex? (== which to strip shared from)
Oct 11 2013
parent "Sean Kelly" <sean invisibleduck.org> writes:
On Friday, 11 October 2013 at 18:26:52 UTC, Dicebot wrote:
 On Friday, 11 October 2013 at 18:22:46 UTC, Sean Kelly wrote:
 It should.  Stripping "shared" just means that you'll be able 
 to call any function available on the struct as opposed to 
 only explicitly shared functions.  And the mutex gives you 
 atomic behavior (assuming you use the mutex properly anywhere 
 else you access the struct).
How would it know which entity is associated with that mutex? (== which to strip shared from)
It wouldn't. I'm guessing it would just cast away shared.
Oct 11 2013
prev sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, October 11, 2013 20:04:57 Sean Kelly wrote:
 On Friday, 11 October 2013 at 17:50:26 UTC, Dicebot wrote:
 How can one possibly used "synchronized" for this in absence of
 classes if desire behavior is to lock an entity, not statement
 block?
I'm not sure I follow. But I was in part objecting to the use of synchronized without a related object: synchronized { // do stuff } This statement should be illegal. You must always specify a synchronization context: synchronized(myMutex) { // do stuff }
I agree with that. I was just in too much of a hurry when I threw that code snippet together and left out the mutex. I was more concerned with what was in the synchronized block than how the lock was done. It could have been done with guard/autolock and RAII as far as I was concerned with regards to what I was trying to show.
 For the rest, it seemed like the suggestion was that you could
 just wrap a statement in any old synchronized block and all your
 problems would be solved, which absolutely isn't the case.
I certainly wasn't suggesting that all problems would be solved by a synchronized block. I was simply trying to show that in order to actually use a shared object, you have to cast away shared, and that means protecting the object with a lock of some kind. You then have the problem of making sure that no thread-local references to the object escape the lock, but at least shared then becomes useable. - Jonathan M Davis
Oct 11 2013
prev sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Oct 9, 2013, at 11:34 PM, Jacob Carlborg <doob me.com> wrote:

 On 2013-10-10 02:22, Sean Kelly wrote:
=20
 Only that this would have to be communicated to the user, since =
moving data later is problematic. Today, I think it's common to = construct an object as unshared and then cast it.
=20
 What is the reason to not create it as "shared" in the first place?
The same as immutable--you may not have all the shared functions = available to establish the desired state. But I'll grant that this is = obviously way more common with immutable than shared.=
Oct 10 2013
prev sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 October 2013 at 14:17:44 UTC, Sean Kelly wrote:
 On Oct 9, 2013, at 4:30 AM, Jacob Carlborg <doob me.com> wrote:
 
 On 2013-10-09 05:31, Walter Bright wrote:
 
 Making this work is fraught with difficulty. It is normal 
 behavior in D
 to create local data with new(), build a data structure, and 
 then cast
 it to shared so it can be transferred to another thread. This 
 will fail
 miserably if the data is allocated on a thread local heap.
I agree with Andrei here. Alternatively perhaps the runtime can move the data to a global pool if it's casted to shared.
Generally not, since even D's precise GC is partially conservative. It's also way more expensive than any cast should be. For better or worse, I think being able to cast data to shared means that we can't have thread-local pools. Unless a new attribute were introduced like "local" that couldn't ever be cast to shared, and that sounds like a disaster.
That isn't accurant. Allocator like tcmalloc use thread local info to allocate shared chunk of memory. What does matter is that the block is tagged as shared as far as the GC is oncerned. Casting qualifier is a NOOP at machine level, so that won't be any slower.
Oct 09 2013
parent reply Sean Kelly <sean invisibleduck.org> writes:
On Oct 9, 2013, at 8:23 AM, deadalnix <deadalnix gmail.com> wrote:

 On Wednesday, 9 October 2013 at 14:17:44 UTC, Sean Kelly wrote:
 On Oct 9, 2013, at 4:30 AM, Jacob Carlborg <doob me.com> wrote:
 On 2013-10-09 05:31, Walter Bright wrote:
 Making this work is fraught with difficulty. It is normal behavior =
in D
 to create local data with new(), build a data structure, and then =
cast
 it to shared so it can be transferred to another thread. This will =
fail
 miserably if the data is allocated on a thread local heap.
I agree with Andrei here. Alternatively perhaps the runtime can move =
the data to a global pool if it's casted to shared.
=20
 Generally not, since even D's precise GC is partially conservative.  =
It's also way more expensive than any cast should be. For better or = worse, I think being able to cast data to shared means that we can't = have thread-local pools. Unless a new attribute were introduced like = "local" that couldn't ever be cast to shared, and that sounds like a = disaster.
=20
 That isn't accurant. Allocator like tcmalloc use thread local info to =
allocate shared chunk of memory. What does matter is that the block is = tagged as shared as far as the GC is concerned. If the GC can determine whether a block is shared at allocation time, it = can allocate the block from a thread-local pool in the unshared case. = So if a collection is necessary, no global stop the world need occur. = Only the current thread's roots are scanned. It's a huge performance = gain in a concurrent program. Particularly in a language like D where = data is thread-local by default.=
Oct 09 2013
parent "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 October 2013 at 15:53:13 UTC, Sean Kelly wrote:
 If the GC can determine whether a block is shared at allocation 
 time, it can allocate the block from a thread-local pool in the 
 unshared case.  So if a collection is necessary, no global stop 
 the world need occur.  Only the current thread's roots are 
 scanned.  It's a huge performance gain in a concurrent program.
  Particularly in a language like D where data is thread-local 
 by default.
Yes, we have this awesome mecanism and we aren't using it.
Oct 09 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/9/13 4:30 AM, Jacob Carlborg wrote:
 On 2013-10-09 05:31, Walter Bright wrote:

 Making this work is fraught with difficulty. It is normal behavior in D
 to create local data with new(), build a data structure, and then cast
 it to shared so it can be transferred to another thread. This will fail
 miserably if the data is allocated on a thread local heap.
I agree with Andrei here. Alternatively perhaps the runtime can move the data to a global pool if it's casted to shared.
I think a reasonable solution (in case we're unable to solve this reasonably within the type system) is to offer a library solution that does the shared allocation, mutation, and casting internally. Again: USER CODE SHOULD HAVE NO BUSINESS CASTING CASUALLY TO GET WORK DONE. Andrei
Oct 09 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-10-09 19:10, Andrei Alexandrescu wrote:

 Again: USER CODE SHOULD HAVE NO BUSINESS CASTING CASUALLY TO GET WORK DONE.
I agree. Doesn't "new shared(X)", or something like that, already work? -- /Jacob Carlborg
Oct 09 2013
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, October 09, 2013 20:04:50 Jacob Carlborg wrote:
 On 2013-10-09 19:10, Andrei Alexandrescu wrote:
 Again: USER CODE SHOULD HAVE NO BUSINESS CASTING CASUALLY TO GET WORK
 DONE.
I agree. Doesn't "new shared(X)", or something like that, already work?
That depends. It works with objects I think (for both shared and immutable), but you definitely have to cast to immutable if you want an immutable array or AA. Also, casting _away_ shared is going to be a very common operation due to how shared works. In order to use shared, you basically have to protect the variable with a mutex or synchronized block, cast away shared, and then use it as thread-local for whatever you're doing until you release the lock (in which case, you have to be sure that there are no more thread-local references to the shared object). As such, while it might be possible to construct stuff directly as shared, it's going to have to be cast to thread-local just to use it in any meaningful way. So, at this point, I don't think that it even vaguely flies to try and make it so that casting away shared is something that isn't typically done. It's going to be done about as often as shared is used for anything other than very basic stuff. As things stand, I don't think that it's even vaguely tenable to claim that casting should be abnormal with regards to immutable and shared. There will definitely have to be language changes if we want casting to be abnormal, and I think that the two main areas which would have to change would be the initialization of immutable arrays and AAs (since that outright requires casting at this point), and shared would probably have to have a major redesign, because it's outright unusable without casting it away. - Jonathan M Davis
Oct 09 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-10-10 05:55, Jonathan M Davis wrote:

 That depends. It works with objects I think (for both shared and immutable),
 but you definitely have to cast to immutable if you want an immutable array or
 AA.

 Also, casting _away_ shared is going to be a very common operation due to how
 shared works. In order to use shared, you basically have to protect the
 variable with a mutex or synchronized block, cast away shared, and then use it
 as thread-local for whatever you're doing until you release the lock (in which
 case, you have to be sure that there are no more thread-local references to
 the shared object). As such, while it might be possible to construct stuff
 directly as shared, it's going to have to be cast to thread-local just to use
 it in any meaningful way. So, at this point, I don't think that it even
 vaguely flies to try and make it so that casting away shared is something that
 isn't typically done. It's going to be done about as often as shared is used
 for anything other than very basic stuff.
What's the reason for casting away "shared", is it to pass it to a function that doesn't accept "shared"? The it should be safe if you synchronize around the call? But that function could put away, the now , unshared data, which is actually shared and cause problem? -- /Jacob Carlborg
Oct 09 2013
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, October 10, 2013 08:38:31 Jacob Carlborg wrote:
 On 2013-10-10 05:55, Jonathan M Davis wrote:
 That depends. It works with objects I think (for both shared and
 immutable), but you definitely have to cast to immutable if you want an
 immutable array or AA.
 
 Also, casting _away_ shared is going to be a very common operation due to
 how shared works. In order to use shared, you basically have to protect
 the variable with a mutex or synchronized block, cast away shared, and
 then use it as thread-local for whatever you're doing until you release
 the lock (in which case, you have to be sure that there are no more
 thread-local references to the shared object). As such, while it might be
 possible to construct stuff directly as shared, it's going to have to be
 cast to thread-local just to use it in any meaningful way. So, at this
 point, I don't think that it even vaguely flies to try and make it so
 that casting away shared is something that isn't typically done. It's
 going to be done about as often as shared is used for anything other than
 very basic stuff.
What's the reason for casting away "shared", is it to pass it to a function that doesn't accept "shared"? The it should be safe if you synchronize around the call? But that function could put away, the now , unshared data, which is actually shared and cause problem?
Pretty much nothing accepts shared. At best, templated functions accept shared. Certainly, shared doesn't work at all with classes and structs unless the type is specifically intended to be used as shared, because you have to mark all of its member functions shared to be able to call them. And if you want to use that class or struct as both shared and unshared, you have to duplicate all of its member functions. That being the case, the only way in general to use a shared object is to protect it with a lock, cast it to thread-local (so that it can actually use its member functions or be passed to other functions to be used), and then use it. e.g. synchronized { auto tl = cast(T)mySharedT; auto result = tl.foo(); auto result2 = bar(tl); } Obviously, you then have to make sure that there are no thread-local references to the shared object when the lock is released, but without casting away shared like that, you can't do much of anything with it. So, similar to when you cast away const, it's up to you to guarantee that the code doesn't violate the type system's guarantees - i.e. that a thread-local variable is not accessed by multiple threads. So, you use a lock of some kind to protect the shared variable while it's treated as a thread-local variable in order to ensure that that guarantee holds. Like with casting away const or with trusted, there's obviously risk in doing this, but there's really no other way to use shared at this point - certainly not without it being incredibly invasive to your code and forcing code duplication. - Jonathan M Davis
Oct 10 2013
parent Jacob Carlborg <doob me.com> writes:
On 2013-10-10 09:24, Jonathan M Davis wrote:

 Pretty much nothing accepts shared. At best, templated functions accept
 shared. Certainly, shared doesn't work at all with classes and structs unless
 the type is specifically intended to be used as shared, because you have to
 mark all of its member functions shared to be able to call them. And if you
 want to use that class or struct as both shared and unshared, you have to
 duplicate all of its member functions.

 That being the case, the only way in general to use a shared object is to
 protect it with a lock, cast it to thread-local (so that it can actually use
 its member functions or be passed to other functions to be used), and then use
 it. e.g.

 synchronized
 {
       auto tl = cast(T)mySharedT;
       auto result = tl.foo();
       auto result2 = bar(tl);
 }

 Obviously, you then have to make sure that there are no thread-local
 references to the shared object when the lock is released, but without casting
 away shared like that, you can't do much of anything with it. So, similar to
 when you cast away const, it's up to you to guarantee that the code doesn't
 violate the type system's guarantees - i.e. that a thread-local variable is
 not accessed by multiple threads. So, you use a lock of some kind to protect
 the shared variable while it's treated as a thread-local variable in order to
 ensure that that guarantee holds. Like with casting away const or with
  trusted, there's obviously risk in doing this, but there's really no other
 way to use shared at this point - certainly not without it being incredibly
 invasive to your code and forcing code duplication.
Sounds like we need a way to tell that a parameter is thread local but not allowed to escape a reference to it. Object foo; void bar (shared_tls Object o) { foo = o; // Compile error, cannot escape a "shared" thread local } void main () { auto o = new shared(Object); synchronized { bar(o); } } Both "shared" can thread local be passed to "shared_tls". If "shared" is passed it assumes to be synchronized during the call to "bar". This will still have the problem of annotating all code with this attribute. Or this needs to be default, which would cause a lot of code breakage. -- /Jacob Carlborg
Oct 10 2013
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, October 09, 2013 20:55:07 Jonathan M Davis wrote:
 On Wednesday, October 09, 2013 20:04:50 Jacob Carlborg wrote:
 On 2013-10-09 19:10, Andrei Alexandrescu wrote:
 Again: USER CODE SHOULD HAVE NO BUSINESS CASTING CASUALLY TO GET WORK
 DONE.
I agree. Doesn't "new shared(X)", or something like that, already work?
That depends. It works with objects I think (for both shared and immutable), but you definitely have to cast to immutable if you want an immutable array or AA. Also, casting _away_ shared is going to be a very common operation due to how shared works. In order to use shared, you basically have to protect the variable with a mutex or synchronized block, cast away shared, and then use it as thread-local for whatever you're doing until you release the lock (in which case, you have to be sure that there are no more thread-local references to the shared object). As such, while it might be possible to construct stuff directly as shared, it's going to have to be cast to thread-local just to use it in any meaningful way. So, at this point, I don't think that it even vaguely flies to try and make it so that casting away shared is something that isn't typically done. It's going to be done about as often as shared is used for anything other than very basic stuff. As things stand, I don't think that it's even vaguely tenable to claim that casting should be abnormal with regards to immutable and shared. There will definitely have to be language changes if we want casting to be abnormal, and I think that the two main areas which would have to change would be the initialization of immutable arrays and AAs (since that outright requires casting at this point), and shared would probably have to have a major redesign, because it's outright unusable without casting it away.
And given that std.concurrency requires casting to and from shared or immutable in order to pass objects across threads, it seems ilke most of D's concurrency model requires casting to and/or from shared or immutable. The major exception is structs or classes which are shared or synchronized rather than a normal object which is used as shared, and I suspect that that's done fairly rarely at this point. In fact, it seems like the most common solution is to ignore shared altogether and use __gshared, which is far worse than casting to and from shared IMHO. So, it's my impression that being able to consider casting to or from shared as abnormal in code which uses shared is a bit of a pipe dream at this point. The current language design pretty much requires casting when doing much of anything with concurrency. - Jonathan M Davis
Oct 09 2013
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-10-10 06:24, Jonathan M Davis wrote:

 And given that std.concurrency requires casting to and from shared or
 immutable in order to pass objects across threads, it seems ilke most of D's
 concurrency model requires casting to and/or from shared or immutable. The
 major exception is structs or classes which are shared or synchronized rather
 than a normal object which is used as shared, and I suspect that that's done
 fairly rarely at this point. In fact, it seems like the most common solution
 is to ignore shared altogether and use __gshared, which is far worse than
 casting to and from shared IMHO.
Isn't the whole point of std.concurrency that is should only accept "shared" for reference types? If you want to use std.concurrency create a "shared" object in the first place?
 So, it's my impression that being able to consider casting to or from shared
 as abnormal in code which uses shared is a bit of a pipe dream at this point.
 The current language design pretty much requires casting when doing much of
 anything with concurrency.
There must be a better way to solve this. -- /Jacob Carlborg
Oct 09 2013
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, October 10, 2013 08:41:19 Jacob Carlborg wrote:
 On 2013-10-10 06:24, Jonathan M Davis wrote:
 And given that std.concurrency requires casting to and from shared or
 immutable in order to pass objects across threads, it seems ilke most of
 D's concurrency model requires casting to and/or from shared or
 immutable. The major exception is structs or classes which are shared or
 synchronized rather than a normal object which is used as shared, and I
 suspect that that's done fairly rarely at this point. In fact, it seems
 like the most common solution is to ignore shared altogether and use
 __gshared, which is far worse than casting to and from shared IMHO.
Isn't the whole point of std.concurrency that is should only accept "shared" for reference types? If you want to use std.concurrency create a "shared" object in the first place?
You might do that if you're creating the object simply to send it across, but it's frequently the case that the object was created well before it was sent across, and it frequently had to have operations done it other than simply creating it (which wouldn't work if it were shared). So, it often wouldn't make sense for the object being passed to be shared except when being passed. And once it's been passed, it's rarely the case that you want it to be shared. You're usually passing ownership. You're essentially taking a thread-local variable from one thread and making it a thread-local variable on another thread. Unfortunately, the type system does not support the concept of thread ownership (beyond thread-local vs shared), so it's up to the programmer to make sure that no references to the object are kept on the original thread, but there's really no way around that unless you're always creating a new object when you pass it across, which would result in which would usually be a unnecessary copy. So, it becomes like trusted in that sense.
 So, it's my impression that being able to consider casting to or from
 shared as abnormal in code which uses shared is a bit of a pipe dream at
 this point. The current language design pretty much requires casting when
 doing much of anything with concurrency.
There must be a better way to solve this.
I honestly don't think we can solve it a different way without completely redesigning shared. shared is specifically designed such that you have to either cast it way to do anything with it or write all of your code to explicitly work with shared, which is not something that generally makes sense to do unless you're creating a type whose only value is in being shared across threads. Far more frequently, you want to share a type which you would also use normally as a thread-local variable, and that means casting. - Jonathan M Davis
Oct 10 2013
next sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-10-10 09:33, Jonathan M Davis wrote:

 You might do that if you're creating the object simply to send it across, but
 it's frequently the case that the object was created well before it was sent
 across, and it frequently had to have operations done it other than simply
 creating it (which wouldn't work if it were shared). So, it often wouldn't
 make sense for the object being passed to be shared except when being passed.
I guess if you're not creating it as "shared" to being with there's not way to tell that the given object now is shared an no thread local references are allowed.
 And once it's been passed, it's rarely the case that you want it to be shared.
 You're usually passing ownership. You're essentially taking a thread-local
 variable from one thread and making it a thread-local variable on another
 thread. Unfortunately, the type system does not support the concept of thread
 ownership (beyond thread-local vs shared), so it's up to the programmer to
 make sure that no references to the object are kept on the original thread,
 but there's really no way around that unless you're always creating a new
 object when you pass it across, which would result in which would usually be a
 unnecessary copy. So, it becomes like  trusted in that sense.
It sounds like we need a way to transfer ownership of an object to a different thread.
 I honestly don't think we can solve it a different way without completely
 redesigning shared. shared is specifically designed such that you have to
 either cast it way to do anything with it or write all of your code to
 explicitly work with shared, which is not something that generally makes sense
 to do unless you're creating a type whose only value is in being shared across
 threads. Far more frequently, you want to share a type which you would also
 use normally as a thread-local variable, and that means casting.
I guess it wouldn't be possible to solve it without changing the type system. -- /Jacob Carlborg
Oct 10 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/10/13 12:33 AM, Jonathan M Davis wrote:
 I honestly don't think we can solve it a different way without completely
 redesigning shared. shared is specifically designed such that you have to
 either cast it way to do anything with it
no
 or write all of your code to
 explicitly work with shared, which is not something that generally makes sense
 to do unless you're creating a type whose only value is in being shared across
 threads.
yes
 Far more frequently, you want to share a type which you would also
 use normally as a thread-local variable, and that means casting.
no Andrei
Oct 10 2013
next sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Oct 10, 2013, at 10:55 AM, Andrei Alexandrescu =
<SeeWebsiteForEmail erdani.org> wrote:

 On 10/10/13 12:33 AM, Jonathan M Davis wrote:
=20
 Far more frequently, you want to share a type which you would also
 use normally as a thread-local variable, and that means casting.
=20 no
Yeah, I'd want to see this claim backed up by some examples. The only = data I share globally in my own apps is the occasional container. = Configuration data, a user database, whatever. I'll also frequently = move data between threads while dispatching tasks, but otherwise = everything is thread-local. I imagine there are other reasonable = methods for using shared data, but I don't know what they are.=
Oct 10 2013
prev sibling next sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, October 10, 2013 11:11:12 Sean Kelly wrote:
 On Oct 10, 2013, at 10:55 AM, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> wrote:
 On 10/10/13 12:33 AM, Jonathan M Davis wrote:
 Far more frequently, you want to share a type which you would also
 use normally as a thread-local variable, and that means casting.
no
Yeah, I'd want to see this claim backed up by some examples. The only data I share globally in my own apps is the occasional container. Configuration data, a user database, whatever. I'll also frequently move data between threads while dispatching tasks, but otherwise everything is thread-local. I imagine there are other reasonable methods for using shared data, but I don't know what they are.
Yeah, but it's that moving data between threads while dispatching tasks which requires casting. Pretty much anything which isn't a value type has to be cast to either immutable or shared in order to pass it across threads, and then it needs to then be cast back to thread-local mutable on the other side for to be useable. Unless you're only passing really basic stuff like int, or the types that your passing are only used for being passed across threads (and thus are designed to work as shared), you end up having to cast. The fact that you can only pass shared or immutable objects across combined with the fact that shared objects are generally unusable makes it so that you're at minimum going to have to either cast the object once it gets to the other thread, even if it was constructed as shared. And since shared is so useless means that if you need to do anything more than simply construct the object before passing it across, you're going have to have it as thread-local in the originating thread as well. I just don't see how you could avoid casting when passing ownership of an object from one thread to another without having a way to pass an object across threads without having to make it shared or immutable to pass it. - Jonathan M Davis
Oct 10 2013
next sibling parent reply "Sean Kelly" <sean invisibleduck.org> writes:
On Thursday, 10 October 2013 at 23:33:17 UTC, Jonathan M Davis 
wrote:
 I just don't see how you could avoid casting when passing 
 ownership of an
 object from one thread to another without having a way to pass 
 an object
 across threads without having to make it shared or immutable to 
 pass it.
Well, the restriction to only pass immutable and shared data is simply enforced statically by the API. So if there were an assumeUnique analog, the check could be modified to accept that as well, and then the class would arrive as unshared. This could be accomplished pretty easily. It would be yet another step towards not having thread-local pools though. I was initially pretty conservative in what was an acceptable type to send, because it's always easier to loosen restrictions than tighten them.
Oct 10 2013
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, October 11, 2013 02:05:19 Sean Kelly wrote:
 It would be yet another step
 towards not having thread-local pools though.
At this point, I don't see how we can have thread-local pools unless casting to and from shared has hooks for managing that. Otherwise, it's far too likely that an object is going to be in the wrong pool, because it's being used as shared when it was constructed as thread-local or vice versa. And we may need some sort of hook with std.concurrency.send which understands that the object being sent is being transferred from one thread to another and would tell the GC to migrate the object from one pool to another (though to do that, it would probably have to not be typed as shared but rather as thread-local, which would jive better with what you're talking about doing with std.concurrency). Certainly, with how shared currently works, it's hard to see how we could get away with having thread-local GC pools as great as that would be. So, if we want that, something about how shared works is going to have to change. - Jonathan M Davis
Oct 10 2013
parent Jacob Carlborg <doob me.com> writes:
On 2013-10-11 02:51, Jonathan M Davis wrote:

 At this point, I don't see how we can have thread-local pools unless casting
 to and from shared has hooks for managing that. Otherwise, it's far too likely
 that an object is going to be in the wrong pool, because it's being used as
 shared when it was constructed as thread-local or vice versa. And we may need
 some sort of hook with std.concurrency.send which understands that the object
 being sent is being transferred from one thread to another and would tell the
 GC to migrate the object from one pool to another (though to do that, it would
 probably have to not be typed as shared but rather as thread-local, which
 would jive better with what you're talking about doing with std.concurrency).

 Certainly, with how shared currently works, it's hard to see how we could get
 away with having thread-local GC pools as great as that would be. So, if we
 want that, something about how shared works is going to have to change.
A simple solution to the "hook" would be to pass a dummy type indicating the object should be transferred: struct Transfer { } send(tid, foo, Transfer()); "Transfer" would be defined in std.concurrency. -- /Jacob Carlborg
Oct 11 2013
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/10/13 4:33 PM, Jonathan M Davis wrote:
 I just don't see how you could avoid casting when passing ownership of an
 object from one thread to another without having a way to pass an object
 across threads without having to make it shared or immutable to pass it.
By using restricted library types. Andrei
Oct 10 2013
prev sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, October 10, 2013 10:55:49 Andrei Alexandrescu wrote:
 On 10/10/13 12:33 AM, Jonathan M Davis wrote:
 I honestly don't think we can solve it a different way without completely
 redesigning shared. shared is specifically designed such that you have to
 either cast it way to do anything with it
no
 or write all of your code to
 explicitly work with shared, which is not something that generally makes
 sense to do unless you're creating a type whose only value is in being
 shared across threads.
yes
Really? Do you honestly expect the average use of shared to involve creating structs or classes which are designed specifically to be used as shared? That definitely has its use, but I would expect it to be far more common that someone would want to share the exact same types that they're using in their thread-local code. In fact, if anything, the normal responses to discussions on shared go in the complete opposite direction of creating classes which are designed to work as shared. It seems like the normal thing to do is simply avoid shared altogether and use __gshared so that you don't have to deal with any of the problems that shared causes. Granted, I obviously haven't seen everyone's code, but I don't believe that I have ever seen anyone create a type designed to be used as shared, and that's certainly not what people discuss doing when shared comes up. TDPL discusses that - and again, I do think that that has its place - but I've never seen it done, and I've never run into any place in my own code where I would have even considered it. Usually, you want to share an object of the same type that you're using in your thread-local code. And even if a struct or class is set up so that its member functions work great as shared, very little code seems to be written with shared in mind (since thread-local is the default), so the only functions which will work with it are its member functions, functions written specifically to work with that type, and templated functions that happen to work with shared. As such, I fully expect casting away shared to be a very common idiom. Without that, the number of things you can do with a shared object is very limited.
 Far more frequently, you want to share a type which you would also
 use normally as a thread-local variable, and that means casting.
no
What else do you expect to be doing with std.concurrency? That's what it's _for_. Unless all of the stuff that you're passing across threads are value types or are designed to work as immutable or shared (which most types aren't), the objects which get passed across need to be cast to thread-local mutable on the target thread in order to be used there, and if you have to do much of anything with the object other than constructing it before passing it across, then you're going to have to have it as thread-local on the originating thread as well, because most functions are going to be unusable with shared. - Jonathan M Davis
Oct 10 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/10/13 5:36 PM, Jonathan M Davis wrote:
 On Thursday, October 10, 2013 10:55:49 Andrei Alexandrescu wrote:
 On 10/10/13 12:33 AM, Jonathan M Davis wrote:
 I honestly don't think we can solve it a different way without completely
 redesigning shared. shared is specifically designed such that you have to
 either cast it way to do anything with it
no
 or write all of your code to
 explicitly work with shared, which is not something that generally makes
 sense to do unless you're creating a type whose only value is in being
 shared across threads.
yes
Really? Do you honestly expect the average use of shared to involve creating structs or classes which are designed specifically to be used as shared?
Yes. Data structures that can be shared are ALWAYS designed specifically for sharing, unless of course it's a trivial type like int. Sharing means careful interlocking and atomic operations and barriers and stuff. You can't EVER expect to obtain all of that magic by plastering "shared" on top of your type. Andrei
Oct 10 2013
next sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, October 10, 2013 18:21:52 Andrei Alexandrescu wrote:
 On 10/10/13 5:36 PM, Jonathan M Davis wrote:
 On Thursday, October 10, 2013 10:55:49 Andrei Alexandrescu wrote:
 On 10/10/13 12:33 AM, Jonathan M Davis wrote:
 or write all of your code to
 explicitly work with shared, which is not something that generally makes
 sense to do unless you're creating a type whose only value is in being
 shared across threads.
yes
Really? Do you honestly expect the average use of shared to involve creating structs or classes which are designed specifically to be used as shared?
Yes. Data structures that can be shared are ALWAYS designed specifically for sharing, unless of course it's a trivial type like int. Sharing means careful interlocking and atomic operations and barriers and stuff. You can't EVER expect to obtain all of that magic by plastering "shared" on top of your type.
It works just fine with the idiom that I described where you protect the usage of the object with a lock, cast it to thread-local to do stuff on it, and then release the lock (making sure that no thread-local references remain). Aside from the necessity of the cast, this is exactly what is typically done in every C++ code base that I've ever seen - and that's with complicated types and not just simple stuff like int. e.g. synchronized { auto tc = cast(T)mySharedT; tc.memberFunc(); doStuff(tc); //no thread-local references to tc other than tc should //exist at this point. } I agree that designing types specifically to function as shared objects has its place (e.g. concurrent containers), but in my experience, that is very much the rare case, not the norm, and what most D programmers seem to describe when talking about shared is simply using __gshared with normal types, not even using shared, let alone using it with types specifically designed to function as shared. So, the most common approach at this point in D seems to be to avoid shared entirely. - Jonathan M Davis
Oct 10 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/10/13 7:04 PM, Jonathan M Davis wrote:
 On Thursday, October 10, 2013 18:21:52 Andrei Alexandrescu wrote:
 You can't EVER expect to obtain all of that magic by plastering "shared"
 on top of your type.
It works just fine with the idiom that I described where you protect the usage of the object with a lock, cast it to thread-local to do stuff on it, and then release the lock (making sure that no thread-local references remain).
TDPL describes how synchronized automatically peels off the "shared" off of direct members of the object. Unfortunately that feature is not yet implemented. Andrei
Oct 10 2013
next sibling parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2013-10-11 02:08:02 +0000, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 10/10/13 7:04 PM, Jonathan M Davis wrote:
 On Thursday, October 10, 2013 18:21:52 Andrei Alexandrescu wrote:
 You can't EVER expect to obtain all of that magic by plastering "shared"
 on top of your type.
It works just fine with the idiom that I described where you protect the usage of the object with a lock, cast it to thread-local to do stuff on it, and then release the lock (making sure that no thread-local references remain).
TDPL describes how synchronized automatically peels off the "shared" off of direct members of the object. Unfortunately that feature is not yet implemented.
That "direct member" limitation makes the feature pretty much worthless. It's rare you want to protect a single integer behind a mutex, generally you protect data structures like arrays or trees, and those always have indirections. You could loosen it up a bit by allowing pure functions to use the member. But you must then make sure those functions won't escape a pointer to the protected structure through one of its argument, or the return value. That won't work with something like std.range.front on an array. Anyway, that whole concept of synchronized class is a deadlock honeypot. It should be scrapped altogether. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 10 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/10/13 7:37 PM, Michel Fortin wrote:
 On 2013-10-11 02:08:02 +0000, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 On 10/10/13 7:04 PM, Jonathan M Davis wrote:
 On Thursday, October 10, 2013 18:21:52 Andrei Alexandrescu wrote:
 You can't EVER expect to obtain all of that magic by plastering
 "shared"
 on top of your type.
It works just fine with the idiom that I described where you protect the usage of the object with a lock, cast it to thread-local to do stuff on it, and then release the lock (making sure that no thread-local references remain).
TDPL describes how synchronized automatically peels off the "shared" off of direct members of the object. Unfortunately that feature is not yet implemented.
That "direct member" limitation makes the feature pretty much worthless. It's rare you want to protect a single integer behind a mutex, generally you protect data structures like arrays or trees, and those always have indirections.
It's not about a single integer, it's about multiple flat objects. True, many cases do involve indirections.
 You could loosen it up a bit by allowing pure functions to use the
 member. But you must then make sure those functions won't escape a
 pointer to the protected structure through one of its argument, or the
 return value. That won't work with something like std.range.front on an
 array.

 Anyway, that whole concept of synchronized class is a deadlock honeypot.
 It should be scrapped altogether.
People still use it. Andrei
Oct 10 2013
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, October 10, 2013 19:43:40 Andrei Alexandrescu wrote:
 On 10/10/13 7:37 PM, Michel Fortin wrote:
 Anyway, that whole concept of synchronized class is a deadlock honeypot.
 It should be scrapped altogether.
People still use it.
To some extent in that some folks used synchronized functions, but synchronized classes haven't been implemented at all. What we have right now is basically just a copy of Java's synchronized function feature. Synchronized classes aren't drastically different, but whatever nuances come with the difference are completely unrealized at this point. I think that synchronized classes have their uses, but I'd honestly still be inclined to use them sparingly. It's frequently too much to lock a whole object at once rather than the single member or group of members that actually need the lock. - Jonathan M Davis
Oct 10 2013
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, October 10, 2013 19:08:02 Andrei Alexandrescu wrote:
 On 10/10/13 7:04 PM, Jonathan M Davis wrote:
 On Thursday, October 10, 2013 18:21:52 Andrei Alexandrescu wrote:
 You can't EVER expect to obtain all of that magic by plastering "shared"
 on top of your type.
It works just fine with the idiom that I described where you protect the usage of the object with a lock, cast it to thread-local to do stuff on it, and then release the lock (making sure that no thread-local references remain).
TDPL describes how synchronized automatically peels off the "shared" off of direct members of the object. Unfortunately that feature is not yet implemented.
I think that it's generally overkill to create a whole class just to protect one shared object. Simply protecting it when you need to with mutex or synchronized block should be enough, and I don't see the cast as being all that big a deal. What TDPL describes is essentially the same except that the compiler does it automatically, but it forces you to create a class just for that. And as Michel points out, removing shared on a single level is frequently not all enough. I think that synchronized classes have their place, but I also think that they're overkill for most situations. Why should I have to create a class just so that I can make a variable thread-local so that I can operate on it? I can do the same with a cast without the extra overhead and a lot more flexibly, since a cast doesn't have the single-level restriction that a synchronized class does. - Jonathan M Davis
Oct 10 2013
prev sibling parent reply "Sean Kelly" <sean invisibleduck.org> writes:
On Friday, 11 October 2013 at 02:07:57 UTC, Andrei Alexandrescu 
wrote:
 TDPL describes how synchronized automatically peels off the 
 "shared" off of direct members of the object. Unfortunately 
 that feature is not yet implemented.
This would help a ton. I'm still not super happy about having to label an entire method as synchronized for this to work though. I'd prefer to label it shared and synchronize only the part(s) inside that need to hold the lock.
Oct 11 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Friday, 11 October 2013 at 17:49:11 UTC, Sean Kelly wrote:
 On Friday, 11 October 2013 at 02:07:57 UTC, Andrei Alexandrescu 
 wrote:
 TDPL describes how synchronized automatically peels off the 
 "shared" off of direct members of the object. Unfortunately 
 that feature is not yet implemented.
This would help a ton. I'm still not super happy about having to label an entire method as synchronized for this to work though. I'd prefer to label it shared and synchronize only the part(s) inside that need to hold the lock.
It should work as well with synchronized(stuff) { // Stuff get its first level sharing removed. }
Oct 11 2013
parent "Dicebot" <public dicebot.lv> writes:
On Friday, 11 October 2013 at 17:54:12 UTC, deadalnix wrote:
 It should work as well with

 synchronized(stuff) {
 // Stuff get its first level sharing removed.
 }
I still stand by the point that for guaranteed safety it must be not simply removed but replaced with `scope` (assuming it is finally implemented).
Oct 11 2013
prev sibling parent Johannes Pfau <nospam example.com> writes:
Am Thu, 10 Oct 2013 22:04:16 -0400
schrieb "Jonathan M Davis" <jmdavisProg gmx.com>:

 most D programmers seem to describe when talking about shared is
 simply using __gshared with normal types, not even using shared, let
 alone using it with types specifically designed to function as
 shared. So, the most common approach at this point in D seems to be
 to avoid shared entirely.
One important reason for this is that the types in core.sync still aren't shared. -------- Mutex myMutex; //WRONG, myMutex is in TLS shared Mutex myMutex; //WRONG, can't call .lock, new __gshared Mutex myMutex; //Can't be used in safe code... //shared Mutex + casting to unshared when accessing: Can't be used in // safe code See also: http://forum.dlang.org/thread/mailman.2017.1353214033.5162.digitalmars-d puremagic.com?page=2#post-mailman.2037.1353278884.5162.digitalmars-d:40puremagic.com Sean Kelly: "I tried this once and it cascaded to requiring modifications of various definitions on core.sys.posix to add a "shared" qualifier, and since I wasn't ready to do that I rolled back the changes. I guess the alternative would be to have a shared equivalent for every operation that basically just casts away shared and then calls the non-shared function, but that's such a terrible design I've been resisting it."
Oct 11 2013
prev sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
11-Oct-2013 05:21, Andrei Alexandrescu пишет:
 On 10/10/13 5:36 PM, Jonathan M Davis wrote:
 On Thursday, October 10, 2013 10:55:49 Andrei Alexandrescu wrote:
 On 10/10/13 12:33 AM, Jonathan M Davis wrote:
 I honestly don't think we can solve it a different way without
 completely
 redesigning shared. shared is specifically designed such that you
 have to
 either cast it way to do anything with it
no
 or write all of your code to
 explicitly work with shared, which is not something that generally
 makes
 sense to do unless you're creating a type whose only value is in being
 shared across threads.
yes
Really? Do you honestly expect the average use of shared to involve creating structs or classes which are designed specifically to be used as shared?
Yes. Data structures that can be shared are ALWAYS designed specifically for sharing, unless of course it's a trivial type like int.
This. And exactly the same for immutable. It's interesting how folks totally expect complex types (like containers) to meaningfully work with all 3 qualifiers.
 Sharing
 means careful interlocking and atomic operations and barriers and stuff.
 You can't EVER expect to obtain all of that magic by plastering "shared"
 on top of your type.
Yup.
 Andrei
-- Dmitry Olshansky
Oct 11 2013
next sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 11/10/13 16:32, Dmitry Olshansky wrote:
 This. And exactly the same for immutable. It's interesting how folks totally
 expect complex types (like containers) to meaningfully work with all 3
qualifiers.
It's not so much that we expect it, as that we might expect that standard library types would _have the appropriate design work put in_ so that they would "just work" with these qualifiers. (Admittedly shared is a bit of a special case right now that probably needs more work before support is rolled out.) If you tell me that's an unreasonable expectation then fair enough, but it feels pretty bad if e.g. library-implemented number types (big integers or floats, rationals, complex numbers, ...) can't from a user perspective behave exactly like their built-in counterparts.
Oct 11 2013
next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
11-Oct-2013 18:46, Joseph Rushton Wakeling пишет:
 On 11/10/13 16:32, Dmitry Olshansky wrote:
 This. And exactly the same for immutable. It's interesting how folks
 totally
 expect complex types (like containers) to meaningfully work with all 3
 qualifiers.
It's not so much that we expect it, as that we might expect that standard library types would _have the appropriate design work put in_ so that they would "just work" with these qualifiers. (Admittedly shared is a bit of a special case right now that probably needs more work before support is rolled out.)
It can't - it's like expecting 37 to modify an become 76. Simply put built-ins are special. Imagine a ref-counted type - how would you copy it (and increment a count) with bit-wise immutability? It simply doesn't make sense. (Yes, one can keep count elsewhere e.g. in a global hash-table). More importantly there is little incentive to make immutable stuff ref-counted, especially COW-types. In this sense BigInt simply doesn't work with immutable by design, if need be one can make FixedBigInt that doesn't include COW, doesn't support read-modify-write and mixes well with BigInt. Immutable works best as static data and/or as snapshot style data structures. It's tables, strings and some unique stuff that gets frozen and published at a certain point (usually at start up). It makes a lot less sense at local scope aside from aesthetic beauty as there is plenty of invariants to check, and bitwise immutability is a minority of that. I would even suggest to adopt a convention for a pair of freeze/thaw methods for any UDT that give you a deep copy of object this is made for mutation (thaw on immutable object) or immutable (freeze mutable).
 If you tell me that's an unreasonable expectation then fair enough, but
 it feels pretty bad if e.g. library-implemented number types (big
 integers or floats, rationals, complex numbers, ...) can't from a user
 perspective behave exactly like their built-in counterparts.
No magic paint would automatically expel reference count from the struct's body. With shared it's even more obvious. In general user defined type has to be designed with one of 3 major use cases in mind: local, immutable, shared. -- Dmitry Olshansky
Oct 11 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/11/13 7:46 AM, Joseph Rushton Wakeling wrote:
 On 11/10/13 16:32, Dmitry Olshansky wrote:
 This. And exactly the same for immutable. It's interesting how folks
 totally
 expect complex types (like containers) to meaningfully work with all 3
 qualifiers.
It's not so much that we expect it, as that we might expect that standard library types would _have the appropriate design work put in_ so that they would "just work" with these qualifiers. (Admittedly shared is a bit of a special case right now that probably needs more work before support is rolled out.) If you tell me that's an unreasonable expectation then fair enough, but it feels pretty bad if e.g. library-implemented number types (big integers or floats, rationals, complex numbers, ...) can't from a user perspective behave exactly like their built-in counterparts.
I think that's reasonable. Andrei
Oct 11 2013
parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 11/10/13 23:02, Andrei Alexandrescu wrote:
 On 10/11/13 7:46 AM, Joseph Rushton Wakeling wrote:
 It's not so much that we expect it, as that we might expect that
 standard library types would _have the appropriate design work put in_
 so that they would "just work" with these qualifiers.  (Admittedly
 shared is a bit of a special case right now that probably needs more
 work before support is rolled out.)

 If you tell me that's an unreasonable expectation then fair enough, but
 it feels pretty bad if e.g. library-implemented number types (big
 integers or floats, rationals, complex numbers, ...) can't from a user
 perspective behave exactly like their built-in counterparts.
I think that's reasonable.
Good :-) It's probably clear from discussion that I don't have a sufficient theoretical overview to immediately address what needs to be done here without help, but if anyone is willing to provide some guidance and instruction, I'm happy to try and do the legwork on std.bigint to bring it up to speed in this respect.
Oct 11 2013
prev sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, October 11, 2013 18:32:15 Dmitry Olshansky wrote:
 11-Oct-2013 05:21, Andrei Alexandrescu пишет:
 Yes. Data structures that can be shared are ALWAYS designed specifically
 for sharing, unless of course it's a trivial type like int.
This. And exactly the same for immutable. It's interesting how folks totally expect complex types (like containers) to meaningfully work with all 3 qualifiers.
That's part of the point. Most stuff can't work with shared. That's why you're forced to cast away shared in many cases. Yes, designing a class specifically to function as shared makes sense some of the time (e.g. concurrent containers), but should I have to create a synchronized class just to wrap a normal type that I happen to want to use as shared in some of my code? That seems like overkill to me, and it forces you to put everything in member functions (if you want to avoid casting away shared anywhere), because it's only inside the member functions that top level of shared is removed for you (and simply removing the top level shared doens't work for more complex objects anyway, thereby still forcing a cast). That makes using shared a royal pain. It's just far cleaner IMHO to protect the shared variable with a lock and cast away shared to operate on it. - Jonathan M Davis
Oct 11 2013
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
12-Oct-2013 00:14, Jonathan M Davis пишет:
 On Friday, October 11, 2013 18:32:15 Dmitry Olshansky wrote:
 11-Oct-2013 05:21, Andrei Alexandrescu пишет:
 Yes. Data structures that can be shared are ALWAYS designed specifically
 for sharing, unless of course it's a trivial type like int.
This. And exactly the same for immutable. It's interesting how folks totally expect complex types (like containers) to meaningfully work with all 3 qualifiers.
That's part of the point. Most stuff can't work with shared. That's why you're forced to cast away shared in many cases. Yes, designing a class specifically to function as shared makes sense some of the time (e.g. concurrent containers), but should I have to create a synchronized class just to wrap a normal type that I happen to want to use as shared in some of my code?
There is not much stuff that needs to be shared. And piles of casts do not inspire confidence at all. If anything having centralized point (e.g. wrapper class as you mention) for code that deals with concurrency and lock is almost always a plus.
 That
 seems like overkill to me, and it forces you to put everything in member
 functions (if you want to avoid casting away shared anywhere), because it's
 only inside the member functions that top level of shared is removed for you
 (and simply removing the top level shared doens't work for more complex
 objects anyway, thereby still forcing a cast). That makes using shared a royal
 pain.
shared needs some library-side help, that's true.
 It's just far cleaner IMHO to protect the shared variable with a lock
 and cast away shared to operate on it.
Then I respectfully disagree. -- Dmitry Olshansky
Oct 11 2013
prev sibling next sibling parent Robert Schadek <realburner gmx.de> writes:
On 10/10/2013 09:33 AM, Jonathan M Davis wrote:
 I honestly don't think we can solve it a different way without completely 
 redesigning shared. shared is specifically designed such that you have to 
 either cast it way to do anything with it or write all of your code to 
 explicitly work with shared, which is not something that generally makes sense 
 to do unless you're creating a type whose only value is in being shared across 
 threads. Far more frequently, you want to share a type which you would also 
 use normally as a thread-local variable, and that means casting.

 - Jonathan M Davis
+1
Oct 10 2013
prev sibling parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2013-10-10 06:41:19 +0000, Jacob Carlborg <doob me.com> said:

 On 2013-10-10 06:24, Jonathan M Davis wrote:
 
 So, it's my impression that being able to consider casting to or from shared
 as abnormal in code which uses shared is a bit of a pipe dream at this point.
 The current language design pretty much requires casting when doing much of
 anything with concurrency.
There must be a better way to solve this.
http://michelf.ca/blog/2012/mutex-synchonization-in-d/ -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 10 2013
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-10-10 13:17, Michel Fortin wrote:

 http://michelf.ca/blog/2012/mutex-synchonization-in-d/
I think I like the idea, but won't you have the same problem as Jonathan described? You can't pass these variables to another function that doesn't expect it to be passed a synchronized variable. You can pass it to pure functions which mean you can probably pass it to a couple of more functions compared to using "shared". -- /Jacob Carlborg
Oct 10 2013
parent Michel Fortin <michel.fortin michelf.ca> writes:
On 2013-10-10 13:02:14 +0000, Jacob Carlborg <doob me.com> said:

 On 2013-10-10 13:17, Michel Fortin wrote:
 
 http://michelf.ca/blog/2012/mutex-synchonization-in-d/
I think I like the idea, but won't you have the same problem as Jonathan described? You can't pass these variables to another function that doesn't expect it to be passed a synchronized variable. You can pass it to pure functions which mean you can probably pass it to a couple of more functions compared to using "shared".
Well, it's one piece of a puzzle. In itself it already is better than having to cast every time. Combined with a way to pass those variables to other functions safely, it should solve practically all the remaining problems that currently require a cast. But I don't have nice a solution for the later problem (short of adding more attributes). -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 10 2013
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-10-10 13:17, Michel Fortin wrote:

 http://michelf.ca/blog/2012/mutex-synchonization-in-d/
This looks similar to what I described here: http://forum.dlang.org/thread/bsqqfmhgzntryyaqrtky forum.dlang.org?page=19#post-l35rql:24og2:241:40digitalmars.com -- /Jacob Carlborg
Oct 10 2013
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2013-10-10 13:03:37 +0000, Jacob Carlborg <doob me.com> said:

 On 2013-10-10 13:17, Michel Fortin wrote:
 
 http://michelf.ca/blog/2012/mutex-synchonization-in-d/
This looks similar to what I described here: http://forum.dlang.org/thread/bsqqfmhgzntryyaqrtky forum.dlang.org?page=19#post-l35rql:24og2:241:40digitalmars.com
Somewhat
 
similar. I don't think it's a good practice to make a mutex part of the public interface of an object (or any public interface for that matter), which is why they're kept private inside the class in my examples. Public mutexes can be locked from anywhere in your program, they lack encapsulation and this makes them prone to deadlocks. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 10 2013
parent Jacob Carlborg <doob me.com> writes:
On 2013-10-10 16:14, Michel Fortin wrote:

 similar. I don't think it's a good practice to make a mutex part of the
 public interface of an object (or any public interface for that matter),
 which is why they're kept private inside the class in my examples.
 Public mutexes can be locked from anywhere in your program, they lack
 encapsulation and this makes them prone to deadlocks.
Right, it's better to keep them private. -- /Jacob Carlborg
Oct 10 2013
prev sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
On Oct 10, 2013, at 4:17 AM, Michel Fortin <michel.fortin michelf.ca> =
wrote:

 On 2013-10-10 06:41:19 +0000, Jacob Carlborg <doob me.com> said:
=20
 On 2013-10-10 06:24, Jonathan M Davis wrote:
 So, it's my impression that being able to consider casting to or =
from shared
 as abnormal in code which uses shared is a bit of a pipe dream at =
this point.
 The current language design pretty much requires casting when doing =
much of
 anything with concurrency.
There must be a better way to solve this.
=20 http://michelf.ca/blog/2012/mutex-synchonization-in-d/
Good article. But why didn't you mention core.sync? It has both a = Mutex and a ReadWriteMutex (ie. shared_mutex).=
Oct 10 2013
parent Michel Fortin <michel.fortin michelf.ca> writes:
On 2013-10-10 17:34:47 +0000, Sean Kelly <sean invisibleduck.org> said:

 On Oct 10, 2013, at 4:17 AM, Michel Fortin <michel.fortin michelf.ca>
 wrote:
 
 http://michelf.ca/blog/2012/mutex-synchonization-in-d/
Good article. But why didn't you mention core.sync? It has both a Mutex and a ReadWriteMutex (ie. shared_mutex).
Because that would have required a ton of explanations about why you need casts everywhere to remove shared, and I don't even know where to begin to explain shared semantics. Shared just doesn't make sense to me the way it works right now. The examples in C++ are much clearer than anything I could have done in D2. I don't want to have to explain why I have to bypass the type system every time I need to access a variable. I'll add that I'm coding in C++ right now so it's much easier to come up with C++ examples. That said, it might be a good idea to add a note at the end about core.sync in case someone wants to try that technique in D. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 10 2013
prev sibling parent "Dicebot" <public dicebot.lv> writes:
On Thursday, 10 October 2013 at 04:24:31 UTC, Jonathan M Davis 
wrote:
 Also, casting _away_ shared is going to be a very common 
 operation due to
 how shared works.
It is yet another use case for `scope` storage class. Locking `shared` variable via mutex should return same variable but casted to non-shared `scope` (somewhere inside the locking library function). Then it is safe to pass it to functions accepting scope parameters as reference won't possibly escape.
Oct 10 2013
prev sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
On Oct 9, 2013, at 9:24 PM, Jonathan M Davis <jmdavisProg gmx.com> =
wrote:
=20
 And given that std.concurrency requires casting to and from shared or=20=
 immutable in order to pass objects across threads, it seems ilke most =
of D's=20
 concurrency model requires casting to and/or from shared or immutable.
std.concurrency won't be this way forever though. We could fake move = semantics with something like assumeUnique!T, so send could be modified = to accept a non-shared class that's marked as Unique. The other option = would be deep copying or serialization.=
Oct 10 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/10/2013 10:27 AM, Sean Kelly wrote:
 [...]
Sean - whatever means you're using to reply breaks the thread.
Oct 10 2013
parent reply Sean Kelly <sean invisibleduck.org> writes:
On Oct 10, 2013, at 12:11 PM, Walter Bright <newshound2 digitalmars.com> =
wrote:

 On 10/10/2013 10:27 AM, Sean Kelly wrote:
 [...]
=20 Sean - whatever means you're using to reply breaks the thread.
The mailing list Brad set up--I can't do NNTP from most locations. I = guess I'll use the website.=
Oct 10 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/10/2013 12:38 PM, Sean Kelly wrote:
 On Oct 10, 2013, at 12:11 PM, Walter Bright <newshound2 digitalmars.com> wrote:

 On 10/10/2013 10:27 AM, Sean Kelly wrote:
 [...]
Sean - whatever means you're using to reply breaks the thread.
The mailing list Brad set up--I can't do NNTP from most locations. I guess I'll use the website.
I'm curious why NNTP would be blocked. I've been able to access it from any wifi hotspots I've tried it from.
Oct 10 2013
parent reply "Sean Kelly" <sean invisibleduck.org> writes:
On Thursday, 10 October 2013 at 20:50:10 UTC, Walter Bright wrote:
 I'm curious why NNTP would be blocked. I've been able to access 
 it from any wifi hotspots I've tried it from.
My only guess is that usenet may be perceived as a illegal file sharing resource. But it's been a while since I've tried, so I'll give it another shot.
Oct 10 2013
parent "Sean Kelly" <sean invisibleduck.org> writes:
On Thursday, 10 October 2013 at 21:15:39 UTC, Sean Kelly wrote:
 On Thursday, 10 October 2013 at 20:50:10 UTC, Walter Bright 
 wrote:
 I'm curious why NNTP would be blocked. I've been able to 
 access it from any wifi hotspots I've tried it from.
My only guess is that usenet may be perceived as a illegal file sharing resource. But it's been a while since I've tried, so I'll give it another shot.
No luck. I can get to the SSL port (563) on usenet hosts that offer it, but not port 119 from anywhere I care to check news from.
Oct 10 2013
prev sibling next sibling parent Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On Tue, 08 Oct 2013 19:22:34 -0700
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 
 It's clear that the perception of GC will not change soon, however
 good or not the arguments may be as applied to various situations and 
 projects. It is also a reality that our GC is slow.
 
 So we need to attack this problem from multiple angles:
 
 * Make RefCounted work as immutable data. There will be a casting
 away of immutability in there, but since it's under the aegis of the
 standard library, it is admissible. All implementations must ensure
 RefCounted works.
 
 * Add reference counted slice and associative array types that are as 
 close as humanly possible to the built-in ones.
 
 * Advertise all of the above in a top module such as std.refcounted. 
 It's amazing how many D programmers have no idea RefCounted even
 exists.
 
 * Review all of Phobos for hidden allocations and make appropriate 
 decisions on how to give the user more control over allocation. I
 have some idea on how that can be done, at various disruption costs.
 
 * Get Robert Schadek's precise GC in. Walter and I have become 101% 
 convinced a precise GC is the one way to go about GC.
 
 I'm up to my neck in D work for Facebook so my time is well spent. We 
 must definitely up the ante in terms of development speed, and pronto.
 
Unless the eventual allocator stuff would make special refcounted slices/AAs redundant (which actually wouldn't be too terrible, really), then I think all that sounds great. Add in a simple way to do full closures without GC allocation and it'd be a big win. However, even with all that, I still feel it would be in D's best interest to have a trivial[1] way to guarantee no GC allocations *at least* on a whole-program level, if not also on a finer-grained basis. *Even if* it's successfully argued that "no GC" is rarely, if ever, a valid requirement (which I find questionable anyway), it's *still* an important feature for D *just because* there's so much demand for it. When your goal is to successfully market something, there *is* value in fulfilling a common need, even if it's just a perceived need. (But again, I'm not claiming "no GC" to be an imaginary need. Just saying that it really doesn't matter if it's imagined or real. The mere demand for it makes it a legitimate concern.) [1] "Trival" == Doesn't require the user to modify or replace anything in druntime or phobos, and all violations are automatically caught at compiletime.
Oct 08 2013
prev sibling next sibling parent reply "JR" <zorael gmail.com> writes:
On Wednesday, 9 October 2013 at 02:22:35 UTC, Andrei Alexandrescu 
wrote:
 * Get Robert Schadek's precise GC in. Walter and I have become 
 101% convinced a precise GC is the one way to go about GC.
An orthogonal question, but is Lucarella's CDGC (still) being ported? There's nothing mutually exclusive between a precise and a concurrent gc, no?
Oct 09 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/9/2013 1:59 AM, JR wrote:
 On Wednesday, 9 October 2013 at 02:22:35 UTC, Andrei Alexandrescu wrote:
 * Get Robert Schadek's precise GC in. Walter and I have become 101% convinced
 a precise GC is the one way to go about GC.
An orthogonal question, but is Lucarella's CDGC (still) being ported? There's nothing mutually exclusive between a precise and a concurrent gc, no?
I thought that got stuck on the problem that the linux system libraries had some sort of threading problem with them.
Oct 09 2013
parent reply "Leandro Lucarella" <leandro.lucarella sociomantic.com> writes:
On Wednesday, 9 October 2013 at 09:01:12 UTC, Walter Bright wrote:
 On 10/9/2013 1:59 AM, JR wrote:
 On Wednesday, 9 October 2013 at 02:22:35 UTC, Andrei 
 Alexandrescu wrote:
 * Get Robert Schadek's precise GC in. Walter and I have 
 become 101% convinced
 a precise GC is the one way to go about GC.
An orthogonal question, but is Lucarella's CDGC (still) being ported? There's nothing mutually exclusive between a precise and a concurrent gc, no?
I thought that got stuck on the problem that the linux system libraries had some sort of threading problem with them.
This is not really what's stopping the porting, is a problem, but an independent one. My idea was to port the GC as it is in Tango, and then see how to overcome its limitations. The problem is it's very hard for me to dedicate time to this porting effort. The decision to make the port happen is there, I just how to figure out how and when. I'll keep you posted when I have news.
Oct 11 2013
parent reply "Sean Kelly" <sean invisibleduck.org> writes:
On Friday, 11 October 2013 at 09:56:10 UTC, Leandro Lucarella 
wrote:
 This is not really what's stopping the porting, is a problem, 
 but an independent one. My idea was to port the GC as it is in 
 Tango, and then see how to overcome its limitations.
I tried this a while back, but the GC in Druntime has changed a bunch since it diverged from Tango, and some or all of those changed need to be applied to your concurrent collector before it can be used. Like you, I ended up not having the time for this. I think you'd need to do a diff of the current GC code vs. the code as it was originally checked into SVN on DSource.
Oct 11 2013
parent "Leandro Lucarella" <leandro.lucarella sociomantic.com> writes:
On Friday, 11 October 2013 at 17:58:07 UTC, Sean Kelly wrote:
 On Friday, 11 October 2013 at 09:56:10 UTC, Leandro Lucarella 
 wrote:
 This is not really what's stopping the porting, is a problem, 
 but an independent one. My idea was to port the GC as it is in 
 Tango, and then see how to overcome its limitations.
I tried this a while back, but the GC in Druntime has changed a bunch since it diverged from Tango, and some or all of those changed need to be applied to your concurrent collector before it can be used. Like you, I ended up not having the time for this. I think you'd need to do a diff of the current GC code vs. the code as it was originally checked into SVN on DSource.
My plan was actually to redo all the patches I did to Tango (which were A LOT) on druntime master, adapting them as I go (I even started doing that but got busy after just 2 or 3 :S). I hope I can resume the work soon!
Oct 20 2013
prev sibling parent reply Robert Schadek <realburner gmx.de> writes:
On 10/09/2013 04:22 AM, Andrei Alexandrescu wrote:
 * Get Robert Schadek's precise GC in. Walter and I have become 101%
 convinced a precise GC is the one way to go about GC.
I would like to claim that work, but the Rainer Schütze wrote that. I know, both german, same initials.
Oct 09 2013
next sibling parent "ponce" <contact gam3sfrommars.fr> writes:
On Wednesday, 9 October 2013 at 09:29:53 UTC, Robert Schadek 
wrote:
 On 10/09/2013 04:22 AM, Andrei Alexandrescu wrote:
 * Get Robert Schadek's precise GC in. Walter and I have become 
 101%
 convinced a precise GC is the one way to go about GC.
I would like to claim that work, but the Rainer Schütze wrote that. I know, both german, same initials.
From a Levenshtein distance point of view, it's almost your work.
Oct 09 2013
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/9/13 2:29 AM, Robert Schadek wrote:
 On 10/09/2013 04:22 AM, Andrei Alexandrescu wrote:
 * Get Robert Schadek's precise GC in. Walter and I have become 101%
 convinced a precise GC is the one way to go about GC.
I would like to claim that work, but the Rainer Schütze wrote that. I know, both german, same initials.
Apologies to both! Andrei
Oct 09 2013
prev sibling next sibling parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Tuesday, 8 October 2013 at 16:22:25 UTC, Dicebot wrote:
 On Tuesday, 8 October 2013 at 15:43:46 UTC, ponce wrote:
 Is there a plan to have a standard counter-attack to that kind 
 of overblown problems?
 It could be just a solid blog post or a  nogc feature.
It is not overblown. It is simply " nogc" which is lacking but absolutely mandatory. Amount of hidden language allocations makes manually cleaning code of those via runtime asserts completely unreasonable for real project.
Please no more attributes. What next, nomalloc? Making sure your code doesn't allocate isn't that difficult.
Oct 08 2013
parent reply "Dicebot" <public dicebot.lv> writes:
On Tuesday, 8 October 2013 at 19:38:22 UTC, Peter Alexander wrote:
 Making sure your code doesn't allocate isn't that difficult.
What would you use for that? It is not difficult, it is unnecessary (and considerably) time-consuming.
Oct 08 2013
next sibling parent "ponce" <contact gmsfrommars.fr> writes:
 What would you use for that? It is not difficult, it is 
 unnecessary (and considerably) time-consuming.
It's likely allocations would show up in a profiler since GC collections are started by those. But I never tested it.
Oct 08 2013
prev sibling parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Tuesday, 8 October 2013 at 20:44:55 UTC, Dicebot wrote:
 On Tuesday, 8 October 2013 at 19:38:22 UTC, Peter Alexander 
 wrote:
 Making sure your code doesn't allocate isn't that difficult.
What would you use for that? It is not difficult, it is unnecessary (and considerably) time-consuming.
Just learn where allocations occur and avoid them during development. This leaves you only with accidental or otherwise unexpected allocations. For the accidental allocations, these will come up during profiling (which is done anyway in a performance sensitive program). The profiler gives you the call stack so these are trivial to spot and remove. There are also several other ways to spot allocations (modify druntime to log on allocation, set a breakpoint in the GC using a debugger, etc.) although I don't do these. You say it is time consuming. In my experience it isn't. General profiling and performance tuning are more time consuming. You may argue that profiling won't always catch accidental allocations due to test coverage. This is true, but then nogc is only a partial fix to this anyway. It will catch GC allocations, but what about accidental calls to malloc, mmap, or maybe an accidental IO call due to some logging you forgot to remove. GC allocations are just one class of performance problems, there are many more and I hope we don't have to add attributes for them all.
Oct 08 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/8/2013 3:02 PM, Peter Alexander wrote:
 You may argue that profiling won't always catch accidental allocations due to
 test coverage. This is true, but then  nogc is only a partial fix to this
 anyway. It will catch GC allocations, but what about accidental calls to
malloc,
 mmap, or maybe an accidental IO call due to some logging you forgot to remove.
 GC allocations are just one class of performance problems, there are many more
 and I hope we don't have to add attributes for them all.
This, of course, is the other problem with nogc. Having a forest of attributes on otherwise ordinary functions is awfully ugly.
Oct 08 2013
parent reply Sean Kelly <sean invisibleduck.org> writes:
On Oct 8, 2013, at 3:38 PM, Walter Bright <newshound2 digitalmars.com> =
wrote:

 On 10/8/2013 3:02 PM, Peter Alexander wrote:
 You may argue that profiling won't always catch accidental =
allocations due to
 test coverage. This is true, but then  nogc is only a partial fix to =
this
 anyway. It will catch GC allocations, but what about accidental calls =
to malloc,
 mmap, or maybe an accidental IO call due to some logging you forgot =
to remove.
 GC allocations are just one class of performance problems, there are =
many more
 and I hope we don't have to add attributes for them all.
=20 This, of course, is the other problem with nogc. Having a forest of =
attributes on otherwise ordinary functions is awfully ugly. And we already have a forest of attributes on otherwise ordinary = functions.
Oct 08 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/8/13 4:22 PM, Sean Kelly wrote:
 On Oct 8, 2013, at 3:38 PM, Walter Bright <newshound2 digitalmars.com> wrote:

 On 10/8/2013 3:02 PM, Peter Alexander wrote:
 You may argue that profiling won't always catch accidental allocations due to
 test coverage. This is true, but then  nogc is only a partial fix to this
 anyway. It will catch GC allocations, but what about accidental calls to
malloc,
 mmap, or maybe an accidental IO call due to some logging you forgot to remove.
 GC allocations are just one class of performance problems, there are many more
 and I hope we don't have to add attributes for them all.
This, of course, is the other problem with nogc. Having a forest of attributes on otherwise ordinary functions is awfully ugly.
And we already have a forest of attributes on otherwise ordinary functions.
It's the cost of expressiveness. Exercising deduction wherever possible is the cure. Andrei
Oct 08 2013
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 10/09/2013 05:38 AM, Andrei Alexandrescu wrote:
 On 10/8/13 4:22 PM, Sean Kelly wrote:
 ...

 And we already have a forest of attributes on otherwise ordinary
 functions.
It's the cost of expressiveness. Exercising deduction wherever possible is the cure. Andrei
The cost of expressiveness certainly is not tedious syntax. What he outlines is a cost of lacking expressiveness. (Eg. it is impossible to abstract over lists of built-in attributes.)
Oct 09 2013
prev sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Tuesday, 8 October 2013 at 23:22:54 UTC, Sean Kelly wrote:
 On Oct 8, 2013, at 3:38 PM, Walter Bright
 This, of course, is the other problem with  nogc. Having a 
 forest of attributes on otherwise ordinary functions is 
 awfully ugly.
And we already have a forest of attributes on otherwise ordinary functions.
I don't understand why there is such reluctance to have many attributes. I'd gladly accept language with literally hundreds of those if they are orthogonal and useful. That is the very point of using strong typed language - making compiler verify as much assumptions as possible for you. Key problem here is not amount of attributes but that those are opt-in, not opt-out,
Oct 09 2013
next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
On Oct 9, 2013, at 5:48 AM, "Dicebot" <public dicebot.lv> wrote:
=20
 On Tuesday, 8 October 2013 at 23:22:54 UTC, Sean Kelly wrote:
 On Oct 8, 2013, at 3:38 PM, Walter Bright
 This, of course, is the other problem with  nogc. Having a forest of att=
ributes on otherwise ordinary functions is awfully ugly.
=20
 And we already have a forest of attributes on otherwise ordinary function=
s.
=20
 I don't understand why there is such reluctance to have many attributes. I=
'd gladly accept language with literally hundreds of those if they are ortho= gonal and useful. That is the very point of using strong typed language - ma= king compiler verify as much assumptions as possible for you. Key problem he= re is not amount of attributes but that those are opt-in, not opt-out, They aren't opt-out for really any reasonable project though, because code i= s reused and those people may want at least the standard attributes to be se= t. Personally, the array of attributes that can be applied to a D function i= s one of my biggest pet peeves with the language. It gains me nothing person= ally, and adds a lot of extra thought to the process of writing a function.=20=
Oct 09 2013
parent reply "Dicebot" <public dicebot.lv> writes:
On Wednesday, 9 October 2013 at 13:57:03 UTC, Sean Kelly wrote:
 They aren't opt-out for really any reasonable project though, 
 because code is reused and those people may want at least the 
 standard attributes to be set. Personally, the array of 
 attributes that can be applied to a D function is one of my 
 biggest pet peeves with the language. It gains me nothing 
 personally, and adds a lot of extra thought to the process of 
 writing a function.
This is exactly what I was speaking about. It would have been much more easy if stuff was `pure safe immutable nothrow` by default and one added `dirty system mutable throw` on per-need basis after getting compiler error. But that is too late to change and this attribute inference may be only reasonable option.
Oct 09 2013
next sibling parent "Tourist" <gravatar gravatar.com> writes:
On Wednesday, 9 October 2013 at 14:11:46 UTC, Dicebot wrote:
 On Wednesday, 9 October 2013 at 13:57:03 UTC, Sean Kelly wrote:
 They aren't opt-out for really any reasonable project though, 
 because code is reused and those people may want at least the 
 standard attributes to be set. Personally, the array of 
 attributes that can be applied to a D function is one of my 
 biggest pet peeves with the language. It gains me nothing 
 personally, and adds a lot of extra thought to the process of 
 writing a function.
This is exactly what I was speaking about. It would have been much more easy if stuff was `pure safe immutable nothrow` by default and one added `dirty system mutable throw` on per-need basis after getting compiler error. But that is too late to change and this attribute inference may be only reasonable option.
Maybe it's worth to introduce "pure: safe: immutable: nothrow:" on top of every module as the new recommended design pattern. Will it work? Then it may go through a deprecation phase, e.g. omitting it on top of the module becomes a warning, then an error, then it becomes the default.
Oct 09 2013
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 10/9/2013 7:11 AM, Dicebot wrote:
 This is exactly what I was speaking about. It would have been much more easy if
 stuff was `pure  safe immutable nothrow` by default and one added `dirty
 system
 mutable throw` on per-need basis after getting compiler error. But that is too
 late to change and this attribute inference may be only reasonable option.
I would generally agree with that.
Oct 09 2013
prev sibling parent reply dennis luehring <dl.soluz gmx.net> writes:
Am 09.10.2013 16:11, schrieb Dicebot:
 On Wednesday, 9 October 2013 at 13:57:03 UTC, Sean Kelly wrote:
 They aren't opt-out for really any reasonable project though,
 because code is reused and those people may want at least the
 standard attributes to be set. Personally, the array of
 attributes that can be applied to a D function is one of my
 biggest pet peeves with the language. It gains me nothing
 personally, and adds a lot of extra thought to the process of
 writing a function.
This is exactly what I was speaking about. It would have been much more easy if stuff was `pure safe immutable nothrow` by default and one added `dirty system mutable throw` on per-need basis after getting compiler error. But that is too late to change and this attribute inference may be only reasonable option.
i wouldn't say that - lets talk about D3 or even D4 - with breaking changes or else the language development will come to a full stop in a few months/years, if something is totaly wrong it needs to be addressed and changed, developers will follow
Oct 09 2013
parent reply "Craig Dillabaugh" <craig.dillabaugh gmail.com> writes:
On Wednesday, 9 October 2013 at 18:44:16 UTC, dennis luehring 
wrote:
 Am 09.10.2013 16:11, schrieb Dicebot:
 On Wednesday, 9 October 2013 at 13:57:03 UTC, Sean Kelly wrote:
 They aren't opt-out for really any reasonable project though,
 because code is reused and those people may want at least the
 standard attributes to be set. Personally, the array of
 attributes that can be applied to a D function is one of my
 biggest pet peeves with the language. It gains me nothing
 personally, and adds a lot of extra thought to the process of
 writing a function.
This is exactly what I was speaking about. It would have been much more easy if stuff was `pure safe immutable nothrow` by default and one added `dirty system mutable throw` on per-need basis after getting compiler error. But that is too late to change and this attribute inference may be only reasonable option.
i wouldn't say that - lets talk about D3 or even D4 - with breaking changes or else the language development will come to a full stop in a few months/years, if something is totaly wrong it needs to be addressed and changed, developers will follow
Would it be possible to specify at the file level (or for a section of the file) a set of default attributes, and then override those for individual functions if need be? I am not sure what the syntax would look like, but say it would be: attributes safe pure nothrow Then every function has these attributes, unless otherwise specified. You could even possibly use more than once to divide up you code into sections with common attributes. attributes safe pure nothrow //Everything defined here is safe pure nothrow attributes system //Everything defined below this invocation is system by default //and so forth
Oct 09 2013
parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Wednesday, 9 October 2013 at 20:07:38 UTC, Craig Dillabaugh 
wrote:
 //Everything defined here is  safe pure nothrow

 attributes  system
You can do that now with colons: system: stuff.... The problem is there's no way to turn some of them off. For safe, there's system, but there's no "throws" for nothrow, no "impure", no "virtual", etc. This btw is another trivially easy addition that's been talked about for a while that should just be done for the next release.
Oct 09 2013
parent "eles" <eles eles.com> writes:
On Wednesday, 9 October 2013 at 20:37:40 UTC, Adam D. Ruppe wrote:
 On Wednesday, 9 October 2013 at 20:07:38 UTC, Craig Dillabaugh 
 wrote:
 //Everything defined here is  safe pure nothrow
The problem is there's no way to turn some of them off. For safe, there's system, but there's no "throws" for nothrow, no "impure", no "virtual", etc.
! safe ?
 This btw is another trivially easy addition that's been talked 
 about for a while that should just be done for the next release.
too much design, too little experiment?
Oct 09 2013
prev sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, October 09, 2013 06:57:00 Sean Kelly wrote:
 Personally, the array of attributes that can be applied to a D
 function is one of my biggest pet peeves with the language. It gains me
 nothing personally, and adds a lot of extra thought to the process of
 writing a function.
We'd probably do a lot better if we had better defaults (e.g. default to pure and then have an attribute for impure instead). The attributes that we have were formed around adding to what's in C/C++ rather than around minimizing the number of attributes required on your average function. But unfortunately, it's too late for that now, given how much code it would break to change it. - Jonathan M Davis
Oct 09 2013
prev sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Tuesday, 8 October 2013 at 22:02:07 UTC, Peter Alexander wrote:
 Just learn where allocations occur and avoid them during 
 development. This leaves you only with accidental or otherwise 
 unexpected allocations.
D is not expected to be a single man project language, isn't it? As I have already said, this implies spending plenty of time on careful reviewing of something that compiler can detect.
 For the accidental allocations, these will come up during 
 profiling (which is done anyway in a performance sensitive 
 program). The profiler gives you the call stack so these are 
 trivial to spot and remove. There are also several other ways 
 to spot allocations (modify druntime to log on allocation, set 
 a breakpoint in the GC using a debugger, etc.) although I don't 
 do these.

 You say it is time consuming. In my experience it isn't. 
 General profiling and performance tuning are more time 
 consuming.
1) Profiling is not always possible. 2) Profiling is considerably more time-consuming than getting compiler error. Time spent on this increases with team size and amount of unexperienced developers in team.
 You may argue that profiling won't always catch accidental 
 allocations due to test coverage. This is true, but then  nogc 
 is only a partial fix to this anyway. It will catch GC 
 allocations, but what about accidental calls to malloc, mmap, 
 or maybe an accidental IO call due to some logging you forgot 
 to remove. GC allocations are just one class of performance 
 problems, there are many more and I hope we don't have to add 
 attributes for them all.
That is why I am askign about noheap, not about nogc - GC here is a just a side thing that makes hidden allocations easier. Though question about defining full set of possible system-wide allocation sources is good one and I don't have any reasonable answer for this.
Oct 09 2013
parent Piotr Szturmaj <bncrbme jadamspam.pl> writes:
On 09.10.2013 14:41, Dicebot wrote:
 On Tuesday, 8 October 2013 at 22:02:07 UTC, Peter Alexander wrote:
 Just learn where allocations occur and avoid them during development.
 This leaves you only with accidental or otherwise unexpected allocations.
D is not expected to be a single man project language, isn't it? As I have already said, this implies spending plenty of time on careful reviewing of something that compiler can detect.
+1 We are humans that make mistakes. This is normal. Compilers are here to help us talk to the computers.
Oct 09 2013
prev sibling next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 8 October 2013 at 16:22:25 UTC, Dicebot wrote:
 On Tuesday, 8 October 2013 at 15:43:46 UTC, ponce wrote:
 Is there a plan to have a standard counter-attack to that kind 
 of overblown problems?
 It could be just a solid blog post or a  nogc feature.
It is not overblown.
I'm certain that most people complaining about it absolutely do not have the constraint that eliminate the possibility to use a GC.
Oct 08 2013
parent "Denis Koroskin" <2korden gmail.com> writes:
On Tuesday, 8 October 2013 at 20:44:56 UTC, deadalnix wrote:
 On Tuesday, 8 October 2013 at 16:22:25 UTC, Dicebot wrote:
 On Tuesday, 8 October 2013 at 15:43:46 UTC, ponce wrote:
 Is there a plan to have a standard counter-attack to that 
 kind of overblown problems?
 It could be just a solid blog post or a  nogc feature.
It is not overblown.
I'm certain that most people complaining about it absolutely do not have the constraint that eliminate the possibility to use a GC.
One interesting detail which is rarely explored is that you can "disable" GC for a portion of a program without actually disabling entire GC. In my current project I have a couple of threads that must never stop, especially not for a garbage collection. These threads manage their memory manually and don't rely on GC (we use reference counting to make sure we don't leak). The rest of the threads, however, make heavy use of the GC, and communicate with "real-time" threads using message passing. The trick is to spawn *native* threads (i.e. not std.core.Thread threads) so that when stop-the-world occurs, thread_suspendAll() will not block the said threads (because it won't even know about them!). Per-thread allocator will hopefully make things even more simple.
Oct 08 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/8/2013 9:22 AM, Dicebot wrote:
 It is simply " nogc" which is lacking but absolutely
 mandatory.
Adding nogc is fairly simple. The trouble, though, is (like purity) it is transitive. Every function an nogc function calls will also have to be nogc. This will entail a great deal of work updating phobos/druntime to add those annotations.
Oct 08 2013
next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tuesday, 8 October 2013 at 22:37:28 UTC, Walter Bright wrote:
 Every function an  nogc function calls will also have to be 
  nogc.
Eh, not necessarily. If it expands to static assert(!__traits(hasAnnotationRecursive, uses_gc));, then the only ones that *need* to be marked are the lowest level ones. Then it figures out the rest only on demand. Then, on the function you care about as a user, you say nogc and it tells you if you called anything and the static assert stacktrace tells you where it happened. Of course, to be convenient to use, phobos would need to offer non-allocating functions, which is indeed a fair amount of work, but they wouldn't *necessarily* have to have the specific attribute.
Oct 08 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/8/2013 3:45 PM, Adam D. Ruppe wrote:
 On Tuesday, 8 October 2013 at 22:37:28 UTC, Walter Bright wrote:
 Every function an  nogc function calls will also have to be  nogc.
Eh, not necessarily. If it expands to static assert(!__traits(hasAnnotationRecursive, uses_gc));, then the only ones that *need* to be marked are the lowest level ones. Then it figures out the rest only on demand. Then, on the function you care about as a user, you say nogc and it tells you if you called anything and the static assert stacktrace tells you where it happened. Of course, to be convenient to use, phobos would need to offer non-allocating functions, which is indeed a fair amount of work, but they wouldn't *necessarily* have to have the specific attribute.
What you're suggesting is called "interprocedural analysis" and doesn't work in a system with separate compilation (meaning that function bodies are hidden from the compiler).
Oct 08 2013
parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tuesday, 8 October 2013 at 22:53:35 UTC, Walter Bright wrote:
 What you're suggesting is called "interprocedural analysis" and 
 doesn't work in a system with separate compilation (meaning 
 that function bodies are hidden from the compiler).
Eh, that's not a dealbreaker, especially with phobos and druntime where the source is always available anyway. Though, my proposed __traits could perhaps be improved to just offer two things: __traits(getFunctionsCalled, function) returns a tuple of all functions called. These would ideally be in the form of symbols. __traits(functionBodyAvailable, function) true if the source was available to the compiler. And that's it: the rest is done in the library using existing language features. Then we can decide in library code if a missing attribute is a problem based on if the body was available or not. Note that this isn't specific to the gc: it would provide the necessary foundation for all kinds of library extensions in the same vein as safe, with the possibility of automatic inference from the prototype + presence of body.
Oct 08 2013
prev sibling parent reply "ponce" <contact gmsfrommars.fr> writes:
On Tuesday, 8 October 2013 at 22:45:51 UTC, Adam D. Ruppe wrote:
 Eh, not necessarily. If it expands to static 
 assert(!__traits(hasAnnotationRecursive, uses_gc));, then the 
 only ones that *need* to be marked are the lowest level ones. 
 Then it figures out the rest only on demand.

 Then, on the function you care about as a user, you say nogc 
 and it tells you if you called anything and the static assert 
 stacktrace tells you where it happened.

 Of course, to be convenient to use, phobos would need to offer 
 non-allocating functions, which is indeed a fair amount of 
 work, but they wouldn't *necessarily* have to have the specific 
 attribute.
But is it even necessary? There isn't a great deal of evidence that someone interested in optimization will be blocked on this particular problem, like Peter Alexander said. GC hassle is quite common but not that big a deal: - Manu: "Consequently, I avoid the GC in D too, and never had any major problems, only inconvenience." http://www.reddit.com/r/programming/comments/1nxs2i/the_state_of_rust_08/ccnefe7 - Dav1d: said he never had a GC problem with BRala (minecraft client) - Me: I had a small ~100ms GC pause in one of my games every 20 minutes, more often than not I don't notice it So a definitive written rebutal we can link to would perhaps be helpful.
Oct 08 2013
next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tuesday, 8 October 2013 at 22:58:02 UTC, ponce wrote:
 But is it even necessary?
It is nice to have stdlib functions available that can be used anywhere. For std.algorithm, Andrei has said if you ever implement an algorithm by hand, it means the library has failed. But there's two places where that falls short (IMO at least): using it without allocating, and using it without bringing in a bazillion dependencies. The latter is a bigger problem to me personally than the former (this is why simpledisplay.d as its own implementation of to and split, for example, doing it myself instead of importing phobos cut the compile time and exe size both in half), but a lot more people complain about the gc...
 There isn't a great deal of evidence that someone interested in 
 optimization will be blocked on this particular problem, like 
 Peter Alexander said.
yeah, I haven't found it to be a big deal to hunt down allocations either, and reimplementing the functions in phobos that allocate generally isn't hard anyway. But still, if we can do it, we might as well.
Oct 08 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/8/13 4:29 PM, Adam D. Ruppe wrote:
 On Tuesday, 8 October 2013 at 22:58:02 UTC, ponce wrote:
 But is it even necessary?
It is nice to have stdlib functions available that can be used anywhere. For std.algorithm, Andrei has said if you ever implement an algorithm by hand, it means the library has failed. But there's two places where that falls short (IMO at least): using it without allocating, and using it without bringing in a bazillion dependencies.
Only Levenshtein distance produces garbage in std.algorithm. Andrei
Oct 08 2013
next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Wednesday, 9 October 2013 at 03:38:56 UTC, Andrei Alexandrescu 
wrote:
 Only Levenshtein distance produces garbage in std.algorithm.
Yeah, I was referring more to phobos as a whole than algorithm specifically there, just using the principle on principle.
Oct 08 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/8/2013 8:38 PM, Andrei Alexandrescu wrote:
 Only Levenshtein distance produces garbage in std.algorithm.
Perhaps the documentation should reflect that: http://dlang.org/phobos/std_algorithm.html#levenshteinDistance
Oct 08 2013
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/8/13 8:45 PM, Walter Bright wrote:
 On 10/8/2013 8:38 PM, Andrei Alexandrescu wrote:
 Only Levenshtein distance produces garbage in std.algorithm.
Perhaps the documentation should reflect that: http://dlang.org/phobos/std_algorithm.html#levenshteinDistance
I'll need to fix the function to use RefCounted. Andrei
Oct 08 2013
prev sibling parent "Kiith-Sa" <kiithsacmp gmail.com> writes:
On Wednesday, 9 October 2013 at 03:46:20 UTC, Walter Bright wrote:
 On 10/8/2013 8:38 PM, Andrei Alexandrescu wrote:
 Only Levenshtein distance produces garbage in std.algorithm.
Perhaps the documentation should reflect that: http://dlang.org/phobos/std_algorithm.html#levenshteinDistance
I think it would be useful to specify what uses GC on top of every Phobos' module documentation.
Oct 08 2013
prev sibling parent Benjamin Thaut <code benjamin-thaut.de> writes:
Am 09.10.2013 01:29, schrieb Adam D. Ruppe:
 It is nice to have stdlib functions available that can be used anywhere.
 For std.algorithm, Andrei has said if you ever implement an algorithm by
 hand, it means the library has failed. But there's two places where that
 falls short (IMO at least): using it without allocating, and using it
 without bringing in a bazillion dependencies.
This. Exactly this was one of my biggest problems when writing GC free code. I started with "hey I just patch all places in phobos that do allocations". I planed to do so, so I can use std.algorithm and others. After trying for a while the dependencies killed me. So I just stopped altogether and rewrote whatever algorithm I would need myself. The dependencies between the modules in phobos are catastrophic. If you need one module, you basically need all of them. Kind Regards Benjamin Thaut
Oct 09 2013
prev sibling parent reply Manu <turkeyman gmail.com> writes:
On 9 October 2013 08:58, ponce <contact gmsfrommars.fr> wrote:

 On Tuesday, 8 October 2013 at 22:45:51 UTC, Adam D. Ruppe wrote:

 Eh, not necessarily. If it expands to static
assert(!__traits(**hasAnnotationRecursive,
 uses_gc));, then the only ones that *need* to be marked are the lowest
 level ones. Then it figures out the rest only on demand.

 Then, on the function you care about as a user, you say nogc and it tells
 you if you called anything and the static assert stacktrace tells you where
 it happened.

 Of course, to be convenient to use, phobos would need to offer
 non-allocating functions, which is indeed a fair amount of work, but they
 wouldn't *necessarily* have to have the specific attribute.
But is it even necessary? There isn't a great deal of evidence that someone interested in optimization will be blocked on this particular problem, like Peter Alexander said. GC hassle is quite common but not that big a deal: - Manu: "Consequently, I avoid the GC in D too, and never had any major problems, only inconvenience." http://www.reddit.com/r/** programming/comments/1nxs2i/**the_state_of_rust_08/ccnefe7<http://www.reddit.com/r/programming/comments/1nxs2i/the_state_of_rust_08/ccnefe7> - Dav1d: said he never had a GC problem with BRala (minecraft client) - Me: I had a small ~100ms GC pause in one of my games every 20 minutes, more often than not I don't notice it So a definitive written rebutal we can link to would perhaps be helpful.
I might just add, that while my experience has been that I haven't had any significant technical problems when actively avoiding the GC, the inconvenience is considerably more severe than I made out in that post (I don't want to foster public negativity). But it is actually really, really inconvenient. If that's my future with D, then I'll pass, just as any un-biased 3rd party would. I've been simmering on this issue ever since I took an interest in D. At first I was apprehensive to accept the GC, then cautiously optimistic that the GC might be okay. But I have seen exactly no movement in this area as long as I've been following D, and I have since reverted to a position in absolute agreement with the C++ users. I will never accept the GC in it's current form for all of my occupational requirements; it's implicitly non-deterministic, and offers very little control over performance characteristics. I've said before that until I can time-slice the GC, and it does not stop the world, then it doesn't satisfy my requirements. I see absolutely no motion towards that goal. If I were one of those many C++ users evaluating D for long-term adoption (and I am!), I'm not going to invest the future of my career and industry in a complete question mark which given years of watching already, is clearly going nowhere. As far as the GC is concerned, with respect to realtime embedded software, I'm out. I've completely lost faith. And it's going to take an awful lot more to restore my faith again than before. What I want is an option to replace the GC with ARC, just like Apple did. Clearly they came to the same conclusion, probably for exactly the same reasons. Apple have a policy of silky smooth responsiveness throughout the OS and the entire user experience. They consider this a sign of quality and professionalism. As far as I can tell, they concluded that non-deterministic GC pauses were incompatible with their goal. I agree. I think their experience should be taken very seriously. They have a successful platform on weak embedded hardware, with about a million applications deployed. I've had a lot of conversations with a lot of experts, plenty of conversations at dconf, and nobody could even offer me a vision for a GC that is acceptable. As far as I can tell, nobody I talked to really thinks a GC that doesn't stop the world, which can be carefully scheduled/time-sliced (ie, an incremental, thread-local GC, or whatever), is even possible. I'll take ARC instead. It's predictable, easy for all programmers who aren't experts on resource management to understand, and I have DIRECT control over it's behaviour and timing. But that's not enough, offering convenience while trying to avoid using the GC altogether is also very important. You should be able to write software that doesn't allocate memory. It's quite hard to do in D today. There's plenty of opportunity for improvement. I'm still keenly awaiting a more-developed presentation of Andrei's allocators system.
Oct 08 2013
next sibling parent reply "PauloPinto" <pjmlp progtools.org> writes:
On Wednesday, 9 October 2013 at 05:15:53 UTC, Manu wrote:
 On 9 October 2013 08:58, ponce <contact gmsfrommars.fr> wrote:

 On Tuesday, 8 October 2013 at 22:45:51 UTC, Adam D. Ruppe 
 wrote:

 Eh, not necessarily. If it expands to static 
 assert(!__traits(**hasAnnotationRecursive,
 uses_gc));, then the only ones that *need* to be marked are 
 the lowest
 level ones. Then it figures out the rest only on demand.

 Then, on the function you care about as a user, you say nogc 
 and it tells
 you if you called anything and the static assert stacktrace 
 tells you where
 it happened.

 Of course, to be convenient to use, phobos would need to offer
 non-allocating functions, which is indeed a fair amount of 
 work, but they
 wouldn't *necessarily* have to have the specific attribute.
But is it even necessary? There isn't a great deal of evidence that someone interested in optimization will be blocked on this particular problem, like Peter Alexander said. GC hassle is quite common but not that big a deal: - Manu: "Consequently, I avoid the GC in D too, and never had any major problems, only inconvenience." http://www.reddit.com/r/** programming/comments/1nxs2i/**the_state_of_rust_08/ccnefe7<http://www.reddit.com/r/programming/comments/1nxs2i/the_state_of_rust_08/ccnefe7> - Dav1d: said he never had a GC problem with BRala (minecraft client) - Me: I had a small ~100ms GC pause in one of my games every 20 minutes, more often than not I don't notice it So a definitive written rebutal we can link to would perhaps be helpful.
I might just add, that while my experience has been that I haven't had any significant technical problems when actively avoiding the GC, the inconvenience is considerably more severe than I made out in that post (I don't want to foster public negativity). But it is actually really, really inconvenient. If that's my future with D, then I'll pass, just as any un-biased 3rd party would. I've been simmering on this issue ever since I took an interest in D. At first I was apprehensive to accept the GC, then cautiously optimistic that the GC might be okay. But I have seen exactly no movement in this area as long as I've been following D, and I have since reverted to a position in absolute agreement with the C++ users. I will never accept the GC in it's current form for all of my occupational requirements; it's implicitly non-deterministic, and offers very little control over performance characteristics. I've said before that until I can time-slice the GC, and it does not stop the world, then it doesn't satisfy my requirements. I see absolutely no motion towards that goal. If I were one of those many C++ users evaluating D for long-term adoption (and I am!), I'm not going to invest the future of my career and industry in a complete question mark which given years of watching already, is clearly going nowhere. As far as the GC is concerned, with respect to realtime embedded software, I'm out. I've completely lost faith. And it's going to take an awful lot more to restore my faith again than before. What I want is an option to replace the GC with ARC, just like Apple did. Clearly they came to the same conclusion, probably for exactly the same reasons. Apple have a policy of silky smooth responsiveness throughout the OS and the entire user experience. They consider this a sign of quality and professionalism. As far as I can tell, they concluded that non-deterministic GC pauses were incompatible with their goal. I agree. I think their experience should be taken very seriously. They have a successful platform on weak embedded hardware, with about a million applications deployed. I've had a lot of conversations with a lot of experts, plenty of conversations at dconf, and nobody could even offer me a vision for a GC that is acceptable. As far as I can tell, nobody I talked to really thinks a GC that doesn't stop the world, which can be carefully scheduled/time-sliced (ie, an incremental, thread-local GC, or whatever), is even possible. I'll take ARC instead. It's predictable, easy for all programmers who aren't experts on resource management to understand, and I have DIRECT control over it's behaviour and timing. But that's not enough, offering convenience while trying to avoid using the GC altogether is also very important. You should be able to write software that doesn't allocate memory. It's quite hard to do in D today. There's plenty of opportunity for improvement. I'm still keenly awaiting a more-developed presentation of Andrei's allocators system.
Apple dropped the GC and went ARC instead, because they never managed to make it work properly. It was full of corner cases, and the application could crash if those cases were not fully taken care of. Or course the PR message is "We dropped GC because ARC is better" and not "We dropped GC because we failed". Now having said this, of course D needs a better GC as the current one doesn't fulfill the needs of potential users of the language. -- Paulo
Oct 08 2013
next sibling parent reply dennis luehring <dl.soluz gmx.net> writes:
Am 09.10.2013 07:23, schrieb PauloPinto:
 Apple dropped the GC and went ARC instead, because they never
 managed to make it work properly.

 It was full of corner cases, and the application could crash if
 those cases were not fully taken care of.

 Or course the PR message is "We dropped GC because ARC is better"
 and not "We dropped GC because we failed".

 Now having said this, of course D needs a better GC as the
 current one doesn't fulfill the needs of potential users of the
 language.
the question is - could ARC be an option for automatic memory managment in D - so that the compiler generated ARC code when not using gc - but using gc-needed code? or is that a hard to reach goal due to gc-using+arc-using lib combine problems?
Oct 08 2013
next sibling parent "PauloPinto" <pjmlp progtools.org> writes:
On Wednesday, 9 October 2013 at 06:05:52 UTC, dennis luehring 
wrote:
 Am 09.10.2013 07:23, schrieb PauloPinto:
 Apple dropped the GC and went ARC instead, because they never
 managed to make it work properly.

 It was full of corner cases, and the application could crash if
 those cases were not fully taken care of.

 Or course the PR message is "We dropped GC because ARC is 
 better"
 and not "We dropped GC because we failed".

 Now having said this, of course D needs a better GC as the
 current one doesn't fulfill the needs of potential users of the
 language.
the question is - could ARC be an option for automatic memory managment in D - so that the compiler generated ARC code when not using gc - but using gc-needed code? or is that a hard to reach goal due to gc-using+arc-using lib combine problems?
Personally I think ARC can only work properly if it is handled by the compiler, even if D is powerful enough to have them as library types. ARC is too costly to have it increment/decrement counters in every pointer access, in time and cache misses. Objective-C, Rust and ParaSail do it well, because ARC is built into the compiler, which can elide needless operations. Library solutions like C++ and D suffer without compiler support. Personally, I think it could be two step: - Improve the GC, because it what most developers will care about anyway - Make the D compilers aware of RefCounted and friends, to minimize memory accesses, for the developers that care about every ms they can extract from the hardware. -- Paulo
Oct 09 2013
prev sibling parent reply Manu <turkeyman gmail.com> writes:
On 9 October 2013 16:05, dennis luehring <dl.soluz gmx.net> wrote:

 Am 09.10.2013 07:23, schrieb PauloPinto:

  Apple dropped the GC and went ARC instead, because they never
 managed to make it work properly.

 It was full of corner cases, and the application could crash if
 those cases were not fully taken care of.

 Or course the PR message is "We dropped GC because ARC is better"
 and not "We dropped GC because we failed".

 Now having said this, of course D needs a better GC as the
 current one doesn't fulfill the needs of potential users of the
 language.
the question is - could ARC be an option for automatic memory managment in D - so that the compiler generated ARC code when not using gc - but using gc-needed code? or is that a hard to reach goal due to gc-using+arc-using lib combine problems?
It sounds pretty easy to reach to me. Compiler generating inc/dec ref calls can't possibly be difficult. An optimisation that simplifies redundant inc/dec sequences doesn't sound hard either... :/ Is there more to it? Cleaning up circular references I guess... what does Apple do? It's an uncommon edge case, so there's gotta be heaps of room for efficient solutions to that (afaik) one edge case. Are there others?
Oct 09 2013
next sibling parent "ponce" <contact gam3sfrommars.fr> writes:
On Wednesday, 9 October 2013 at 07:33:38 UTC, Manu wrote:
 Is there more to it? Cleaning up circular references I guess... 
 what does Apple do?
 It's an uncommon edge case, so there's gotta be heaps of room 
 for efficient solutions to that (afaik) one edge case. Are 
 there others?
There is a research paper people linked in a similar debate, that states systems that detect ARC circular references are actually not that far from an actual GC. I can't find it right now.
Oct 09 2013
prev sibling next sibling parent "PauloPinto" <pjmlp progtools.org> writes:
On Wednesday, 9 October 2013 at 07:33:38 UTC, Manu wrote:
 On 9 October 2013 16:05, dennis luehring <dl.soluz gmx.net> 
 wrote:

 Am 09.10.2013 07:23, schrieb PauloPinto:

  Apple dropped the GC and went ARC instead, because they never
 managed to make it work properly.

 It was full of corner cases, and the application could crash 
 if
 those cases were not fully taken care of.

 Or course the PR message is "We dropped GC because ARC is 
 better"
 and not "We dropped GC because we failed".

 Now having said this, of course D needs a better GC as the
 current one doesn't fulfill the needs of potential users of 
 the
 language.
the question is - could ARC be an option for automatic memory managment in D - so that the compiler generated ARC code when not using gc - but using gc-needed code? or is that a hard to reach goal due to gc-using+arc-using lib combine problems?
It sounds pretty easy to reach to me. Compiler generating inc/dec ref calls can't possibly be difficult. An optimisation that simplifies redundant inc/dec sequences doesn't sound hard either... :/ Is there more to it? Cleaning up circular references I guess... what does Apple do? It's an uncommon edge case, so there's gotta be heaps of room for efficient solutions to that (afaik) one edge case. Are there others?
Apple's compiler does flow analysis. First all inc/dec operations are generated as usual. Then flow analysis is applied and all redundant inc/dec are removed before the native code generation takes place. There is a WWDC session where this was explained. -- Paulo
Oct 09 2013
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-10-09 09:33, Manu wrote:

 It sounds pretty easy to reach to me. Compiler generating inc/dec ref
 calls can't possibly be difficult. An optimisation that simplifies
 redundant inc/dec sequences doesn't sound hard either... :/
 Is there more to it? Cleaning up circular references I guess... what
 does Apple do?
 It's an uncommon edge case, so there's gotta be heaps of room for
 efficient solutions to that (afaik) one edge case. Are there others?
See my reply to one of your other posts: http://forum.dlang.org/thread/bsqqfmhgzntryyaqrtky forum.dlang.org?page=10#post-l33gah:24ero:241:40digitalmars.com I don't recall the exact issues but there were several issues that were brought up in the email conversation. -- /Jacob Carlborg
Oct 09 2013
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2013-10-09 11:58:33 +0000, Jacob Carlborg <doob me.com> said:

 On 2013-10-09 09:33, Manu wrote:
 
 It sounds pretty easy to reach to me. Compiler generating inc/dec ref
 calls can't possibly be difficult. An optimisation that simplifies
 redundant inc/dec sequences doesn't sound hard either... :/
 Is there more to it? Cleaning up circular references I guess... what
 does Apple do?
 It's an uncommon edge case, so there's gotta be heaps of room for
 efficient solutions to that (afaik) one edge case. Are there others?
See my reply to one of your other posts: http://forum.dlang.org/thread/bsqqfmhgzntryyaqrtky forum.dlang.org?page=10#post-l33gah:24ero:241:40digitalmars.com
I
 
 don't recall the exact issues but there were several issues that were 
 brought up in the email conversation.
Here's a quick summary: Walter's idea was to implement ARC for classes implementing IUnknown. So if you wanted a ARC'ed class, you implement IUnknown and you're done. Problems were on the edges of ARC-controlled pointers and GC-controlled ones (which cannot be allowed to be mixed), and also with wanting to use the GC to free cycles for these objects (which would make COM AddRef and Release unusable when called from non-D code). I think the only way to make that work sanely is to create another root object for ref-counted classes. Another idea was to make *everything* in D ref-counted. ARC simply becomes another GC implementation. There can be no confusion between what's ref-counted and what isn't (everything is). It's much simpler really. But Walter isn't keen on the idea of having to call a function at every pointer assignment to keep the reference count up to date (for performance reasons), so that idea was rejected. This makes some sense, because unlike Objective-C ARC where only Objective-C object pointers are ref-counted, in D you'd have to do that with all pointers, and some will point to external data that does not need to be ref-counted at all. It was also pointed out that concurrent GCs require some work to be done on pointer assignment (and also when moving a pointer). So it seems to me that advances on the GC front are going to be limited without that. And now, things seems to have stalled again. It's a little disappointing. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 09 2013
next sibling parent reply Johannes Pfau <nospam example.com> writes:
Am Wed, 9 Oct 2013 09:38:12 -0400
schrieb Michel Fortin <michel.fortin michelf.ca>:

 
 I think the only way to make that work sanely is to create
 another root object for ref-counted classes.
 
The problem here is we need a way to know that type X is refcounted, right? Couldn't we just use an attribute with the class or interface declaration: refcounted interface X{}; refcounted class Y {}; class X1 : X; class Y1 : Y; Now we know for sure that all these classes and SubClasses are refcounted.
 Another idea was to make *everything* in D ref-counted. ARC simply 
 becomes another GC implementation. There can be no confusion between 
 what's ref-counted and what isn't (everything is). It's much simpler 
 really. But Walter isn't keen on the idea of having to call a
 function at every pointer assignment to keep the reference count up
 to date (for performance reasons), so that idea was rejected. This
 makes some sense, because unlike Objective-C ARC where only
 Objective-C object pointers are ref-counted, in D you'd have to do
 that with all pointers, and some will point to external data that
 does not need to be ref-counted at all.
The GC should be integrated with ARC types. A generic solution could be allowing different allocators with ARC types: auto a = new!Malloc RefCountedClass(); auto b = new!GC RefCountedClass(); auto c = new!Pool RefCountedClass(); This can be done by storing a hidden pointer to the free function in every refcounted class and the GC's free function is then just a no-op.
Oct 09 2013
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2013-10-09 14:06:51 +0000, Johannes Pfau <nospam example.com> said:

 Am Wed, 9 Oct 2013 09:38:12 -0400
 schrieb Michel Fortin <michel.fortin michelf.ca>:
 
 I think the only way to make that work sanely is to create
 another root object for ref-counted classes.
The problem here is we need a way to know that type X is refcounted, right? Couldn't we just use an attribute with the class or interface declaration: refcounted interface X{}; refcounted class Y {}; class X1 : X; class Y1 : Y; Now we know for sure that all these classes and SubClasses are refcounted.
The problem is that you can't cast Y to its base class Object (because you'd be casting an ARC pointer to a GC one and the reference count would not be maintained). Even calling Object's functions should be prohibited (because the "this" pointer is a GC pointer inside of Object's functions, and it could be leaked). So, de facto, by introducing all the necessary safety limitations, you've practically made Y a root class. The only things you can safely access from the base non-ref-counted class are its variables and its static functions. And the same apply to interface: you can't mix them. On top of that all that, if your goal is to go GC-free, you have to reinvent the whole standard library (so it uses ref-counted classes) and you still have to avoid the common pitfalls of implicit allocations (array concat for instance). So it's not terribly useful for someone whose goal is to avoid the GC. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 09 2013
parent reply Johannes Pfau <nospam example.com> writes:
Am Wed, 9 Oct 2013 10:36:36 -0400
schrieb Michel Fortin <michel.fortin michelf.ca>:

 
  refcounted interface X{};
  refcounted class Y {};
 class X1 : X;
 class Y1 : Y;
 
 Now we know for sure that all these classes and SubClasses are
 refcounted.
The problem is that you can't cast Y to its base class Object (because you'd be casting an ARC pointer to a GC one and the reference count would not be maintained). Even calling Object's functions should be prohibited (because the "this" pointer is a GC pointer inside of Object's functions, and it could be leaked). So, de facto, by introducing all the necessary safety limitations, you've practically made Y a root class. The only things you can safely access from the base non-ref-counted class are its variables and its static functions.
I see. That's a good point why a separate object hirarchy is necessary.
 On top of that all that, if your goal is to go GC-free, you have to 
 reinvent the whole standard library (so it uses ref-counted classes) 
 and you still have to avoid the common pitfalls of implicit
 allocations (array concat for instance). So it's not terribly useful
 for someone whose goal is to avoid the GC.
 
I don't think the standard lib is not that much of a problem, at least there's a way to avoid it (and some modules, std.algorithm, std.digest, std.uuid already are 99% GC free). But if someone really wants to strip the GC _completely_ there's a huge issue with memory management of Exceptions.
Oct 09 2013
next sibling parent reply Manu <turkeyman gmail.com> writes:
On 10 October 2013 01:40, Johannes Pfau <nospam example.com> wrote:

 Am Wed, 9 Oct 2013 10:36:36 -0400
 schrieb Michel Fortin <michel.fortin michelf.ca>:

  refcounted interface X{};
  refcounted class Y {};
 class X1 : X;
 class Y1 : Y;

 Now we know for sure that all these classes and SubClasses are
 refcounted.
The problem is that you can't cast Y to its base class Object (because you'd be casting an ARC pointer to a GC one and the reference count would not be maintained). Even calling Object's functions should be prohibited (because the "this" pointer is a GC pointer inside of Object's functions, and it could be leaked). So, de facto, by introducing all the necessary safety limitations, you've practically made Y a root class. The only things you can safely access from the base non-ref-counted class are its variables and its static functions.
I see. That's a good point why a separate object hirarchy is necessary.
 On top of that all that, if your goal is to go GC-free, you have to
 reinvent the whole standard library (so it uses ref-counted classes)
 and you still have to avoid the common pitfalls of implicit
 allocations (array concat for instance). So it's not terribly useful
 for someone whose goal is to avoid the GC.
I don't think the standard lib is not that much of a problem, at least there's a way to avoid it (and some modules, std.algorithm, std.digest, std.uuid already are 99% GC free). But if someone really wants to strip the GC _completely_ there's a huge issue with memory management of Exceptions.
Exceptions have a pretty well defined lifetime... can't they be manually cleaned up by the exception handler after the catching scope exits?
Oct 09 2013
next sibling parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2013-10-09 16:51:03 +0000, Manu <turkeyman gmail.com> said:

 On 10 October 2013 01:40, Johannes Pfau <nospam example.com> wrote:
 
 But if someone really wants to strip the GC _completely_ there's a huge
 issue with memory management of Exceptions.
Exceptions have a pretty well defined lifetime... can't they be manually cleaned up by the exception handler after the catching scope exits?
Exceptions don't need a well-defined lifetime for things to work. D exceptions are classes and are heap-allocated. So if everything becomes reference-counted, exceptions would be reference-counted too. The exception handler would be the one decrementing the reference count once it has done with the exception (all this under the hood, managed by the compiler). Alternatively an exception handler could return the exception to the parent function (as a return value), store the exception elsewhere, or throw it again, in which case the decrement operation would be balanced by an increment, and both increment and decrement should be elided by the compiler as they're cancelling each other. I fail to see an issue. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 09 2013
next sibling parent reply Johannes Pfau <nospam example.com> writes:
Am Wed, 9 Oct 2013 13:13:51 -0400
schrieb Michel Fortin <michel.fortin michelf.ca>:

 On 2013-10-09 16:51:03 +0000, Manu <turkeyman gmail.com> said:
 
 On 10 October 2013 01:40, Johannes Pfau <nospam example.com> wrote:
 
 But if someone really wants to strip the GC _completely_ there's a
 huge issue with memory management of Exceptions.
Exceptions have a pretty well defined lifetime... can't they be manually cleaned up by the exception handler after the catching scope exits?
Exceptions don't need a well-defined lifetime for things to work.
What I meant was using exceptions without the GC requires huge changes (like switching all exceptions to use reference counting) or at least using the same type of allocation for all exceptions. If you have 'malloced' and GC owned exceptions it can be hard to know if an Exception needs to be freed: try { codeWithMallocException(); codeWithGCException(); } catch(Exception e) { //Am I supposed to free this exception now? } Now think of exception chaining... I think this can easily lead to memory leaks or other latent issues. So avoiding parts of phobos which allocate right now is relatively easy. Switching all code to use 'malloced' Exceptions is much more work.
 D exceptions are classes and are heap-allocated. So if everything 
 becomes reference-counted, exceptions would be reference-counted too. 
 [...]
 I fail to see an issue.
 
I meant it's complicated right now, without compiler / runtime / phobos changes. You can avoid parts of phobos as a user but getting rid of GC allocated exceptions is more difficult. Of course if you can change all exceptions to use reference counting that works well but that's a rather big change.
Oct 09 2013
parent Michel Fortin <michel.fortin michelf.ca> writes:
On 2013-10-09 17:36:01 +0000, Johannes Pfau <nospam example.com> said:

 Am Wed, 9 Oct 2013 13:13:51 -0400
 schrieb Michel Fortin <michel.fortin michelf.ca>:
 
 On 2013-10-09 16:51:03 +0000, Manu <turkeyman gmail.com> said:
 
 On 10 October 2013 01:40, Johannes Pfau <nospam example.com> wrote:
 
 But if someone really wants to strip the GC _completely_ there's a
 huge issue with memory management of Exceptions.
Exceptions have a pretty well defined lifetime... can't they be manually cleaned up by the exception handler after the catching scope exits?
Exceptions don't need a well-defined lifetime for things to work.
What I meant was using exceptions without the GC requires huge changes (like switching all exceptions to use reference counting) or at least using the same type of allocation for all exceptions. If you have 'malloced' and GC owned exceptions it can be hard to know if an Exception needs to be freed: try { codeWithMallocException(); codeWithGCException(); } catch(Exception e) { //Am I supposed to free this exception now? }
Sorry. I think I got mixed up in different branches of this huge discussion thread. This branch is about having a separate attribute to split ref-counted objects from GC objects, but having the two coexist. Other branches are about replacing the GC with a ref-counted GC, but that's not what we're talking about here. You are perfectly right that you can't mix GC exceptions with ref-counted exceptions. And if ref-counted exceptions can't derive from the garbage collected Exception class, then it becomes awfully complicated (probably impossible) to have exception without having the GC cleaning them up. I was wrong, that's an issue. It's no different than the general issue that you need to have a separate root class however. You simply need a separate root exception class too, one that you'd have to catch explicitly and that catch (Exception e) would not catch. That's not very practical. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 09 2013
prev sibling parent reply Manu <turkeyman gmail.com> writes:
On 10 October 2013 03:13, Michel Fortin <michel.fortin michelf.ca> wrote:

 On 2013-10-09 16:51:03 +0000, Manu <turkeyman gmail.com> said:

  On 10 October 2013 01:40, Johannes Pfau <nospam example.com> wrote:
  But if someone really wants to strip the GC _completely_ there's a huge
 issue with memory management of Exceptions.
Exceptions have a pretty well defined lifetime... can't they be manually cleaned up by the exception handler after the catching scope exits?
Exceptions don't need a well-defined lifetime for things to work. D exceptions are classes and are heap-allocated. So if everything becomes reference-counted, exceptions would be reference-counted too. The exception handler would be the one decrementing the reference count once it has done with the exception (all this under the hood, managed by the compiler). Alternatively an exception handler could return the exception to the parent function (as a return value), store the exception elsewhere, or throw it again, in which case the decrement operation would be balanced by an increment, and both increment and decrement should be elided by the compiler as they're cancelling each other. I fail to see an issue.
I was talking about using exceptions without any sort of GC at all. I think it's the only critical language feature left that relies on a GC in some form. If would be nice if at least the fundamental language concepts were usable without a GC of any sort. Most other things can be reasonably worked around at this point.
Oct 09 2013
parent "deadalnix" <deadalnix gmail.com> writes:
On Thursday, 10 October 2013 at 03:06:43 UTC, Manu wrote:
 I was talking about using exceptions without any sort of GC at 
 all.
 I think it's the only critical language feature left that 
 relies on a GC in
 some form.
 If would be nice if at least the fundamental language concepts 
 were usable
 without a GC of any sort.
 Most other things can be reasonably worked around at this point.
You can free your exception when done with it. It doesn't seem very complicated.
Oct 09 2013
prev sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 October 2013 at 16:51:11 UTC, Manu wrote:
 Exceptions have a pretty well defined lifetime... can't they be 
 manually
 cleaned up by the exception handler after the catching scope 
 exits?
static Exception askingForTrouble; try { ... } catch(Exception e) { askingForTrouble = e; }
Oct 09 2013
prev sibling parent Benjamin Thaut <code benjamin-thaut.de> writes:
Am 09.10.2013 17:40, schrieb Johannes Pfau:
 But if someone really wants to strip the GC _completely_ there's a huge
 issue with memory management of Exceptions.
I don't agree there. Basically all my catch blocks look like this: catch(Exception ex) { //do something with ex Delete(ex); } I never had problems with this, or memory leaks.
Oct 09 2013
prev sibling next sibling parent reply "qznc" <qznc web.de> writes:
On Wednesday, 9 October 2013 at 13:38:12 UTC, Michel Fortin wrote:
 On 2013-10-09 11:58:33 +0000, Jacob Carlborg <doob me.com> said:

 On 2013-10-09 09:33, Manu wrote:
 
 It sounds pretty easy to reach to me. Compiler generating 
 inc/dec ref
 calls can't possibly be difficult. An optimisation that 
 simplifies
 redundant inc/dec sequences doesn't sound hard either... :/
 Is there more to it? Cleaning up circular references I 
 guess... what
 does Apple do?
 It's an uncommon edge case, so there's gotta be heaps of room 
 for
 efficient solutions to that (afaik) one edge case. Are there 
 others?
See my reply to one of your other posts: http://forum.dlang.org/thread/bsqqfmhgzntryyaqrtky forum.dlang.org?page=10#post-l33gah:24ero:241:40digitalmars.com
I
 
 don't recall the exact issues but there were several issues 
 that were brought up in the email conversation.
Here's a quick summary: Walter's idea was to implement ARC for classes implementing IUnknown. So if you wanted a ARC'ed class, you implement IUnknown and you're done. Problems were on the edges of ARC-controlled pointers and GC-controlled ones (which cannot be allowed to be mixed), and also with wanting to use the GC to free cycles for these objects (which would make COM AddRef and Release unusable when called from non-D code). I think the only way to make that work sanely is to create another root object for ref-counted classes. Another idea was to make *everything* in D ref-counted. ARC simply becomes another GC implementation. There can be no confusion between what's ref-counted and what isn't (everything is). It's much simpler really. But Walter isn't keen on the idea of having to call a function at every pointer assignment to keep the reference count up to date (for performance reasons), so that idea was rejected. This makes some sense, because unlike Objective-C ARC where only Objective-C object pointers are ref-counted, in D you'd have to do that with all pointers, and some will point to external data that does not need to be ref-counted at all. It was also pointed out that concurrent GCs require some work to be done on pointer assignment (and also when moving a pointer). So it seems to me that advances on the GC front are going to be limited without that. And now, things seems to have stalled again. It's a little disappointing.
I found no summary and stuff seems to get lost, so I created a page on the wiki. http://wiki.dlang.org/Versus_the_garbage_collector
Oct 09 2013
parent John Joyus <john.joyus gmail.com> writes:
On 10/09/2013 12:40 PM, qznc wrote:
 I found no summary and stuff seems to get lost,
 so I created a page on the wiki.

 http://wiki.dlang.org/Versus_the_garbage_collector
+1
Oct 09 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/9/13 6:38 AM, Michel Fortin wrote:
 Walter's idea was to implement ARC for classes implementing IUnknown. So
 if you wanted a ARC'ed class, you implement IUnknown and you're done.
 Problems were on the edges of ARC-controlled pointers and GC-controlled
 ones (which cannot be allowed to be mixed), and also with wanting to use
 the GC to free cycles for these objects (which would make COM AddRef and
 Release unusable when called from non-D code). I think the only way to
 make that work sanely is to create another root object for ref-counted
 classes.
Agreed. This summary reveals the major difficulties involved.
 Another idea was to make *everything* in D ref-counted. ARC simply
 becomes another GC implementation. There can be no confusion between
 what's ref-counted and what isn't (everything is). It's much simpler
 really. But Walter isn't keen on the idea of having to call a function
 at every pointer assignment to keep the reference count up to date (for
 performance reasons), so that idea was rejected. This makes some sense,
 because unlike Objective-C ARC where only Objective-C object pointers
 are ref-counted, in D you'd have to do that with all pointers, and some
 will point to external data that does not need to be ref-counted at all.
I don't think that's on the reject list. Yah, call a function that may either be an inline lightweight refcount manipulation, or forward to a virtual method etc. It's up to the class.
 It was also pointed out that concurrent GCs require some work to be done
 on pointer assignment (and also when moving a pointer). So it seems to
 me that advances on the GC front are going to be limited without that.

 And now, things seems to have stalled again. It's a little disappointing.
I don't think we're stalled. Andrei
Oct 09 2013
parent Michel Fortin <michel.fortin michelf.ca> writes:
On 2013-10-09 18:14:31 +0000, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Another idea was to make *everything* in D ref-counted. ARC simply
 becomes another GC implementation. There can be no confusion between
 what's ref-counted and what isn't (everything is). It's much simpler
 really. But Walter isn't keen on the idea of having to call a function
 at every pointer assignment to keep the reference count up to date (for
 performance reasons), so that idea was rejected. This makes some sense,
 because unlike Objective-C ARC where only Objective-C object pointers
 are ref-counted, in D you'd have to do that with all pointers, and some
 will point to external data that does not need to be ref-counted at all.
I don't think that's on the reject list. Yah, call a function that may either be an inline lightweight refcount manipulation, or forward to a virtual method etc. It's up to the class.
Are we talking about the same thing? You say "it's up to the class", but it should be obvious that *everything* being reference counted (as I wrote above) means every pointer, not only those to classes. Having only classes being reference counted is not very helpful if one wants to avoid the garbage collector. And that discussion with Johannes Pfau a few minutes ago about exceptions shows that if you disable the GC, exceptions can't depend on the GC anymore to be freed, which is a problem too. (Should we create another ref-counted exception root type? Hopefully no.) In my opinion, trying to segregate between reference-counted and garbage-collected types just makes things awfully complicated. And it doesn't help much someone who wants to avoid the GC. To be useful, reference counting should be a replacement for the garbage collector (while still keeping the current GC to free cycles). -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 09 2013
prev sibling parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2013-10-09 07:33:29 +0000, Manu <turkeyman gmail.com> said:

 Is there more to it? Cleaning up circular references I guess... what does
 Apple do?
Apple implemented auto-nulling weak pointers for ARC (as well as __unsafe_unretained ones if you are not afraid of dangling pointers).
 It's an uncommon edge case, so there's gotta be heaps of room for efficient
 solutions to that (afaik) one edge case. Are there others?
I don't know about you, but circular references are not rare at all in my code. Another solution that could be used for circular references is to have the stop-the-world GC we have now collect them. Reference counting could be used to free GC memory in advance (in the absence of circular references), so the GC would have less garbage to collect when it runs. Both solutions (week-autonulling and last-resort GC) are not mutually exclusive and can coexist. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 09 2013
next sibling parent reply Manu <turkeyman gmail.com> writes:
On 9 October 2013 23:06, Michel Fortin <michel.fortin michelf.ca> wrote:

 On 2013-10-09 07:33:29 +0000, Manu <turkeyman gmail.com> said:

  Is there more to it? Cleaning up circular references I guess... what does
 Apple do?
Apple implemented auto-nulling weak pointers for ARC (as well as __unsafe_unretained ones if you are not afraid of dangling pointers). It's an uncommon edge case, so there's gotta be heaps of room for
 efficient
 solutions to that (afaik) one edge case. Are there others?
I don't know about you, but circular references are not rare at all in my code. Another solution that could be used for circular references is to have the stop-the-world GC we have now collect them. Reference counting could be used to free GC memory in advance (in the absence of circular references), so the GC would have less garbage to collect when it runs. Both solutions (week-autonulling and last-resort GC) are not mutually exclusive and can coexist.
I suspect there are plenty of creative possibilities that could be applied to reducing the pool of potential circular-references to as small a pool as possible. It's simply not feasible to have machine scanning 4-6 gigabytes of allocated memory looking for garbage.
Oct 09 2013
next sibling parent reply Paulo Pinto <pjmlp progtools.org> writes:
Am 09.10.2013 16:52, schrieb Manu:
 On 9 October 2013 23:06, Michel Fortin <michel.fortin michelf.ca
 <mailto:michel.fortin michelf.ca>> wrote:

     On 2013-10-09 07:33:29 +0000, Manu <turkeyman gmail.com
     <mailto:turkeyman gmail.com>> said:

         Is there more to it? Cleaning up circular references I guess...
         what does
         Apple do?


     Apple implemented auto-nulling weak pointers for ARC (as well as
     __unsafe_unretained ones if you are not afraid of dangling pointers).

         It's an uncommon edge case, so there's gotta be heaps of room
         for efficient
         solutions to that (afaik) one edge case. Are there others?


     I don't know about you, but circular references are not rare at all
     in my code.

     Another solution that could be used for circular references is to
     have the stop-the-world GC we have now collect them. Reference
     counting could be used to free GC memory in advance (in the absence
     of circular references), so the GC would have less garbage to
     collect when it runs.

     Both solutions (week-autonulling and last-resort GC) are not
     mutually exclusive and can coexist.


 I suspect there are plenty of creative possibilities that could be
 applied to reducing the pool of potential circular-references to as
 small a pool as possible.
 It's simply not feasible to have machine scanning 4-6 gigabytes of
 allocated memory looking for garbage.
The Azul VM does not have a problem with it, as it does pauseless concurrent GC, while being used on online trading systems. -- Paulo
Oct 09 2013
parent reply Sean Kelly <sean invisibleduck.org> writes:
On Oct 9, 2013, at 8:35 AM, Paulo Pinto <pjmlp progtools.org> wrote:
=20
 The Azul VM does not have a problem with it, as it does pauseless =
concurrent GC, while being used on online trading systems. Incremental GCs are awesome. Have you looked at IBM's Metronome for = Java? Maybe this would work with SafeD, but I don't think it's an = option with the full D language.=
Oct 09 2013
parent Paulo Pinto <pjmlp progtools.org> writes:
Am 09.10.2013 17:54, schrieb Sean Kelly:
 On Oct 9, 2013, at 8:35 AM, Paulo Pinto <pjmlp progtools.org> wrote:
 The Azul VM does not have a problem with it, as it does pauseless concurrent
GC, while being used on online trading systems.
Incremental GCs are awesome. Have you looked at IBM's Metronome for Java? ...
Just the technical papers.
 ...  Maybe this would work with SafeD, but I don't think it's an 
option with the full D language.

Yeah, maybe if D was safe by default, with pointer tricks encapsulated 
inside system sections. Similar to unsafe blocks in other languages.

Not sure how much that would help, though.

--
Paulo
Oct 09 2013
prev sibling parent Michel Fortin <michel.fortin michelf.ca> writes:
On 2013-10-09 14:52:35 +0000, Manu <turkeyman gmail.com> said:

 I suspect there are plenty of creative possibilities that could be applied
 to reducing the pool of potential circular-references to as small a pool as
 possible.
I'm more pessimistic than you on this. Cycles can only be detected at runtime unless we make dramatic changes to the memory model. You can find cycles by scanning at intervals, as the GC does. Or you could detect them by updating a dependency graph each time you assign a value to a pointer. The dependency graph would have to tell you whether you're still connected to a root. When all root connections are severed you can free the object. I doubt the overhead would make this later idea practical however. Weak-autonulling pointers are much cheaper. I'd suggest using one of the above methods as debugging aid to find cycles, and then manually insert weak pointers where needed to break those cycles. Either that, or leave the GC on if you don't mind the GC.
 It's simply not feasible to have machine scanning 4-6 gigabytes of
 allocated memory looking for garbage.
Agree. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 09 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/9/13 6:06 AM, Michel Fortin wrote:
 On 2013-10-09 07:33:29 +0000, Manu <turkeyman gmail.com> said:

 Is there more to it? Cleaning up circular references I guess... what does
 Apple do?
Apple implemented auto-nulling weak pointers for ARC (as well as __unsafe_unretained ones if you are not afraid of dangling pointers).
Yah, the rub being the programmer must use weak pointers judiciously. The difficulty is in "judiciously" :o).
 It's an uncommon edge case, so there's gotta be heaps of room for
 efficient
 solutions to that (afaik) one edge case. Are there others?
I don't know about you, but circular references are not rare at all in my code.
Nor Facebook's. I can't give away numbers, but the fraction of cycles per total memory allocated averaged over all of Facebook's PHP programs (PHP being reference counted) is staggering. Staggering! If PHP didn't follow a stateless request model (which allows us to use region allocation and reclaim all memory upon the request's end), we'd be literally unable to run a server for more than a few minutes.
 Another solution that could be used for circular references is to have
 the stop-the-world GC we have now collect them. Reference counting could
 be used to free GC memory in advance (in the absence of circular
 references), so the GC would have less garbage to collect when it runs.
Yah, that seems to be a good approach for us. Andrei
Oct 09 2013
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2013-10-09 17:15:51 +0000, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 10/9/13 6:06 AM, Michel Fortin wrote:
 I don't know about you, but circular references are not rare at all in
 my code.
Nor Facebook's. I can't give away numbers, but the fraction of cycles per total memory allocated averaged over all of Facebook's PHP programs (PHP being reference counted) is staggering. Staggering! If PHP didn't follow a stateless request model (which allows us to use region allocation and reclaim all memory upon the request's end), we'd be literally unable to run a server for more than a few minutes.
Don't remind me. About 8 years ago I had to abandon a big PHP project simply because PHP couldn't deal with cycles. There was no way to scale the thing. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 09 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 October 2013 at 18:41:21 UTC, Michel Fortin wrote:
 On 2013-10-09 17:15:51 +0000, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> said:

 On 10/9/13 6:06 AM, Michel Fortin wrote:
 I don't know about you, but circular references are not rare 
 at all in
 my code.
Nor Facebook's. I can't give away numbers, but the fraction of cycles per total memory allocated averaged over all of Facebook's PHP programs (PHP being reference counted) is staggering. Staggering! If PHP didn't follow a stateless request model (which allows us to use region allocation and reclaim all memory upon the request's end), we'd be literally unable to run a server for more than a few minutes.
Don't remind me. About 8 years ago I had to abandon a big PHP project simply because PHP couldn't deal with cycles. There was no way to scale the thing.
Solved in 5.3
Oct 09 2013
parent Michel Fortin <michel.fortin michelf.ca> writes:
On 2013-10-09 19:45:39 +0000, "deadalnix" <deadalnix gmail.com> said:

 Don't remind me. About 8 years ago I had to abandon a big PHP project 
 simply because PHP couldn't deal with cycles. There was no way to scale 
 the thing.
Solved in 5.3
I know. 4 years tool late. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 09 2013
prev sibling parent reply Manu <turkeyman gmail.com> writes:
On 9 October 2013 15:23, PauloPinto <pjmlp progtools.org> wrote:

 On Wednesday, 9 October 2013 at 05:15:53 UTC, Manu wrote:

 On 9 October 2013 08:58, ponce <contact gmsfrommars.fr> wrote:

  On Tuesday, 8 October 2013 at 22:45:51 UTC, Adam D. Ruppe wrote:
 Eh, not necessarily. If it expands to static assert(!__traits(****
 hasAnnotationRecursive,

 uses_gc));, then the only ones that *need* to be marked are the lowest
 level ones. Then it figures out the rest only on demand.

 Then, on the function you care about as a user, you say nogc and it
 tells
 you if you called anything and the static assert stacktrace tells you
 where
 it happened.

 Of course, to be convenient to use, phobos would need to offer
 non-allocating functions, which is indeed a fair amount of work, but
 they
 wouldn't *necessarily* have to have the specific attribute.
But is it even necessary? There isn't a great deal of evidence that someone interested in optimization will be blocked on this particular problem, like Peter Alexander said. GC hassle is quite common but not that big a deal: - Manu: "Consequently, I avoid the GC in D too, and never had any major problems, only inconvenience." http://www.reddit.com/r/** programming/comments/1nxs2i/****the_state_of_rust_08/ccnefe7<h** ttp://www.reddit.com/r/**programming/comments/1nxs2i/** the_state_of_rust_08/ccnefe7<http://www.reddit.com/r/programming/comments/1nxs2i/the_state_of_rust_08/ccnefe7>

 - Dav1d: said he never had a GC problem with BRala (minecraft client)
 - Me: I had a small ~100ms GC pause in one of my games every 20 minutes,
 more often than not I don't notice it

 So a definitive written rebutal we can link to would perhaps be helpful.
I might just add, that while my experience has been that I haven't had any significant technical problems when actively avoiding the GC, the inconvenience is considerably more severe than I made out in that post (I don't want to foster public negativity). But it is actually really, really inconvenient. If that's my future with D, then I'll pass, just as any un-biased 3rd party would. I've been simmering on this issue ever since I took an interest in D. At first I was apprehensive to accept the GC, then cautiously optimistic that the GC might be okay. But I have seen exactly no movement in this area as long as I've been following D, and I have since reverted to a position in absolute agreement with the C++ users. I will never accept the GC in it's current form for all of my occupational requirements; it's implicitly non-deterministic, and offers very little control over performance characteristics. I've said before that until I can time-slice the GC, and it does not stop the world, then it doesn't satisfy my requirements. I see absolutely no motion towards that goal. If I were one of those many C++ users evaluating D for long-term adoption (and I am!), I'm not going to invest the future of my career and industry in a complete question mark which given years of watching already, is clearly going nowhere. As far as the GC is concerned, with respect to realtime embedded software, I'm out. I've completely lost faith. And it's going to take an awful lot more to restore my faith again than before. What I want is an option to replace the GC with ARC, just like Apple did. Clearly they came to the same conclusion, probably for exactly the same reasons. Apple have a policy of silky smooth responsiveness throughout the OS and the entire user experience. They consider this a sign of quality and professionalism. As far as I can tell, they concluded that non-deterministic GC pauses were incompatible with their goal. I agree. I think their experience should be taken very seriously. They have a successful platform on weak embedded hardware, with about a million applications deployed. I've had a lot of conversations with a lot of experts, plenty of conversations at dconf, and nobody could even offer me a vision for a GC that is acceptable. As far as I can tell, nobody I talked to really thinks a GC that doesn't stop the world, which can be carefully scheduled/time-sliced (ie, an incremental, thread-local GC, or whatever), is even possible. I'll take ARC instead. It's predictable, easy for all programmers who aren't experts on resource management to understand, and I have DIRECT control over it's behaviour and timing. But that's not enough, offering convenience while trying to avoid using the GC altogether is also very important. You should be able to write software that doesn't allocate memory. It's quite hard to do in D today. There's plenty of opportunity for improvement. I'm still keenly awaiting a more-developed presentation of Andrei's allocators system.
Apple dropped the GC and went ARC instead, because they never managed to make it work properly. It was full of corner cases, and the application could crash if those cases were not fully taken care of. Or course the PR message is "We dropped GC because ARC is better" and not "We dropped GC because we failed". Now having said this, of course D needs a better GC as the current one doesn't fulfill the needs of potential users of the language.
Well, I never read that article apparently... but that's possibly even more of a concern if true. Does anyone here REALLY believe that a bunch of volunteer contributors can possibly do what apple failed to do with their squillions of dollars and engineers? I haven't heard anybody around here propose the path to an acceptable solution. It's perpetually in the too-hard basket, hence we still have the same GC as forever and it's going nowhere.
Oct 09 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/9/2013 12:29 AM, Manu wrote:
 Does anyone here REALLY believe that a bunch of volunteer contributors can
 possibly do what apple failed to do with their squillions of dollars and
engineers?
 I haven't heard anybody around here propose the path to an acceptable solution.
 It's perpetually in the too-hard basket, hence we still have the same GC as
 forever and it's going nowhere.
What do you propose?
Oct 09 2013
next sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-10-09 09:31, Walter Bright wrote:

 What do you propose?
He wants ARC. -- /Jacob Carlborg
Oct 09 2013
prev sibling parent reply Manu <turkeyman gmail.com> writes:
On 9 October 2013 17:31, Walter Bright <newshound2 digitalmars.com> wrote:

 On 10/9/2013 12:29 AM, Manu wrote:

 Does anyone here REALLY believe that a bunch of volunteer contributors can
 possibly do what apple failed to do with their squillions of dollars and
 engineers?
 I haven't heard anybody around here propose the path to an acceptable
 solution.
 It's perpetually in the too-hard basket, hence we still have the same GC
 as
 forever and it's going nowhere.
What do you propose?
ARC. I've been here years now, and I see absolutely no evidence that the GC is ever going to improve. I can trust ARC, it's predictable, I can control it. Also, proper support for avoiding the GC without severe inconvenience as constantly keeps coming up. But I don't think there's any debate on that one. Everyone seems to agree.
Oct 09 2013
next sibling parent dennis luehring <dl.soluz gmx.net> writes:
Am 09.10.2013 16:30, schrieb Manu:
 On 9 October 2013 17:31, Walter Bright <newshound2 digitalmars.com> wrote:

 On 10/9/2013 12:29 AM, Manu wrote:

 Does anyone here REALLY believe that a bunch of volunteer contributors can
 possibly do what apple failed to do with their squillions of dollars and
 engineers?
 I haven't heard anybody around here propose the path to an acceptable
 solution.
 It's perpetually in the too-hard basket, hence we still have the same GC
 as
 forever and it's going nowhere.
What do you propose?
ARC. I've been here years now, and I see absolutely no evidence that the GC is ever going to improve. I can trust ARC, it's predictable, I can control it. Also, proper support for avoiding the GC without severe inconvenience as constantly keeps coming up. But I don't think there's any debate on that one. Everyone seems to agree.
just a question: how should ref count locking be done - only for shared? all non shared pointers don't need to be thread-safe - or?
Oct 09 2013
prev sibling next sibling parent reply Paulo Pinto <pjmlp progtools.org> writes:
Am 09.10.2013 16:30, schrieb Manu:
 On 9 October 2013 17:31, Walter Bright <newshound2 digitalmars.com
 <mailto:newshound2 digitalmars.com>> wrote:

     On 10/9/2013 12:29 AM, Manu wrote:

         Does anyone here REALLY believe that a bunch of volunteer
         contributors can
         possibly do what apple failed to do with their squillions of
         dollars and engineers?
         I haven't heard anybody around here propose the path to an
         acceptable solution.
         It's perpetually in the too-hard basket, hence we still have the
         same GC as
         forever and it's going nowhere.


     What do you propose?


 ARC. I've been here years now, and I see absolutely no evidence that the
 GC is ever going to improve. I can trust ARC, it's predictable, I can
 control it.
 Also, proper support for avoiding the GC without severe inconvenience as
 constantly keeps coming up. But I don't think there's any debate on that
 one. Everyone seems to agree.
As someone that is in the sidelines and doesn't really use D, my opinion should not count that much, if at all. However, rewriting D's memory management to be ARC based will have performance impact if the various D compilers aren't made ARC aware. Then there is the whole point of rewriting phobos and druntime to use ARC instead of GC. Will the return on investment pay off, instead of fixing the existing GC? What will be the message sent to the outsiders wondering if D is stable enough to be adopted, and see these constant rewrites? -- Paulo
Oct 09 2013
next sibling parent reply dennis luehring <dl.soluz gmx.net> writes:
Am 09.10.2013 17:46, schrieb Paulo Pinto:
 Am 09.10.2013 16:30, schrieb Manu:
 On 9 October 2013 17:31, Walter Bright <newshound2 digitalmars.com
 <mailto:newshound2 digitalmars.com>> wrote:

     On 10/9/2013 12:29 AM, Manu wrote:

         Does anyone here REALLY believe that a bunch of volunteer
         contributors can
         possibly do what apple failed to do with their squillions of
         dollars and engineers?
         I haven't heard anybody around here propose the path to an
         acceptable solution.
         It's perpetually in the too-hard basket, hence we still have the
         same GC as
         forever and it's going nowhere.


     What do you propose?


 ARC. I've been here years now, and I see absolutely no evidence that the
 GC is ever going to improve. I can trust ARC, it's predictable, I can
 control it.
 Also, proper support for avoiding the GC without severe inconvenience as
 constantly keeps coming up. But I don't think there's any debate on that
 one. Everyone seems to agree.
As someone that is in the sidelines and doesn't really use D, my opinion should not count that much, if at all. However, rewriting D's memory management to be ARC based will have performance impact if the various D compilers aren't made ARC aware. Then there is the whole point of rewriting phobos and druntime to use ARC instead of GC. Will the return on investment pay off, instead of fixing the existing GC? What will be the message sent to the outsiders wondering if D is stable enough to be adopted, and see these constant rewrites? -- Paulo
not manual ARC - compiler generated ARC - so there is no need for rewrite
Oct 09 2013
parent Paulo Pinto <pjmlp progtools.org> writes:
Am 09.10.2013 17:49, schrieb dennis luehring:
 Am 09.10.2013 17:46, schrieb Paulo Pinto:
 Am 09.10.2013 16:30, schrieb Manu:
 On 9 October 2013 17:31, Walter Bright <newshound2 digitalmars.com
 <mailto:newshound2 digitalmars.com>> wrote:

     On 10/9/2013 12:29 AM, Manu wrote:

         Does anyone here REALLY believe that a bunch of volunteer
         contributors can
         possibly do what apple failed to do with their squillions of
         dollars and engineers?
         I haven't heard anybody around here propose the path to an
         acceptable solution.
         It's perpetually in the too-hard basket, hence we still have the
         same GC as
         forever and it's going nowhere.


     What do you propose?


 ARC. I've been here years now, and I see absolutely no evidence that the
 GC is ever going to improve. I can trust ARC, it's predictable, I can
 control it.
 Also, proper support for avoiding the GC without severe inconvenience as
 constantly keeps coming up. But I don't think there's any debate on that
 one. Everyone seems to agree.
As someone that is in the sidelines and doesn't really use D, my opinion should not count that much, if at all. However, rewriting D's memory management to be ARC based will have performance impact if the various D compilers aren't made ARC aware. Then there is the whole point of rewriting phobos and druntime to use ARC instead of GC. Will the return on investment pay off, instead of fixing the existing GC? What will be the message sent to the outsiders wondering if D is stable enough to be adopted, and see these constant rewrites? -- Paulo
not manual ARC - compiler generated ARC - so there is no need for rewrite
There will be a need for rewrite, if the code happens to have cyclic references. -- Paulo
Oct 09 2013
prev sibling parent reply Manu <turkeyman gmail.com> writes:
On 10 October 2013 01:46, Paulo Pinto <pjmlp progtools.org> wrote:

 Am 09.10.2013 16:30, schrieb Manu:

 On 9 October 2013 17:31, Walter Bright <newshound2 digitalmars.com
 <mailto:newshound2 **digitalmars.com <newshound2 digitalmars.com>>>
 wrote:

     On 10/9/2013 12:29 AM, Manu wrote:

         Does anyone here REALLY believe that a bunch of volunteer
         contributors can
         possibly do what apple failed to do with their squillions of
         dollars and engineers?
         I haven't heard anybody around here propose the path to an
         acceptable solution.
         It's perpetually in the too-hard basket, hence we still have the
         same GC as
         forever and it's going nowhere.


     What do you propose?


 ARC. I've been here years now, and I see absolutely no evidence that the
 GC is ever going to improve. I can trust ARC, it's predictable, I can
 control it.
 Also, proper support for avoiding the GC without severe inconvenience as
 constantly keeps coming up. But I don't think there's any debate on that
 one. Everyone seems to agree.
As someone that is in the sidelines and doesn't really use D, my opinion should not count that much, if at all. However, rewriting D's memory management to be ARC based will have performance impact if the various D compilers aren't made ARC aware.
Supporting ARC in the compiler _is_ the job. That includes a cyclic-reference solution. Then there is the whole point of rewriting phobos and druntime to use ARC
 instead of GC.
It would be transparent if properly supported by the compiler. Will the return on investment pay off, instead of fixing the existing GC?

If anyone can even _imagine_ a design for a 'fixed' GC, I'd love to hear
it. I've talked with a lot of experts, they all mumble and groan, and just
talk about how hard it is.

What will be the message sent to the outsiders wondering if D is stable
 enough to be adopted, and see these constant rewrites?
People didn't run screaming from Obj-C when they switched to ARC. I think they generally appreciated it.
Oct 09 2013
next sibling parent Paulo Pinto <pjmlp progtools.org> writes:
Am 09.10.2013 19:05, schrieb Manu:
 On 10 October 2013 01:46, Paulo Pinto <pjmlp progtools.org
 <mailto:pjmlp progtools.org>> wrote:

     Am 09.10.2013 16:30, schrieb Manu:

         On 9 October 2013 17:31, Walter Bright
         <newshound2 digitalmars.com <mailto:newshound2 digitalmars.com>
         <mailto:newshound2 __digitalmars.com
         <mailto:newshound2 digitalmars.com>>> wrote:

              On 10/9/2013 12:29 AM, Manu wrote:

                  Does anyone here REALLY believe that a bunch of volunteer
                  contributors can
                  possibly do what apple failed to do with their
         squillions of
                  dollars and engineers?
                  I haven't heard anybody around here propose the path to an
                  acceptable solution.
                  It's perpetually in the too-hard basket, hence we still
         have the
                  same GC as
                  forever and it's going nowhere.


              What do you propose?


         ARC. I've been here years now, and I see absolutely no evidence
         that the
         GC is ever going to improve. I can trust ARC, it's predictable,
         I can
         control it.
         Also, proper support for avoiding the GC without severe
         inconvenience as
         constantly keeps coming up. But I don't think there's any debate
         on that
         one. Everyone seems to agree.


     As someone that is in the sidelines and doesn't really use D, my
     opinion should not count that much, if at all.

     However, rewriting D's memory management to be ARC based will have
     performance impact if the various D compilers aren't made ARC aware.


 Supporting ARC in the compiler _is_ the job. That includes a
 cyclic-reference solution.

     Then there is the whole point of rewriting phobos and druntime to
     use ARC instead of GC.


 It would be transparent if properly supported by the compiler.

     Will the return on investment pay off, instead of fixing the
     existing GC?


 If anyone can even _imagine_ a design for a 'fixed' GC, I'd love to hear
 it. I've talked with a lot of experts, they all mumble and groan, and
 just talk about how hard it is.

     What will be the message sent to the outsiders wondering if D is
     stable enough to be adopted, and see these constant rewrites?


 People didn't run screaming from Obj-C when they switched to ARC. I
 think they generally appreciated it.
Because Objective-C's GC design was broken, as I mentioned on my previous posts. Anyway, you make good points, thanks for the reply. -- Paulo
Oct 09 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/9/2013 10:05 AM, Manu wrote:
 Supporting ARC in the compiler _is_ the job. That includes a cyclic-reference
 solution.
Wholly replacing the GC with ARC has some fundamental issues: 1. Array slicing becomes clumsy instead of fast & cheap. 2. Functions can now accept new'd data, data from user allocations, static data, slices, etc., with aplomb. Wiring in ARC would likely make this unworkable. 3. I'm not convinced yet that ARC can guarantee memory safety. For example, where do weak pointers fit in with memory safety? (ARC uses user-annoted weak pointers to deal with cycles.)
Oct 09 2013
next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Wednesday, 9 October 2013 at 19:24:18 UTC, Walter Bright wrote:
 1. Array slicing becomes clumsy instead of fast & cheap

 2. Functions can now accept new'd data, data from user 
 allocations, static data, slices, etc., with aplomb. Wiring in 
 ARC would likely make this unworkable.
If we had working scope storage class, these could be solved - you could slice to your heart's content as long as the reference never escapes your current scope. Really, I want this regardless of the rest of the discussion just because then we can boot a lot of stuff to the library. I realize it is easier said than done, but I'm pretty sure it is technically supposed to work already...
Oct 09 2013
prev sibling parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2013-10-09 19:24:18 +0000, Walter Bright <newshound2 digitalmars.com> said:

 On 10/9/2013 10:05 AM, Manu wrote:
 Supporting ARC in the compiler _is_ the job. That includes a cyclic-reference
 solution.
Wholly replacing the GC with ARC has some fundamental issues: 1. Array slicing becomes clumsy instead of fast & cheap.
I think you're exaggerating a bit. If you slice, you know it's the same memory block, so you know it's using the same reference count, so the compiler can elide paired retain release calls just like it should be able to do for regular pointers, keeping it fast and cheap. Example: int[] a = [1, 2]; // 1. retained on allocation // 2. scope(exit) release a int[] b = a[0..1]; // 3. retain on assignment // 4. scope(exit) release b return b; // 5. retain b for caller Now, you know at line 3 that "b" is the same memory block as "a", so 2 and 3 cancels each other, and so do 4 and 5. Result: int[] a = [1, 2]; // 1. retained on allocation int[] b = a[0..1]; return b; The assumption here is that int[] does not span over two memory blocks. Perhaps this is a problem for memory management system code, but memory management system code should be able to opt-out anyway, if only to be able to write the retain/release implementation. (In Objective-C you opt out using the __unsafe_unretained pointer attribute.) That said, with function boundaries things are a little messier: int[] foo(int[] a) // a is implicitly retained by the caller // for the duration of the call { int[] b = a[0..1]; // 1. retain on assignment // 2. scope(exit) release b return b; // 3. retain b for caller } Here, only 1 and 2 can be elided, resulting in one explicit call to retain: int[] foo(int[] a) // a is implicitly retained by the caller // for the duration of the call { int[] b = a[0..1]; return b; // 3. retain b for caller } But by inlining this trivial function, similar flow analysis in the caller should be able to elide that call to retain too. So there is some overhead, but probably not as much as you think (and probably a little more than I think because of control flow and functions that can throw put in the middle of this are going to make it harder to elide redundant retain/release pairs). Remember that you have less GC overhead and possibly increased memory locality too (because memory is freed and reused sooner). I won't try to guess which is faster, it's probably going to differ depending on the benchmark anyway.
 2. Functions can now accept new'd data, data from user allocations, 
 static data, slices, etc., with aplomb. Wiring in ARC would likely make 
 this unworkable.
That's no different from the GC having to ignore those pointers when it does a scan. Just check it was allocated within the reference counted allocator memory pool, and if so adjust the block's reference counter, else ignore. There's a small performance cost, but it's probably small compared to an atomic increment/decrement. Objective-C too has some objects (NSString objects) in the static data segment. They also have "magic" hard-coded immutable value objects hiding the object's payload within the pointer itself on 64-bit processors. Calls to retain/release just get ignored for these.
 3. I'm not convinced yet that ARC can guarantee memory safety. For 
 example, where do weak pointers fit in with memory safety? (ARC uses 
 user-annoted weak pointers to deal with cycles.)
Failure to use weak pointers creates cycles, but cycles are not unsafe. The worse that'll happen is memory exhaustion (but we could/should still have an optional GC available to collect cycles). If weak pointers are nulled automatically (as they should) then you'll never get a dangling pointer. To use a weak pointer you first have to make a non-weak copy of it through a runtime call. You either get a null non-weak pointer or a non-null one if the object is still alive. Runtime stuff ensure that this works atomically with regard to the reference count falling to zero. No dangling pointer. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 09 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/9/2013 2:10 PM, Michel Fortin wrote:
 That's no different from the GC having to ignore those pointers when it does a
 scan. Just check it was allocated within the reference counted allocator memory
 pool, and if so adjust the block's reference counter, else ignore. There's a
 small performance cost, but it's probably small compared to an atomic
 increment/decrement.
When passing any dynamic array to a function, or most any assignment, the compiler must insert: (is pointer into gc) && (update ref count) This is costly, because: 1. the gc pools may be fragmented, i.e. interleaved with malloc'd blocks, meaning an arbitrary number of checks for "is pointer into gc". I suspect on 64 bit machines one might be able to reserve in advance a large enough range of addresses to accommodate any realistic eventual gc size, making the check cost 3 instructions, but I don't know how portable such a scheme may be between operating systems. 2. the "update ref count" is likely a function call, which trashes the contents of many registers, leading to poor code performance even if that function is never called (because the compiler must assume it is called, and the registers trashed) Considering that we are trying to appeal to the performance oriented community, these are serious drawbacks. Recall that array slicing performance has been a BIG WIN for several D users.
Oct 09 2013
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2013-10-09 21:40:11 +0000, Walter Bright <newshound2 digitalmars.com> said:

 On 10/9/2013 2:10 PM, Michel Fortin wrote:
 That's no different from the GC having to ignore those pointers when it does a
 scan. Just check it was allocated within the reference counted allocator memory
 pool, and if so adjust the block's reference counter, else ignore. There's a
 small performance cost, but it's probably small compared to an atomic
 increment/decrement.
When passing any dynamic array to a function, or most any assignment, the compiler must insert: (is pointer into gc) && (update ref count) This is costly, because: 1. the gc pools may be fragmented, i.e. interleaved with malloc'd blocks, meaning an arbitrary number of checks for "is pointer into gc". I suspect on 64 bit machines one might be able to reserve in advance a large enough range of addresses to accommodate any realistic eventual gc size, making the check cost 3 instructions, but I don't know how portable such a scheme may be between operating systems.
I know it is. The GC already pays that cost when it scans. We're just moving that cost elsewhere.
 2. the "update ref count" is likely a function call, which trashes the 
 contents of many registers, leading to poor code performance even if 
 that function is never called (because the compiler must assume it is 
 called, and the registers trashed)
In my opinion, the "is pointer into gc" check would be part of the functions. It wouldn't change things much because this is the most likely case and your registers are going to be trashed anyway (and it makes the code smaller at the call site, better for caching). There's no question that assigning to a pointer will be slower. The interesting question how much of that lost performance do you get back later by not having the GC stop the world?
 Considering that we are trying to appeal to the performance oriented 
 community, these are serious drawbacks. Recall that array slicing 
 performance has been a BIG WIN for several D users.
Performance means different things for different people. Slicing performance is great, but GC pauses very bad in some cases. You can't choose to have one without having the other. It all depends on what you want to do. In an ideal world, we'd be able to choose between using a GC or using ARC when building our program. A compiler flag could do the trick. But that becomes messy when libraries (static and dynamic) get involved as they all have to agree on the same codegen to work together. Adding something to mangling that would cause link errors in case of mismatch might be good enough to prevent accidents though. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 09 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 October 2013 at 23:37:53 UTC, Michel Fortin wrote:
 In an ideal world, we'd be able to choose between using a GC or 
 using ARC when building our program. A compiler flag could do 
 the trick. But that becomes messy when libraries (static and 
 dynamic) get involved as they all have to agree on the same 
 codegen to work together. Adding something to mangling that 
 would cause link errors in case of mismatch might be good 
 enough to prevent accidents though.
ObjC guys used to think that. It turns out it is a really bad idea.
Oct 09 2013
parent Michel Fortin <michel.fortin michelf.ca> writes:
On 2013-10-10 01:21:25 +0000, "deadalnix" <deadalnix gmail.com> said:

 On Wednesday, 9 October 2013 at 23:37:53 UTC, Michel Fortin wrote:
 In an ideal world, we'd be able to choose between using a GC or using 
 ARC when building our program. A compiler flag could do the trick. But 
 that becomes messy when libraries (static and dynamic) get involved as 
 they all have to agree on the same codegen to work together. Adding 
 something to mangling that would cause link errors in case of mismatch 
 might be good enough to prevent accidents though.
ObjC guys used to think that. It turns out it is a really bad idea.
Things were much worse with Objective-C because at the time there was no ARC, reference-counting was manual and supporting both required a lot of manual work. Supporting the GC wasn't always a easy either, as the GC only tracked pointers inside of Objective-C objects and on the stack, not in structs on the heap. The GC had an implementation problem for pointers inside static segments, and keeping code working on both a GC and reference-counted had many perils. I think it can be done better in D. We'd basically just be changing the GC algorithm so it uses reference counting. The differences are: 1. unpredictable lifetimes -> predictable lifetime 2. no bother about cyclic references -> need to break them with "weak" The later is probably the most problematic, but if someone has leaks because he uses a library missing "weak" annotations he can still run the GC to collect them while most memory is reclaimed through ARC, or he can fix the problematic library by adding "weak" at the right places. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 10 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/9/2013 7:30 AM, Manu wrote:
 ARC. I've been here years now, and I see absolutely no evidence that the GC is
 ever going to improve. I can trust ARC, it's predictable, I can control it.
 Also, proper support for avoiding the GC without severe inconvenience as
 constantly keeps coming up. But I don't think there's any debate on that one.
 Everyone seems to agree.
I think we can get pretty close to ARC by using RefCounted, but you've indicated you disagree. (Note that ObjC does not have a template system, and so using a library system was not possible for ObjC.)
Oct 09 2013
parent reply Manu <turkeyman gmail.com> writes:
Again from the other thread. The are a few problems with mangling the type;
It breaks when you need to interact with libraries.
It's incompatible with struct alignment, and changes the struct size. These
are very carefully managed properties of structures.
It obscures/complicates generic code.
It doesn't deal with circular references, which people keep bringing up as
a very important problem.

What happens when a library receives a T* arg? Micro managing the ref-count
at library boundaries sounds like a lot more trouble than manual memory
management.
On 10 Oct 2013 05:20, "Walter Bright" <newshound2 digitalmars.com> wrote:

 On 10/9/2013 7:30 AM, Manu wrote:

 ARC. I've been here years now, and I see absolutely no evidence that the
 GC is
 ever going to improve. I can trust ARC, it's predictable, I can control
 it.
 Also, proper support for avoiding the GC without severe inconvenience as
 constantly keeps coming up. But I don't think there's any debate on that
 one.
 Everyone seems to agree.
I think we can get pretty close to ARC by using RefCounted, but you've indicated you disagree. (Note that ObjC does not have a template system, and so using a library system was not possible for ObjC.)
Oct 09 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/9/2013 9:45 PM, Manu wrote:
 The are a few problems with mangling the type;
I don't understand that.
 It breaks when you need to interact with libraries.
That's true if the library persists copies of the data. But I think it's doable if the library API is stateless, i.e. 'pure'.
 It's incompatible with struct alignment, and changes the struct size. These are
 very carefully managed properties of structures.
Nobody says there can be only one variant of RefCounted.
 It obscures/complicates generic code.
It seems to not be a problem in C++ with shared_ptr<T>.
 It doesn't deal with circular references, which people keep bringing up as a
 very important problem.
ARC doesn't deal with it automatically, either, it requires the user to insert weak pointers at the right places. But, if the RefCounted data is actually allocated on the GC heap, an eventual GC sweep will delete them.
 What happens when a library receives a T* arg? Micro managing the ref-count at
 library boundaries sounds like a lot more trouble than manual memory
management.
Aside from purity mentioned above, another way to deal with that is to encapsulate uses of a RefCounted data structure so that raw pointers into it are unnecessary.
Oct 10 2013
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, October 10, 2013 00:30:55 Walter Bright wrote:
 It doesn't deal with circular references, which people keep bringing up as
 a very important problem.
ARC doesn't deal with it automatically, either, it requires the user to insert weak pointers at the right places. But, if the RefCounted data is actually allocated on the GC heap, an eventual GC sweep will delete them.
That may be true, but if you're using RefCounted because you can't afford the GC, then using the GC heap with them is not an option, because that could trigger a sweep, which is precisely what you're trying to avoid. More normal code may be fine with it but not the folks who can't afford the interruption of stop the world or any of the other costs that come with the GC. So, if RefCounted (or a similar type) is going to be used without the GC, it's going to need some type of weak-ref, even if it's just a normal pointer - though as you've pointed out, that pretty much throws safety out the window as no GC is involved. But since you've arguably already done that by using malloc instead of the GC anyway, I think that it's debatable how much that matters. However, the GC would allow for more normal code to not worry about circular references with RefCounted. - Jonathan M Davis
Oct 10 2013
prev sibling parent "PauloPinto" <pjmlp progtools.org> writes:
On Wednesday, 9 October 2013 at 07:29:30 UTC, Manu wrote:
 On 9 October 2013 15:23, PauloPinto <pjmlp progtools.org> wrote:

 On Wednesday, 9 October 2013 at 05:15:53 UTC, Manu wrote:

 On 9 October 2013 08:58, ponce <contact gmsfrommars.fr> wrote:

  On Tuesday, 8 October 2013 at 22:45:51 UTC, Adam D. Ruppe 
 wrote:
 Eh, not necessarily. If it expands to static 
 assert(!__traits(****
 hasAnnotationRecursive,

 uses_gc));, then the only ones that *need* to be marked are 
 the lowest
 level ones. Then it figures out the rest only on demand.

 Then, on the function you care about as a user, you say 
 nogc and it
 tells
 you if you called anything and the static assert stacktrace 
 tells you
 where
 it happened.

 Of course, to be convenient to use, phobos would need to 
 offer
 non-allocating functions, which is indeed a fair amount of 
 work, but
 they
 wouldn't *necessarily* have to have the specific attribute.
But is it even necessary? There isn't a great deal of evidence that someone interested in optimization will be blocked on this particular problem, like Peter Alexander said. GC hassle is quite common but not that big a deal: - Manu: "Consequently, I avoid the GC in D too, and never had any major problems, only inconvenience." http://www.reddit.com/r/** programming/comments/1nxs2i/****the_state_of_rust_08/ccnefe7<h** ttp://www.reddit.com/r/**programming/comments/1nxs2i/** the_state_of_rust_08/ccnefe7<http://www.reddit.com/r/programming/comments/1nxs2i/the_state_of_rust_08/ccnefe7>

 - Dav1d: said he never had a GC problem with BRala 
 (minecraft client)
 - Me: I had a small ~100ms GC pause in one of my games every 
 20 minutes,
 more often than not I don't notice it

 So a definitive written rebutal we can link to would perhaps 
 be helpful.
I might just add, that while my experience has been that I haven't had any significant technical problems when actively avoiding the GC, the inconvenience is considerably more severe than I made out in that post (I don't want to foster public negativity). But it is actually really, really inconvenient. If that's my future with D, then I'll pass, just as any un-biased 3rd party would. I've been simmering on this issue ever since I took an interest in D. At first I was apprehensive to accept the GC, then cautiously optimistic that the GC might be okay. But I have seen exactly no movement in this area as long as I've been following D, and I have since reverted to a position in absolute agreement with the C++ users. I will never accept the GC in it's current form for all of my occupational requirements; it's implicitly non-deterministic, and offers very little control over performance characteristics. I've said before that until I can time-slice the GC, and it does not stop the world, then it doesn't satisfy my requirements. I see absolutely no motion towards that goal. If I were one of those many C++ users evaluating D for long-term adoption (and I am!), I'm not going to invest the future of my career and industry in a complete question mark which given years of watching already, is clearly going nowhere. As far as the GC is concerned, with respect to realtime embedded software, I'm out. I've completely lost faith. And it's going to take an awful lot more to restore my faith again than before. What I want is an option to replace the GC with ARC, just like Apple did. Clearly they came to the same conclusion, probably for exactly the same reasons. Apple have a policy of silky smooth responsiveness throughout the OS and the entire user experience. They consider this a sign of quality and professionalism. As far as I can tell, they concluded that non-deterministic GC pauses were incompatible with their goal. I agree. I think their experience should be taken very seriously. They have a successful platform on weak embedded hardware, with about a million applications deployed. I've had a lot of conversations with a lot of experts, plenty of conversations at dconf, and nobody could even offer me a vision for a GC that is acceptable. As far as I can tell, nobody I talked to really thinks a GC that doesn't stop the world, which can be carefully scheduled/time-sliced (ie, an incremental, thread-local GC, or whatever), is even possible. I'll take ARC instead. It's predictable, easy for all programmers who aren't experts on resource management to understand, and I have DIRECT control over it's behaviour and timing. But that's not enough, offering convenience while trying to avoid using the GC altogether is also very important. You should be able to write software that doesn't allocate memory. It's quite hard to do in D today. There's plenty of opportunity for improvement. I'm still keenly awaiting a more-developed presentation of Andrei's allocators system.
Apple dropped the GC and went ARC instead, because they never managed to make it work properly. It was full of corner cases, and the application could crash if those cases were not fully taken care of. Or course the PR message is "We dropped GC because ARC is better" and not "We dropped GC because we failed". Now having said this, of course D needs a better GC as the current one doesn't fulfill the needs of potential users of the language.
Well, I never read that article apparently... but that's possibly even more of a concern if true. Does anyone here REALLY believe that a bunch of volunteer contributors can possibly do what apple failed to do with their squillions of dollars and engineers? I haven't heard anybody around here propose the path to an acceptable solution. It's perpetually in the too-hard basket, hence we still have the same GC as forever and it's going nowhere.
I already provided that information in antoher discussion thread awhile ago, http://forum.dlang.org/post/cntjtnvnrwgdoklvznnw forum.dlang.org It is easy for developers outside Objective-C world to believe in the ARC PR, without knowing what happened in the battlefield. :) -- Paulo
Oct 09 2013
prev sibling next sibling parent Benjamin Thaut <code benjamin-thaut.de> writes:
Am 09.10.2013 07:15, schrieb Manu:
 I've had a lot of conversations with a lot of experts, plenty of
 conversations at dconf, and nobody could even offer me a vision for a GC
 that is acceptable.
 As far as I can tell, nobody I talked to really thinks a GC that doesn't
 stop the world, which can be carefully scheduled/time-sliced (ie, an
 incremental, thread-local GC, or whatever), is even possible.
I have to fully agree here. I recently bought and read the book "The handbook of garbage collection". And the base requirement they make for basically every garbage collector that is a little bit more advancted then a impercise mark & sweep is that you know the _exact_ location of _all_ your pointers. And thats where D's problems come from. Its quite easy to know all pointers on the heap but the real problem are the pointers on the stack. If we really want a state of the art GC we need fully working pointer discovery first. Regarding the GC D's biggest problem is, that it was designed to require a GC but it was not designed to actually support a GC. And thats why I don't believe that there will ever be a GC good enough in D to fullfill realtime or soft-realtime requirements. Kind Regards Benjamin Thaut
Oct 09 2013
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-10-09 07:15, Manu wrote:

 What I want is an option to replace the GC with ARC,
We had an email conversation about ARC when I announced the updated Objective-C support for D. Perhaps it's time make it public. I don't think we came to a conclusion. I had some trouble following it. It was a bit too advanced for me. -- /Jacob Carlborg
Oct 09 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/9/2013 4:57 AM, Jacob Carlborg wrote:
 On 2013-10-09 07:15, Manu wrote:

 What I want is an option to replace the GC with ARC,
We had an email conversation about ARC when I announced the updated Objective-C support for D. Perhaps it's time make it public. I don't think we came to a conclusion. I had some trouble following it. It was a bit too advanced for me.
If it's got valuable information in it, please consider making it public here (after getting permission from its participants).
Oct 09 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/9/2013 12:26 PM, Walter Bright wrote:
 If it's got valuable information in it, please consider making it public here
 (after getting permission from its participants).
Eh, I see that I was on that thread. Am in the process of getting permission.
Oct 09 2013
parent Michel Fortin <michel.fortin michelf.ca> writes:
On 2013-10-09 19:40:45 +0000, Walter Bright <newshound2 digitalmars.com> said:

 On 10/9/2013 12:26 PM, Walter Bright wrote:
 If it's got valuable information in it, please consider making it public here
 (after getting permission from its participants).
Eh, I see that I was on that thread. Am in the process of getting permission.
It seems my emails can't reach you (I'd like to know why). You have my permission. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Oct 09 2013
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Tuesday, 8 October 2013 at 22:37:28 UTC, Walter Bright wrote:
 On 10/8/2013 9:22 AM, Dicebot wrote:
 It is simply " nogc" which is lacking but absolutely
 mandatory.
Adding nogc is fairly simple. The trouble, though, is (like purity) it is transitive. Every function an nogc function calls will also have to be nogc. This will entail a great deal of work updating phobos/druntime to add those annotations.
Well I thought in current situation we are pretty much forced to go for attributed inference anyway - this problem is not nogc specific. (I actually think making permissive attributes/qualifiers default ones was a very early mistake)
Oct 09 2013
prev sibling parent reply "ixid" <nuaccount gmail.com> writes:
On Tuesday, 8 October 2013 at 22:37:28 UTC, Walter Bright wrote:
 On 10/8/2013 9:22 AM, Dicebot wrote:
 It is simply " nogc" which is lacking but absolutely
 mandatory.
Adding nogc is fairly simple. The trouble, though, is (like purity) it is transitive. Every function an nogc function calls will also have to be nogc. This will entail a great deal of work updating phobos/druntime to add those annotations.
A very naive question but is there no way of analysing the subfunctions to check their purity or lack of GC use rather than having to annotate everything? D does need to be a little wary of becoming too heavily annotated.
Oct 11 2013
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, October 11, 2013 16:27:53 ixid wrote:
 On Tuesday, 8 October 2013 at 22:37:28 UTC, Walter Bright wrote:
 On 10/8/2013 9:22 AM, Dicebot wrote:
 It is simply " nogc" which is lacking but absolutely
 mandatory.
Adding nogc is fairly simple. The trouble, though, is (like purity) it is transitive. Every function an nogc function calls will also have to be nogc. This will entail a great deal of work updating phobos/druntime to add those annotations.
A very naive question but is there no way of analysing the subfunctions to check their purity or lack of GC use rather than having to annotate everything? D does need to be a little wary of becoming too heavily annotated.
Attribute inferrence can only work with templates thanks to separate compilation. There's no guarantee that you have the source for the functions that you're using (unless a function is templated). So, there's no way to do the inferrence in the general case. - Jonathan M Davis
Oct 11 2013
prev sibling next sibling parent reply "Tourist" <gravatar gravatar.com> writes:
On Tuesday, 8 October 2013 at 15:43:46 UTC, ponce wrote:
 At least on Internet forums, there seems to be an entire 
 category of people dismissing D immediately because it has a GC.

 http://www.reddit.com/r/programming/comments/1nxs2i/the_state_of_rust_08/ccne46t
 http://www.reddit.com/r/programming/comments/1nxs2i/the_state_of_rust_08/ccnddqd
 http://www.reddit.com/r/programming/comments/1nsxaa/when_performance_matters_comparing_c_and_go/cclqbqw

 The subject inevitably comes in every reddit thread like it was 
 some kind of show-stopper.

 Now I know first-hand how much work avoiding a GC can give 
 (http://blog.gamesfrommars.fr/2011/01/25/optimizing-crajsh-part-2-2/).

 Yet with D the situation is different and I feel that criticism 
 is way overblown:
 - first of all, few people will have problems with GC in D at 
 all
 - then minimizing allocations can usually solve most of the 
 problems
 - if it's still a problem, the GC can be completely disabled, 
 relevant language features avoided, and there will be no GC 
 pause
 - this work of avoiding allocations would happen anyway in a 
 C++ codebase
 - I happen to have a job with some hardcore optimized C++ 
 codebase and couldn't care less that a GC would run provided 
 there is a way to minimize GC usage (and there is)

 Whatever rational rebutal we have it's never heard.
 The long answer is that it's not a real problem. But it seems 
 people want a short answer. It's also an annoying fight to have 
 since so much of it is based on zero data.

 Is there a plan to have a standard counter-attack to that kind 
 of overblown problems?
 It could be just a solid blog post or a  nogc feature.
I thought about an alternative approach: Instead of using a (yet another) annotation, how about introducing a flag similar to -cov, which would output lines in which the GC is used. This information can be used by an IDE to highlight those lines. Then you could quickly navigate through your performance-critical loop and make sure it's clean of GC.
Oct 08 2013
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, October 09, 2013 01:04:39 Tourist wrote:
 I thought about an alternative approach:
 Instead of using a (yet another) annotation, how about
 introducing a flag similar to -cov, which would output lines in
 which the GC is used.
 This information can be used by an IDE to highlight those lines.
 Then you could quickly navigate through your performance-critical
 loop and make sure it's clean of GC.
That sounds like a much less invasive approach no a nogc attribute. We already arguably have too many attributes. We shouldn't be adding more unless we actually need to. And if we work towards making better use of output ranges in Phobos, it should become reasonably easy to determine which functions might allocate and which won't, so anyone writing code that wants to avoid the GC will know which functions they can call and which they can't. And your proposed flag would catch the cases that they miss. So, I'd say that this along with the --nogc flag seem like good ideas. - Jonathan M Davis
Oct 08 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/8/13 4:45 PM, Jonathan M Davis wrote:
 On Wednesday, October 09, 2013 01:04:39 Tourist wrote:
 I thought about an alternative approach:
 Instead of using a (yet another) annotation, how about
 introducing a flag similar to -cov, which would output lines in
 which the GC is used.
 This information can be used by an IDE to highlight those lines.
 Then you could quickly navigate through your performance-critical
 loop and make sure it's clean of GC.
That sounds like a much less invasive approach no a nogc attribute.
Problem is with functions that have no source available. Andrei
Oct 08 2013
parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 October 2013 at 03:39:38 UTC, Andrei Alexandrescu 
wrote:
 On 10/8/13 4:45 PM, Jonathan M Davis wrote:
 On Wednesday, October 09, 2013 01:04:39 Tourist wrote:
 I thought about an alternative approach:
 Instead of using a (yet another) annotation, how about
 introducing a flag similar to -cov, which would output lines 
 in
 which the GC is used.
 This information can be used by an IDE to highlight those 
 lines.
 Then you could quickly navigate through your 
 performance-critical
 loop and make sure it's clean of GC.
That sounds like a much less invasive approach no a nogc attribute.
Problem is with functions that have no source available. Andrei
Mangle the nogc it into the name?
Oct 09 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/9/13 12:01 AM, Mehrdad wrote:
 On Wednesday, 9 October 2013 at 03:39:38 UTC, Andrei Alexandrescu wrote:
 On 10/8/13 4:45 PM, Jonathan M Davis wrote:
 On Wednesday, October 09, 2013 01:04:39 Tourist wrote:
 I thought about an alternative approach:
 Instead of using a (yet another) annotation, how about
 introducing a flag similar to -cov, which would output lines in
 which the GC is used.
 This information can be used by an IDE to highlight those lines.
 Then you could quickly navigate through your performance-critical
 loop and make sure it's clean of GC.
That sounds like a much less invasive approach no a nogc attribute.
Problem is with functions that have no source available. Andrei
Mangle the nogc it into the name?
That would work. Then anything that doesn't have nogc counts as an allocation, and the corresponding line will be labeled as such. (I suspect that would cause a bunch of false positives in systems that don't add nogc systematically.) Andrei
Oct 09 2013
parent reply dennis luehring <dl.soluz gmx.net> writes:
Am 09.10.2013 09:51, schrieb Andrei Alexandrescu:
 On 10/9/13 12:01 AM, Mehrdad wrote:
 On Wednesday, 9 October 2013 at 03:39:38 UTC, Andrei Alexandrescu wrote:
 On 10/8/13 4:45 PM, Jonathan M Davis wrote:
 On Wednesday, October 09, 2013 01:04:39 Tourist wrote:
 I thought about an alternative approach:
 Instead of using a (yet another) annotation, how about
 introducing a flag similar to -cov, which would output lines in
 which the GC is used.
 This information can be used by an IDE to highlight those lines.
 Then you could quickly navigate through your performance-critical
 loop and make sure it's clean of GC.
That sounds like a much less invasive approach no a nogc attribute.
Problem is with functions that have no source available. Andrei
Mangle the nogc it into the name?
That would work. Then anything that doesn't have nogc counts as an allocation, and the corresponding line will be labeled as such. (I suspect that would cause a bunch of false positives in systems that don't add nogc systematically.) Andrei
but maybe combined with adam ruppes idea in thread http://forum.dlang.org/post/l322df$1n8o$1 digitalmars.com will reduce the false postive amount faster
Oct 09 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/9/13 12:58 AM, dennis luehring wrote:
 Am 09.10.2013 09:51, schrieb Andrei Alexandrescu:
 On 10/9/13 12:01 AM, Mehrdad wrote:
 On Wednesday, 9 October 2013 at 03:39:38 UTC, Andrei Alexandrescu wrote:
 On 10/8/13 4:45 PM, Jonathan M Davis wrote:
 On Wednesday, October 09, 2013 01:04:39 Tourist wrote:
 I thought about an alternative approach:
 Instead of using a (yet another) annotation, how about
 introducing a flag similar to -cov, which would output lines in
 which the GC is used.
 This information can be used by an IDE to highlight those lines.
 Then you could quickly navigate through your performance-critical
 loop and make sure it's clean of GC.
That sounds like a much less invasive approach no a nogc attribute.
Problem is with functions that have no source available. Andrei
Mangle the nogc it into the name?
That would work. Then anything that doesn't have nogc counts as an allocation, and the corresponding line will be labeled as such. (I suspect that would cause a bunch of false positives in systems that don't add nogc systematically.) Andrei
but maybe combined with adam ruppes idea in thread http://forum.dlang.org/post/l322df$1n8o$1 digitalmars.com will reduce the false postive amount faster
I'm hesitant about stuff that computes function summaries such as __traits(getFunctionsCalled, function) without allowing those summaries to make it into the function's signature or attributes. It makes separate compilation difficult. Andrei
Oct 09 2013
next sibling parent reply dennis luehring <dl.soluz gmx.net> writes:
Am 09.10.2013 10:05, schrieb Andrei Alexandrescu:
 On 10/9/13 12:58 AM, dennis luehring wrote:
 Am 09.10.2013 09:51, schrieb Andrei Alexandrescu:
 On 10/9/13 12:01 AM, Mehrdad wrote:
 On Wednesday, 9 October 2013 at 03:39:38 UTC, Andrei Alexandrescu wrote:
 On 10/8/13 4:45 PM, Jonathan M Davis wrote:
 On Wednesday, October 09, 2013 01:04:39 Tourist wrote:
 I thought about an alternative approach:
 Instead of using a (yet another) annotation, how about
 introducing a flag similar to -cov, which would output lines in
 which the GC is used.
 This information can be used by an IDE to highlight those lines.
 Then you could quickly navigate through your performance-critical
 loop and make sure it's clean of GC.
That sounds like a much less invasive approach no a nogc attribute.
Problem is with functions that have no source available. Andrei
Mangle the nogc it into the name?
That would work. Then anything that doesn't have nogc counts as an allocation, and the corresponding line will be labeled as such. (I suspect that would cause a bunch of false positives in systems that don't add nogc systematically.) Andrei
but maybe combined with adam ruppes idea in thread http://forum.dlang.org/post/l322df$1n8o$1 digitalmars.com will reduce the false postive amount faster
I'm hesitant about stuff that computes function summaries such as __traits(getFunctionsCalled, function) without allowing those summaries to make it into the function's signature or attributes. It makes separate compilation difficult.
what speaks against doing it so (after thinking about it a long long time...) i think D needs a proper way of express the function "features" compatible with seperated compilation, and there is nothing else as the signature - as long as D don't get its own internal object format, and still want to be C link compatible
Oct 09 2013
parent dennis luehring <dl.soluz gmx.net> writes:
Am 09.10.2013 10:17, schrieb dennis luehring:
 Am 09.10.2013 10:05, schrieb Andrei Alexandrescu:
 On 10/9/13 12:58 AM, dennis luehring wrote:
 Am 09.10.2013 09:51, schrieb Andrei Alexandrescu:
 On 10/9/13 12:01 AM, Mehrdad wrote:
 On Wednesday, 9 October 2013 at 03:39:38 UTC, Andrei Alexandrescu wrote:
 On 10/8/13 4:45 PM, Jonathan M Davis wrote:
 On Wednesday, October 09, 2013 01:04:39 Tourist wrote:
 I thought about an alternative approach:
 Instead of using a (yet another) annotation, how about
 introducing a flag similar to -cov, which would output lines in
 which the GC is used.
 This information can be used by an IDE to highlight those lines.
 Then you could quickly navigate through your performance-critical
 loop and make sure it's clean of GC.
That sounds like a much less invasive approach no a nogc attribute.
Problem is with functions that have no source available. Andrei
Mangle the nogc it into the name?
That would work. Then anything that doesn't have nogc counts as an allocation, and the corresponding line will be labeled as such. (I suspect that would cause a bunch of false positives in systems that don't add nogc systematically.) Andrei
but maybe combined with adam ruppes idea in thread http://forum.dlang.org/post/l322df$1n8o$1 digitalmars.com will reduce the false postive amount faster
I'm hesitant about stuff that computes function summaries such as __traits(getFunctionsCalled, function) without allowing those summaries to make it into the function's signature or attributes. It makes separate compilation difficult.
what speaks against doing it so (after thinking about it a long long time...) i think D needs a proper way of express the function "features" compatible with seperated compilation, and there is nothing else as the signature - as long as D don't get its own internal object format, and still want to be C link compatible
same goes to nothrow, pure, safe, ... and nogc (and possible noarc to even prevent auto reference counting in hierachy if it mabye comes in the future)
Oct 09 2013
prev sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Wednesday, 9 October 2013 at 08:05:30 UTC, Andrei Alexandrescu 
wrote:
 I'm hesitant about stuff that computes function summaries such 
 as __traits(getFunctionsCalled, function) without allowing 
 those summaries to make it into the function's signature or 
 attributes. It makes separate compilation difficult.
That's the verification of the attribute. You can still attach the attribute to a prototype without a body for separate compiliation (this is the same as safe - the prototype could be lying and the compiler can't verify, but you trust the annotation is correct). The advantage __traits(getFunctionsCalled) has over a built-in nogc is simply that we can define it all in the library, and add more, combine ones*, etc., without changing the language again. * A library uda could be defined to check everything || nogc or whatever, since it is ultimately implemented as a static assert which can do it all too.
Oct 09 2013
prev sibling next sibling parent John Joyus <john.joyus gmail.com> writes:
On 10/08/2013 11:43 AM, ponce wrote:
 At least on Internet forums, there seems to be an entire category of
 people dismissing D immediately because it has a GC.
I have just read an interesting blog post about GC http://prog21.dadgum.com/15.html
Oct 08 2013
prev sibling next sibling parent reply Justin Whear <justin economicmodeling.com> writes:
On Tue, 08 Oct 2013 17:43:45 +0200, ponce wrote:

 Yet with D the situation is different and I feel that criticism is way
 overblown:
 - first of all, few people will have problems with GC in D at all - then
 minimizing allocations can usually solve most of the problems - if it's
 still a problem, the GC can be completely disabled,
 relevant language features avoided, and there will be no GC pause - this
 work of avoiding allocations would happen anyway in a C++
 codebase - I happen to have a job with some hardcore optimized C++
 codebase and couldn't care less that a GC would run provided there is a
 way to minimize GC usage (and there is)
I thought I'd weigh in with my experience with D code that is in production. Over the last couple of years, I've had exactly one bad experience with D's garbage collection and it was really bad. It was also mostly my fault. Our web-fronted API is powered by a pool of persistent worker processes written in D. This worker program had an infrequently-used function that built an associative array for every row of data that it processed. I knew this was a bad idea when I wrote it, but it was a tricky problem and using an AA was a quick and fairly intuitive way to solve it--what's more, it worked just fine for months in production. At some point, however, a user inadvertently found a pathological case that caused the function to thrash terribly--whenever we attached to the process with GDB, we would almost invariably find it performing a full garbage collection. The process was still running, and would eventually deliver a response, but only after being at 100% CPU for ten or twenty minutes (as opposed to the <30s time expected). The function has since been completely rewritten not only to avoid using AAs, but with a much better algorithm from a time-complexity point of view. As a "customer" of D I'm a bit torn: should I be impressed by good performance we usually got out of such a crappy bit of code or disappointed by how terrible the performance became in the pathological cases? As a result of my experience with D over the past few years, I tend to write code in two modes: - High level language mode: D as a more awesome Python/Ruby/etc. Built- in AAs are a godsend. Doing `arr ~= element` is great! - Improved C: avoid heap allocations (and thus GC). Looks like nice C code. Related to the latter, it would be really nice to be able to prove that a section of code makes no heap allocations/GC collections. At the moment, I resort to all-cap comments and occasionally running with breakpoints set on the GC functions. Justin
Oct 09 2013
parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Wednesday, 9 October 2013 at 20:10:40 UTC, Justin Whear wrote:
 Related to the latter, it would be really nice to be able to 
 prove that a section of code makes no heap allocations/GC 
 collections.
As a quick temporary thing, how about gc_throw_on_next(); ? That'd just set a thread local flag that gc_malloc checks and if it is set, immediately resets it and throws an AllocAssertError. My thought is this could be quickly and easily implemented pending a better solution and in the mean time can be used in unit tests to help check this stuff.
Oct 09 2013
parent Justin Whear <justin economicmodeling.com> writes:
On Wed, 09 Oct 2013 22:30:05 +0200, Adam D. Ruppe wrote:

 On Wednesday, 9 October 2013 at 20:10:40 UTC, Justin Whear wrote:
 Related to the latter, it would be really nice to be able to prove that
 a section of code makes no heap allocations/GC collections.
As a quick temporary thing, how about gc_throw_on_next(); ? That'd just set a thread local flag that gc_malloc checks and if it is set, immediately resets it and throws an AllocAssertError. My thought is this could be quickly and easily implemented pending a better solution and in the mean time can be used in unit tests to help check this stuff.
So user-code would look like this? // Set up code, GC is fine here ... // Entering critical loop (which may run for months at a time) debug GC.throw_on_next(true); while (true) { ... } // Tear-down code, GC is fine here // (though unnecessary as the process is about to exit) debug GC.throw_on_next(false); ... Something like this would make testing simpler and is probably much more feasible than deep static analysis.
Oct 10 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, October 10, 2013 10:27:24 Sean Kelly wrote:
 On Oct 9, 2013, at 9:24 PM, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 And given that std.concurrency requires casting to and from shared or
 immutable in order to pass objects across threads, it seems ilke most of
 D's concurrency model requires casting to and/or from shared or
 immutable.
std.concurrency won't be this way forever though. We could fake move semantics with something like assumeUnique!T, so send could be modified to accept a non-shared class that's marked as Unique.
I take it that you mean something other than std.exception.assumeUnique which simply casts to immutable? All that std.exception.assumeUnique does for you over casting is document why the cast is happening. If you're talking about creating a wrapper type which indicates that the object is unique, I'd still expect that the casting would have to be happening underneath the hood when the object was passed (though then for better or worse, it would be encapsulated). And unless the objecte were always in that Unique wrapper, the programmer would still have to be promising that the object was actually unique and not being shared across threads rather than the type system doing it, in which case, I don't see much gain over simply casting. And if it's always in the wrapper, then you're in a similar boat to shared or immutable in that it's not the correct type. I expect that there are nuances in what you're suggesting that I don't grasp at the moment, but as far as I can tell, the type system fundamentally requires a cast when passing objects across threads. It's just a question of whether that cast is hidden or not, and depending on how you hide it, I think that there's a real risk of the situation being worse than if you require explicit casting, because then what you're doing and what you have to be careful about are less obvious, since what's going on is hidden.
 The other option would be deep copying or serialization.
That would be far too costly IMHO. In the vast majority of cases (in my experience at least and from what I've seen others do), what you really want to do is pass ownership of the object from one thread to the other, and while deep copying would allow you to avoid type system issues, it's completely unnecessary otherwise. So, we'd be introducing overhead just to satisfy our very restrictive type system. The only way that I can think of to fix that would be for objects to all have a concept of what thread owns them (so that the type system would be able to understand the concept of an object's ownership being passed from one thread to another), but that would be a _big_ change and likely way too complicated in general. - Jonathan M Davis
Oct 10 2013
prev sibling next sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Oct 10, 2013, at 10:50 AM, "Jonathan M Davis" <jmdavisProg gmx.com> =
wrote:

 On Thursday, October 10, 2013 10:27:24 Sean Kelly wrote:
 On Oct 9, 2013, at 9:24 PM, Jonathan M Davis <jmdavisProg gmx.com> =
wrote:
 And given that std.concurrency requires casting to and from shared =
or
 immutable in order to pass objects across threads, it seems ilke =
most of
 D's concurrency model requires casting to and/or from shared or
 immutable.
std.concurrency won't be this way forever though. We could fake move semantics with something like assumeUnique!T, so send could be =
modified to
 accept a non-shared class that's marked as Unique.
=20 I take it that you mean something other than =
std.exception.assumeUnique which=20
 simply casts to immutable? All that std.exception.assumeUnique does =
for you=20
 over casting is document why the cast is happening.
=20
 If you're talking about creating a wrapper type which indicates that =
the=20
 object is unique, I'd still expect that the casting would have to be =
happening=20
 underneath the hood when the object was passed (though then for better =
or=20
 worse, it would be encapsulated). And unless the objecte were always =
in that=20
 Unique wrapper, the programmer would still have to be promising that =
the=20
 object was actually unique and not being shared across threads rather =
than the=20
 type system doing it, in which case, I don't see much gain over simply=20=
 casting. And if it's always in the wrapper, then you're in a similar =
boat to=20
 shared or immutable in that it's not the correct type.
=20
 I expect that there are nuances in what you're suggesting that I don't =
grasp=20
 at the moment, but as far as I can tell, the type system fundamentally=20=
 requires a cast when passing objects across threads. It's just a =
question of=20
 whether that cast is hidden or not, and depending on how you hide it, =
I think=20
 that there's a real risk of the situation being worse than if you =
require=20
 explicit casting, because then what you're doing and what you have to =
be=20
 careful about are less obvious, since what's going on is hidden.
Yes, we couldn't use assumeUnique as-is because then the object would = land on the other side as immutable. It would have to wrap the object = to tell send() that the object, while not shared or immutable, is safe = to put in a message. Then send() would discard the wrapper while = constructing the message.=
Oct 10 2013
prev sibling parent reply "qznc" <qznc web.de> writes:
On Tuesday, 8 October 2013 at 15:43:46 UTC, ponce wrote:
 At least on Internet forums, there seems to be an entire 
 category of people dismissing D immediately because it has a GC.

 Whatever rational rebutal we have it's never heard.
 The long answer is that it's not a real problem. But it seems 
 people want a short answer. It's also an annoying fight to have 
 since so much of it is based on zero data.
Just stumbled upon this paper "A Study of the Scalability of Stop-the-world Garbage Collectors on Multicores". I have not read it in detail, but the conclusion says: "Our evaluation suggests that today, there is no conceptual reason to believe that the pause time of a stop-the-world GC will increase with the increasing number of cores and memory size of multicore hardware." http://pagesperso-systeme.lip6.fr/Gael.Thomas/research/biblio/2013/gidra13asplos-naps.pdf
Oct 13 2013
parent Paulo Pinto <pjmlp progtools.org> writes:
Am 13.10.2013 16:21, schrieb qznc:
 On Tuesday, 8 October 2013 at 15:43:46 UTC, ponce wrote:
 At least on Internet forums, there seems to be an entire category of
 people dismissing D immediately because it has a GC.

 Whatever rational rebutal we have it's never heard.
 The long answer is that it's not a real problem. But it seems people
 want a short answer. It's also an annoying fight to have since so much
 of it is based on zero data.
Just stumbled upon this paper "A Study of the Scalability of Stop-the-world Garbage Collectors on Multicores". I have not read it in detail, but the conclusion says: "Our evaluation suggests that today, there is no conceptual reason to believe that the pause time of a stop-the-world GC will increase with the increasing number of cores and memory size of multicore hardware." http://pagesperso-systeme.lip6.fr/Gael.Thomas/research/biblio/2013/gidra13asplos-naps.pdf
Thanks for the paper, as language geek I love to read them. -- Paulo
Oct 13 2013