www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - On Phobos GC hunt

reply "Dmitry Olshansky" <dmitry.olsh gmail.com> writes:
I made a proposal to quantatively measure and tabulate all GC 
allocations in Phobos before coming up with solutions to " nogc 
Phobos".

After approving node from Andrei I've come up with a piece of 
automation to extract this data and post it on wiki.

So here is the exhustive list of everything calling into GC in 
Phobos (-vgc compiler flag):

http://wiki.dlang.org/Stuff_in_Phobos_That_Generates_Garbage

Including source links, a wild guess at function's name and the 
compiler's warning message for potential GC call.

As far as data goes this is about as good as we can get, the next 
phase is labeling this stuff with potential solution(s). Again 
doing all by hand is tedious and hardly useful.

Instead we need to observe patterns and label it automatically 
until the non-trivial subset remains. So everybody, please take 
time and identify simple patterns and post back your ideas on 
solution(s).

So far I see the most frequent cases:
- `new SomeException` - switch to RC exceptions
- AA access - ??? (use user-defined AA type as parameter?)
- array concat - ???
- closure - ???



---
Dmitry Olshansky
Oct 07 2014
next sibling parent reply "grm" <gerhard.mueller gmsoft.at> writes:
1.) It may be helpful to reduce the noise in that every match 
after a new is ignored (and probaly multiple 'operator ~' alarms 
within the same statement).

2.) There seems to be a problem with repeated alarms:
When viewing the page source, this link shows up numerous times. 
See
https://github.com/D-Programming-Language//phobos/blob/d4d98124ab6cbef7097025a7cfd1161d1963c87e/std/conv.d#L688

/Gerhard
Oct 07 2014
next sibling parent "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Tuesday, 7 October 2014 at 16:23:19 UTC, grm wrote:
 2.) There seems to be a problem with repeated alarms:
 When viewing the page source, this link shows up numerous 
 times. See
 https://github.com/D-Programming-Language//phobos/blob/d4d98124ab6cbef7097025a7cfd1161d1963c87e/std/conv.d#L688
That's because of multiple template instantiations of the same function. These should probably be filtered for this use case.
Oct 07 2014
prev sibling parent reply "Dmitry Olshansky" <dmitry.olsh gmail.com> writes:
On Tuesday, 7 October 2014 at 16:23:19 UTC, grm wrote:
 1.) It may be helpful to reduce the noise in that every match 
 after a new is ignored (and probaly multiple 'operator ~' 
 alarms within the same statement).
The tool currently is quick line-based hack, hence no notion of statement. It's indeed a good idea to merge all messages for one statement and de-duplicate on per statement basis.
 2.) There seems to be a problem with repeated alarms:
 When viewing the page source, this link shows up numerous 
 times. See
 https://github.com/D-Programming-Language//phobos/blob/d4d98124ab6cbef7097025a7cfd1161d1963c87e/std/conv.d#L688
There are lots of toImpl overloads, deduplication is done on module:LOC basis so the all show up. Going to fix in v2 to merge all of them in one row.
 /Gerhard
Oct 08 2014
parent "grm" <gerhard.mueller gmsoft.at> writes:
Was in a slight hurry and forgot to mention that I (quite sure: 
we all) very much appreciate the hands-on mentality your approach 
shows.

looking forward to v2

/Gerhard
Oct 08 2014
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/7/2014 8:57 AM, Dmitry Olshansky wrote:
 I made a proposal to quantatively measure and tabulate all GC allocations in
 Phobos before coming up with solutions to " nogc Phobos".

 After approving node from Andrei I've come up with a piece of automation to
 extract this data and post it on wiki.
Thanks, Dmitri, this is great work. I suggest at a minimum that all of those get notes added to their documentation that they gc allocate.
Oct 07 2014
parent reply "Brad Anderson" <eco gnuk.net> writes:
On Tuesday, 7 October 2014 at 19:24:34 UTC, Walter Bright wrote:
 Thanks, Dmitri, this is great work. I suggest at a minimum that 
 all of those get notes added to their documentation that they 
 gc allocate.
Seems like that's something that should just be automated where possible instead of trying to update the documentation of hundreds of functions.
Oct 07 2014
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/7/14, 12:31 PM, Brad Anderson wrote:
 On Tuesday, 7 October 2014 at 19:24:34 UTC, Walter Bright wrote:
 Thanks, Dmitri, this is great work. I suggest at a minimum that all of
 those get notes added to their documentation that they gc allocate.
Seems like that's something that should just be automated where possible instead of trying to update the documentation of hundreds of functions.
Could ddoc do that? -- Andrei
Oct 07 2014
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2014-10-07 17:57, Dmitry Olshansky wrote:
 I made a proposal to quantatively measure and tabulate all GC
 allocations in Phobos before coming up with solutions to " nogc Phobos".

 After approving node from Andrei I've come up with a piece of automation
 to extract this data and post it on wiki.

 So here is the exhustive list of everything calling into GC in Phobos
 (-vgc compiler flag):

 http://wiki.dlang.org/Stuff_in_Phobos_That_Generates_Garbage

 Including source links, a wild guess at function's name and the
 compiler's warning message for potential GC call.

 As far as data goes this is about as good as we can get, the next phase
 is labeling this stuff with potential solution(s). Again doing all by
 hand is tedious and hardly useful.

 Instead we need to observe patterns and label it automatically until the
 non-trivial subset remains. So everybody, please take time and identify
 simple patterns and post back your ideas on solution(s).

 So far I see the most frequent cases:
 - `new SomeException` - switch to RC exceptions
 - AA access - ??? (use user-defined AA type as parameter?)
 - array concat - ???
 - closure - ???
I did some processing of the data and this is the results I got: 772 | 'new' causes GC allocation 515 | operator ~= may cause GC allocation 380 | operator ~ may cause GC allocation 113 | array literal may cause GC allocation 90 | setting 'length' may cause GC allocation 77 | indexing an associative array may cause GC allocation 34 | using closure causes GC allocation 16 | 'delete' requires GC 5 | associative array literal may cause GC allocation Total 9 I didn't look at any source code to see what "new" is actually allocating, for example. -- /Jacob Carlborg
Oct 07 2014
parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Tuesday, 7 October 2014 at 20:13:32 UTC, Jacob Carlborg wrote:
 I didn't look at any source code to see what "new" is actually 
 allocating, for example.
I did some random sampling, and it's 90% exceptions, with the occasional array allocation. I noticed that a lot of the ~ and ~= complaints are in code that only ever runs at compile time (generating strings for mixin). I wonder if there's any way we can silence these false positives.
Oct 07 2014
next sibling parent reply "Dmitry Olshansky" <dmitry.olsh gmail.com> writes:
On Tuesday, 7 October 2014 at 21:59:08 UTC, Peter Alexander wrote:
 On Tuesday, 7 October 2014 at 20:13:32 UTC, Jacob Carlborg 
 wrote:
 I didn't look at any source code to see what "new" is actually 
 allocating, for example.
I did some random sampling, and it's 90% exceptions, with the occasional array allocation.
That's interesting. I suspected around 50%. Well that's even better since if we do ref-counted exceptions we solve 90% of problem ;)
 I noticed that a lot of the ~ and ~= complaints are in code 
 that only ever runs at compile time (generating strings for 
 mixin). I wonder if there's any way we can silence these false 
 positives.
I'm going to use blacklist for these as compiler can't in general know if it is going to be used exclusively at CTFE or not. Okay, I think I should go a bit futher with the second version of the tool. Things on todo list: - make tool general enough to work for any GitHub based project (and hackable for other hostings) - use Brian's D parser to accurately find artifacts - detect "throw new SomeStuff" pattern and automatically populate potential fix line - list all source links in one coulmn for the same function (this needs proper parser) - use blacklist of <module-name>:<artifact name> to filter out CTFE - use current data from wiki for "potential fix" column if present Holy grail is: - plot DOT call-graph of GC-users, with leafs being the ones reported by -vgc. So I start with this list then add functions them, then functions that use these functions and so on.
Oct 08 2014
parent reply "Dmitry Olshansky" <dmitry.olsh gmail.com> writes:
On Wednesday, 8 October 2014 at 07:52:37 UTC, Dmitry Olshansky 
wrote:
 On Tuesday, 7 October 2014 at 21:59:08 UTC, Peter Alexander

 Okay,  I think I should go a bit futher with the second version 
 of the tool.

 Things on todo list:
  - make tool general enough to work for any GitHub based 
 project (and hackable for other hostings)
  - use Brian's D parser to accurately find artifacts
  - detect "throw new SomeStuff" pattern and automatically 
 populate potential fix line
  - list all source links in one coulmn for the same function 
 (this needs proper parser)
  - use blacklist of <module-name>:<artifact name> to filter out 
 CTFE
  - use current data from wiki for "potential fix" column if 
 present
The new version is out, it's a bit rough for a proper announcement yet and misses a couple of things from my todo list but the improvement is so radical I decided to share it anyway. With the new pattern-matcher/parser I hacked together in on top of Brain's lexer it's now surgically precise in labeling artifacts. Also I retained as much as possible of original comments (line numbers have changed), and grouped source links per artifact. Updated Wiki: http://wiki.dlang.org/Stuff_in_Phobos_That_Generates_Garbage Tool: https://github.com/DmitryOlshansky/gchunt Also it's "universal" as in any github-hosted D project, for example here is an output for druntime: http://wiki.dlang.org/Stuff_in_Druntime_That_Generates_Garbage Still todo: - blacklisting of modules/artifacts - detect usage of (i)dup - label throw new xyz as `EX` - a few bugs to fix in artifact labeling
Oct 14 2014
next sibling parent reply "Chris" <wendlec tcd.ie> writes:
On Tuesday, 14 October 2014 at 13:29:33 UTC, Dmitry Olshansky 
wrote:
 On Wednesday, 8 October 2014 at 07:52:37 UTC, Dmitry Olshansky 
 wrote:
 On Tuesday, 7 October 2014 at 21:59:08 UTC, Peter Alexander

 Okay,  I think I should go a bit futher with the second 
 version of the tool.

 Things on todo list:
 - make tool general enough to work for any GitHub based 
 project (and hackable for other hostings)
 - use Brian's D parser to accurately find artifacts
 - detect "throw new SomeStuff" pattern and automatically 
 populate potential fix line
 - list all source links in one coulmn for the same function 
 (this needs proper parser)
 - use blacklist of <module-name>:<artifact name> to filter out 
 CTFE
 - use current data from wiki for "potential fix" column if 
 present
The new version is out, it's a bit rough for a proper announcement yet and misses a couple of things from my todo list but the improvement is so radical I decided to share it anyway. With the new pattern-matcher/parser I hacked together in on top of Brain's lexer it's now surgically precise in labeling artifacts. Also I retained as much as possible of original comments (line numbers have changed), and grouped source links per artifact. Updated Wiki: http://wiki.dlang.org/Stuff_in_Phobos_That_Generates_Garbage Tool: https://github.com/DmitryOlshansky/gchunt Also it's "universal" as in any github-hosted D project, for example here is an output for druntime: http://wiki.dlang.org/Stuff_in_Druntime_That_Generates_Garbage Still todo: - blacklisting of modules/artifacts - detect usage of (i)dup - label throw new xyz as `EX` - a few bugs to fix in artifact labeling
Thanks a million! That's very very useful.
Oct 15 2014
parent "Dmitry Olshansky" <dmitry.olsh gmail.com> writes:
On Wednesday, 15 October 2014 at 11:25:58 UTC, Chris wrote:
 On Tuesday, 14 October 2014 at 13:29:33 UTC, Dmitry Olshansky 
 wrote:
 On Wednesday, 8 October 2014 at 07:52:37 UTC, Dmitry Olshansky 
 wrote:
 On Tuesday, 7 October 2014 at 21:59:08 UTC, Peter Alexander

 Okay,  I think I should go a bit futher with the second 
 version of the tool.

 Things on todo list:
 - make tool general enough to work for any GitHub based 
 project (and hackable for other hostings)
 - use Brian's D parser to accurately find artifacts
 - detect "throw new SomeStuff" pattern and automatically 
 populate potential fix line
 - list all source links in one coulmn for the same function 
 (this needs proper parser)
 - use blacklist of <module-name>:<artifact name> to filter 
 out CTFE
 - use current data from wiki for "potential fix" column if 
 present
The new version is out, it's a bit rough for a proper announcement yet and misses a couple of things from my todo list but the improvement is so radical I decided to share it anyway. With the new pattern-matcher/parser I hacked together in on top of Brain's lexer it's now surgically precise in labeling artifacts. Also I retained as much as possible of original comments (line numbers have changed), and grouped source links per artifact. Updated Wiki: http://wiki.dlang.org/Stuff_in_Phobos_That_Generates_Garbage Tool: https://github.com/DmitryOlshansky/gchunt Also it's "universal" as in any github-hosted D project, for example here is an output for druntime: http://wiki.dlang.org/Stuff_in_Druntime_That_Generates_Garbage Still todo: - blacklisting of modules/artifacts - detect usage of (i)dup - label throw new xyz as `EX` - a few bugs to fix in artifact labeling
Thanks a million! That's very very useful.
I sure hoped so! :) Sadly I'm going to be incredibly busy this weekend, so the proper release date shifts to sometime afterwards.
Oct 16 2014
prev sibling parent "safety0ff" <safety0ff.dev gmail.com> writes:
On Tuesday, 14 October 2014 at 13:29:33 UTC, Dmitry Olshansky 
wrote:
 Also it's "universal" as in any github-hosted D project, for 
 example here is an output for druntime:

 http://wiki.dlang.org/Stuff_in_Druntime_That_Generates_Garbage

 Still todo:
  - a few bugs to fix in artifact labeling
One artefact labelling bug I noticed was GC.removeRange and GC.removeRoot were placed in the Artefact column where it should have been GC.rangeIter and GC.rootIter.
Oct 18 2014
prev sibling parent reply Johannes Pfau <nospam example.com> writes:
Am Tue, 07 Oct 2014 21:59:05 +0000
schrieb "Peter Alexander" <peter.alexander.au gmail.com>:

 On Tuesday, 7 October 2014 at 20:13:32 UTC, Jacob Carlborg wrote:
 I didn't look at any source code to see what "new" is actually 
 allocating, for example.
I did some random sampling, and it's 90% exceptions, with the occasional array allocation. I noticed that a lot of the ~ and ~= complaints are in code that only ever runs at compile time (generating strings for mixin). I wonder if there's any way we can silence these false positives.
Code in if(__ctfe) blocks could be (and should be) allowed: https://github.com/D-Programming-Language/dmd/pull/3572 But if you have got a normal function (string generateMixin()) the compiler can't really know that it's only used at compile time. And if it's not a template the code using the GC will be compiled, even if it's never called. This might be enough to get undefined symbol errors if you don't have an GC, so the error messages are kinda valid.
Oct 08 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/8/14, 1:13 AM, Johannes Pfau wrote:
 Am Tue, 07 Oct 2014 21:59:05 +0000
 schrieb "Peter Alexander" <peter.alexander.au gmail.com>:

 On Tuesday, 7 October 2014 at 20:13:32 UTC, Jacob Carlborg wrote:
 I didn't look at any source code to see what "new" is actually
 allocating, for example.
I did some random sampling, and it's 90% exceptions, with the occasional array allocation. I noticed that a lot of the ~ and ~= complaints are in code that only ever runs at compile time (generating strings for mixin). I wonder if there's any way we can silence these false positives.
Code in if(__ctfe) blocks could be (and should be) allowed: https://github.com/D-Programming-Language/dmd/pull/3572 But if you have got a normal function (string generateMixin()) the compiler can't really know that it's only used at compile time. And if it's not a template the code using the GC will be compiled, even if it's never called. This might be enough to get undefined symbol errors if you don't have an GC, so the error messages are kinda valid.
That's a bummer. Can we get the compiler to remove the "if (__ctfe)" code after semantic checking? Andrei
Oct 08 2014
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/8/14, 1:01 PM, Andrei Alexandrescu wrote:
 On 10/8/14, 1:13 AM, Johannes Pfau wrote:
 Am Tue, 07 Oct 2014 21:59:05 +0000
 schrieb "Peter Alexander" <peter.alexander.au gmail.com>:

 On Tuesday, 7 October 2014 at 20:13:32 UTC, Jacob Carlborg wrote:
 I didn't look at any source code to see what "new" is actually
 allocating, for example.
I did some random sampling, and it's 90% exceptions, with the occasional array allocation. I noticed that a lot of the ~ and ~= complaints are in code that only ever runs at compile time (generating strings for mixin). I wonder if there's any way we can silence these false positives.
Code in if(__ctfe) blocks could be (and should be) allowed: https://github.com/D-Programming-Language/dmd/pull/3572 But if you have got a normal function (string generateMixin()) the compiler can't really know that it's only used at compile time. And if it's not a template the code using the GC will be compiled, even if it's never called. This might be enough to get undefined symbol errors if you don't have an GC, so the error messages are kinda valid.
That's a bummer. Can we get the compiler to remove the "if (__ctfe)" code after semantic checking?
Or would "static if (__ctfe)" work? -- Andrei
Oct 08 2014
next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 10/8/14 4:10 PM, Andrei Alexandrescu wrote:
 On 10/8/14, 1:01 PM, Andrei Alexandrescu wrote:
 That's a bummer. Can we get the compiler to remove the "if (__ctfe)"
 code after semantic checking?
Or would "static if (__ctfe)" work? -- Andrei
Please don't ask me to explain why, because I still don't know. But _ctfe is a normal runtime variable :) It has been explained to me before, why it has to be a runtime variable. I think Don knows the answer. -Steve
Oct 08 2014
parent "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Wednesday, 8 October 2014 at 20:15:51 UTC, Steven 
Schveighoffer wrote:
 On 10/8/14 4:10 PM, Andrei Alexandrescu wrote:
 On 10/8/14, 1:01 PM, Andrei Alexandrescu wrote:
 That's a bummer. Can we get the compiler to remove the "if 
 (__ctfe)"
 code after semantic checking?
Or would "static if (__ctfe)" work? -- Andrei
Please don't ask me to explain why, because I still don't know. But _ctfe is a normal runtime variable :) It has been explained to me before, why it has to be a runtime variable. I think Don knows the answer.
Well, the contents of the static if expression have to be evaluated at compile time, so static if (__ctfe) would always be true. Also, if it were to somehow work as imagined then you'd have nonsensical things like this: static if (__ctfe) class Wat {} auto foo() { static if (__ctfe) return new Wat(); return null; } static wat = foo(); wat now has a type at runtime that only exists at compile time.
Oct 08 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 Or would "static if (__ctfe)" work? -- Andrei
Currently it doesn't work, because __ctfe is a run-time variable. Walter originally tried and failed to make it a compile-time variable. Bye, bearophile
Oct 08 2014
prev sibling next sibling parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Wed, 08 Oct 2014 13:10:11 -0700
Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com>
wrote:

 Or would "static if (__ctfe)" work? -- Andrei
ha! The Famous Bug! it works, but not as people expected. as "static if" evaluates when function is *compiling*, __ctfe is false there, and so the whole "true" branch will be removed as dead code. i believe that compiler should warn about this, 'cause i'm tend to repeatedly hit this funny thing.
Oct 08 2014
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Wed, 8 Oct 2014 23:20:13 +0300
schrieb ketmar via Digitalmars-d <digitalmars-d puremagic.com>:

 On Wed, 08 Oct 2014 13:10:11 -0700
 Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com>
 wrote:
=20
 Or would "static if (__ctfe)" work? -- Andrei
ha! The Famous Bug! it works, but not as people expected. as "static if" evaluates when function is *compiling*, __ctfe is false there, and so the whole "true" branch will be removed as dead code. =20 i believe that compiler should warn about this, 'cause i'm tend to repeatedly hit this funny thing.
Lol, definitely! I made that mistake myself and Robert --=20 Marco
Oct 10 2014
parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Fri, 10 Oct 2014 09:14:28 +0200
Marco Leise via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 Am Wed, 8 Oct 2014 23:20:13 +0300
 schrieb ketmar via Digitalmars-d <digitalmars-d puremagic.com>:
=20
 On Wed, 08 Oct 2014 13:10:11 -0700
 Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com>
 wrote:
=20
 Or would "static if (__ctfe)" work? -- Andrei
ha! The Famous Bug! it works, but not as people expected. as "static if" evaluates when function is *compiling*, __ctfe is false there, and so the whole "true" branch will be removed as dead code. =20 i believe that compiler should warn about this, 'cause i'm tend to repeatedly hit this funny thing.
=20 Lol, definitely! I made that mistake myself and Robert
i made a quick patch that warns on "static if (__ctfe)": https://issues.dlang.org/show_bug.cgi?id=3D13601
Oct 10 2014
prev sibling next sibling parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Wed, 8 Oct 2014 23:20:13 +0300
ketmar via Digitalmars-d <digitalmars-d puremagic.com> wrote:

p.s. or vice versa: "static if (__ctfe)" is always true, to non-ctfe
code will be removed. sorry, i can't really remember what is true, but
anyway, it works by removeing one of the branches altogether.
Oct 08 2014
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 10/08/2014 10:25 PM, ketmar via Digitalmars-d wrote:
 On Wed, 8 Oct 2014 23:20:13 +0300
 ketmar via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 p.s. or vice versa: "static if (__ctfe)" is always true, to non-ctfe
 code will be removed. sorry, i can't really remember what is true, but
 anyway, it works by removeing one of the branches altogether.
This is probably a regression somewhere after 2.060, because with 2.060 I get Error: variable __ctfe cannot be read at compile time Error: expression __ctfe is not constant or does not evaluate to a bool as I'd expect.
Oct 08 2014
next sibling parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Wed, 08 Oct 2014 23:40:01 +0200
Timon Gehr via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 This is probably a regression somewhere after 2.060, because with
 2.060 I get
=20
 Error: variable __ctfe cannot be read at compile time
 Error: expression __ctfe is not constant or does not evaluate to a
 bool
=20
 as I'd expect.
i remember now that i was copypasting toHash() from druntime some time ago and changed "if (__ctfe)" to "static if (__ctfe)" in process. it compiles and works fine, and i don't even noticed what i did until i tried to change non-ctfe part of toHash() and found that my changes had no effect at all. and then i discovered that "static". this was 2.066 or 2.067-git. and now i can clearly say that "static if (__ctfe)" leaving only ctfe part. that was somewhat confusing, as i was pretty sure that "if (__ctfe)" *must* be used with "static".
Oct 08 2014
prev sibling parent Martin Nowak <code+news.digitalmars dawg.eu> writes:
On 10/08/2014 11:40 PM, Timon Gehr wrote:
 This is probably a regression somewhere after 2.060, because with 2.060
 I get

 Error: variable __ctfe cannot be read at compile time
 Error: expression __ctfe is not constant or does not evaluate to a bool

 as I'd expect.
Marked the bugzilla case as regression. https://issues.dlang.org/show_bug.cgi?id=13601
Oct 18 2014
prev sibling parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Wed, 8 Oct 2014 23:25:18 +0300
ketmar via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On Wed, 8 Oct 2014 23:20:13 +0300
 ketmar via Digitalmars-d <digitalmars-d puremagic.com> wrote:
=20
 p.s. or vice versa: "static if (__ctfe)" is always true, to non-ctfe
 code will be removed. sorry, i can't really remember what is true, but
 anyway, it works by removeing one of the branches altogether.
hm. i need some sleep. or new keyboard. or both.
Oct 08 2014
prev sibling next sibling parent Johannes Pfau <nospam example.com> writes:
Am Wed, 08 Oct 2014 13:01:43 -0700
schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:

 On 10/8/14, 1:13 AM, Johannes Pfau wrote:
 Code in if(__ctfe) blocks could be (and should be) allowed:
 https://github.com/D-Programming-Language/dmd/pull/3572

 But if you have got a normal function (string generateMixin()) the
 compiler can't really know that it's only used at compile time. And
 if it's not a template the code using the GC will be compiled, even
 if it's never called. This might be enough to get undefined symbol
 errors if you don't have an GC, so the error messages are kinda
 valid.
That's a bummer. Can we get the compiler to remove the "if (__ctfe)" code after semantic checking? Andrei
I think you misunderstood, code in if(__ctfe) is already removed, it never gets into the binary. But the nogc/-vgc checks still complain about GC allocations in if(__ctfe). This is easy to fix, but as ctfe is a runtime variable you could also do (if(__ctfe || dice() == 1 )) and What I meant is that the compiler can't know that this code is CTFE-only and -vgc must complain: string generateMixin(string a) {return "int " ~ a ~ ";";} mixin(generateMixin()); But there are workarounds: http://dpaste.dzfl.pl/e689585c0a95 (Note that dead-code elimination should be able to remove all functions marked as private)
Oct 09 2014
prev sibling parent reply Martin Nowak <code+news.digitalmars dawg.eu> writes:
On 10/08/2014 10:01 PM, Andrei Alexandrescu wrote:
 That's a bummer. Can we get the compiler to remove the "if (__ctfe)"
 code after semantic checking?

 Andrei
It seems that __ctfe is treated as constant in the backend. At least there is no asm code generated for these examples (even without -O). void main() { import std.stdio; if (__ctfe) writeln("foo"); } int main() { return __ctfe ? 1 : 0; }
Oct 18 2014
parent "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Martin Nowak"  wrote in message news:m1udcn$s35$1 digitalmars.com...

 On 10/08/2014 10:01 PM, Andrei Alexandrescu wrote:
 That's a bummer. Can we get the compiler to remove the "if (__ctfe)"
 code after semantic checking?

 Andrei
It seems that __ctfe is treated as constant in the backend.
 At least there is no asm code generated for these examples (even without
-O). void main() { import std.stdio; if (__ctfe) writeln("foo"); } int main() { return __ctfe ? 1 : 0; }
__ctfe is passed to the backend as 'false', and even with -O the basic optimizations will strip it as dead code. So it is removed after semantic checking, but that doesn't help as semantic checking includes nogc checking.
Oct 20 2014
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/7/14, 8:57 AM, Dmitry Olshansky wrote:
 I made a proposal to quantatively measure and tabulate all GC
 allocations in Phobos before coming up with solutions to " nogc Phobos".

 After approving node from Andrei I've come up with a piece of automation
 to extract this data and post it on wiki.

 So here is the exhustive list of everything calling into GC in Phobos
 (-vgc compiler flag):

 http://wiki.dlang.org/Stuff_in_Phobos_That_Generates_Garbage
Awesome! I've started adding explanations to the first few entries, let's use that crowdsourcing thing to fill this! -- Andrei
Oct 07 2014
prev sibling next sibling parent "Xinok" <xinok live.com> writes:
On Tuesday, 7 October 2014 at 15:57:59 UTC, Dmitry Olshansky 
wrote:
 So here is the exhustive list of everything calling into GC in 
 Phobos (-vgc compiler flag):

 http://wiki.dlang.org/Stuff_in_Phobos_That_Generates_Garbage
A correction on TimSortImpl, it does actually generate garbage by calling uninitializedArray to allocate the buffer. The check for __ctfe is unnecessary (it may have been needed sometime ago or was added naively). Timsort is an O(n/2) algorithm and requires a buffer, but there's no reason for it to be GC-allocated. It could simply be malloc'd and free'd before the function returns.
Oct 07 2014
prev sibling next sibling parent reply Johannes Pfau <nospam example.com> writes:
Am Tue, 07 Oct 2014 15:57:58 +0000
schrieb "Dmitry Olshansky" <dmitry.olsh gmail.com>:

 
 Instead we need to observe patterns and label it automatically 
 until the non-trivial subset remains. So everybody, please take 
 time and identify simple patterns and post back your ideas on 
 solution(s).
 
I just had a look at all closure allocations and identified these patterns: 1) Fixable by manually stack-allocating closure A delegate is passed to some function which stores this delegate and therefore correctly doesn't mark the parameter as scope. However, the lifetime of the stored delegate is still limited to the current function (e.g. it's stored in a struct instance, but on the stack). Can be fixed by creating a static struct{T... members; void doSomething(){access members}} instance on stack and passing &stackvar.doSomething as delegate. 2) Using delegates to add state to ranges ---- return iota(dim). filter!(i => ptr[i])(). map!(i => BitsSet!size_t(ptr[i], i * bitsPerSizeT))(). joiner(); ---- This code adds state to ranges without declaring a new type: the ptr variable is not accessible and needs to be move into a closure. Declaring a custom range type is a solution, but not straightforward: If the ptr field is moved into the range a closure is not necessary. But if the range is copied, it's address changes and the delegate passed to map is now invalid. 3) Functions taking delegates as generic parameters receiveTimeout,receive,formattedWrite accept different types, including delegates. The delegates can all be scope to avoid the allocation but is void foo(T)(scope T) a good idea? The alternative is probably making an overload for delegates with scope attribute. (The result is that all functions calling receiveTimeout,... with a delegate allocate a closure) 4) Solvable with manual memory management Some specific functions can't be easily fixed, but the delegates they create have a well defined lifetime (for example spawn creates a delegate which is only needed at the startup of a new thread, it's never used again). These could be malloc+freed. 5) Design issue These functions generally create a delegate using variables passed in as parameters. There's no way to avoid closures here. Although manual allocation is an possible, the lifetime is undefined and can only be managed by the GC. 6) Other Two cases can be fixed by moving a buffer into a struct or moving a function out of a member function into it's surrounding class. Also notable: 17 out of 35 cases are in std.net.curl. This is because curl heavily uses delegates and wrapper delegates.
Oct 08 2014
parent reply "Dmitry Olshansky" <dmitry.olsh gmail.com> writes:
On Wednesday, 8 October 2014 at 11:25:25 UTC, Johannes Pfau wrote:
 Am Tue, 07 Oct 2014 15:57:58 +0000
 schrieb "Dmitry Olshansky" <dmitry.olsh gmail.com>:

 
 Instead we need to observe patterns and label it automatically 
 until the non-trivial subset remains. So everybody, please 
 take time and identify simple patterns and post back your 
 ideas on solution(s).
 
I just had a look at all closure allocations and identified these patterns:
Awesome! This is exactly the kind of help I wanted.
 1) Fixable by manually stack-allocating closure
    A delegate is passed to some function which stores this 
 delegate and
    therefore correctly doesn't mark the parameter as scope. 
 However,
    the lifetime of the stored delegate is still limited to the 
 current
    function (e.g. it's stored in a struct instance, but on the 
 stack).

    Can be fixed by creating a static struct{T... members; void
    doSomething(){access members}} instance on stack and passing
    &stackvar.doSomething as delegate.
Hm... Probably we can create a template for this.
 2) Using delegates to add state to ranges
    ----
    return iota(dim).
      filter!(i => ptr[i])().
      map!(i => BitsSet!size_t(ptr[i], i * bitsPerSizeT))().
      joiner();
    ----
    This code adds state to ranges without declaring a new type: 
 the ptr
    variable is not accessible and needs to be move into a 
 closure.
    Declaring a custom range type is a solution, but not
    straightforward: If the ptr field is moved into the range a 
 closure
    is not necessary. But if the range is copied, it's address 
 changes
    and the delegate passed to map is now invalid.
Indeed, such code is fine in "user-space" but have no place in the library.
 3) Functions taking delegates as generic parameters
    receiveTimeout,receive,formattedWrite accept different types,
    including delegates. The delegates can all be scope to avoid 
 the
    allocation but is void foo(T)(scope T) a good idea? The 
 alternative
    is probably making an overload for delegates with scope 
 attribute.

    (The result is that all functions calling receiveTimeout,... 
 with a
    delegate allocate a closure)

 4) Solvable with manual memory management
    Some specific functions can't be easily fixed, but the 
 delegates
    they create have a well defined lifetime (for example spawn 
 creates
    a delegate which is only needed at the startup of a new 
 thread, it's
    never used again). These could be malloc+freed.
I think this and (2) can be solved if we come up with solid support for RC-closures.
 5) Design issue
    These functions generally create a delegate using variables 
 passed
    in as parameters. There's no way to avoid closures here. 
 Although
    manual allocation is an possible, the lifetime is undefined 
 and can
    only be managed by the GC.

 6) Other
    Two cases can be fixed by moving a buffer into a struct or 
 moving a
    function out of a member function into it's surrounding 
 class.
Yeah, there are always outliers ;)
 Also notable: 17 out of 35 cases are in std.net.curl. This is 
 because
 curl heavily uses delegates and wrapper delegates.
Interesting... it must be due to cURL callback-based API. All in all, std.net.curl is a constant source of complaints, it may need some work to fix other issues anyway.
Oct 08 2014
parent "Kagamin" <spam here.lot> writes:
On Wednesday, 8 October 2014 at 12:09:00 UTC, Dmitry Olshansky 
wrote:
 I think this and (2) can be solved if we come up with solid 
 support for RC-closures.
Delegates don't obey data sharing type checks though. A long standing language issue.
Oct 08 2014
prev sibling parent Johannes Pfau <nospam example.com> writes:
Am Tue, 07 Oct 2014 15:57:58 +0000
schrieb "Dmitry Olshansky" <dmitry.olsh gmail.com>:

 I made a proposal to quantatively measure and tabulate all GC 
 allocations in Phobos before coming up with solutions to " nogc 
 Phobos".
 
 After approving node from Andrei I've come up with a piece of 
 automation to extract this data and post it on wiki.
 
 So here is the exhustive list of everything calling into GC in 
 Phobos (-vgc compiler flag):
 
 http://wiki.dlang.org/Stuff_in_Phobos_That_Generates_Garbage
 
 Including source links, a wild guess at function's name and the 
 compiler's warning message for potential GC call.
 
 As far as data goes this is about as good as we can get, the next 
 phase is labeling this stuff with potential solution(s). Again 
 doing all by hand is tedious and hardly useful.
 
 Instead we need to observe patterns and label it automatically 
 until the non-trivial subset remains. So everybody, please take 
 time and identify simple patterns and post back your ideas on 
 solution(s).
 
 So far I see the most frequent cases:
 - `new SomeException` - switch to RC exceptions
 - AA access - ??? (use user-defined AA type as parameter?)
 - array concat - ???
 - closure - ???
 
 
 
 ---
 Dmitry Olshansky
Another observation: idup/dup are not reported by -vgc (This is correct behavior. nogc detects these as normal functions without nogc attribute and complains. -vgc does not report calls to non- nogc functions). However, idup/dup might be common and it might make sense to grep for them manually?
Oct 09 2014