www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Data-Flow (Escape) Analysis to Aid in Avoiding GC

reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
When reading/parsing data from disk often try to write code such 
as

     foreach (const line; File(filePath).byLine)
     {
         auto s = line.splitter(" ")

         const x = s.front.to!uint; s.popFront;
         const y = s.front.to!double; s.popFront;
         ...
     }

In response to all the discussions regarding performance problems 
related to the GC I wonder if there are plans to implement 
data-flow analysis in DMD that can detect that the calls to 
s.front in the example above doesn't need to use the GC. This 
because their references aren't used outside of the foreach scope 
(Escape Analysis).
Feb 13 2015
next sibling parent reply "Kagamin" <spam here.lot> writes:
Whether s.front uses GC is determined by s.front implementation, 
caller can't affect it.
Feb 13 2015
next sibling parent "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Friday, 13 February 2015 at 09:13:48 UTC, Kagamin wrote:
 Whether s.front uses GC is determined by s.front 
 implementation, caller can't affect it.
I'm talking about internal changes to DMD, in this case.
Feb 13 2015
prev sibling parent reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Friday, 13 February 2015 at 09:13:48 UTC, Kagamin wrote:
 Whether s.front uses GC is determined by s.front 
 implementation, caller can't affect it.
Compiling https://github.com/nordlow/justd/blob/master/t_splitter.d with -vgc on dmd git master gives no warnings about GC allocations! Is this really true!?
Feb 13 2015
parent reply "Tobias Pankrath" <tobias pankrath.net> writes:
On Friday, 13 February 2015 at 11:34:50 UTC, Per Nordlöw wrote:
 On Friday, 13 February 2015 at 09:13:48 UTC, Kagamin wrote:
 Whether s.front uses GC is determined by s.front 
 implementation, caller can't affect it.
Compiling https://github.com/nordlow/justd/blob/master/t_splitter.d with -vgc on dmd git master gives no warnings about GC allocations! Is this really true!?
Why should splitter.front allocate?
Feb 13 2015
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Tobias Pankrath:

 Why should splitter.front allocate?
I think that front was able to throw Unicode exceptions, that require the GC. But I think later they have become asserts, that don't require the GC. Bye, bearophile
Feb 13 2015
prev sibling parent reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Friday, 13 February 2015 at 11:52:50 UTC, Tobias Pankrath 
wrote:
 On Friday, 13 February 2015 at 11:34:50 UTC, Per Nordlöw wrote:
 On Friday, 13 February 2015 at 09:13:48 UTC, Kagamin wrote:
 Whether s.front uses GC is determined by s.front 
 implementation, caller can't affect it.
Compiling https://github.com/nordlow/justd/blob/master/t_splitter.d with -vgc on dmd git master gives no warnings about GC allocations! Is this really true!?
Why should splitter.front allocate?
Ahh, I think I understand now. I thought that slice creations ment GC-allocation but it doesn't right? It just increases a reference counter somewhere and creates a stack context for the slice right? But what about to!string in auto x = line.strip.splitter!isWhite.joiner("_").to!string; ?
Feb 13 2015
parent reply "Tobias Pankrath" <tobias pankrath.net> writes:
On Friday, 13 February 2015 at 12:40:57 UTC, Per Nordlöw wrote:
 On Friday, 13 February 2015 at 11:52:50 UTC, Tobias Pankrath 
 wrote:
 On Friday, 13 February 2015 at 11:34:50 UTC, Per Nordlöw wrote:
 On Friday, 13 February 2015 at 09:13:48 UTC, Kagamin wrote:
 Whether s.front uses GC is determined by s.front 
 implementation, caller can't affect it.
Compiling https://github.com/nordlow/justd/blob/master/t_splitter.d with -vgc on dmd git master gives no warnings about GC allocations! Is this really true!?
Why should splitter.front allocate?
Ahh, I think I understand now. I thought that slice creations ment GC-allocation but it doesn't right? It just increases a reference counter somewhere and creates a stack context for the slice right?
There are no reference counts involved, just simple arithmetic. string a = "abc"; string b = a[1 .. $]; struct Slice(T) { T* ptr; size_t length }; Slice!char a = { <ptr_to_constant, 3 } Slice!char b = { a.ptr + 1, 2 }
 But what about to!string in

     auto x = line.strip.splitter!isWhite.joiner("_").to!string;

 ?
That needs to allocate. Probably -vgc only lists GC allocation inside the current scope and doesn't look inside called functions. For this, there is nogc.
Feb 13 2015
parent reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Friday, 13 February 2015 at 12:50:14 UTC, Tobias Pankrath 
wrote:
 There are no reference counts involved, just simple arithmetic.

 string a = "abc";
 string b = a[1 .. $];
Then how does the GC know when to release when there are multiple references? Is this because string references immutable storage?
 Probably -vgc only lists GC allocation inside the current scope 
 and doesn't look inside called functions. For this, there is 
  nogc.
Isn't vgc recursively inferred bottom-up for calls to templates functions?
Feb 13 2015
next sibling parent "Tobias Pankrath" <tobias pankrath.net> writes:
On Friday, 13 February 2015 at 12:58:40 UTC, Per Nordlöw wrote:
 On Friday, 13 February 2015 at 12:50:14 UTC, Tobias Pankrath 
 wrote:
 There are no reference counts involved, just simple arithmetic.

 string a = "abc";
 string b = a[1 .. $];
Then how does the GC know when to release when there are multiple references? Is this because string references immutable storage?
It scans the memory for pointers to the memory to be freed before freeing them.
 Isn't vgc recursively inferred bottom-up for calls to templates 
 functions?
I didn't know vgc exists until your question, so I don't know what it does exactly. Thought that it will highlight calls to GC.malloc in the current function, even if emitted by the compiler for e.g. closures. I don't think it treats template functions different than other functions (it only considers their signature).
Feb 13 2015
prev sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Per Nordlöw:

 Then how does the GC know when to release when there are 
 multiple references?
The mark phase counts what's reachable and what can't be reached. If an object has one pointer to it, or one hundred pointers, it is not removed. If nothing points to it, it is removed. I suggest you to read how a mark&sweep GC works, or better to implement a bare-bones mark&sweep GC in C language yourself for Lisp-like cons cells, you only need 100 lines of code or so to do it. Bye, bearophile
Feb 13 2015
parent "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Friday, 13 February 2015 at 13:07:04 UTC, bearophile wrote:
 I suggest you to read how a mark&sweep GC works, or better to 
 implement a bare-bones mark&sweep GC in C language yourself for 
 Lisp-like cons cells, you only need 100 lines of code or so to 
 do it.
Got it. Thanks.
Feb 13 2015
prev sibling parent "Tobias Pankrath" <tobias pankrath.net> writes:
On Friday, 13 February 2015 at 08:21:53 UTC, Per Nordlöw wrote:
 When reading/parsing data from disk often try to write code 
 such as

     foreach (const line; File(filePath).byLine)
     {
         auto s = line.splitter(" ")

         const x = s.front.to!uint; s.popFront;
         const y = s.front.to!double; s.popFront;
         ...
     }

 In response to all the discussions regarding performance 
 problems related to the GC I wonder if there are plans to 
 implement data-flow analysis in DMD that can detect that the 
 calls to s.front in the example above doesn't need to use the 
 GC. This because their references aren't used outside of the 
 foreach scope (Escape Analysis).
I haven't looked into the source, but the only point where this snippet should allocate is at byLine.
Feb 13 2015