www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Any python-like generator patterns in D?

reply Samuel Lampa <samuel.lampa gmail.com> writes:
Hi,

I found these slides very interesting, on how python generator patterns can
be used to create re-usable code-parts that can be "piped" togegher ad
infinitum, to create e.g. parsing pipelines requiring minimal memory (things
a sysadmin working with huge files might need quite often):

http://www.dabeaz.com/generators/index.html
PDF: http://www.dabeaz.com/generators/Generators.pdf

It would be nice to do this kind of stuff in D though, to hopefully gain
some performance, so I wonder, is there similar stuff in D, and where to
find info about it?

Cheers,
Samuel
Sep 08 2011
parent reply Ali =?iso-8859-1?q?=C7ehreli?= <acehreli yahoo.com> writes:
On Thu, 08 Sep 2011 13:35:02 +0200, Samuel Lampa wrote:

 Hi,
 
 I found these slides very interesting, on how python generator patterns
 can be used to create re-usable code-parts that can be "piped" togegher
 ad infinitum, to create e.g. parsing pipelines requiring minimal memory
 (things a sysadmin working with huge files might need quite often):
 
 http://www.dabeaz.com/generators/index.html PDF:
 http://www.dabeaz.com/generators/Generators.pdf
 
 It would be nice to do this kind of stuff in D though, to hopefully gain
 some performance, so I wonder, is there similar stuff in D, and where to
 find info about it?
 
 Cheers,
 Samuel
D uses the range concept. Phobos ranges are based on the ideas presented in this article: http://www.informit.com/articles/printerfriendly.aspx?p=1407357 But some of the design and even the names of range types have changed since that article has been written. These and other Phobos modules make use of ranges: http://www.d-programming-language.org/phobos/std_range.html http://www.d-programming-language.org/phobos/std_algorithm.html http://www.d-programming-language.org/phobos/std_array.html Ali P.S. I am in the process of translating my Turkish D book to English. For completeness, here are the two chapters about ranges: http://ddili.org/ders/d/araliklar.html http://ddili.org/ders/d/araliklar_baska.html
Sep 08 2011
next sibling parent Samuel Lampa <samuel.lampa gmail.com> writes:
Many thanks!

I'll check these links.

// Samuel

On 09/08/2011 07:47 PM, Ali Çehreli wrote:
 On Thu, 08 Sep 2011 13:35:02 +0200, Samuel Lampa wrote:

 Hi,

 I found these slides very interesting, on how python generator patterns
 can be used to create re-usable code-parts that can be "piped" togegher
 ad infinitum, to create e.g. parsing pipelines requiring minimal memory
 (things a sysadmin working with huge files might need quite often):

 http://www.dabeaz.com/generators/index.html PDF:
 http://www.dabeaz.com/generators/Generators.pdf

 It would be nice to do this kind of stuff in D though, to hopefully gain
 some performance, so I wonder, is there similar stuff in D, and where to
 find info about it?

 Cheers,
 Samuel
D uses the range concept. Phobos ranges are based on the ideas presented in this article: http://www.informit.com/articles/printerfriendly.aspx?p=1407357 But some of the design and even the names of range types have changed since that article has been written. These and other Phobos modules make use of ranges: http://www.d-programming-language.org/phobos/std_range.html http://www.d-programming-language.org/phobos/std_algorithm.html http://www.d-programming-language.org/phobos/std_array.html Ali P.S. I am in the process of translating my Turkish D book to English. For completeness, here are the two chapters about ranges: http://ddili.org/ders/d/araliklar.html http://ddili.org/ders/d/araliklar_baska.html
Sep 08 2011
prev sibling next sibling parent reply Samuel Lampa <samuel.lampa gmail.com> writes:
I'm coming back to this again [1].

What I still remain looking for is a concise guide on how to do stream 
(string) processing in D, in a (hopefully) as simple and elegant way as 
possible? [2] Any tips?

(The InformIT article seemed rather technical, and more focused on 
proposing new APIs than explaining how to actually do this in D, as 
efficiently as possible)

Best Regards
// Samuel

[1] Thinking to re-implement in D, some bioinformatics python scripts 
I've created for Rosalind course platform.
[2] I'm using generator functions in python, so I wanted to get the same 
nice low memory usage, by using stream processing


On 09/08/2011 10:33 PM, Samuel Lampa wrote:
 Many thanks!

 I'll check these links.

 // Samuel

 On 09/08/2011 07:47 PM, Ali Çehreli wrote:
 On Thu, 08 Sep 2011 13:35:02 +0200, Samuel Lampa wrote:

 Hi,

 I found these slides very interesting, on how python generator patterns
 can be used to create re-usable code-parts that can be "piped" togegher
 ad infinitum, to create e.g. parsing pipelines requiring minimal memory
 (things a sysadmin working with huge files might need quite often):

 http://www.dabeaz.com/generators/index.html PDF:
 http://www.dabeaz.com/generators/Generators.pdf

 It would be nice to do this kind of stuff in D though, to hopefully 
 gain
 some performance, so I wonder, is there similar stuff in D, and 
 where to
 find info about it?

 Cheers,
 Samuel
D uses the range concept. Phobos ranges are based on the ideas presented in this article: http://www.informit.com/articles/printerfriendly.aspx?p=1407357 But some of the design and even the names of range types have changed since that article has been written. These and other Phobos modules make use of ranges: http://www.d-programming-language.org/phobos/std_range.html http://www.d-programming-language.org/phobos/std_algorithm.html http://www.d-programming-language.org/phobos/std_array.html Ali P.S. I am in the process of translating my Turkish D book to English. For completeness, here are the two chapters about ranges: http://ddili.org/ders/d/araliklar.html http://ddili.org/ders/d/araliklar_baska.html
-- Developer at SNIC-UPPMAX www.uppmax.uu.se Developer at Dept of Pharm Biosciences www.farmbio.uu.se
Feb 21 2013
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Samuel Lampa:

 What I still remain looking for is a concise guide on how to do 
 stream (string) processing in D, in a (hopefully) as simple and 
 elegant way as possible? [2] Any tips?
Are you willing to explain what you mean, or to show Python links/code? Bye, bearophile
Feb 21 2013
next sibling parent reply Samuel Lampa <samuel.lampa gmail.com> writes:
On 02/21/2013 04:29 PM, bearophile wrote:
 Samuel Lampa:

 What I still remain looking for is a concise guide on how to do 
 stream (string) processing in D, in a (hopefully) as simple and 
 elegant way as possible? [2] Any tips?
Are you willing to explain what you mean, or to show Python links/code?
Sure. Please find some sample python code, where I create a pipeline of generator objects that when executed, loads only one item at a time (one line in this case). The last expression ("for line ... print ...") will drive the execution of the whole chain of generators, yielding another line through the chain for every line that is iterated over in the for loop, keeping memory usage to a minimum. # ---- PYTHON CODE STARTS HERE ---- # Define some functions that return generator objects def generate_lines(filename): for line in open(filename): yield line.rstrip("\n") def generate_uppercase_lines(input_gen_obj): for line in input_gen_obj: yield line.upper() def generate_lines_for_output(input_gen_obj): for line in input_gen_obj: yield "Line: " + line # Chain together the generator object to produce a "pipeline": gen_lines = generate_lines("infile.txt") gen_uppercase_lines = generate_uppercase_lines(gen_lines) gen_lines_for_output = generate_lines_for_output(gen_uppercase_lines) # Do something with the last generator object in the pipeline, # in order to drive the whole pipeline, one line at a time. for line in gen_lines_for_output: print line # ---- PYTHON CODE ENDS HERE ---- Also, python has an even more compact syntax for creating generator objects: [new generator obj] = ([function to do something]([item]) for [item] in [other generator obj]) Writing the above code with this syntax would be something like this: # ---- PYTHON CODE STARTS HERE ---- # Simultaneously create the generator objects, and the pipeline gen_lines = (line.rstrip("\n") for line in open("infile.txt")) gen_uppercase_lines = (line.upper() for line in gen_lines) gen_final_lines = ("Line: " + line for line in gen_lines) # Drive the pipeline, one line at a time for line in gen_final_lines: print line # ---- PYTHON CODE ENDS HERE ---- Hope this clarifies. Otherwise, the main go-to tutorial on python generators is the slides by David Beazley: http://www.dabeaz.com/generators/ http://www.dabeaz.com/generators/Generators.pdf Cheers // Samuel
Feb 21 2013
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Samuel Lampa:

 python has an even more compact syntax for creating generator 
 objects:
D doesn't have a generator syntax, but it's easy to do what you want using UFCS in D.
 gen_lines = (line.rstrip("\n") for line in open("infile.txt"))
 gen_uppercase_lines = (line.upper() for line in gen_lines)
 gen_final_lines = ("Line: " + line for line in gen_lines)

 # Drive the pipeline, one line at a time
 for line in gen_final_lines:
     print line
==> import std.stdio, std.algorithm, std.string; void main() { auto gen = File("infile.txt") .byLine() .map!chomp .map!toUpper .map!(line => "Line: " ~ line); writefln("%-(%s\n%)", gen); } Bye, bearophile
Feb 21 2013
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
 import std.stdio, std.algorithm, std.string;

 void main() {
     auto gen = File("infile.txt")
                .byLine()
                .map!chomp
                .map!toUpper
                .map!(line => "Line: " ~ line);

     writefln("%-(%s\n%)", gen);
 }
Or faster: import std.stdio, std.algorithm, std.string; void main() { auto gen = File("infile.txt") .byLine() .map!(line => "Line: " ~ line.chomp.toUpper); writefln("%-(%s\n%)", gen); } Bye, bearophile
Feb 21 2013
parent Samuel Lampa <samuel.lampa gmail.com> writes:
On 02/21/2013 05:55 PM, bearophile wrote:
 import std.stdio, std.algorithm, std.string;

 void main() {
     auto gen = File("infile.txt")
                .byLine()
                .map!(line => "Line: " ~ line.chomp.toUpper);

     writefln("%-(%s\n%)", gen);
 }
Cool, thanks! // Samuel
Feb 21 2013
prev sibling parent Samuel Lampa <samuel.lampa gmail.com> writes:
On 02/21/2013 05:14 PM, Samuel Lampa wrote:
 Sure. Please find some sample python code <snip>
Ugh, s/find/find below/
Feb 21 2013
prev sibling parent reply "jerro" <a a.com> writes:
D has a concept of iterators, which are similar to python's 
iterators, but more general (python iterators are roughly 
equivalent to input ranges). There are no generator functions in 
D, but you can use fibers to get similar functionality, as Nick 
Sabalausky showed in this thread:

http://forum.dlang.org/thread/jno6o5$qtb$1 digitalmars.com

I've put a simple example that uses a similar approach here:

http://dpaste.dzfl.pl/051751d1

But if you use this, performance will suffer. The last time I 
tried it, fiber based ranges were limited to something like 20 
million iterations (which is similar to python's generators) on 
my machine . Simple ranges, on the other hand, can do billions of 
iterations per second in some cases.
Feb 21 2013
next sibling parent "jerro" <a a.com> writes:
On Thursday, 21 February 2013 at 16:19:51 UTC, jerro wrote:
 D has a concept of iterators
Should be: D has a concept of ranges
Feb 21 2013
prev sibling parent Samuel Lampa <samuel.lampa gmail.com> writes:
02/21/2013 05:19 PM, jerro wrote:
 D has a concept of [ranges], which are similar to python's iterators, 
 but more general (python iterators are roughly equivalent to input 
 ranges). There are no generator functions in D, but you can use fibers 
 to get similar functionality ...<snip>
Right, will dig into the ranges then! :) Thanks for the hints! // Samuel
Feb 21 2013
prev sibling parent reply "jerro" <a a.com> writes:
 P.S. I am in the process of translating my Turkish D book to 
 English. For
 completeness, here are the two chapters about ranges:

   http://ddili.org/ders/d/araliklar.html

   http://ddili.org/ders/d/araliklar_baska.html
Why not link to the english translation, too? http://ddili.org/ders/d.en/ranges.html
Feb 21 2013
next sibling parent Samuel Lampa <samuel.lampa gmail.com> writes:
On 02/21/2013 04:41 PM, jerro wrote:
 P.S. I am in the process of translating my Turkish D book to English. 
 For
 completeness, here are the two chapters about ranges:

   http://ddili.org/ders/d/araliklar.html

   http://ddili.org/ders/d/araliklar_baska.html
Why not link to the english translation, too? http://ddili.org/ders/d.en/ranges.html
Ah, yeah, I thought the translation had not happened yet :) Looks very interesting, might be what I was looking for (gotta study it in more detail ...) // Samuel
Feb 21 2013
prev sibling parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
Ha ha! Just saw this. :)

On 02/21/2013 07:41 AM, jerro wrote:
 P.S. I am in the process of translating my Turkish D book to 
English. For
 completeness, here are the two chapters about ranges:

 http://ddili.org/ders/d/araliklar.html

 http://ddili.org/ders/d/araliklar_baska.html
Why not link to the english translation, too? http://ddili.org/ders/d.en/ranges.html
I might have made a typo there but no: My quote above was from On Thu, 08 Sep 2011 13:35:02 +0200 but the very first English draft of that chapter has been submitted on On Oct 11, 2011 http://code.google.com/p/ddili/source/list?path=/trunk/src/ders/d.en/ranges.d&start=267 :) Ali
Mar 13 2013