www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - How do you implement a recursive walker of a tree with a lazy range?

reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Instead of the usual opApply approach, how would you implement an auto
return walker function *without* defining a special external struct
that holds all the logic? In other words, using existing Phobos
functionality only, composing functions like map and chain. Example
code which isn't correct:

-----
module test;

import std.algorithm;
import std.range;
import std.stdio;

struct Tree
{
    string value;
    Tree[] children;

    /** Walk the entire tree. */
    auto walk()
    {
        return map!(
            // note the [] workaround to make it a range
            a => chain([a], a.children)
        )([this]);  // note the [] workaround to make it a range
    }

    string toString()
    {
        return value;
    }
}

void main()
{
    auto tree =
        Tree("root",
        [
            Tree("root.1",
            [
                Tree("root.1.1"),
                Tree("root.1.2")
            ]),

            Tree("root.2",
            [
                Tree("root.2.1"),
                Tree("root.2.2")
            ]),
        ]
    );

    writeln(tree.walk);
}
-----

In case it's not displayed correctly I've pasted it here too:
http://dpaste.dzfl.pl/88409749

Can this be properly implemented using map?
Oct 29 2013
parent reply "Chris Cain" <clcain uncg.edu> writes:
Here's the code:
----
     InputRange!Tree walk()
     {
         return inputRangeObject(chain(
             [this],
             children.map!(a=>a.walk())().joiner()));
     }
----
Result:
---
[root, root.1, root.1.1, root.1.2, root.2, root.2.1, root.2.2]
---

It's a bit confusing to explain how I came up with that, but 
essentially you have to cleanly terminate the type inference 
(hence why I used an inputRangeObject) and make sure you map the 
walk function on all of the children. Mapping on the children 
creates a range of ranges (essentially, it turns your range of 
children into a range of results of walking, which are ranges), 
so you must flatten it into a single range using a join. After 
that, I prepended the [this] reference using a regular chain 
similar to what you were doing.

I'm not confident that this is the most efficient way, but it 
works.
Oct 29 2013
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Chris Cain:

     InputRange!Tree walk()
     {
         return inputRangeObject(chain(
             [this],
             children.map!(a=>a.walk())().joiner()));
     }
I have used your nice idea to create another partial D solution for this task: http://rosettacode.org/wiki/Tree_traversal#D import std.stdio, std.algorithm, std.range, std.conv, std.string; struct Tree(T) { T value; Tree* left, right; } alias VisitRange(T) = InputRange!(const Tree!T); VisitRange!T preOrder(T)(in Tree!T* t) { enum self = mixin("&" ~ __FUNCTION__.split(".").back); if (t == null) return typeof(return).init.takeNone.inputRangeObject; return [*t] .chain([t.left, t.right] .filter!(t => t != null) .map!(a => self(a)) .joiner) .inputRangeObject; } VisitRange!T inOrder(T)(in Tree!T* t) { enum self = mixin("&" ~ __FUNCTION__.split(".").back); if (t == null) return typeof(return).init.takeNone.inputRangeObject; return [t.left] .filter!(t => t != null) .map!(a => self(a)) .joiner .chain([*t]) .chain([t.right] .filter!(t => t != null) .map!(a => self(a)) .joiner) .inputRangeObject; } VisitRange!T postOrder(T)(in Tree!T* t) { enum self = mixin("&" ~ __FUNCTION__.split(".").back); if (t == null) return typeof(return).init.takeNone.inputRangeObject; return [t.left, t.right] .filter!(t => t != null) .map!(a => self(a)) .joiner .chain([*t]) .inputRangeObject; } VisitRange!T levelOrder(T)(in Tree!T* t) { enum self = mixin("&" ~ __FUNCTION__.split(".").back); return typeof(return).init.takeNone.inputRangeObject; // UNFINISHED } void main() { alias N = Tree!int; auto tree = new N(1, new N(2, new N(4, new N(7)), new N(5)), new N(3, new N(6, new N(8), new N(9)))); tree.preOrder.map!(t => t.value).writeln; tree.inOrder.map!(t => t.value).writeln; tree.postOrder.map!(t => t.value).writeln; // Should print: [1, 2, 3, 4, 5, 6, 7, 8, 9] tree.levelOrder.map!(t => t.value).writeln; } As you see levelOrder is not finished. I don't know if there is a simple way to return that something for empty input (simpler than typeof(return).init.takeNone.inputRangeObject). I have used it to avoid bugs caused by the successive map.writeln. I don't know if there are nice ways to refactor this code to shorten it some. I'd like that function pointer self to be a D built-in feature. I have used it because when you write such kind of code and you copy&paste the implementations to create a new visit function it's very easy to forget to update the name of the function to call recursively. A related task that is not yet using this idea: http://rosettacode.org/wiki/Same_Fringe#Haskell Bye, bearophile
Oct 30 2013
next sibling parent "Chris Cain" <clcain uncg.edu> writes:
On Wednesday, 30 October 2013 at 12:55:41 UTC, bearophile wrote:
 I have used your nice idea to create another partial D solution 
 for this task:
 http://rosettacode.org/wiki/Tree_traversal#D

 ... snip ...
Very cool! It's pretty close to being done. I'd have to give some real thought to how a level-order traversal might be done lazily using D. But that's a good start.
Oct 30 2013
prev sibling next sibling parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Wed, Oct 30, 2013 at 1:55 PM, bearophile <bearophileHUGS lycos.com>wrote:



 alias VisitRange(T) = InputRange!(const Tree!T);
Does that syntax come with DMD 2.064? The (T) part, I mean.
Oct 30 2013
parent "Dicebot" <public dicebot.lv> writes:
On Wednesday, 30 October 2013 at 17:31:10 UTC, Philippe Sigaud 
wrote:
 On Wed, Oct 30, 2013 at 1:55 PM, bearophile 
 <bearophileHUGS lycos.com>wrote:



 alias VisitRange(T) = InputRange!(const Tree!T);
Does that syntax come with DMD 2.064? The (T) part, I mean.
Yes. http://dlang.org/changelog#eponymous_template
Oct 30 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 10/30/13, Philippe Sigaud <philippe.sigaud gmail.com> wrote:
 Does that syntax come with DMD 2.064? The (T) part, I mean.
Yes. Glad to see you here btw, do you have your own solution to the problem?
Oct 30 2013
prev sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Wed, Oct 30, 2013 at 7:09 PM, Andrej Mitrovic <andrej.mitrovich gmail.com
 wrote:
 On 10/30/13, Philippe Sigaud <philippe.sigaud gmail.com> wrote:
 Does that syntax come with DMD 2.064? The (T) part, I mean.
Yes.
Ah, that's cool! I've been waiting for that one for years.
 Glad to see you here btw, do you have your own solution to the problem?
As Benjamin, I used an internal stack. But that was years ago, let me see. https://github.com/PhilippeSigaud/dranges/blob/master/treerange.d#L185 or, for graphs: https://github.com/PhilippeSigaud/dranges/blob/master/graphrange.d#L25 Hmm, what I did there doesn't seem really lazy. I guess with today Phobos and DMD it's possible to have far less allocations. (that code dates from 2009-2011)
Oct 30 2013
prev sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 10/30/13, Chris Cain <clcain uncg.edu> wrote:
 I'm not confident that this is the most efficient way, but it
 works.
It allocates, I'm looking for a lazy range. I would be surprised that such a common task as iterating a tree is not possible without using classes and workarounds.
Oct 30 2013
next sibling parent Benjamin Thaut <code benjamin-thaut.de> writes:
Am 30.10.2013 15:58, schrieb Andrej Mitrovic:
 On 10/30/13, Chris Cain <clcain uncg.edu> wrote:
 I'm not confident that this is the most efficient way, but it
 works.
It allocates, I'm looking for a lazy range. I would be surprised that such a common task as iterating a tree is not possible without using classes and workarounds.
I had this problem too. I just wrote my own range which uses a stack internally to remember the nodes that still need to be visited. When the algorithm hits a non leaf node it pushes the "right" node onto the stack and continues with the "left" node. Whenever the algorithm hits a leaf node it takes a node from the stack and continues with that. That results in a recusion typical depth-first iteration of the nodes. -- Kind Regards Benjamin Thaut
Oct 30 2013
prev sibling parent reply "Chris Cain" <clcain uncg.edu> writes:
On Wednesday, 30 October 2013 at 14:58:21 UTC, Andrej Mitrovic 
wrote:
 It allocates, I'm looking for a lazy range. I would be 
 surprised that
 such a common task as iterating a tree is not possible without 
 using
 classes and workarounds.
It allocates, but it's still a lazy range. It only allocates when popFront is called. I looked around a bit more and found "only" in std.range that can do this without allocating: With the walk function like this: --- InputRange!Tree walk() { return inputRangeObject(chain( only(this), children.map!(a=>a.walk).joiner)); } --- you would avoid allocations.
Oct 30 2013
parent "Chris Cain" <clcain uncg.edu> writes:
On Wednesday, 30 October 2013 at 16:50:33 UTC, Chris Cain wrote:
 you would avoid allocations.
Actually, let me clarify. You'd avoid *those* allocations. inputRangeObject allocates a class. Unfortunately, I'm not certain it's possible to do this cleanly without such a thing using inputRangeObject because otherwise you'd have infinite recursion on the types. To limit it to a single allocation, perhaps you could have two functions, walk and innerWalk. The outer walk could allocate some memory on the heap (requiring knowledge of the maximum depth of the tree either by exploring the tree prior to walking it or having the tree hold parent pointers & depths) and the inner walk could take the memory as a parameter. Each recursion would just do mem[1..$] Technically, this should work as well. But it won't be as clean of a solution. It still wouldn't have "no" allocations, but it's the best I can do to limit them. Ultimately, if you want to do this lazily with no allocations and with maximum performance, I think you'll have to make a custom range struct holding the things you need. That said, it can be done with the standard library (and as I've shown, it can look pretty clean and simple) but it probably won't be maximum performance and have no allocations.
Oct 30 2013