www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - Range-Based Graph Search in D (blog post)

reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
Trying to write a bit more about D on my blog now. To start, I've 
written about a proof-of-concept range-based API for graph search.

http://poita.org/2014/01/09/range-based-graph-search-in-d.html

I'd greatly appreciate any feedback on the design. As we don't 
yet have a graph library for D, it would be interesting to 
discuss APIs in general.

-Peter
Jan 09 2014
next sibling parent reply "qznc" <qznc web.de> writes:
On Thursday, 9 January 2014 at 22:04:16 UTC, Peter Alexander 
wrote:
 Trying to write a bit more about D on my blog now. To start, 
 I've written about a proof-of-concept range-based API for graph 
 search.

 http://poita.org/2014/01/09/range-based-graph-search-in-d.html

 I'd greatly appreciate any feedback on the design. As we don't 
 yet have a graph library for D, it would be interesting to 
 discuss APIs in general.
Beautiful! I was thinking about something like this myself. Your code and your compositional examples are nicer than I expected something like this to turn out. For the visitation API design: Your map approach (bool[Vertex] m_visited) is probably the most generic one. A variant, where the nodes store the flag internally is more efficient, though. Since you do not want to walk the whole graph to reset the flag, a counter is more appropriate for multiple walks. You (or the graph data structure) must remember the current flag count. So the basic idea is: A vertex contains a counter/flag initialized with zero. On visiting it, the counter is increased to a the global value. Visit can be checked by comparing it to a global value. Maybe a special vertex interface check could be used similar to isInputRange. template isFlaggedVertex(R) { enum bool isFlaggedVertex = is(typeof( (inout int = 0) { int global_value = 1; R v = void; // can define a vertex object if (r.is_visited(global_value)) {} // can test for visited r.mark_visited(global_value); // can mark as visited })); } Depending on this check, the search functions could use internal or external visit mechanisms.
Jan 09 2014
parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Thursday, 9 January 2014 at 22:53:02 UTC, qznc wrote:
 For the visitation API design: Your map approach (bool[Vertex] 
 m_visited) is probably the most generic one.

 A variant, where the nodes store the flag internally is more 
 efficient, though.
Unless the graph is infinite ;-) But yes, for most graphs that would likely be more efficient than a hash lookup. I'll keep note of that. Thanks!
Jan 11 2014
parent "Tove" <tove fransson.se> writes:
On Saturday, 11 January 2014 at 09:51:57 UTC, Peter Alexander 
wrote:
 On Thursday, 9 January 2014 at 22:53:02 UTC, qznc wrote:
 For the visitation API design: Your map approach (bool[Vertex] 
 m_visited) is probably the most generic one.

 A variant, where the nodes store the flag internally is more 
 efficient, though.
Unless the graph is infinite ;-) But yes, for most graphs that would likely be more efficient than a hash lookup. I'll keep note of that. Thanks!
I love the design, only some few minor tweaks needed(as already highlighted in this thread). Even without any changes, it fits my current needs and I'd like to use it already, am I right in assuming that it's Boost licensed?
Jan 21 2014
prev sibling next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Peter Alexander:

 http://poita.org/2014/01/09/range-based-graph-search-in-d.html
I'd like to see the usage of a not implicit graph. Regarding this line: static d(string a, string b) { return zip(a, b).count!"a[0] != a[1]"; } An alternative: enum dist = (in string s, in string s2) pure => s1.zip(s2).count!(ab => ab[0] != ab[1]); It's a use case for some tuple unpacking syntax: enum dist = (in string s1, in string s2) pure => s1.zip(s2).count!( {a, b} => a != b); Bye, bearophile
Jan 09 2014
parent "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Thursday, 9 January 2014 at 22:54:11 UTC, bearophile wrote:
 Peter Alexander:

 http://poita.org/2014/01/09/range-based-graph-search-in-d.html
I'd like to see the usage of a not implicit graph.
There is a crude example in the unit test in the file. You could actually create an non-implicit graph using implicitGraph: auto adjacencyLists[nVerts] = [...]; auto graph = implicitGraph!(int, i => adjList[i]);
 Regarding this line:

 static d(string a, string b) { return zip(a, b).count!"a[0] != 
 a[1]"; }

 An alternative:

 enum dist = (in string s, in string s2) pure =>
     s1.zip(s2).count!(ab => ab[0] != ab[1]);
True, I tried to make it as short as possible so it fit on one line in my blog :-)
Jan 09 2014
prev sibling next sibling parent "extrawurst" <stephan extrawurst.org> writes:
On Thursday, 9 January 2014 at 22:04:16 UTC, Peter Alexander 
wrote:
 Trying to write a bit more about D on my blog now. To start, 
 I've written about a proof-of-concept range-based API for graph 
 search.

 http://poita.org/2014/01/09/range-based-graph-search-in-d.html

 I'd greatly appreciate any feedback on the design. As we don't 
 yet have a graph library for D, it would be interesting to 
 discuss APIs in general.

 -Peter
Put it on reddit: http://www.reddit.com/r/programming/comments/1uvgvi/rangebased_graph_search_in_d/
Jan 10 2014
prev sibling next sibling parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Thursday, 9 January 2014 at 22:04:16 UTC, Peter Alexander 
wrote:
 Trying to write a bit more about D on my blog now. To start, 
 I've written about a proof-of-concept range-based API for graph 
 search.

 http://poita.org/2014/01/09/range-based-graph-search-in-d.html

 I'd greatly appreciate any feedback on the design. As we don't 
 yet have a graph library for D, it would be interesting to 
 discuss APIs in general.

 -Peter
Nice :) I presume you are aware of https://github.com/WebDrake/Dgraph
Jan 10 2014
parent reply "Joseph Rushton Wakeling" <joseph.wakeling webdrake.net> writes:
On Friday, 10 January 2014 at 11:07:13 UTC, John Colvin wrote:
 Nice :)

 I presume you are aware of https://github.com/WebDrake/Dgraph
Good inspiration for me to get back to work on that :-) Peter -- this is really exciting to see and I will be looking into your work with great interest. Let me know if there is any overlap in our projects that we can make use of. At first glance (I'm in the midst of house-moves right now so can't give your code the attention it deserves immediately:-) it is interesting to see how our algorithms and problems of interest have resulted in different design concerns. We should definitely try and set up some mutual coding challenges to try and test scalability, performance etc. :-)
Jan 10 2014
parent "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Friday, 10 January 2014 at 18:26:15 UTC, Joseph Rushton 
Wakeling wrote:
 On Friday, 10 January 2014 at 11:07:13 UTC, John Colvin wrote:
 Nice :)

 I presume you are aware of https://github.com/WebDrake/Dgraph
Good inspiration for me to get back to work on that :-) Peter -- this is really exciting to see and I will be looking into your work with great interest. Let me know if there is any overlap in our projects that we can make use of. At first glance (I'm in the midst of house-moves right now so can't give your code the attention it deserves immediately:-) it is interesting to see how our algorithms and problems of interest have resulted in different design concerns. We should definitely try and set up some mutual coding challenges to try and test scalability, performance etc. :-)
I'm currently only in the exploratory phase, trying to iron out a simple but flexible API. Hard to say if there is or will be any overlap at this time. However, assuming both our designs are as generic as they hope to be, they should all be interoperable (perhaps with a bit of adaptation/forwarding)!
Jan 11 2014
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 01/09/2014 11:04 PM, Peter Alexander wrote:
 Trying to write a bit more about D on my blog now. To start, I've
 written about a proof-of-concept range-based API for graph search.

 http://poita.org/2014/01/09/range-based-graph-search-in-d.html

 I'd greatly appreciate any feedback on the design.
I like the general direction. One issue with the API is that it does not deal with graphs with more than one edge between two given vertices very well. I think it'd need a separate edge abstraction. (Formally, a graph is just a pair of sets (V,E) with two functions from E to V giving you start and end vertex of an edge.) A graph search can have more properties than just the sequence of visited edges, so maybe it is not a range, but just provides a range. (And maybe also a range of events, usable instead of callbacks, maybe a DFS tree etc.) Allocation of memory for visited flags and the like should be controllable. Maybe the design should even allow using persistent data structures for storing visited flags, such that the search range can be used as a value type / forward range.
 As we don't yet have
 a graph library for D, it would be interesting to discuss APIs in general.

 -Peter
Probably it would be worthwhile to explore a design that is similar to Phobos ranges at some level. I.e. support kinds of graphs with different capabilities. (Eg. your implicitGraph cannot enumerate all vertices or edges that are present in the Graph. The least capable graph kind reasonably supported might not even be able to quickly report all adjacent edges to a vertex. Some graphs may allow in-place mutation etc.) Then provide some explicit representations with high capabilities and helper functions to convert certain representations to (more) explicit representations. (similar to constructing eg. an array from a range.) Constructions on graphs would then return implicit representations.
Jan 10 2014
parent "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Saturday, 11 January 2014 at 03:53:28 UTC, Timon Gehr wrote:
 One issue with the API is that it does not deal with graphs 
 with more than one edge between two given vertices very well. I 
 think it'd need a separate edge abstraction. (Formally, a graph 
 is just a pair of sets (V,E) with two functions from E to V 
 giving you start and end vertex of an edge.)
I was thinking of changing the adjacency list to return a range of edges instead. That may solve the problem. It also solves the problem of the edgeWeight API being inefficient for certain graph data structures.
 A graph search can have more properties than just the sequence 
 of visited edges, so maybe it is not a range, but just provides 
 a range. (And maybe also a range of events, usable instead of 
 callbacks, maybe a DFS tree etc.)
A range of events would be interesting. You could then get a range of different events by filtering: graph.depthFirstSearchEvents(start).filter!(e => e.type == Event.node).map!(e => e.node); etc.
 Allocation of memory for visited flags and the like should be 
 controllable. Maybe the design should even allow using 
 persistent data structures for storing visited flags, such that 
 the search range can be used as a value type / forward range.
Absolutely. I'm leaving things like memory allocation API until later and focussing on the general usability and flexibility.
 Probably it would be worthwhile to explore a design that is 
 similar to Phobos ranges at some level. I.e. support kinds of 
 graphs with different capabilities. (Eg. your implicitGraph 
 cannot enumerate all vertices or edges that are present in the 
 Graph. The least capable graph kind reasonably supported might 
 not even be able to quickly report all adjacent edges to a 
 vertex. Some graphs may allow in-place mutation etc.) Then
 provide some explicit representations with high capabilities 
 and helper functions to convert certain representations to 
 (more) explicit representations. (similar to constructing eg. 
 an array from a range.) Constructions on graphs would then 
 return implicit representations.
Yah, I only implemented implicit graph first due to it's simple representation. It's mainly useful for infinite graphs like game trees and works well for some other simple graph types, but certainly there will be a need for other types of graph (adjacency matrices, edge lists, adjacency lists, etc.) It's also a very minimal graph type, which is useful for testing a generic API. I want the algorithms to require only as much as necessary from the graph representation. If users have to construct large graph data type with many mandatory member functions then I would be unhappy. Thanks for all the feedback.
Jan 11 2014
prev sibling parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
 Trying to write a bit more about D on my blog now. To start, I've written
 about a proof-of-concept range-based API for graph search.

 http://poita.org/2014/01/09/range-based-graph-search-in-d.html

 I'd greatly appreciate any feedback on the design. As we don't yet have a
 graph library for D, it would be interesting to discuss APIs in general.
I really like it, very elegant! Certain lines brought a smile to my lips :-) (using anonymous functions as graph parameters, for example). Much more elegant than my own graph code :) You wanted suggestions, here are a few. They come from my use of graphs: not so much as search-based, but as a way to store and modify 'relationships' (call graphs, import graphs) or to use them as support for state machines. * I sometimes have to use graphs with more than just `weight` on edges (names as strings, chars, for example for a state machine, etc). Do you think you could extend your design to allow parameterization on edges? Of course, to use the search algorithms, the user should provide an way to get a priority from the edges. Or any user-defined type could be used, ideally, as long as it provides a way to get a `size_t`. * I'd like a way to map on a graph (keeping the edge structure, but changing the nodes contents). I know I can map on your search result, but 1) I want a graph after mapping, not a range and 2) I don't want to search, I want to affect all nodes. * I'd like a way to iterate seeing only edges, not nodes. * I'd like a way to concatenate/link different graphs (a bit like chain for ranges). I sometimes have graphs that are assembled from many different small graphs, bottom-up style. Typically, if you assemble automata (from regexes, for example), you do it bottom-up. * Of course, I'd like a way to build a graph from nothing, putting nodes and then edges in them. * I sometimes need to extract some global information from a graph: number of nodes, number of edges, etc. * Does you design allow for backward iteration? (I mean, inverting the edges). Some graph algorithms ask for this kind of magic. * Getting strongly connected components is a very interesting algorithm to have. * you should define some template constraints on your types (Graph...), but I understand they can be added later on. Re-reading my suggestions, I see they would push your code toward my own attempts :-) I guess I'm biased. I have some (years old!) code in there: https://github.com/PhilippeSigaud/dranges/blob/master/graphrange.d https://github.com/PhilippeSigaud/dranges/blob/master/graph.d https://github.com/PhilippeSigaud/dranges/blob/master/graphalgorithm.d As for my (age old) vision on graph iteration, here it is: https://github.com/PhilippeSigaud/dranges/blob/master/recursive.d But yet again, nothing as elegant as your own code.
Jan 11 2014
parent "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Saturday, 11 January 2014 at 09:11:45 UTC, Philippe Sigaud 
wrote:
 * I sometimes have to use graphs with more than just `weight` 
 on edges
 (names as strings, chars, for example for a state machine, 
 etc). Do
 you think you could extend your design to allow 
 parameterization on
 edges? Of course, to use the search algorithms, the user should
 provide an way to get a priority from the edges. Or any 
 user-defined
 type could be used, ideally, as long as it provides a way to 
 get a
 `size_t`.
Yes, I think this should be possible. I think I will allow arbitrary edge types so the user can attach whatever information they like, and then these can be retrieved through an edge event visitation of the graph.
 * I'd like a way to map on a graph (keeping the edge structure, 
 but
 changing the nodes contents). I know I can map on your search 
 result,
 but 1) I want a graph after mapping, not a range and 2) I don't 
 want
 to search, I want to affect all nodes.
Interesting. I think that should be possible if you provide some sort of adaptor on the graph that modifies the result of the "adjacent" function.
 * I'd like a way to iterate seeing only edges, not nodes.
Yes, I think I will change the search range to be a range of events, which will include edges taken and vertices visited among other things. You can then filter the events to only see edge events if you wish.
 * I'd like a way to concatenate/link different graphs (a bit 
 like
 chain for ranges). I sometimes have graphs that are assembled 
 from
 many different small graphs, bottom-up style. Typically, if you
 assemble automata (from regexes, for example), you do it 
 bottom-up.
Should be possible, but I think that would be up to the user to do the combination. They just need to provide the correct API.
 * Of course, I'd like a way to build a graph from nothing, 
 putting
 nodes and then edges in them.
Yep, I'll add some real graph data structures later :-) I understand implicit graphs aren't everything!
 * I sometimes need to extract some global information from a 
 graph:
 number of nodes, number of edges, etc.
Any graph that can provide that would be free to do so.
 * Does you design allow for backward iteration? (I mean, 
 inverting the
 edges). Some graph algorithms ask for this kind of magic.
True. Implicit graph would struggle to provide that, but other graphs would be able to provide "incidentEdges" as part of the API (easy for adjacency matrix for example).
 * Getting strongly connected components is a very interesting 
 algorithm to have.
Absolutely, I'll likely move onto connected components and graph flow next.
 * you should define some template constraints on your types
 (Graph...), but I understand they can be added later on.
Yeah, it's not very robust at the moment, just experimental :-) I find adding constraints early just gets in the way of experimentation.
 Re-reading my suggestions, I see they would push your code 
 toward my
 own attempts :-) I guess I'm biased.

 I have some (years old!) code in there:
 https://github.com/PhilippeSigaud/dranges/blob/master/graphrange.d
 https://github.com/PhilippeSigaud/dranges/blob/master/graph.d
 https://github.com/PhilippeSigaud/dranges/blob/master/graphalgorithm.d

 As for my (age old) vision on graph iteration, here it is:

 https://github.com/PhilippeSigaud/dranges/blob/master/recursive.d

 But yet again, nothing as elegant as your own code.
Thank you for the feedback. I will have a look at your code for inspiration!
Jan 11 2014