www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - D graph library -- update

reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
Hi all,

Following earlier discussion about a D graph library
<http://forum.dlang.org/thread/5187D514.5070109 webdrake.net>, this evening I
sat down and had a go at coming up with some basic code to support such a
venture.

You can find it here: https://github.com/WebDrake/Dgraph

This takes the basic data structure from the widely-used igraph library
<http://igraph.sourceforge.net> but builds around it using idiomatic D
structures and algorithms.

The code currently consists of the basic data structure, implemented as a final
class with methods to extract the number of vertices, the list of edges, and the
degree and neighbours of a vertex.

There is also a simple test file that can be used for benchmarking against
comparable igraph code.  With the current method of adding edges one by one,
this code already benchmarks as faster than its igraph equivalent, running in
2.4s on my machine when compiled with gdmd -O -release -inline and 1.4s when
compiled with ldmd2 and the same flags -- compared to 3s for the igraph code
written in C.

However, when igraph's option to add the edges all in one go in a vector is
enabled, igraph is significantly faster.  This surely reflects a mix of memory
management (how many allocs/appends?) and also the amount of sorting and other
updates that occur when edges are added.  So, I think that igraph can probably
still be beaten here.

The memory usage for the example D code is slightly higher than for its
comparable igraph C code, clocking in at about 2MB as opposed to 1.7.

I've chosen igraph as a point of comparison because it's known for being both
the fastest and most scalable graph library out there.  Many of igraph's design
choices seem focused on minimizing memory usage, sometimes at the expense of
all-out speed: for example, if an algorithm needs an adjacency list
representation of the graph, igraph will generate one at that moment, and
destroy it afterwards.

However, on the basis of the simple work done here, I'm fairly confident that D
can do better.  The code here is _much_ simpler than the equivalent functions in
igraph, and has not yet been optimized in any way either for speed or for memory
management.  Yet it seems to be performing on par with or better than igraph
within the limits of its current design constraints.

I'll be trying to implement a few additional little pieces of functionality in
the next days, perhaps some graph metrics which can give another point of
performance comparison.

Anyway, comments welcome, both positive and negative.

Thanks & best wishes,

    -- Joe
Jul 10 2013
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Joseph Rushton Wakeling:

 The memory usage for the example D code is slightly higher than 
 for its
 comparable igraph C code, clocking in at about 2MB as opposed 
 to 1.7.

If you want a meaningful memory comparison then perhaps you need a 10 or 100 (or more) times larger graph. Bye, bearophile
Jul 10 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/11/13 3:25 AM, Joseph Rushton Wakeling wrote:
 By default, schwartzSort uses SwapStrategy.unstable, which means quicksort is
 used as the sorting mechanism.  If we replace this with SwapStrategy.stable or
 SwapStrategy.semistable, TimSort is used instead, and this dramatically cuts
the
 running time -- from almost 2 minutes to under 3 seconds (compared to 13
seconds
 for igraph with one-by-one edge addition), and under 2 if ldmd2 is used for
 compiling.

Unstable sort should be faster than stable sort. If I remember correctly (1) our TimSort is indeed slower than quicksort on random numbers, (2) there is a performance bug that makes our quicksort perform quadratically on data that's essentially sorted but has one unsorted element at the end. Is that the case? Andrei
Jul 11 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
11-Jul-2013 20:02, Joseph Rushton Wakeling пишет:
 On 07/11/2013 05:17 PM, Andrei Alexandrescu wrote:
 Unstable sort should be faster than stable sort. If I remember correctly (1)
our
 TimSort is indeed slower than quicksort on random numbers, (2) there is a
 performance bug that makes our quicksort perform quadratically on data that's
 essentially sorted but has one unsorted element at the end. Is that the case?

That's _exactly_ the case here. :-\

In such a case if inserting exactly one element into a sorted range you may want to special case it to lowerbound + insert. E.g. something like this: auto idx = sortedRange.lowerbound(val).length; //insert should be called on array backing that sortedRange insert(sortedRange, idx, val);
 Thanks for the clarification!

-- Dmitry Olshansky
Jul 11 2013
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
12-Jul-2013 02:23, Joseph Rushton Wakeling пишет:
 On 07/11/2013 10:43 PM, Dmitry Olshansky wrote:
 In such a case if inserting exactly one element into a sorted range you may
want
 to special case it to lowerbound + insert.
 E.g. something like this:

 auto idx = sortedRange.lowerbound(val).length;
 //insert should be called on array backing that sortedRange
 insert(sortedRange, idx, val);

... which would basically correspond to an insertion sort, right?

Yup, but done in one step. I'd put it the other way around: simply put this is inserting an item into a sorted range => binary search + insert. An insertion sort is built around this operation performed repeatedly.
 I'm quite tired on reading this so will have to think it over fresh tomorrow.
 What I'm not certain of is how to tweak it given that actually here we have two
 arrays:

      size_t[] values;  // doesn't change, but we've added a new value at the
end
      size_t[] sortedIndices; // indices 0 .. values.length, sorted according
                              // to values[i]

Append new value to values. Then use 'values.length-1' (new length - 1 i.e. the old length) as an item to insert into sortedIndices. The last step is to figure out what range to call lowerBound on - I'd say something like: assumeSorted(sortedIndices.map!(x => values[x])) then use that to find a suitable place to insert in sortedIndices.
 ... so, it's sortedIndices that we need to sort, but according to the
 corresponding entries in the array of values.

 Anyway, thanks for the thought -- lowerbound() is a function I've never used
 before so it wouldn't have occurred to me.

-- Dmitry Olshansky
Jul 11 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 07/11/2013 02:18 AM, bearophile wrote:
 If you want a meaningful memory comparison then perhaps you need a 10 or 100
(or
 more) times larger graph.

I know, and it's coming. :-) The main memory-related issues will probably not show up in a situation like this where all we're doing is storing the graph data, but in the case where algorithms are being performed on the data.
Jul 10 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 07/11/2013 02:24 AM, Joseph Rushton Wakeling wrote:
 I know, and it's coming. :-)  The main memory-related issues will probably not
 show up in a situation like this where all we're doing is storing the graph
 data, but in the case where algorithms are being performed on the data.

For comparison, I performed the same tests but with a 10_000 node graph. Here we see similar memory use, but igraph outperforms dgraph by a factor of nearly 10 even with the insert of nodes one at a time. Profiling shows that the time difference is accounted for by the sorting algorithm used in the indexEdges() method: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 93.12 110.01 110.01 20000 0.01 0.01 _D6dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda4VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ12schwartzSortMFAmZS6dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv222__T11SortedRangeTAmS1986dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda4VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ11__lambda165Z11SortedRange561__T13quickSortImplS5066dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda4VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ12schwartzSortMFAmZS6dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv222__T11SortedRangeTAmS1986dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda4VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ11__lambda165 Z11SortedRange11__lambda168TS3std5range14__T3ZipTAmTAmZ3ZipZ13quickSortImplMFS3std5range14__T3ZipTAmTAmZ3ZipZv 3.88 114.59 4.58 20000 0.00 0.00 _D6dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda2VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ12schwartzSortMFAmZS6dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv222__T11SortedRangeTAmS1986dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda2VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ11__lambda165Z11SortedRange561__T13quickSortImplS5066dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda2VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ12schwartzSortMFAmZS6dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv222__T11SortedRangeTAmS1986dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda2VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ11__lambda165 Z11SortedRange11__lambda168TS3std5range14__T3ZipTAmTAmZ3ZipZ13quickSortImplMFS3std5range14__T3ZipTAmTAmZ3ZipZv 1.81 116.73 2.14 1124043790 0.00 0.00 _D3std5range14__T3ZipTAmTAmZ3Zip7opIndexMFNaNbNfmZS3std8typecons14__T5TupleTmTmZ5Tuple 0.59 117.42 0.70 203164131 0.00 0.00 _D3std9algorithm43__T6swapAtTS3std5range14__T3ZipTAmTAmZ3ZipZ6swapAtFNaNbNfS3std5range14__T3ZipTAmTAmZ3ZipmmZv 0.42 117.92 0.50 20000 0.00 0.01 _D6dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv By default, schwartzSort uses SwapStrategy.unstable, which means quicksort is used as the sorting mechanism. If we replace this with SwapStrategy.stable or SwapStrategy.semistable, TimSort is used instead, and this dramatically cuts the running time -- from almost 2 minutes to under 3 seconds (compared to 13 seconds for igraph with one-by-one edge addition), and under 2 if ldmd2 is used for compiling. This comes at a cost of increasing memory usage from about 3.7 MB (almost identical to igraph) to 5.6. Probably the best way to secure optimal performance without the memory hit is to use an insertion sort (or maybe a smoothsort?). I guess that Timsort would be best to use in the case of adding multiple edges in one go, unless no edges at all have been added before, in which case quicksort would probably be optimal; though quicksort would probably remain best if memory management is the priority. So, the new D code is still competitive with igraph, but needs some smoothing around the edges (quite literally!:-).
Jul 11 2013
prev sibling next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Thursday, 11 July 2013 at 10:25:40 UTC, Joseph Rushton 
Wakeling wrote:
 On 07/11/2013 02:24 AM, Joseph Rushton Wakeling wrote:
 I know, and it's coming. :-)  The main memory-related issues 
 will probably not
 show up in a situation like this where all we're doing is 
 storing the graph
 data, but in the case where algorithms are being performed on 
 the data.

For comparison, I performed the same tests but with a 10_000 node graph. Here we see similar memory use, but igraph outperforms dgraph by a factor of nearly 10 even with the insert of nodes one at a time. Profiling shows that the time difference is accounted for by the sorting algorithm used in the indexEdges() method: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 93.12 110.01 110.01 20000 0.01 0.01 _D6dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda4VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ12schwartzSortMFAmZS6dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv222__T11SortedRangeTAmS1986dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda4VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ11__lambda165Z11SortedRange561__T13quickSortImplS5066dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda4VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ12schwartzSortMFAmZS6dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv222__T11SortedRangeTAmS1986dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda4VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ11__lambda165 Z11SortedRange11__lambda168TS3std5range14__T3ZipTAmTAmZ3ZipZ13quickSortImplMFS3std5range14__T3ZipTAmTAmZ3ZipZv 3.88 114.59 4.58 20000 0.00 0.00 _D6dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda2VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ12schwartzSortMFAmZS6dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv222__T11SortedRangeTAmS1986dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda2VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ11__lambda165Z11SortedRange561__T13quickSortImplS5066dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda2VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ12schwartzSortMFAmZS6dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv222__T11SortedRangeTAmS1986dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv132__T12schwartzSortS606dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv9__lambda2VAyaa5_61203c2062VE3std9algorithm12SwapStrategy0TAmZ11__lambda165 Z11SortedRange11__lambda168TS3std5range14__T3ZipTAmTAmZ3ZipZ13quickSortImplMFS3std5range14__T3ZipTAmTAmZ3ZipZv 1.81 116.73 2.14 1124043790 0.00 0.00 _D3std5range14__T3ZipTAmTAmZ3Zip7opIndexMFNaNbNfmZS3std8typecons14__T5TupleTmTmZ5Tuple 0.59 117.42 0.70 203164131 0.00 0.00 _D3std9algorithm43__T6swapAtTS3std5range14__T3ZipTAmTAmZ3ZipZ6swapAtFNaNbNfS3std5range14__T3ZipTAmTAmZ3ZipmmZv 0.42 117.92 0.50 20000 0.00 0.01 _D6dgraph5graph13__T5GraphVb0Z5Graph10indexEdgesMFZv By default, schwartzSort uses SwapStrategy.unstable, which means quicksort is used as the sorting mechanism. If we replace this with SwapStrategy.stable or SwapStrategy.semistable, TimSort is used instead, and this dramatically cuts the running time -- from almost 2 minutes to under 3 seconds (compared to 13 seconds for igraph with one-by-one edge addition), and under 2 if ldmd2 is used for compiling. This comes at a cost of increasing memory usage from about 3.7 MB (almost identical to igraph) to 5.6. Probably the best way to secure optimal performance without the memory hit is to use an insertion sort (or maybe a smoothsort?). I guess that Timsort would be best to use in the case of adding multiple edges in one go, unless no edges at all have been added before, in which case quicksort would probably be optimal; though quicksort would probably remain best if memory management is the priority. So, the new D code is still competitive with igraph, but needs some smoothing around the edges (quite literally!:-).

This is very promising. Great work!
Jul 11 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 07/11/2013 05:17 PM, Andrei Alexandrescu wrote:
 Unstable sort should be faster than stable sort. If I remember correctly (1)
our
 TimSort is indeed slower than quicksort on random numbers, (2) there is a
 performance bug that makes our quicksort perform quadratically on data that's
 essentially sorted but has one unsorted element at the end. Is that the case?

That's _exactly_ the case here. :-\ Thanks for the clarification!
Jul 11 2013
prev sibling next sibling parent "Joseph Rushton Wakeling" <joseph.wakeling webdrake.net> writes:
On Thursday, 11 July 2013 at 15:17:03 UTC, Andrei Alexandrescu 
wrote:
 (2) there is a performance bug that makes our quicksort perform 
 quadratically on data that's essentially sorted but has one 
 unsorted element at the end.

Do you have a link? I couldn't find it on the bugzilla, though I do remember a discussion of this from a while back.
Jul 11 2013
prev sibling next sibling parent "w0rp" <devw0rp gmail.com> writes:
I don't like it. Here is why.

1. You can only use a size_t type as the vertex type. What about 
strings?
2. The distinction between directed and undirected graphs is made 
with an boolean type parameter? Why not an enum? Better yet, why 
not just use interfaces? (You are already using classes and 
garbage collected memory.)
3. I have to control the graph's capacity manually, and I can't 
use arbitrary numbers for edges.
4. Adding an edge doesn't add vertices.

My most common use of graphs is to start from nothing, build from 
adding edges arbitrarily, usually from IDs which may be integers, 
and may be strings, and then use graph algorithms to gain some 
understanding of the relation between objects mapped to these 
IDs. With this library, this will be difficult or impossible.
Jul 11 2013
prev sibling next sibling parent "Joseph Rushton Wakeling" <joseph.wakeling webdrake.net> writes:
On Thursday, 11 July 2013 at 18:14:01 UTC, w0rp wrote:
 I don't like it. Here is why.

 1. You can only use a size_t type as the vertex type. What 
 about strings?

This is a core graph type designed explicitly for speed and efficiency. If you want string ids for vertices, it's better to wrap this with maps from string to size_t and back again. Having strings or other arbitrary data types involved in the core data type will just be a nightmare space- and speed-wise.
 2. The distinction between directed and undirected graphs is 
 made with an boolean type parameter? Why not an enum?

It is an enum.
 why not just use interfaces? (You are already using classes and 
 garbage collected memory.)

It's in consideration. I haven't yet identified a clear need.
 3. I have to control the graph's capacity manually, and I can't 
 use arbitrary numbers for edges.
 4. Adding an edge doesn't add vertices.

Again, those things are probably best achieved by wrappers that mediate the adding of edges.
 My most common use of graphs is to start from nothing, build 
 from adding edges arbitrarily, usually from IDs which may be 
 integers, and may be strings, and then use graph algorithms to 
 gain some understanding of the relation between objects mapped 
 to these IDs. With this library, this will be difficult or 
 impossible.

I understand those requirements, I just think that the core data structure on which calculations are done is not the ideal place to implement them. (Sorry for terse replies, I'm writing on my phone...)
Jul 11 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Jul 11, 2013 at 08:34:17PM +0200, Joseph Rushton Wakeling wrote:
 On Thursday, 11 July 2013 at 18:14:01 UTC, w0rp wrote:
I don't like it. Here is why.

1. You can only use a size_t type as the vertex type. What about
strings?

This is a core graph type designed explicitly for speed and efficiency. If you want string ids for vertices, it's better to wrap this with maps from string to size_t and back again. Having strings or other arbitrary data types involved in the core data type will just be a nightmare space- and speed-wise.

My most common use of graphs is to start from nothing, build from
adding edges arbitrarily, usually from IDs which may be integers,
and may be strings, and then use graph algorithms to gain some
understanding of the relation between objects mapped to these IDs.
With this library, this will be difficult or impossible.

I understand those requirements, I just think that the core data structure on which calculations are done is not the ideal place to implement them. (Sorry for terse replies, I'm writing on my phone...)

That makes sense. The core graph type is what's used internally by the graphing algorithms, right? So it should be optimized for maximum performance in those algorithms. However, I think we should also provide friendlier representations for user-facing types. One possibility is a wrapper type that accepts string IDs for nodes/edges, and internally maintains a mapping between them and numerical IDs, so that the user doesn't have to keep doing this manually. This way the algorithms can run at top speed, and the user can still get a nice API for string-based IDs. T -- Freedom of speech: the whole world has no right *not* to hear my spouting off!
Jul 11 2013
prev sibling next sibling parent "Joseph Rushton Wakeling" <joseph.wakeling webdrake.net> writes:
On Thursday, 11 July 2013 at 18:42:10 UTC, H. S. Teoh wrote:
 However, I think we should also provide friendlier 
 representations for
 user-facing types. One possibility is a wrapper type that 
 accepts string
 IDs for nodes/edges, and internally maintains a mapping between 
 them and
 numerical IDs, so that the user doesn't have to keep doing this
 manually. This way the algorithms can run at top speed, and the 
 user can
 still get a nice API for string-based IDs.

Yes, I agree. The goal right now is to get the internal stuff really well done, with a friendly face to follow.
Jul 11 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 07/11/2013 10:43 PM, Dmitry Olshansky wrote:
 In such a case if inserting exactly one element into a sorted range you may
want
 to special case it to lowerbound + insert.
 E.g. something like this:
 
 auto idx = sortedRange.lowerbound(val).length;
 //insert should be called on array backing that sortedRange
 insert(sortedRange, idx, val);

... which would basically correspond to an insertion sort, right? I'm quite tired on reading this so will have to think it over fresh tomorrow. What I'm not certain of is how to tweak it given that actually here we have two arrays: size_t[] values; // doesn't change, but we've added a new value at the end size_t[] sortedIndices; // indices 0 .. values.length, sorted according // to values[i] ... so, it's sortedIndices that we need to sort, but according to the corresponding entries in the array of values. Anyway, thanks for the thought -- lowerbound() is a function I've never used before so it wouldn't have occurred to me.
Jul 11 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 07/11/2013 08:14 PM, w0rp wrote:
 3. I have to control the graph's capacity manually, and I can't use arbitrary
 numbers for edges.

I'm not sure I understand why this is a problem with the class presented here. Can you clarify what your concern is?
Jul 11 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 07/11/2013 08:14 PM, w0rp wrote:
 1. You can only use a size_t type as the vertex type. What about strings?

I want to reply at greater length on this point, because I don't want you to feel I'm dismissing your concerns. I remember that back in our earlier discussions you posted a basic data type of something like, WeightType[VertexID][VertexID]; where VertexID might be an integer, a string, or potentially something else. Let's simplify and suppose we have bool[VertexID][VertexID], so that it's just true of there's a link, false otherwise. In principle you could simplify still further and just have a set of VertexID pairs, except that Phobos still doesn't have a decent Set implementation (... hint, hint ... :-) This will be quick (and fairly memory efficient, I'd have thought) to create, and the distinction between Set and MultiSet would allow an easy constraint on whether you have a graph or a multigraph. You also have a ready and efficient way to check if a given link (head, tail) is in the graph, although I think I can probably do a good job on that with the current implementation (more on that another time:-). What I'm not sure that you have so readily is an efficient way to extract e.g. degree(v), or neighbours(v), and I'm concerned that the extra data you'll have to carry to change this may actually lead to a memory blow-up. I'd be very happy to be proved wrong about this, so if you have some ideas, or better still code for comparison and benchmarking, I'll happily reconsider the design. But for now my contention is that generic vertex IDs are only relevant when it comes to input or output, and that's where they should be implemented, not in the core processing code.
 3. I have to control the graph's capacity manually, and I can't use arbitrary
 numbers for edges.

I don't see the capacity issue as being different from e.g. the fact that an array's capacity must be controlled manually unless you explicitly attempt to append to the end of it. As for arbitrary numbers for edges: I think in practice this should be OK, so long as you ensure that vertexCount is greater than the maximum vertex ID number, and there are not _too_ many gaps between the vertex IDs that are used. (e.g. if your vertex IDs are 1 to 50 rather than 0 to 49, you should have no problem.) Then, most results can be adapted simply by filtering out "nodes" with degree == 0. If on the other hand your node IDs are 10, 20, 30, ... you might not get such a good performance ... :-)
 4. Adding an edge doesn't add vertices.

Here I don't think I have anything to add apart from that I don't like the idea of a graph that might surreptitiously add an arbitrary number of vertices simply on the basis of an edge being added. I'd rather force the user to check that the link being added is valid given the vertices available, and provide a wrapper to offer those kinds of checks and the opportunity to expand the number of vertices if needed.
Jul 12 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 07/12/2013 12:37 AM, Dmitry Olshansky wrote:
 Append new value to values.
 
 Then use 'values.length-1' (new length - 1 i.e. the old length) as an item to
 insert into sortedIndices.
 
 The last step is to figure out what range to call lowerBound on - I'd say
 something like:
 
 assumeSorted(sortedIndices.map!(x => values[x]))
 
 then use that to find a suitable place to insert in sortedIndices.

Thanks very much for that, I'll try it out and report back on performance. :-) I think I may move the discussion over to D.learn as I have some other new profiling results to follow up on -- it'll be an interesting lesson in how to manage memory for performance.
Jul 12 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 07/12/2013 12:37 AM, Dmitry Olshansky wrote:
 Append new value to values.
 
 Then use 'values.length-1' (new length - 1 i.e. the old length) as an item to
 insert into sortedIndices.
 
 The last step is to figure out what range to call lowerBound on - I'd say
 something like:
 
 assumeSorted(sortedIndices.map!(x => values[x]))
 
 then use that to find a suitable place to insert in sortedIndices.

So, just to report back -- this is the code I came up with based on your suggestion: void indexEdges() { immutable size_t l = _indexHead.length; foreach(e; iota(l, _head.length)) { size_t i = _indexHead.map!(a => _head[a]).assumeSorted.lowerBound(_head[e]).length; insertInPlace(_indexHead, i, e); i = _indexTail.map!(a => _tail[a]).assumeSorted.lowerBound(_tail[e]).length; insertInPlace(_indexTail, i, e); } } This is _much_ faster than the previous code (although still not as fast as igraph when adding multiple edges in one go), and also more memory efficient. I'm a little bothered that the insertion implies potentially many re-allocs, and I wonder if it might be even better if the length of _indexHead and _indexTail can be increased in one go, and the "insertion" of the new edge index happen without risking a reallocation. Anyway, thanks very much for the useful idea! igraph is still ultimately faster in the case where multiple edges are added in a single go, but that's an issue we can confront later.
Jul 12 2013
prev sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 07/12/2013 03:12 PM, Joseph Rushton Wakeling wrote:
 I'm a little bothered that the insertion implies potentially many re-allocs,
and
 I wonder if it might be even better if the length of _indexHead and _indexTail
 can be increased in one go, and the "insertion" of the new edge index happen
 without risking a reallocation.

An attempt at an alternative: immutable size_t l = _indexHead.length; _indexHead.length = _head.length; _indexTail.length = _tail.length; foreach(e; iota(l, _head.length)) { size_t i, j; i = _indexHead[0 .. e].map!(a => _head[a]).assumeSorted.lowerBound(_head[e]).length; for(j = e; j > i; --j) _indexHead[j] = _indexHead[j - 1]; _indexHead[i] = e; i = _indexTail[0 .. e].map!(a => _tail[a]).assumeSorted.lowerBound(_tail[e]).length; for(j = e; j > i; --j) _indexTail[j] = _indexTail[j - 1]; _indexTail[i] = e; } It doesn't seem to make any difference in speed or memory consumption to the current code (OK, perhaps a tiny speed increase, but very tiny). It might have some relevance in the case where multiple edges are added in one go. I'll keep it in mind for that.
Jul 12 2013