www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Kuhn-Munkres Algorithm (a.k.a. The Hungarian Algorithm)

reply Sameer Pradhan <pradhan cemantix.org> writes:
Dear D-ers,

I wanted to port a currently Perl code used for scoring an NLP 
task to D.  A few folks have reimplemented it in Python but I 
would like to create a D implementation.

The catch is that each of the above two routines relies on a 
library or module that implements the Kuhn-Munkres assignment 
algorithm (https://en.wikipedia.org/wiki/Hungarian_algorithm).  
Both the Perl and Python implementations are based on the the 
following (likely more general version the algorithm—I believe 
one that handles rectangular matrices)

http://csclab.murraystate.edu/~bob.pilgrim/445/munkres.html

I did a quick search to find out whether there already exists a D 
library that implements this routine.   I did not find any 
library satisfying this requirement.  So, it is likely that I 
would have to write it myself. My assumption is that it won't be 
very complicated, but for the most efficient implementation I 
would probably need to use the one of the D libraries that do 
NumPy style, efficient computation over matrices.  OR, I could 
find a implementation in C and create a binding for it in D.  I 
am not sure what is the easiest and least effort path that would 
ensure an accurate and optimal (in terms of speed) solution.

I have a feeling that someone one on this forum might have more 
information that could shed some more light on the best path to 
take.


--
Sameer
Feb 19 2021
next sibling parent Max Haughton <maxhaton gmail.com> writes:
On Friday, 19 February 2021 at 20:53:33 UTC, Sameer Pradhan wrote:
 Dear D-ers,

 I wanted to port a currently Perl code used for scoring an NLP 
 task to D.  A few folks have reimplemented it in Python but I 
 would like to create a D implementation.

 [...]
I'm not familiar with this algorithm, but the Mir libraries have fast matrix routines so worth taking a look at those.
Feb 19 2021
prev sibling next sibling parent reply anonymous <foo bar.de> writes:
On Friday, 19 February 2021 at 20:53:33 UTC, Sameer Pradhan wrote:

 I have a feeling that someone one on this forum might have more 
 information that could shed some more light on the best path to 
 take.
I'm solving the linear assignment problem (LAP), but not in D. Therefore I can't share code... Some general notes: * For larger problems R. Jonker and A. Volgenant, "A Shortest Augmenting Path Algorithm for Dense and Sparse Linear Assignment Problems," Computing, vol. 38, pp. 325-340, 1987. is often faster than the hungarian method. Scipy switched their implementation lately. * you can use single commodity min-cost-flow algorithms for solving the LAP. In my experience lemon::NetworkSimplex [1] (written in C++ -> some glue code for a template free API needed) beats a reasonably good implementation of the hungarian method by a factor of 5-10. * min-cost flow can handle non-square matrices. * I have no implementation of the Jonker/Volgenant to compare to. [1] http://lemon.cs.elte.hu/pub/doc/1.3.1/a00276.html
Feb 19 2021
parent reply Sameer Pradhan <pradhan cemantix.org> writes:
On Saturday, 20 February 2021 at 07:50:40 UTC, anonymous wrote:
 * For larger problems
 R. Jonker and A. Volgenant, "A Shortest Augmenting Path 
 Algorithm for Dense and Sparse Linear Assignment Problems," 
 Computing, vol. 38, pp. 325-340, 1987.
   is often faster than the hungarian method. Scipy switched 
 their implementation lately.
 * you can use single commodity min-cost-flow algorithms for 
 solving the LAP. In my experience lemon::NetworkSimplex [1] 
 (written in C++ -> some glue code for a template free API 
 needed) beats a reasonably good implementation of the hungarian 
 method by a factor of 5-10.
    * min-cost flow can handle non-square matrices.
    * I have no implementation of the Jonker/Volgenant to 
 compare to.
 [1] http://lemon.cs.elte.hu/pub/doc/1.3.1/a00276.html
Thanks for these details! It is interesting that an algorithm from 1987 is valid and potentially 5-10x faster than the hungarian method. Sorry for my ignorance in this area, but do both algorithms provide optimal solutions, or would there be some compromise in quality of the solutions for the Jonker vs Hungarian method?
Feb 25 2021
parent anonymous <foo bar.com> writes:
On Thursday, 25 February 2021 at 20:40:37 UTC, Sameer Pradhan 
wrote:
 Thanks for these details! It is interesting that an algorithm 
 from 1987 is valid and potentially 5-10x faster than the 
 hungarian method. Sorry for my ignorance in this area, but do 
 both algorithms provide optimal solutions, or would there be 
 some compromise in quality of the solutions for the Jonker vs 
 Hungarian method?
well, there are at least 3 approaches to the linear assignment problem 1) the hungarian method (from the 1950s) 2) Jonker / Volgenant algorithm 3) reduction to min-cost flow. There are several algorithms for min-cost flow. One of them is the network simplex method. all are exact, i.e. return solutions with the same costs/objective value. Note that there might be more than one optimal solution. In theory one could also use a general purpose LP solver (any basic solution will be integer), but even the (expensive) commercial packages are slower than a reasonable implementation of the 3 approaches above. As mentioned before, I have no practical experience with 2) as the license terms of the original Pascal implementation in the paper are somewhat unclear (and therefore of all literal translations to other programming languages). The 5-10x speedup is between a reasonable (school-bookish) implementation of the hungarian method I can not share and lemon::NetworkSimplex (-> option 3) - but lemon is a template heavy C++ library, so interfacing it to D will require some C++ programming.
Feb 25 2021
prev sibling parent reply drug <drug2004 bk.ru> writes:
On 2/19/21 11:53 PM, Sameer Pradhan wrote:
 Dear D-ers,
 
 I wanted to port a currently Perl code used for scoring an NLP task to 
 D.  A few folks have reimplemented it in Python but I would like to 
 create a D implementation.
 
 The catch is that each of the above two routines relies on a library or 
 module that implements the Kuhn-Munkres assignment algorithm 
 (https://en.wikipedia.org/wiki/Hungarian_algorithm). Both the Perl and 
 Python implementations are based on the the following (likely more 
 general version the algorithm—I believe one that handles rectangular 
 matrices)
 
 http://csclab.murraystate.edu/~bob.pilgrim/445/munkres.html
 
 I did a quick search to find out whether there already exists a D 
 library that implements this routine.   I did not find any library 
 satisfying this requirement.  So, it is likely that I would have to 
 write it myself. My assumption is that it won't be very complicated, but 
 for the most efficient implementation I would probably need to use the 
 one of the D libraries that do NumPy style, efficient computation over 
 matrices.  OR, I could find a implementation in C and create a binding 
 for it in D.  I am not sure what is the easiest and least effort path 
 that would ensure an accurate and optimal (in terms of speed) solution.
 
 I have a feeling that someone one on this forum might have more 
 information that could shed some more light on the best path to take.
 
 
 -- 
 Sameer
 
As I understand all you need for efficient Hungarian algorithm implementation is a contiguous 2D array, not jagged one like built-in 2D array in D. All other operations (like conditional modification of a matrix element) are too trivial/specific to be highly optimized in 3rd party libraries (in contrast to finding the inverse of a matrix, for example). Also an operation like row modification should be well optimized by a compiler (ldc/gdc first of all). I would just take mir.ndslice to get a contiguous dynamic 2D array and implement the algorithm like in example you mentioned above. Or if your matrix has fixed size you can use gfm.math.Matrix for contiguous static arrays. gfm is less advanced than Mir because intended for 3D games first of all, but it can be much easier to use. Also gfm has only dependencies.
Feb 20 2021
parent reply Sameer Pradhan <pradhan cemantix.org> writes:
On Saturday, 20 February 2021 at 10:35:27 UTC, drug wrote:
 On 2/19/21 11:53 PM, Sameer Pradhan wrote:
 Dear D-ers,
 
 I wanted to port a currently Perl code used for scoring an NLP 
 task to D.  A few folks have reimplemented it in Python but I 
 would like to create a D implementation.
 
 The catch is that each of the above two routines relies on a 
 library or module that implements the Kuhn-Munkres assignment 
 algorithm (https://en.wikipedia.org/wiki/Hungarian_algorithm). 
 Both the Perl and Python implementations are based on the the 
 following (likely more general version the algorithm—I believe 
 one that handles rectangular matrices)
 
 http://csclab.murraystate.edu/~bob.pilgrim/445/munkres.html
 
 I did a quick search to find out whether there already exists 
 a D library that implements this routine.   I did not find any 
 library satisfying this requirement.  So, it is likely that I 
 would have to write it myself. My assumption is that it won't 
 be very complicated, but for the most efficient implementation 
 I would probably need to use the one of the D libraries that 
 do NumPy style, efficient computation over matrices.  OR, I 
 could find a implementation in C and create a binding for it 
 in D.  I am not sure what is the easiest and least effort path 
 that would ensure an accurate and optimal (in terms of speed) 
 solution.
 
Thanks for your responses. I had one other related question: I also found a C-implementation of the algorithm. I have never written a Dlang binding for a C library, but since D has binary compatibility with a C library/object file, would it be easier to write a wrapper for this C library in D? Or there can be some gotchas that I need to keep in mind if I take this path? -- Sameer
Feb 25 2021
next sibling parent bachmeier <no spam.net> writes:
On Thursday, 25 February 2021 at 20:35:36 UTC, Sameer Pradhan 
wrote:

 Thanks for your responses. I had one other related question: I 
 also found a C-implementation of the algorithm. I have never 
 written a Dlang binding for a C library, but since D has binary 
 compatibility with a C library/object file, would it be easier 
 to write a wrapper for this C library in D? Or there can be 
 some gotchas that I need to keep in mind if I take this path?
If you have a working implementation, save yourself the time and call the C code from D. You should not need to write bindings, you should use dstep to write a binding for you https://github.com/jacob-carlborg/dstep or dpp to include the header file directly in your D code https://github.com/atilaneves/dpp Once you get that working, you can create a wrapper with better functionality if it's worth your time to do so. I typically just call the C functions as they're written.
Feb 25 2021
prev sibling parent drug <drug2004 bk.ru> writes:
On 2/25/21 11:35 PM, Sameer Pradhan wrote:
 
 
 Thanks for your responses. I had one other related question: I also 
 found a C-implementation of the algorithm. I have never written a Dlang 
 binding for a C library, but since D has binary compatibility with a C 
 library/object file, would it be easier to write a wrapper for this C 
 library in D? Or there can be some gotchas that I need to keep in mind 
 if I take this path?
 
 -- 
 Sameer
 
Bindings would be the easier way. It is really trivial, imho, no gotchas at all. Also you can ask a question here.
Feb 26 2021