digitalmars.D - Kuhn-Munkres Algorithm (a.k.a. The Hungarian Algorithm)

Sameer Pradhan (26/26) Feb 19 2021 Dear D-ers,

Max Haughton (3/8) Feb 19 2021 I'm not familiar with this algorithm, but the Mir libraries have
anonymous (19/22) Feb 19 2021 I'm solving the linear assignment problem (LAP), but not in D.

Sameer Pradhan (7/22) Feb 25 2021 Thanks for these details! It is interesting that an algorithm

anonymous (24/30) Feb 25 2021 well, there are at least 3 approaches to the linear assignment

drug (13/45) Feb 20 2021 As I understand all you need for efficient Hungarian algorithm

Sameer Pradhan (9/37) Feb 25 2021 Thanks for your responses. I had one other related question: I

bachmeier (11/17) Feb 25 2021 If you have a working implementation, save yourself the time and
drug (3/15) Feb 26 2021 Bindings would be the easier way. It is really trivial, imho, no gotchas...

Sameer Pradhan <pradhan cemantix.org> writes:

Dear D-ers,

I wanted to port a currently Perl code used for scoring an NLP 
task to D.  A few folks have reimplemented it in Python but I 
would like to create a D implementation.

The catch is that each of the above two routines relies on a 
library or module that implements the Kuhn-Munkres assignment 
algorithm (https://en.wikipedia.org/wiki/Hungarian_algorithm).  
Both the Perl and Python implementations are based on the the 
following (likely more general version the algorithm—I believe 
one that handles rectangular matrices)

http://csclab.murraystate.edu/~bob.pilgrim/445/munkres.html

I did a quick search to find out whether there already exists a D 
library that implements this routine.   I did not find any 
library satisfying this requirement.  So, it is likely that I 
would have to write it myself. My assumption is that it won't be 
very complicated, but for the most efficient implementation I 
would probably need to use the one of the D libraries that do 
NumPy style, efficient computation over matrices.  OR, I could 
find a implementation in C and create a binding for it in D.  I 
am not sure what is the easiest and least effort path that would 
ensure an accurate and optimal (in terms of speed) solution.

I have a feeling that someone one on this forum might have more 
information that could shed some more light on the best path to 
take.


--
Sameer

Feb 19 2021

Max Haughton <maxhaton gmail.com> writes:

On Friday, 19 February 2021 at 20:53:33 UTC, Sameer Pradhan wrote:
 Dear D-ers,

 I wanted to port a currently Perl code used for scoring an NLP 
 task to D.  A few folks have reimplemented it in Python but I 
 would like to create a D implementation.

 [...]

I'm not familiar with this algorithm, but the Mir libraries have 
fast matrix routines so worth taking a look at those.

Feb 19 2021

anonymous <foo bar.de> writes:

On Friday, 19 February 2021 at 20:53:33 UTC, Sameer Pradhan wrote:

 I have a feeling that someone one on this forum might have more 
 information that could shed some more light on the best path to 
 take.

I'm solving the linear assignment problem (LAP), but not in D. 
Therefore I can't share code...

Some general notes:
* For larger problems
R. Jonker and A. Volgenant, "A Shortest Augmenting Path Algorithm 
for Dense and Sparse Linear Assignment Problems," Computing, vol. 
38, pp. 325-340, 1987.
   is often faster than the hungarian method. Scipy switched their 
implementation lately.
* you can use single commodity min-cost-flow algorithms for 
solving the LAP. In my experience lemon::NetworkSimplex [1] 
(written in C++ -> some glue code for a template free API needed) 
beats a reasonably good implementation of the hungarian method by 
a factor of 5-10.
    * min-cost flow can handle non-square matrices.
    * I have no implementation of the Jonker/Volgenant to compare 
to.



[1] http://lemon.cs.elte.hu/pub/doc/1.3.1/a00276.html

Feb 19 2021

Sameer Pradhan <pradhan cemantix.org> writes:

On Saturday, 20 February 2021 at 07:50:40 UTC, anonymous wrote:
 * For larger problems
 R. Jonker and A. Volgenant, "A Shortest Augmenting Path 
 Algorithm for Dense and Sparse Linear Assignment Problems," 
 Computing, vol. 38, pp. 325-340, 1987.
   is often faster than the hungarian method. Scipy switched 
 their implementation lately.
 * you can use single commodity min-cost-flow algorithms for 
 solving the LAP. In my experience lemon::NetworkSimplex [1] 
 (written in C++ -> some glue code for a template free API 
 needed) beats a reasonably good implementation of the hungarian 
 method by a factor of 5-10.
    * min-cost flow can handle non-square matrices.
    * I have no implementation of the Jonker/Volgenant to 
 compare to.
 [1] http://lemon.cs.elte.hu/pub/doc/1.3.1/a00276.html

Thanks for these details! It is interesting that an algorithm 
from 1987 is valid and potentially 5-10x faster than the 
hungarian method. Sorry for my ignorance in this area, but do 
both algorithms provide optimal solutions, or would there be some 
compromise in quality of the solutions for the Jonker vs 
Hungarian method?

Feb 25 2021

anonymous <foo bar.com> writes:

On Thursday, 25 February 2021 at 20:40:37 UTC, Sameer Pradhan 
wrote:
 Thanks for these details! It is interesting that an algorithm 
 from 1987 is valid and potentially 5-10x faster than the 
 hungarian method. Sorry for my ignorance in this area, but do 
 both algorithms provide optimal solutions, or would there be 
 some compromise in quality of the solutions for the Jonker vs 
 Hungarian method?

well, there are at least 3 approaches to the linear assignment 
problem
1) the hungarian method (from the 1950s)
2) Jonker / Volgenant algorithm
3) reduction to min-cost flow. There are several algorithms for 
min-cost flow. One of them is the network simplex method.

all are exact, i.e. return solutions with the same 
costs/objective value. Note that there might be more than one 
optimal solution.

In theory one could also use a general purpose LP solver (any 
basic solution will be integer), but even the (expensive) 
commercial packages are slower than a reasonable implementation 
of the 3 approaches above.

As mentioned before, I have no practical experience with 2) as 
the license terms of the original Pascal implementation in  the 
paper are somewhat unclear (and therefore of all literal 
translations to other programming languages).
The 5-10x speedup is between a reasonable (school-bookish) 
implementation of the hungarian method I can not share and 
lemon::NetworkSimplex (-> option 3) - but lemon is a template 
heavy C++ library, so interfacing it to D will require some C++ 
programming.

Feb 25 2021

drug <drug2004 bk.ru> writes:

On 2/19/21 11:53 PM, Sameer Pradhan wrote:
 Dear D-ers,
 
 I wanted to port a currently Perl code used for scoring an NLP task to 
 D.  A few folks have reimplemented it in Python but I would like to 
 create a D implementation.
 
 The catch is that each of the above two routines relies on a library or 
 module that implements the Kuhn-Munkres assignment algorithm 
 (https://en.wikipedia.org/wiki/Hungarian_algorithm). Both the Perl and 
 Python implementations are based on the the following (likely more 
 general version the algorithm—I believe one that handles rectangular 
 matrices)
 
 http://csclab.murraystate.edu/~bob.pilgrim/445/munkres.html
 
 I did a quick search to find out whether there already exists a D 
 library that implements this routine.   I did not find any library 
 satisfying this requirement.  So, it is likely that I would have to 
 write it myself. My assumption is that it won't be very complicated, but 
 for the most efficient implementation I would probably need to use the 
 one of the D libraries that do NumPy style, efficient computation over 
 matrices.  OR, I could find a implementation in C and create a binding 
 for it in D.  I am not sure what is the easiest and least effort path 
 that would ensure an accurate and optimal (in terms of speed) solution.
 
 I have a feeling that someone one on this forum might have more 
 information that could shed some more light on the best path to take.
 
 
 -- 
 Sameer
 

As I understand all you need for efficient Hungarian algorithm 
implementation is a contiguous 2D array, not jagged one like built-in 2D 
array in D. All other operations (like conditional modification of a 
matrix element) are too trivial/specific to be highly optimized in 3rd 
party libraries (in contrast to finding the inverse of a matrix, for 
example). Also an operation like row modification should be well 
optimized by a compiler (ldc/gdc first of all). I would just take 
mir.ndslice to get a contiguous dynamic 2D array and implement the 
algorithm like in example you mentioned above. Or if your matrix has 
fixed size you can use gfm.math.Matrix for contiguous static arrays. gfm 
is less advanced than Mir because intended for 3D games first of all, 
but it can be much easier to use. Also gfm has only dependencies.

Feb 20 2021

Sameer Pradhan <pradhan cemantix.org> writes:

On Saturday, 20 February 2021 at 10:35:27 UTC, drug wrote:
 On 2/19/21 11:53 PM, Sameer Pradhan wrote:
 Dear D-ers,
 
 I wanted to port a currently Perl code used for scoring an NLP 
 task to D.  A few folks have reimplemented it in Python but I 
 would like to create a D implementation.
 
 The catch is that each of the above two routines relies on a 
 library or module that implements the Kuhn-Munkres assignment 
 algorithm (https://en.wikipedia.org/wiki/Hungarian_algorithm). 
 Both the Perl and Python implementations are based on the the 
 following (likely more general version the algorithm—I believe 
 one that handles rectangular matrices)
 
 http://csclab.murraystate.edu/~bob.pilgrim/445/munkres.html
 
 I did a quick search to find out whether there already exists 
 a D library that implements this routine.   I did not find any 
 library satisfying this requirement.  So, it is likely that I 
 would have to write it myself. My assumption is that it won't 
 be very complicated, but for the most efficient implementation 
 I would probably need to use the one of the D libraries that 
 do NumPy style, efficient computation over matrices.  OR, I 
 could find a implementation in C and create a binding for it 
 in D.  I am not sure what is the easiest and least effort path 
 that would ensure an accurate and optimal (in terms of speed) 
 solution.
 



Thanks for your responses. I had one other related question: I 
also found a C-implementation of the algorithm. I have never 
written a Dlang binding for a C library, but since D has binary 
compatibility with a C library/object file, would it be easier to 
write a wrapper for this C library in D? Or there can be some 
gotchas that I need to keep in mind if I take this path?

--
Sameer

Feb 25 2021

bachmeier <no spam.net> writes:

On Thursday, 25 February 2021 at 20:35:36 UTC, Sameer Pradhan 
wrote:

 Thanks for your responses. I had one other related question: I 
 also found a C-implementation of the algorithm. I have never 
 written a Dlang binding for a C library, but since D has binary 
 compatibility with a C library/object file, would it be easier 
 to write a wrapper for this C library in D? Or there can be 
 some gotchas that I need to keep in mind if I take this path?

If you have a working implementation, save yourself the time and 
call the C code from D. You should not need to write bindings, 
you should use dstep to write a binding for you

https://github.com/jacob-carlborg/dstep

or dpp to include the header file directly in your D code

https://github.com/atilaneves/dpp

Once you get that working, you can create a wrapper with better 
functionality if it's worth your time to do so. I typically just 
call the C functions as they're written.

Feb 25 2021

drug <drug2004 bk.ru> writes:

On 2/25/21 11:35 PM, Sameer Pradhan wrote:
 
 
 Thanks for your responses. I had one other related question: I also 
 found a C-implementation of the algorithm. I have never written a Dlang 
 binding for a C library, but since D has binary compatibility with a C 
 library/object file, would it be easier to write a wrapper for this C 
 library in D? Or there can be some gotchas that I need to keep in mind 
 if I take this path?
 
 -- 
 Sameer
 

Bindings would be the easier way. It is really trivial, imho, no gotchas 
at all. Also you can ask a question here.

Feb 26 2021

D Programming

C/C++ Programming

Other

digitalmars.D - Kuhn-Munkres Algorithm (a.k.a. The Hungarian Algorithm)