www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Trying Phobos some more

One way to find what's missing among Phobos ranges and higher 
order functions is to try to use them. This is a small 
Rosettacode task, given a file of words, it asks to find the 
longest pair of anagrams that have not even one letter in the 
same position:

http://rosettacode.org/wiki/Anagrams/Deranged_anagrams

I don't fully understand the Clojure solution:


(let
   [words    (re-seq #"\w+" (slurp "unixdict.txt"))
    anagrams (filter second (vals (group-by sort words)))
    deranged (remove #(some true? (apply map = %)) anagrams)]
   (prn (last (sort-by #(count (first %)) deranged))))


I have written two D solutions, the first tries to be short and 
the second to be faster. This is the first solution (look in that 
page for the fast solution):


void main() {
     import std.stdio, std.file, std.algorithm, std.string,
            std.array;

     string[][dstring] anags;
     foreach (const w; "unixdict.txt".readText.split)
         anags[w.array.sort().release.idup] ~= w;

     anags
     .byValue
     .map!(anags => anags.cartesianProduct(anags)
                    .filter!q{a[].equal!{ a != b }})
     .join
     .minPos!q{ a[0].length > b[0].length }[0]
     .writeln;
}



A better short D version:

void main() {
     import std.stdio, std.file, std.algorithm, std.string,
            std.array;

     "unixdict.txt"
     .readText
     .split
     .hashGroupBy!(w => w.sorted.release)
     .byValue
     .map!(anags => anags
                    .pairwise
                    .filter!q{a[].equal!q{ a != b }})
     .join
     .max!q{ a[0].length }
     .writeln;
}


Where:

hashGroupBy returns an associative array of the equivalence 
classes, given a key function 
(http://d.puremagic.com/issues/show_bug.cgi?id=9842 ).

sorted is similar to an array+sort. But it allocates a new array, 
so the its output should be implicitly castable to const 
(http://d.puremagic.com/issues/show_bug.cgi?id=5076  
https://d.puremagic.com/issues/show_bug.cgi?id=12140 ).

pairwise is a range that yields the pairs, but only the lower 
triangle. So it's like two nested loops. In this program it's up 
to twice faster than using cartesianProduct 
(http://d.puremagic.com/issues/show_bug.cgi?id=6788 ).

max and min accept an optional mapping key function, as in Python 
(http://d.puremagic.com/issues/show_bug.cgi?id=4705 ).

Bye,
bearophile
Feb 12 2014