digitalmars.D - Phobos usability

bearophile (53/54) Dec 24 2010 The original D2 implementation (probably by feepinCreature), with little...
bearophile <bearophileHUGS lycos.com> writes:
Another post about Phobos usability. Again the relative code is a rosettacode
Task:

Assume we have a collection of numbers, and want to find the one with the
largest minimal prime factor (that is, the one that contains relatively large
factors). To speed up the search, the factorization should be done in parallel
using separate threads or processes, to take advantage of multi-core CPUs.<

The original D2 implementation (probably by feepinCreature), with little
changes:
http://rosettacode.org/wiki/Parallel_calculations#D

This D2 version is not bad, but there are some ways to improve it.

------------------------------------

1) Finding the max or min of an iterable is a really common need. Probably I
need to do it every 50-100 lines of code or less. This is how it is found in
that program:

reduce!q{max(a, b)}(0L, minFactors));

I strongly suggest the max() and min() to find the max of an iterable by
themselves. And to optionally accept a mapping function (as schwartzSort). See:
http://d.puremagic.com/issues/show_bug.cgi?id=4705

With that the line of code becomes more readable and shorter, and surely
bug-free:
max(minFactors);

Similar considerations suggest the creation of a sum() function:
http://d.puremagic.com/issues/show_bug.cgi?id=4725

A sum() is not just a reduce, because it is allowed to use an optimization. See
issue 4725.

------------------------------------

2) Clojure language shows how nice is a pmap(), parallel map.

If necessary two different pmap() may be created, one where the single mapping
operation is slow enough, and one where it is very cheap. For this Task I use
the pmap() for costly mapping function.

Using pmap() the D2 program may be shortened to (this loses the timings, but
they are not required):

import std.stdio, std.math, std.algorithm, std.parallel;

pure ulong lowestFactor(immutable ulong n) {
    if (n % 2 == 0)
        return 2;
    else {
        immutable ulong limit = cast(ulong)sqrt(n) + 1;
        for (ulong i = 3; i < limit; i += 2)
            if (n % i == 0)
                return i;
    }
    return n;
}

void main() {
    auto numbers = [2UL^^61-1, 2UL^^61-1, 2UL^^61-1, 112272537195293,
                    115284584522153, 115280098190773, 115797840077099,
                    112582718962171, 112272537095293, 1099726829285419];

    writefln("Largest min. factor is %s.",
             max(pmap(&lowestFactor, numbers)));
}


pmap is able to see that the given mapping function is pure. Here
lowestFactor() is strongly pure, so pmap() is free to ignore the execution
order of the single lowestFactor functions.

An alternative syntax for pmap is similar to the map:
pmap!(&lowest_factor)(numbers)

------------------------------------

3) For debugging I may want to print the array of lowest factors:
writeln(pmap(&lowest_factor, numbers));

Or even just:
writeln(map!(&lowest_factor)(numbers));

writeln() is able to print a lazy iterable, but it prints it just as an array.
This is bad, because it's important to give cues to the person that reads to
the printout regarding the types of the types printed.

A simple way to tell apart lazy sequences from arrays is to use a semicolon
instead a comma to tell apart items (lists are sometimes written using a
semicolon in other functional languages, so this is not a new thing):

[0; 1; 2; 3; 4]

See:
http://d.puremagic.com/issues/show_bug.cgi?id=3813

Empty dynamic arrays, empty static arrays, empty associative arrays and empty
lazy iterables need some kind of output when they are printed, I suggest a
simple "[]" at least. With no output it's too much easy to confuse "no output"
with "empty array" with "missing writeln" with "not run code path", etc. (In
bugzilla there is a bug report about this, not written by me, I don't remember
its number).

Bye,
bearophile
Dec 24 2010
D Programming

C/C++ Programming

Other

digitalmars.D - Phobos usability