digitalmars.D - How many HOFs in Phobos?

bearophile (48/48) Jan 31 2011 This is a high-level implementation of Levenshtein distance in Haskell:

Andrei Alexandrescu (61/109) Jan 31 2011 Thanks for this work.

bearophile (10/14) Jan 31 2011 I was quite aware that Haskell version is designed for being short, not ...

Walter Bright (2/5) Feb 01 2011 It's exponentially bad performance makes it short, not useful.

Andrei Alexandrescu (5/11) Feb 01 2011 I'm not sure whether it's exponential, polynomial greater than
bearophile (6/7) Feb 01 2011 A program with high complexity is not a problem if you run it only on fe...

Jonathan M Davis (7/19) Feb 01 2011 The issue is that if you want something in Phobos, it _does_ need to be ...

Daniel Gibson (4/25) Feb 01 2011 Well, he didn't want the slow Levenshtein implementation in Phobos (if I

Andrei Alexandrescu (13/38) Feb 01 2011 The fact that foldl and foldr are only one letter apart is a design

Daniel Gibson (17/63) Feb 01 2011 Thanks for the link :-)
bearophile (69/73) Feb 01 2011 I think you have misunderstood the discussion, or maybe I just don't und...

Andrei Alexandrescu (26/77) Feb 01 2011 I'd be glad to include such a function if there were good use cases for

bearophile (8/10) Feb 02 2011 The edit distance code in Phobos2 is about 129 lines long (without comme...

spir (24/74) Feb 02 2011 'iterate' would be great if it didn't have a different meaning in progra...

Andrei Alexandrescu (8/27) Feb 01 2011 I think this is a bit much, though probably a good principle to live

Jonathan M Davis (10/39) Feb 01 2011 Okay. Perhaps, I said it a bit too strongly, but I think that the gist o...

Walter Bright (2/4) Feb 01 2011 Yup, because if it isn't, it gets ridicule heaped upon it, and deservedl...

Andrei Alexandrescu (6/17) Feb 01 2011 I agree in spirit but only weakly. Cute and complexity corrupt examples

bearophile <bearophileHUGS lycos.com> writes:

This is a high-level implementation of Levenshtein distance in Haskell:


levenshtein s1 s2 = last $ foldl transform [0 .. length s1] s2
  where transform ns (n:ns') c = scanl calc (n+1) $ zip3 s1 ns ns'
          where calc z (c', x, y) = minimum [y+1, z+1, x + fromEnum (c' /= c)]

main = print (levenshtein "kitten" "sitting")


(I don't explain that Haskell code because below I give a kind of D translation
that is probably enough to understand.)
It prints 3, the code comes from the rosettacode.org site:
http://rosettacode.org/wiki/Levenshtein_distance#Haskell

Currently in std.algorithm there are higher order functions (HOFs) like "map",
"filter", "reduce" etc, reduce is the right fold.

That Haskell code uses a foldl (fold left), and scanl (like foldl, but keeps
and returns all intermediate values too). This is a brutal translation to D2:


import std.stdio, std.range, std.array, std.algorithm, std.typecons;

/// foldl f z [x1, x2, ..., xn] == (...((z `f` x1) `f` x2) `f`...) `f` xn
T1 foldl(F, T1, Range)(F f, T1 z, Range xs) {
    foreach (x; xs)
        z = f(z, x);
    return z;
}

/// scanl f z [x1, x2, ...] == [z, z `f` x1, (z `f` x1) `f` x2, ...]
T1[] scanl(F, T1, Range)(F f, T1 z, Range xs) {
    auto result = [z];
    foreach (x; xs) {
        z = f(z, x);
        result ~= z;
    }
    return result;
}

auto levenshtein(string s1, string s2) {
    int[] transform(int[] ns, char c) {
        auto n = ns[0];
        auto ns2 = ns[1..$];
        int compute(int z, Tuple!(dchar, int, int) t) {
            return min(t[2]+1, z+1, t[1] + (t[0] != c));
        }
        return scanl(&compute, n+1, zip(s1, ns, ns2));
    }
    return foldl(&transform, array(iota(cast(int)s1.length+1)), s2)[$-1];
}

void main() {
    string add(string x, string y) { return x ~ y; }
    writeln(foldl(&add, "", ["ab", "cd", "ef"])); // "abcdef"
    writeln(scanl(&add, "", ["ab", "cd", "ef"])); // ["", "ab", "abcd",
"abcdef"]

    writeln(levenshtein("kitten", "sitting")); // 3
}


Haskell has tons of built-in HOFs (monadic ones too, like foldM), but D is less
functional than Haskell, and reading highly functional-style code is not easy
for many programmers (and D syntax limits make such code noisy). What's the
right number of HOFs to put in std.algorithm+std.range? Are (better
implementations of) scanl and foldl fit for Phobos? (I am not suggesting to
rename "reduce" to "foldr". You may add an "alias reduce foldr;" if you want,
but that's all).

Hours ago I have added this suggestion to bugzilla:
http://d.puremagic.com/issues/show_bug.cgi?id=5510

Bye,
bearophile

Jan 31 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/31/11 3:37 PM, bearophile wrote:
 This is a high-level implementation of Levenshtein distance in Haskell:


 levenshtein s1 s2 = last $ foldl transform [0 .. length s1] s2
    where transform ns (n:ns') c = scanl calc (n+1) $ zip3 s1 ns ns'
            where calc z (c', x, y) = minimum [y+1, z+1, x + fromEnum (c' /= c)]

 main = print (levenshtein "kitten" "sitting")


 (I don't explain that Haskell code because below I give a kind of D
translation that is probably enough to understand.)
 It prints 3, the code comes from the rosettacode.org site:
 http://rosettacode.org/wiki/Levenshtein_distance#Haskell

 Currently in std.algorithm there are higher order functions (HOFs) like "map",
"filter", "reduce" etc, reduce is the right fold.

 That Haskell code uses a foldl (fold left), and scanl (like foldl, but keeps
and returns all intermediate values too). This is a brutal translation to D2:


 import std.stdio, std.range, std.array, std.algorithm, std.typecons;

 /// foldl f z [x1, x2, ..., xn] == (...((z `f` x1) `f` x2) `f`...) `f` xn
 T1 foldl(F, T1, Range)(F f, T1 z, Range xs) {
      foreach (x; xs)
          z = f(z, x);
      return z;
 }

 /// scanl f z [x1, x2, ...] == [z, z `f` x1, (z `f` x1) `f` x2, ...]
 T1[] scanl(F, T1, Range)(F f, T1 z, Range xs) {
      auto result = [z];
      foreach (x; xs) {
          z = f(z, x);
          result ~= z;
      }
      return result;
 }

 auto levenshtein(string s1, string s2) {
      int[] transform(int[] ns, char c) {
          auto n = ns[0];
          auto ns2 = ns[1..$];
          int compute(int z, Tuple!(dchar, int, int) t) {
              return min(t[2]+1, z+1, t[1] + (t[0] != c));
          }
          return scanl(&compute, n+1, zip(s1, ns, ns2));
      }
      return foldl(&transform, array(iota(cast(int)s1.length+1)), s2)[$-1];
 }

 void main() {
      string add(string x, string y) { return x ~ y; }
      writeln(foldl(&add, "", ["ab", "cd", "ef"])); // "abcdef"
      writeln(scanl(&add, "", ["ab", "cd", "ef"])); // ["", "ab", "abcd",
"abcdef"]

      writeln(levenshtein("kitten", "sitting")); // 3
 }


 Haskell has tons of built-in HOFs (monadic ones too, like foldM), but D is
less functional than Haskell, and reading highly functional-style code is not
easy for many programmers (and D syntax limits make such code noisy). What's
the right number of HOFs to put in std.algorithm+std.range? Are (better
implementations of) scanl and foldl fit for Phobos? (I am not suggesting to
rename "reduce" to "foldr". You may add an "alias reduce foldr;" if you want,
but that's all).

 Hours ago I have added this suggestion to bugzilla:
 http://d.puremagic.com/issues/show_bug.cgi?id=5510

 Bye,
 bearophile

Thanks for this work.

The Haskell implementation doesn't scale. My testbed:

levenshtein s1 s2 = last $ foldl transform [0 .. length s1] s2
   where transform ns (n:ns') c = scanl calc (n+1) $ zip3 s1 ns ns'
           where calc z (c', x, y) = minimum [y+1, z+1, x + fromEnum (c' 
/= c)]

n = 4
s = concat $ replicate n "kitten"
t = concat $ replicate n "sitting"

main = print (levenshtein s t)

I assigned n doubling values: 4 8 16 32 64 ... and measured the running 
times gave me:

4 -> 3ms
8 -> 4ms
16 -> 8ms
32 -> 32ms
64 -> 137ms
128 -> 607ms
256 -> 2.6s
512 -> 15.8s
1024 -> 78s
2048 -> doesn't finish with -O, crashes without -O

I used ghc -O on a top-of-the-line server (8 cores, 64 GB RAM).

Then I ran your implementation on a MacBook Pro with dmd -O -release 
-inline:

4 -> 4ms
8 -> 4ms
16 -> 6ms
32 -> 11ms
64 -> 36ms
128 -> 108ms
256 -> 402ms
512 -> 1.59s
1024 -> 6.36s
2048 -> 25.43s
4096 -> 100.2s

Finally, on the same laptop and with the same options I ran the 
std-provided levenshteinDistance, obtaining:

4 -> 5ms
8 -> 5ms
16 -> 5ms
32 -> 6ms
64 -> 11ms
128 -> 33ms
256 -> 112ms
512 -> 431ms
1024 -> 1.8s
2048 -> 7.9s
4096 -> 37.9s

Note that levenshteinDistance is more general (works with any forward 
ranges, automatically performs correct UTF decoding, accepts custom 
comparison).

The trend of the two D implementations is quadratic: time required is 4x 
upon doubling the size of the input. The trend of the Haskell 
implementation is super-quadratic.

Note that reduce is akin to foldl, not foldr. To answer your question, 
adding functional primitives to Phobos is good but should be done in 
moderation; a staunch functional approach is not always the most 
recommendable.


Andrei

Jan 31 2011

bearophile <bearophileHUGS lycos.com> writes:

Andrei:

 Thanks for this work.

My pleasure :-)


 The Haskell implementation doesn't scale.

I was quite aware that Haskell version is designed for being short, not fast.
But I was not aware of the precise trends of the various versions, and I am
surprised to see how much efficient is my ugly D2 version compared to that
fully lazy Haskell version. As usual doing experiments teaches something :-)


 My testbed:

This time it's you doing the bencharks :-)
If you want you may benchmark this version too, adapted to D2 from my dlibs1,
the timings I'm seeing here are interesting:
http://codepad.org/Kf2FMMN9
Uncommenting a different lines of code it allows you to benchmark three
versions, two from Phobos2.


 Note that reduce is akin to foldl, not foldr.

Sorry for my mistake.

Bye and thank you,
bearophile

Jan 31 2011

Walter Bright <newshound2 digitalmars.com> writes:

bearophile wrote:
 The Haskell implementation doesn't scale.

 
 I was quite aware that Haskell version is designed for being short, not fast.

It's exponentially bad performance makes it short, not useful.

Feb 01 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/1/11 12:34 PM, Walter Bright wrote:
 bearophile wrote:
 The Haskell implementation doesn't scale.

 I was quite aware that Haskell version is designed for being short,
 not fast.

 It's exponentially bad performance makes it short, not useful.

I'm not sure whether it's exponential, polynomial greater than 
quadratic, or simply quadratic (as it should) with large inefficiencies 
attached. Maybe a Haskell expert could clarify that.

Andrei

Feb 01 2011

bearophile <bearophileHUGS lycos.com> writes:

Walter:
 
 It's exponentially bad performance makes it short, not useful.

A program with high complexity is not a problem if you run it only on few very
short examples. There is a place to care for performance (like when you design
a function for Phobos) and there are places where you care for other things.

I suggest top stop focusing only on a fault of a program that was not designed
for performance (and if you want to start looking at the numerous good things
present in Haskell. Haskell language and its implementation contains tens of
good ideas).

Bye,
bearophile

Feb 01 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday 01 February 2011 12:27:32 bearophile wrote:
 Walter:
 It's exponentially bad performance makes it short, not useful.

 
 A program with high complexity is not a problem if you run it only on few
 very short examples. There is a place to care for performance (like when
 you design a function for Phobos) and there are places where you care for
 other things.
 
 I suggest top stop focusing only on a fault of a program that was not
 designed for performance (and if you want to start looking at the numerous
 good things present in Haskell. Haskell language and its implementation
 contains tens of good ideas).

The issue is that if you want something in Phobos, it _does_ need to be
designed 
with performance in mind. Anything which isn't efficient needs to have a very
good 
reason for its existence which balances out its lack of efficiency. If the
Haskell 
implementation isn't performant enough, then it doesn't cut it for Phobos, even 
if it's a good fit for Haskell.

- Jonathan M Davis

Feb 01 2011

Daniel Gibson <metalcaedes gmail.com> writes:

Am 01.02.2011 21:53, schrieb Jonathan M Davis:
 On Tuesday 01 February 2011 12:27:32 bearophile wrote:
 Walter:
 It's exponentially bad performance makes it short, not useful.

 A program with high complexity is not a problem if you run it only on few
 very short examples. There is a place to care for performance (like when
 you design a function for Phobos) and there are places where you care for
 other things.

 I suggest top stop focusing only on a fault of a program that was not
 designed for performance (and if you want to start looking at the numerous
 good things present in Haskell. Haskell language and its implementation
 contains tens of good ideas).

 
 The issue is that if you want something in Phobos, it _does_ need to be
designed 
 with performance in mind. Anything which isn't efficient needs to have a very
good 
 reason for its existence which balances out its lack of efficiency. If the
Haskell 
 implementation isn't performant enough, then it doesn't cut it for Phobos,
even 
 if it's a good fit for Haskell.
 
 - Jonathan M Davis

Well, he didn't want the slow Levenshtein implementation in Phobos (if I
understood correctly), but more higher order functions like fold*. These are not
inherently slow and are most probably useful to implement fast functions as
well ;)

Feb 01 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/1/11 2:58 PM, Daniel Gibson wrote:
 Am 01.02.2011 21:53, schrieb Jonathan M Davis:
 On Tuesday 01 February 2011 12:27:32 bearophile wrote:
 Walter:
 It's exponentially bad performance makes it short, not useful.

 A program with high complexity is not a problem if you run it only on few
 very short examples. There is a place to care for performance (like when
 you design a function for Phobos) and there are places where you care for
 other things.

 I suggest top stop focusing only on a fault of a program that was not
 designed for performance (and if you want to start looking at the numerous
 good things present in Haskell. Haskell language and its implementation
 contains tens of good ideas).

 The issue is that if you want something in Phobos, it _does_ need to be
designed
 with performance in mind. Anything which isn't efficient needs to have a very
good
 reason for its existence which balances out its lack of efficiency. If the
Haskell
 implementation isn't performant enough, then it doesn't cut it for Phobos, even
 if it's a good fit for Haskell.

 - Jonathan M Davis

 Well, he didn't want the slow Levenshtein implementation in Phobos (if I
 understood correctly), but more higher order functions like fold*. These are
not
 inherently slow and are most probably useful to implement fast functions as
well ;)

The fact that foldl and foldr are only one letter apart is a design 
mistake. They have very different behavior, yet many functional 
programmers consider them largely interchangeable and are genuinely 
surprised when they hear the relative tradeoffs.

std.algorithm.reduce implements foldl, as it should. Simulating foldr on 
bidirectional ranges is as easy as reduce(retro(range)). Defining 
Haskell-style foldr on forward ranges would be difficult because it 
needs lazy evaluation through and through, and is at danger because 
people may use it naively.

For more info see the best answer at 
http://stackoverflow.com/questions/3429634/haskell-foldl-vs-foldr-question.


Andrei

Feb 01 2011

Daniel Gibson <metalcaedes gmail.com> writes:

Am 01.02.2011 22:30, schrieb Andrei Alexandrescu:
 On 2/1/11 2:58 PM, Daniel Gibson wrote:
 Am 01.02.2011 21:53, schrieb Jonathan M Davis:
 On Tuesday 01 February 2011 12:27:32 bearophile wrote:
 Walter:
 It's exponentially bad performance makes it short, not useful.

 A program with high complexity is not a problem if you run it only on few
 very short examples. There is a place to care for performance (like when
 you design a function for Phobos) and there are places where you care for
 other things.

 I suggest top stop focusing only on a fault of a program that was not
 designed for performance (and if you want to start looking at the numerous
 good things present in Haskell. Haskell language and its implementation
 contains tens of good ideas).

 The issue is that if you want something in Phobos, it _does_ need to be
designed
 with performance in mind. Anything which isn't efficient needs to have a very
 good
 reason for its existence which balances out its lack of efficiency. If the
 Haskell
 implementation isn't performant enough, then it doesn't cut it for Phobos, even
 if it's a good fit for Haskell.

 - Jonathan M Davis

 Well, he didn't want the slow Levenshtein implementation in Phobos (if I
 understood correctly), but more higher order functions like fold*. These are
not
 inherently slow and are most probably useful to implement fast functions as
 well ;)

 
 The fact that foldl and foldr are only one letter apart is a design mistake.
 They have very different behavior, yet many functional programmers consider
them
 largely interchangeable and are genuinely surprised when they hear the relative
 tradeoffs.
 
 std.algorithm.reduce implements foldl, as it should. Simulating foldr on
 bidirectional ranges is as easy as reduce(retro(range)). Defining Haskell-style
 foldr on forward ranges would be difficult because it needs lazy evaluation
 through and through, and is at danger because people may use it naively.
 
 For more info see the best answer at
 http://stackoverflow.com/questions/3429634/haskell-foldl-vs-foldr-question.
 
 
 Andrei

Thanks for the link :-)
I haven't used Haskell (or any other functional language) in a few years so I
forgot these details (to be honest, I don't think I understood the implications
explained in the stackoverflow post back then - I was happy when my Haskell
programs did what the exercises demanded).
I think that reduce as a non-lazy foldl is what people mostly need/want when
they use folding.
But I'm not sure if a lazy foldr (with whatever name to prevent people from
using it accidentally) may be useful.. I guess it's only useful when a list (or,
in D, range) is returned, i.e. the input list/range isn't really reduced like
when just calculating a sum or minimum or whatever.
I'm trying to think of use cases for this, but none (that aren't covered by map)
come to (my) mind - but this may be just because my brain isn't used to
functional programming anymore.

Cheers,
- Daniel

Feb 01 2011

bearophile <bearophileHUGS lycos.com> writes:

Jonathan M Davis:

The issue is that if you want something in Phobos, it _does_ need to be
designed with performance in mind. Anything which isn't efficient needs to have
a very good reason for its existence which balances out its lack of efficiency.
If the Haskell implementation isn't performant enough, then it doesn't cut it
for Phobos, even if it's a good fit for Haskell.<

I think you have misunderstood the discussion, or maybe I just don't understand
you.

My discussion was about HOFs, not about Levenshtein distance (I've shown a fast
one, but it's probably not usable for Phobos because of license issues:
http://codepad.org/s0ezEojU ).

---------------------

Andrei:

The fact that foldl and foldr are only one letter apart is a design mistake.<

I agree, mostly :-)


But something like foldr that uses head recursion would be indeed rather
dangerous to include.<

std.algorithm.reduce implements foldl, as it should. Simulating foldr on
bidirectional ranges is as easy as reduce(retro(range)). Defining Haskell-style
foldr on forward ranges would be difficult because it needs lazy evaluation
through and through, and is at danger because people may use it naively.

Python2 has only the foldr, probably to keep language simpler & safer to use.
Python3 has moved reduce from the language to the std library.

So far I have needed both kinds of folds in D only to translate some small
programs from Haskell to D (and in this case I have even once confused the two
folds, as you have noted), so probably I will be able to survive without a
(better renamed) foldr in Phobos, if you don't want it for "safety" for naive
programmers.

------------------

But there are other HOFs that may be useful (they are in dlibs1 too):

- "nest" (or iterate), to apply one function many times:
nest(sin, 0.2, 4) === sin(sin(sin(sin(0.2))))

- Something similar, that keeps all the intermediate results. It's sometimes
named nestList, but in D it may be lazy.

------------------

There is another problem, shown by std.functional.compose. See the part about D
here:
http://rosettacode.org/wiki/First-class_functions#D

This task asks things like (more details on the rosettacode page):
- Create new functions from preexisting functions at run-time
- Store functions in collections
- Use functions as arguments to other functions
- Use functions as return values of other functions 


To do it "well enough" the D implementation doesn't want to use
std.functional.compose and defines a more dynamic one, able to use run time
delegates:


private import std.math ;
import std.stdio ;
 
T delegate(S) compose(T, U, S)(T delegate(U) f, U delegate(S) g) {
    return (S s) { return f(g(s)); };
}
 
void main() {
	// warper need as not all built-in real functions 
	// have same signature , eg pure/nothrow
	auto sin  = delegate (real x) { return std.math.sin(x) ; } ;
	auto asin = delegate (real x) { return std.math.asin(x) ; } ;
	auto cos  = delegate (real x) { return std.math.cos(x) ; } ;
	auto acos = delegate (real x) { return std.math.acos(x) ; } ;
	auto cube = delegate (real x) { return x*x*x ; } ;
	auto cbrt = delegate (real x) { return std.math.cbrt(x) ; } ;
 
	// built-in : sin/cos/asin/acos/cbrt user:cube
	auto fun = [sin,  cos,  cube] ;
	auto inv = [asin, acos, cbrt] ;
 
	foreach(i, f ; fun)
		writefln("%6.3f", compose(f, inv[i])(0.5)) ;
}


You are able to write a similar program with std.functional.compose too, but
using tuples instead of arrays, this is less flexible:

import std.stdio, std.typetuple, std.functional;
private import std.math;
 
void main() {
    // wrappers needed as not all built-in functions
    // have same signature, eg pure/nothrow
    auto sin  = (real x) { return std.math.sin(x); };
    auto asin = (real x) { return std.math.asin(x); };
    auto cos  = (real x) { return std.math.cos(x); };
    auto acos = (real x) { return std.math.acos(x); };
    auto cube = (real x) { return x ^^ 3; };
    auto cbrt = (real x) { return std.math.cbrt(x); };
 
    alias TypeTuple!(sin,  cos,  cube) dir;
    alias TypeTuple!(asin, acos, cbrt) inv;
 
    foreach (i, f; dir)
        writefln("%6.3f", compose!(f, inv[i])(0.5));
}


This questions the design of std.functional.compose.

Bye,
bearophile

Feb 01 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/1/11 7:17 PM, bearophile wrote:
 But there are other HOFs that may be useful (they are in dlibs1 too):

 - "nest" (or iterate), to apply one function many times:
 nest(sin, 0.2, 4) === sin(sin(sin(sin(0.2))))

I'd be glad to include such a function if there were good use cases for 
it. Fixed-point iteration is interesting in math, but in practice you 
don't just run it naively, e.g. you run it once and check for a 
termination condition (which may be quite involved when e.g. doing 
linear algebra; my dissertation uses a lot of fixed-point iteration).

To wit, the sin example sucks. At what time have you been at a point in 
life when you wanted to do not only a few iterations of sin against its 
own result, but actually you want to be able to express that very notion 
as a sole function? It at least you chose cos, I would have been more 
sympathetic because fixed point iteration it's a way to solving cos(x) = x.

SICP has if I remember correctly a couple of nice examples of iterations 
for computing pi and sqrt, but those are recurrences, not fixed point 
iterations. For that we already have std.range.recurrence which is more 
general. To do fixed point with sequence is trivial:

auto s = recurrence!"sin(a[n-1])"(1.0);

The main point is that there are very strong examples for recurrences.

 - Something similar, that keeps all the intermediate results. It's sometimes
named nestList, but in D it may be lazy.

I think better examples could be found for that one, but... still 
tenuous. What's wrong with array, take, and recurrence?

You _already_ have in D better abstractions that those you think you 
want to bring over from functional languages. You have edit distance 
that laughs Haskell's out the door, you have recurrence that makes 
iterate look like a useless oddity, you have range refinements that 
bring you the best of foldr without the cost...

 ------------------

 There is another problem, shown by std.functional.compose. See the part about
D here:
 http://rosettacode.org/wiki/First-class_functions#D

 This task asks things like (more details on the rosettacode page):
 - Create new functions from preexisting functions at run-time
 - Store functions in collections
 - Use functions as arguments to other functions
 - Use functions as return values of other functions


 To do it "well enough" the D implementation doesn't want to use
std.functional.compose and defines a more dynamic one, able to use run time
delegates:


 private import std.math ;
 import std.stdio ;

 T delegate(S) compose(T, U, S)(T delegate(U) f, U delegate(S) g) {
      return (S s) { return f(g(s)); };
 }

 void main() {
 	// warper need as not all built-in real functions
 	// have same signature , eg pure/nothrow
 	auto sin  = delegate (real x) { return std.math.sin(x) ; } ;
 	auto asin = delegate (real x) { return std.math.asin(x) ; } ;
 	auto cos  = delegate (real x) { return std.math.cos(x) ; } ;
 	auto acos = delegate (real x) { return std.math.acos(x) ; } ;
 	auto cube = delegate (real x) { return x*x*x ; } ;
 	auto cbrt = delegate (real x) { return std.math.cbrt(x) ; } ;

 	// built-in : sin/cos/asin/acos/cbrt user:cube
 	auto fun = [sin,  cos,  cube] ;
 	auto inv = [asin, acos, cbrt] ;

 	foreach(i, f ; fun)
 		writefln("%6.3f", compose(f, inv[i])(0.5)) ;
 }


 You are able to write a similar program with std.functional.compose too, but
using tuples instead of arrays, this is less flexible:

 import std.stdio, std.typetuple, std.functional;
 private import std.math;

 void main() {
      // wrappers needed as not all built-in functions
      // have same signature, eg pure/nothrow
      auto sin  = (real x) { return std.math.sin(x); };
      auto asin = (real x) { return std.math.asin(x); };
      auto cos  = (real x) { return std.math.cos(x); };
      auto acos = (real x) { return std.math.acos(x); };
      auto cube = (real x) { return x ^^ 3; };
      auto cbrt = (real x) { return std.math.cbrt(x); };

      alias TypeTuple!(sin,  cos,  cube) dir;
      alias TypeTuple!(asin, acos, cbrt) inv;

      foreach (i, f; dir)
          writefln("%6.3f", compose!(f, inv[i])(0.5));
 }


 This questions the design of std.functional.compose.

More like it spurs the language to allow better local instantiation.


Andrei

Feb 01 2011

bearophile <bearophileHUGS lycos.com> writes:

Andrei:

Thank you for your answers. Learning to use Phobos algorithms very well takes
months.

You have edit distance that laughs Haskell's out the door,<

The edit distance code in Phobos2 is about 129 lines long (without comments and
blank lines), while that Haskell version I've shown is 3 lines long and it was
never designed to be library code. Probably there are ways to write a longer
Haskell version of the edit distance that's no more than about two times slower
than any D version.
On the other hand the dlibs1 library version of the edit distance I've shown is
about 10 times faster than the Phobos2 edit distance for that benchmark (but
it's less flexible, it doesn't deal with UTF8/16) (and I've seen less troubles
with memory allocation too).

I have closed enhancement request 5510, keeping zombie enhancement requests
open is bad :-)


 More like it spurs the language to allow better local instantiation.

I think you refer to a recent nice enhancement request. But I am not sure
that's enough to regain the dynamism of run-time function pointers.

Bye,
bearophile

Feb 02 2011

spir <denis.spir gmail.com> writes:

On 02/02/2011 02:17 AM, bearophile wrote:
 But there are other HOFs that may be useful (they are in dlibs1 too):

 - "nest" (or iterate), to apply one function many times:
 nest(sin, 0.2, 4) === sin(sin(sin(sin(0.2))))

'iterate' would be great if it didn't have a different meaning in programming 
already.

 - Something similar, that keeps all the intermediate results. It's sometimes
named nestList, but in D it may be lazy.

Nice to compute (math) series, I guess. What other uses?
Anyway, both are replaced by one line of D, don't you think?

 There is another problem, shown by std.functional.compose. See the part about
D here:
 http://rosettacode.org/wiki/First-class_functions#D

 This task asks things like (more details on the rosettacode page):
 - Create new functions from preexisting functions at run-time
 - Store functions in collections
 - Use functions as arguments to other functions
 - Use functions as return values of other functions


 To do it "well enough" the D implementation doesn't want to use
std.functional.compose and defines a more dynamic one, able to use run time
delegates:


 private import std.math ;
 import std.stdio ;

 T delegate(S) compose(T, U, S)(T delegate(U) f, U delegate(S) g) {
      return (S s) { return f(g(s)); };
 }

Great! That's the signature I would expect. Except for S U T in that order ;-) 
(what the f**k?)

Output delegate(Input) compose(Output, Median, Input)
   (Output delegate(Median) f, Median delegate(Input) g) {
     return (Input input) { return f(g(input)); };
}

 void main() {
 	// warper need as not all built-in real functions
 	// have same signature , eg pure/nothrow
 	auto sin  = delegate (real x) { return std.math.sin(x) ; } ;
 	auto asin = delegate (real x) { return std.math.asin(x) ; } ;
 	auto cos  = delegate (real x) { return std.math.cos(x) ; } ;
 	auto acos = delegate (real x) { return std.math.acos(x) ; } ;
 	auto cube = delegate (real x) { return x*x*x ; } ;
 	auto cbrt = delegate (real x) { return std.math.cbrt(x) ; } ;

 	// built-in : sin/cos/asin/acos/cbrt user:cube
 	auto fun = [sin,  cos,  cube] ;
 	auto inv = [asin, acos, cbrt] ;

 	foreach(i, f ; fun)
 		writefln("%6.3f", compose(f, inv[i])(0.5)) ;
 }

That's the way it must be written, imo.

 You are able to write a similar program with std.functional.compose too, but
using tuples instead of arrays, this is less flexible:

 import std.stdio, std.typetuple, std.functional;
 private import std.math;

 void main() {
      // wrappers needed as not all built-in functions
      // have same signature, eg pure/nothrow
      auto sin  = (real x) { return std.math.sin(x); };
      auto asin = (real x) { return std.math.asin(x); };
      auto cos  = (real x) { return std.math.cos(x); };
      auto acos = (real x) { return std.math.acos(x); };
      auto cube = (real x) { return x ^^ 3; };
      auto cbrt = (real x) { return std.math.cbrt(x); };

      alias TypeTuple!(sin,  cos,  cube) dir;
      alias TypeTuple!(asin, acos, cbrt) inv;

      foreach (i, f; dir)
          writefln("%6.3f", compose!(f, inv[i])(0.5));
 }


 This questions the design of std.functional.compose.

I do agree. Maybe of various other funcs and HOF's in stdlib as well.
In particular, I think tuple and TypeTuple shoudl not be used by other features 
in Phobos, at least as long as they are not builtin features. Their 
manipulation complicates and obscures everything. Better to use instead arrays 
for collections, or structs as named tuples. Even if a struct must be defined 
for a single use: it's clear, simple, self-documenting thank to names; and 
well-known by everyone.

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Feb 02 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/1/11 2:53 PM, Jonathan M Davis wrote:
 On Tuesday 01 February 2011 12:27:32 bearophile wrote:
 Walter:
 It's exponentially bad performance makes it short, not useful.

 A program with high complexity is not a problem if you run it only on few
 very short examples. There is a place to care for performance (like when
 you design a function for Phobos) and there are places where you care for
 other things.

 I suggest top stop focusing only on a fault of a program that was not
 designed for performance (and if you want to start looking at the numerous
 good things present in Haskell. Haskell language and its implementation
 contains tens of good ideas).

 The issue is that if you want something in Phobos, it _does_ need to be
designed
 with performance in mind. Anything which isn't efficient needs to have a very
good
 reason for its existence which balances out its lack of efficiency. If the
Haskell
 implementation isn't performant enough, then it doesn't cut it for Phobos, even
 if it's a good fit for Haskell.

 - Jonathan M Davis

I think this is a bit much, though probably a good principle to live 
into. For example Phobos does include linear search routines that are 
"inefficient" - i.e. O(m * n). It also has many abstractions that are 
arguably not as efficient as they could be, either at high level, low 
level, or both. But something like foldr that uses head recursion would 
be indeed rather dangerous to include.

Andrei

Feb 01 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, February 01, 2011 13:37:44 Andrei Alexandrescu wrote:
 On 2/1/11 2:53 PM, Jonathan M Davis wrote:
 On Tuesday 01 February 2011 12:27:32 bearophile wrote:
 Walter:
 It's exponentially bad performance makes it short, not useful.

 
 A program with high complexity is not a problem if you run it only on
 few very short examples. There is a place to care for performance (like
 when you design a function for Phobos) and there are places where you
 care for other things.
 
 I suggest top stop focusing only on a fault of a program that was not
 designed for performance (and if you want to start looking at the
 numerous good things present in Haskell. Haskell language and its
 implementation contains tens of good ideas).

 
 The issue is that if you want something in Phobos, it _does_ need to be
 designed with performance in mind. Anything which isn't efficient needs
 to have a very good reason for its existence which balances out its lack
 of efficiency. If the Haskell implementation isn't performant enough,
 then it doesn't cut it for Phobos, even if it's a good fit for Haskell.
 
 - Jonathan M Davis

 
 I think this is a bit much, though probably a good principle to live
 into. For example Phobos does include linear search routines that are
 "inefficient" - i.e. O(m * n). It also has many abstractions that are
 arguably not as efficient as they could be, either at high level, low
 level, or both. But something like foldr that uses head recursion would
 be indeed rather dangerous to include.

Okay. Perhaps, I said it a bit too strongly, but I think that the gist of what
I 
said is sound. Inefficient algorithms need to be bring something to the table
that 
is worth their inefficiency. And generally, if we can reasonably make
algorithms 
more efficient, that's something that we want to do. That's not to say that we 
don't have or never will have less efficient algorithms in Phobos, but they're 
there because they're worth what they bring, and just because an algorithm
would 
be considered reasonable for Haskell does not necessarily mean that it would be 
considered reasonable for Phobos.

- Jonathan M Davis

Feb 01 2011

Walter Bright <newshound2 digitalmars.com> writes:

Jonathan M Davis wrote:
 The issue is that if you want something in Phobos, it _does_ need to be
designed 
 with performance in mind.

Yup, because if it isn't, it gets ridicule heaped upon it, and deservedly.

Feb 01 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/1/11 2:27 PM, bearophile wrote:
 Walter:

 It's exponentially bad performance makes it short, not useful.

 A program with high complexity is not a problem if you run it only on
 few very short examples. There is a place to care for performance
 (like when you design a function for Phobos) and there are places
 where you care for other things.

 I suggest top stop focusing only on a fault of a program that was not
 designed for performance (and if you want to start looking at the
 numerous good things present in Haskell. Haskell language and its
 implementation contains tens of good ideas).

 Bye, bearophile

I agree in spirit but only weakly. Cute and complexity corrupt examples 
such as the exponential Fibonacci, the linear space factorial, and the 
quicksort that is not quick and not quicksort have long misrepresented 
what's good about functional programming.

Andrei

Feb 01 2011

D Programming

C/C++ Programming

Other

digitalmars.D - How many HOFs in Phobos?