www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Functional programming, immutablility

reply bearophile <bearophileHUGS lycos.com> writes:
This is an introduction to the Clojure language:
http://java.ociweb.com/javasig/knowledgebase/2009-05/ClojureSlides.pdf

One of the things it contains is a "shopping list" of this used in Functional
Programming:

Pure Functions:
- produce results that only depend on inputs, not any global state
- do not have side effects such as changing global state, file I/O or database
updates

First Class Functions:
- can be held in variables
- can be passed to and returned from other functions

Higher Order Functions:
- functions that do one or both of these:
- accept other functions as arguments and execute them zero or more times
- return another function
[A possible syntax idea: if f1 e f2 are functions, f1*f2 may produce a third
function that is f1(f2(x)) ]

Closures:
- special functions that retain access to variables that were in their scope
when the closure was created

Partial Application:
- ability to create new functions from existing ones that take fewer arguments

Currying:
- transforming a function of n arguments into a chain of n one argument
functions

Continuations:
- ability to save execution state and return to it later 

Immutable Data:
- after data is created, it can't be changed
- new versions with modifications are created instead
- some FP languages make concessions, but immutable is the default

Lazy Evaluation
- ability to delay evaluation of functions until their result is needed
- useful for optimization and implementing in?nite data structures

Monads:
- manage sequences of operations to provide control fow, null handling,
exception handling, concurrency and more

Pattern Matching:
- ability to specify branching in a compact way based on matching a value
against a set of alternatives


D2 language already has many of those features, monads can be implemented (even
if the language doesn't offer syntax to manage them). Pattern Matching is
missing (but this isn't too much bad: adding pattern matching to D2 may
increase its complexity a bit too much). I think continuations are missing.
Lazy Evaluation is present but just a bit.
Overall I think the current features of D2 may allow functional-style
programming, but so far I have not seen a good functional programmer try to use
D2 in such way seriously.


Later I have found a blog post by Matias Giovannini, "Is Immutable Data Slow?",
that shows a very simple benchmark:
http://alaska-kamtchatka.blogspot.com/2009/05/is-immutable-data-slow.html

With a reddit discussion too:
http://www.reddit.com/r/programming/comments/8n9r1/is_immutable_data_slow/

So I have adapted it to D1 with Tango (the Phobos version is very close), I
have also written a D1 version with structs. Below you can see the resulting
timings and the code of the three programs.
(I have not created a version for D2 with "immutable" yet, but I have created
two extra versions that use malloc to allocate structs, their timings aren't
interesting. I have also written a version that passes functions as template
alias, to avoid the overhead caused by the time0/time1 function pointers, but
the timings don't change much).


With LDC (-O5, -release, -inline) the inner loop for the mutable record version
looks almost optimal for this CPU:

.LBB7_8:	# forbody.i28
	addsd	.LCPI7_1, %xmm0
	movsd	%xmm0, 16(%eax)
	incl	%ecx
	cmpl	$10000000, %ecx
	jne	.LBB7_8	# forbody.i28


Where LCPI7_1 = double value: 1.000000e+00
But I don't know why that movsd %xmm0,16(%eax) is there, it may be moved
outside the loop.
DMD doesn't inline well enough here, and the result ~3 times slower.

Currently D1 seems unfit to be used as Java, where people create and destroy
lot of small objects very frequently.
Functional-style code uses immutable data/objects, but currently the D1 GC may
make such kind of programming too not efficient enough.
If D2 wants to be fit for functional programming, then I think a GC that is
more efficient with immutable data structures is more important than pattern
matching and continuations. Possible solutions were to have two GCs, and/or (if
not already done) to keep inside it a stack-like memory pool of small objects.

Bye,
bearophile

-------------------------------

Timings, updates/s:

Java -server, DMD on WinXP, iters=100_000_000:
  Immutable record:  65_963_060 
  Mutable record:   714_285_714 

Java -server, on Pubuntu, iters=100_000_000:
  Immutable record:  48_520_135 
  Mutable record:   634_920_634  (13.1 X)


D with classes, DMD on WinXP, iters=10_000_000:
  Immutable record:   3_396_392 
  Mutable record:   221_999_829 

D with classes, LDC on Pubuntu, iters=10_000_000/500_000_000:
  Immutable record:   3_773_527 
  Mutable record:   631_569_640  (167 X)


D with structs, DMD on WinXP, iters=10_000_000:
  Immutable record:   7_245_499 
  Mutable record:   222_205_458 

D with structs, LDC on Pubuntu, iters=10_000_000/500_000_000:
  Immutable record:   4_237_223
  Mutable record:   631_569_640 

(The performance of Pubuntu is a bit lower than native Windows, but the results
are useful still.)


CPU Core 2, 2 GHz, 2 GB RAM, WinXP sp2

LDC (Sat May  9 01:22:14 2009)
-O5 -release -inline

DMD 1.042
-O -release -inline

javac 1.6.0_12
java version "1.6.0_07"
Java(TM) SE Runtime Environment (build 1.6.0_07-b06)
Java HotSpot(TM) Server VM (build 11.2-b01, mixed mode)

-------------------------------

// Java code

public final class MutTest {
    private static final class Person0 {
        public final String name;
        public final int age;
        public final double balance;

        public Person0(String name, int age, double balance) {
            this.name = name;
            this.age = age;
            this.balance = balance;
        }

        public Person0 deposit(double amount) {
            return new Person0(this.name, this.age, this.balance + amount);
        }
    }

    private static final class Person1 {
        public String name;
        public int age;
        public double balance;

        public Person1(String name, int age, double balance) {
            this.name = name;
            this.age = age;
            this.balance = balance;
        }

        public void deposit(double amount) {
            this.balance += amount;
        }
    }

    private interface Test {
        double test(int iters);
    }

    private static final Test TEST0 = new Test() {
        public double test(int iters) {
            Person0 person = new Person0("John Doe", 19, 100.0);
            for (int i = 0; i != iters; i++) {
                person = person.deposit(1.0);
            }
            return person.balance;
        }
    };

    private static final Test TEST1 = new Test() {
        public double test(int iters) {
            Person1 person = new Person1("John Doe", 19, 100.0);
            for (int i = 0; i != iters; i++) {
                person.deposit(1.0);
            }
            return person.balance;
        }
    };

    private static double test(int times, Test test, int iters) {
        long best = Long.MAX_VALUE;
        double balance;
        for (int i = 0; i != times; i++) {
            long now = System.currentTimeMillis();
            balance = test.test(iters);
            now = System.currentTimeMillis() - now;
            if (best > now)
                best = now;
        }
        return (double) iters / ((double) best / 1000.0);
    }

    public static void main(String[] args) {
        final int iters = 100000000;
        System.out.printf("Immutable record: %f updates/s\n", test(5, TEST0,
iters));
        System.out.printf("Mutable record:   %f updates/s\n", test(5, TEST1,
iters));
    }
}

-----------------------

// D1 code with classes for Tango

import tango.stdc.stdio: printf;
import tango.time.StopWatch: StopWatch;

StopWatch elapsed;

final class Person0 {
    final char[] name;
    final double balance;
    final int age;

    this(char[] name, int age, double balance) {
        this.name = name;
        this.age = age;
        this.balance = balance;
    }

    final Person0 deposit(double amount) {
        return new Person0(this.name, this.age, this.balance + amount);
    }
}

final class Person1 {
    char[] name;
    double balance;
    int age;

    this(char[] name, int age, double balance) {
        this.name = name;
        this.age = age;
        this.balance = balance;
    }

    final void deposit(double amount) {
        this.balance += amount;
    }
}

public double test0(int iters) {
    auto person = new Person0("John Doe", 19, 100.0);
    for (int i = 0; i != iters; i++)
        person = person.deposit(1.0);
    return person.balance;
}

public double test1(int iters) {
    auto person = new Person1("John Doe", 19, 100.0);
    for (int i = 0; i != iters; i++)
        person.deposit(1.0);
    return person.balance;
}

int test(TyFun)(int times, TyFun test, int iters) {
    double best = double.max;
    double balance;
    for (int i = 0; i != times; i++) {
        elapsed.start;
        balance = test(iters);
        double now = elapsed.stop;
        if (best > now)
            best = now;
    }
    return cast(int)(iters / best);
}

void main() {
    const int iters = 10_000_000;
    printf("Immutable record: %d updates/s\n", test(5, &test0, iters));
    printf("Mutable record:   %d updates/s\n", test(5, &test1, iters*60)); //
requires an higher iters!
}

-----------------------

// D1 code with structs for Tango

import tango.stdc.stdio: printf;
import tango.time.StopWatch: StopWatch;

StopWatch elapsed;

struct Person0 {
    final char[] name;
    final double balance;
    final int age;

    static Person0* opCall(char[] name, int age, double balance) {
        auto p = new Person0;
        p.name = name;
        p.age = age;
        p.balance = balance;
        return p;
    }

    Person0* deposit(double amount) {
        return Person0(this.name, this.age, this.balance + amount);
    }
}

struct Person1 {
    char[] name;
    double balance;
    int age;

    static Person1* opCall(char[] name, int age, double balance) {
        auto p = new Person1;
        p.name = name;
        p.age = age;
        p.balance = balance;
        return p;
    }

    void deposit(double amount) {
        this.balance += amount;
    }
}

public double test0(int iters) {
    auto person = Person0("John Doe", 19, 100.0);
    for (int i; i < iters; i++)
        person = person.deposit(1.0);
    return person.balance;
}

public double test1(int iters) {
    auto person = Person1("John Doe", 19, 100.0);
    for (int i; i < iters; i++)
        person.deposit(1.0);
    return person.balance;
}

int test(TyFun)(int times, TyFun test, int iters) {
    double best = double.max;
    double balance;
    for (int i = 0; i != times; i++) {
        elapsed.start;
        balance = test(iters);
        double now = elapsed.stop;
        if (best > now)
            best = now;
    }
    return cast(int)(iters / best);
}

void main() {
    const int iters = 10_000_000;
    printf("Immutable record: %d updates/s\n", test(5, &test0, iters));
    printf("Mutable record:   %d updates/s\n", test(5, &test1, iters*60)); //
requires an higher iters!
}

-----------------------
May 26 2009
parent Sean Kelly <sean invisibleduck.org> writes:
bearophile wrote:
 
 D2 language already has many of those features, monads can be implemented
(even if the language doesn't offer syntax to manage them). Pattern Matching is
missing (but this isn't too much bad: adding pattern matching to D2 may
increase its complexity a bit too much). I think continuations are missing.
Lazy Evaluation is present but just a bit.
 Overall I think the current features of D2 may allow functional-style
programming, but so far I have not seen a good functional programmer try to use
D2 in such way seriously.

I think fibers are D2's answer to continuations. They're a part of core.thread.
Jun 12 2009