digitalmars.D - Does D have "structural sharing" of immutable collections?

Chris Dew (8/8) May 23 2012 Just some beginner questions...

bearophile (11/16) May 23 2012 Currently Phobos doesn't have a Map (there are built-in

Chris Dew (3/21) May 23 2012 Thanks for your answer,
Roman D. Boiko (9/27) May 23 2012 I need some immutable collections for my D Compiler Tools

simendsjo (3/7) May 23 2012 http://dlang.org/phobos/std_container.html#SList

Dmitry Olshansky (5/12) May 23 2012 More or less fine... in git version.
simendsjo (2/10) May 23 2012 And http://dsource.org/projects/dcollections

Roman D. Boiko (2/17) May 23 2012 Those aren't immutable either :(

Steven Schveighoffer (17/34) May 23 2012 There is a very good reason however :)

Roman D. Boiko (8/13) May 23 2012 I don't need to invent here, and it is definitely feasible.

Steven Schveighoffer (8/18) May 23 2012 I looked up how it could be done, the one thing I was not sure of was th...

Roman D. Boiko (17/25) May 23 2012 My problem is the following:

Craig Dillabaugh (9/36) May 24 2012 I am not sure if I entirely understand your problem, but would a

Roman D. Boiko (6/11) May 24 2012 Yes, it would. Actually, what I described is how to use RBT as

Roman D. Boiko (2/12) May 23 2012 Those are not immutable, unfortunately

Stewart Gordon (9/11) May 23 2012

Roman D. Boiko (7/20) May 23 2012 Immutable collections in most cases have the same performance

Roman D. Boiko (5/11) May 23 2012 I've posted some links with information on this topic:

Andrei Alexandrescu (8/19) May 23 2012 Okasaki put together the ultimate work on immutable data structures. I

Roman D. Boiko (10/18) May 23 2012 I would appreciate critics and feedback once I start. The most

bearophile (11/14) May 23 2012 If you plan in creating a start of purely functional collections

Roman D. Boiko (2/16) May 23 2012 Thanks, I will!

Jonathan M Davis (7/35) May 23 2012 In my personal experience, that's not true at all. I've seen _huge_

Roman D. Boiko (6/16) May 23 2012 I expect problems with eliminating tail calls (especially for

Jonathan M Davis (18/36) May 23 2012 As part of my thesis work, I wrote a program which was counting possible...

Roman D. Boiko (2/33) May 23 2012 Is source code available?

Paulo Pinto (12/54) May 24 2012 While I am an Haskell newbie, I think most of those problems are

David Nadlinger (7/10) May 23 2012 An array does not facilitate sharing common subsets between

Andrei Alexandrescu (6/14) May 23 2012 Yah, FP doesn't like arrays and immutable containers essentially eschew

bearophile (24/28) May 23 2012 In the Chapel/X10 worlds I have read about various strategies to

"Chris Dew" <cmsdew gmail.com> writes:

Just some beginner questions...

Does D have "structural sharing" of its immutable collections?

i.e. Can I make a copy of an immutable Map or List collection 
with an extra (or mutated) member, and will the returned (new) 
collection share most of it's structure with the earlier 
collection.

Thanks,

Chris.

May 23 2012

"bearophile" <bearophileHUGS lycos.com> writes:

Chris Dew:

 Does D have "structural sharing" of its immutable collections?

 i.e. Can I make a copy of an immutable Map or List collection 
 with an extra (or mutated) member, and will the returned (new) 
 collection share most of it's structure with the earlier 
 collection.

Currently Phobos doesn't have a Map (there are built-in 
associative arrays), and there isn't a true List (there is a 
linked list, that so far I have not used).
Currently I think nothing in D works as you ask, it doesn't even 
have "immutable collections" like a Finger Tree or something.
But writing such collections is well within D type system 
capabilities, so I think eventually they will be present in 
Phobos if there is enough need.

Bye,
bearophile

May 23 2012

"Chris Dew" <cmsdew gmail.com> writes:

Thanks for your answer,

Chris.

On Wednesday, 23 May 2012 at 14:35:31 UTC, bearophile wrote:
 Chris Dew:

 Does D have "structural sharing" of its immutable collections?

 i.e. Can I make a copy of an immutable Map or List collection 
 with an extra (or mutated) member, and will the returned (new) 
 collection share most of it's structure with the earlier 
 collection.

 Currently Phobos doesn't have a Map (there are built-in 
 associative arrays), and there isn't a true List (there is a 
 linked list, that so far I have not used).
 Currently I think nothing in D works as you ask, it doesn't 
 even have "immutable collections" like a Finger Tree or 
 something.
 But writing such collections is well within D type system 
 capabilities, so I think eventually they will be present in 
 Phobos if there is enough need.

 Bye,
 bearophile

May 23 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Wednesday, 23 May 2012 at 14:35:31 UTC, bearophile wrote:
 Chris Dew:

 Does D have "structural sharing" of its immutable collections?

 i.e. Can I make a copy of an immutable Map or List collection 
 with an extra (or mutated) member, and will the returned (new) 
 collection share most of it's structure with the earlier 
 collection.

 Currently Phobos doesn't have a Map (there are built-in 
 associative arrays), and there isn't a true List (there is a 
 linked list, that so far I have not used).
 Currently I think nothing in D works as you ask, it doesn't 
 even have "immutable collections" like a Finger Tree or 
 something.
 But writing such collections is well within D type system 
 capabilities, so I think eventually they will be present in 
 Phobos if there is enough need.

 Bye,
 bearophile

I need some immutable collections for my D Compiler Tools 
project, namely linked list, red-black tree, and possibly some 
others.

So I'm going to create them for my use, and later generalize. 
I've created a project on GitHub, but didn't implement anything 
useful yet.

Some more information is available at 
http://d-coding.com/2012/05/21/functional-data-structures.html

May 23 2012

simendsjo <simendsjo gmail.com> writes:

On Wed, 23 May 2012 17:05:21 +0200, Roman D. Boiko <rb d-coding.com> wrote:

 I need some immutable collections for my D Compiler Tools project,  
 namely linked list, red-black tree, and possibly some others.
  So I'm going to create them for my use, and later generalize. I've  
 created a project on GitHub, but didn't implement anything useful yet.

http://dlang.org/phobos/std_container.html#SList
http://dlang.org/phobos/std_container.html#RedBlackTree

May 23 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 23.05.2012 22:18, simendsjo wrote:
 On Wed, 23 May 2012 17:05:21 +0200, Roman D. Boiko <rb d-coding.com> wrote:

 I need some immutable collections for my D Compiler Tools project,
 namely linked list, red-black tree, and possibly some others.
 So I'm going to create them for my use, and later generalize. I've
 created a project on GitHub, but didn't implement anything useful yet.

 http://dlang.org/phobos/std_container.html#SList

Awful junk.

 http://dlang.org/phobos/std_container.html#RedBlackTree

More or less fine... in git version.

-- 
Dmitry Olshansky

May 23 2012

simendsjo <simendsjo gmail.com> writes:

On Wed, 23 May 2012 20:18:54 +0200, simendsjo <simendsjo gmail.com> wrote:

 On Wed, 23 May 2012 17:05:21 +0200, Roman D. Boiko <rb d-coding.com>  
 wrote:

 I need some immutable collections for my D Compiler Tools project,  
 namely linked list, red-black tree, and possibly some others.
  So I'm going to create them for my use, and later generalize. I've  
 created a project on GitHub, but didn't implement anything useful yet.

 http://dlang.org/phobos/std_container.html#SList
 http://dlang.org/phobos/std_container.html#RedBlackTree

And http://dsource.org/projects/dcollections

May 23 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Wednesday, 23 May 2012 at 18:24:28 UTC, simendsjo wrote:
 On Wed, 23 May 2012 20:18:54 +0200, simendsjo 
 <simendsjo gmail.com> wrote:

 On Wed, 23 May 2012 17:05:21 +0200, Roman D. Boiko 
 <rb d-coding.com> wrote:

 I need some immutable collections for my D Compiler Tools 
 project, namely linked list, red-black tree, and possibly 
 some others.
 So I'm going to create them for my use, and later generalize. 
 I've created a project on GitHub, but didn't implement 
 anything useful yet.

 http://dlang.org/phobos/std_container.html#SList
 http://dlang.org/phobos/std_container.html#RedBlackTree

 And http://dsource.org/projects/dcollections

Those aren't immutable either :(

May 23 2012

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 23 May 2012 14:32:57 -0400, Roman D. Boiko <rb d-coding.com> wrote:

 On Wednesday, 23 May 2012 at 18:24:28 UTC, simendsjo wrote:
 On Wed, 23 May 2012 20:18:54 +0200, simendsjo <simendsjo gmail.com>  
 wrote:

 On Wed, 23 May 2012 17:05:21 +0200, Roman D. Boiko <rb d-coding.com>  
 wrote:

 I need some immutable collections for my D Compiler Tools project,  
 namely linked list, red-black tree, and possibly some others.
 So I'm going to create them for my use, and later generalize. I've  
 created a project on GitHub, but didn't implement anything useful yet.

 http://dlang.org/phobos/std_container.html#SList
 http://dlang.org/phobos/std_container.html#RedBlackTree

 And http://dsource.org/projects/dcollections

 Those aren't immutable either :(

There is a very good reason however :)

I need tail-immutable and tail-const structs to be able to avoid "code  
pasting hell".

Until then, I feel very unmotivated to do anything regarding const and  
immutability.

FWIW, std.container.RedBlackTree is derived from dcollections.RBTree.

Regarding sharing structure, the class type would have to be specifically  
written to use this.  It's not just slapping on immutable tags.

For at least RBTree, and my implementation of Hash, this would not be  
feasible.

Certainly ArrayList, LinkedList, and probably Deque, they could share  
structure

If you need a sorted tree structure that supports sharing immutable state,  
you likely have to find a different algorithm than redblack, since adding  
nodes can modify the tree structure via rotates.

-Steve

May 23 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Wednesday, 23 May 2012 at 19:51:02 UTC, Steven Schveighoffer 
wrote:
 If you need a sorted tree structure that supports sharing 
 immutable state, you likely have to find a different algorithm 
 than redblack, since adding nodes can modify the tree structure 
 via rotates.

 -Steve

I don't need to invent here, and it is definitely feasible. 
Okasaki provided efficient algorithm for inserting nodes, and, 
IIRC, rotations (not for deleting, though). But OCaml doesn't map 
to D easily (neither does Haskel, I think).

A side note, I've learned D and switched to Linux in February 
'12, so I struggle with newbie problems regularly...

May 23 2012

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 23 May 2012 17:54:42 -0400, Roman D. Boiko <rb d-coding.com> wrote:

 On Wednesday, 23 May 2012 at 19:51:02 UTC, Steven Schveighoffer wrote:
 If you need a sorted tree structure that supports sharing immutable  
 state, you likely have to find a different algorithm than redblack,  
 since adding nodes can modify the tree structure via rotates.

 -Steve

 I don't need to invent here, and it is definitely feasible. Okasaki  
 provided efficient algorithm for inserting nodes, and, IIRC, rotations  
 (not for deleting, though). But OCaml doesn't map to D easily (neither  
 does Haskel, I think).

I looked up how it could be done, the one thing I was not sure of was the  
parent node.  If you can have multiple references to the same subtree, how  
do you keep track of the parents.  But if you only are concerned about the  
ancestry of a single node, you can dynamically determine the line in  
O(lgn) time, and use that for the running of the redblack algorithm.

I'm not sure how efficient it would be.

-Steve

May 23 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Wednesday, 23 May 2012 at 22:25:35 UTC, Steven Schveighoffer 
wrote:
 I looked up how it could be done, the one thing I was not sure 
 of was the parent node.  If you can have multiple references to 
 the same subtree, how do you keep track of the parents.  But if 
 you only are concerned about the ancestry of a single node, you 
 can dynamically determine the line in O(lgn) time, and use that 
 for the running of the redblack algorithm.

 I'm not sure how efficient it would be.

 -Steve

My problem is the following:

n elements in some given order, each has an associated value; I 
need to retrieve elements for which the sum of values associated 
with previous elements lies in a given ragne; this should work 
efficiently when elements are added or removed, and history 
should be preserved.

Storing elements as tree leafs and sums of children values in 
other nodes gives me this easily.

Lookup price is 0(log N) number comparisons, which should be very 
fast. Insert / delete / rotate are also logarithmic, but I don't 
know the constant factors.

No need to retrieve parents.

So far it seems like RBT is perfect for me (if cache friendly 
implementation will be possible. Hopefully it will, because nodes 
are small and I can preallocate memory for them).

May 23 2012

"Craig Dillabaugh" <cdillaba cg.scs.carleton.ca> writes:

On Wednesday, 23 May 2012 at 22:39:33 UTC, Roman D. Boiko wrote:
 On Wednesday, 23 May 2012 at 22:25:35 UTC, Steven Schveighoffer 
 wrote:
 I looked up how it could be done, the one thing I was not sure 
 of was the parent node.  If you can have multiple references 
 to the same subtree, how do you keep track of the parents.  
 But if you only are concerned about the ancestry of a single 
 node, you can dynamically determine the line in O(lgn) time, 
 and use that for the running of the redblack algorithm.

 I'm not sure how efficient it would be.

 -Steve

 My problem is the following:

 n elements in some given order, each has an associated value; I 
 need to retrieve elements for which the sum of values 
 associated with previous elements lies in a given ragne; this 
 should work efficiently when elements are added or removed, and 
 history should be preserved.

 Storing elements as tree leafs and sums of children values in 
 other nodes gives me this easily.

 Lookup price is 0(log N) number comparisons, which should be 
 very fast. Insert / delete / rotate are also logarithmic, but I 
 don't know the constant factors.

 No need to retrieve parents.

 So far it seems like RBT is perfect for me (if cache friendly 
 implementation will be possible. Hopefully it will, because 
 nodes are small and I can preallocate memory for them).

I am not sure if I entirely understand your problem, but would a
persistent search tree work? See:

http://www.sciencedirect.com/science/article/pii/0022000089900342

I have a small write up on them from a course I took years ago:

http://cg.scs.carleton.ca/~cdillaba/comp5008/sarnak.html

Regards,
Craig

  -Steve

May 24 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Thursday, 24 May 2012 at 13:15:30 UTC, Craig Dillabaugh wrote:
 I am not sure if I entirely understand your problem, but would a
 persistent search tree work? See:

 http://www.sciencedirect.com/science/article/pii/0022000089900342

 I have a small write up on them from a course I took years ago:

 http://cg.scs.carleton.ca/~cdillaba/comp5008/sarnak.html

Yes, it would. Actually, what I described is how to use RBT as
search tree. RBT is selected because it bounds number of levels
to the optimum, and thus gives log N worst case for all
operations.

I'll take a look at the links, thanks!

May 24 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Wednesday, 23 May 2012 at 18:18:55 UTC, simendsjo wrote:
 On Wed, 23 May 2012 17:05:21 +0200, Roman D. Boiko 
 <rb d-coding.com> wrote:

 I need some immutable collections for my D Compiler Tools 
 project, namely linked list, red-black tree, and possibly some 
 others.
 So I'm going to create them for my use, and later generalize. 
 I've created a project on GitHub, but didn't implement 
 anything useful yet.

 http://dlang.org/phobos/std_container.html#SList
 http://dlang.org/phobos/std_container.html#RedBlackTree

Those are not immutable, unfortunately

May 23 2012

Stewart Gordon <smjg_1998 yahoo.com> writes:

On 23/05/2012 16:05, Roman D. Boiko wrote:
<snip>
 I need some immutable collections for my D Compiler Tools project, namely
linked list,
 red-black tree, and possibly some others.

<snip>

What's the point of an immutable red-black tree?

The whole point of a red-black tree is to facilitate fast dynamic addition,
removal and 
lookup of elements.

When the tree is immutable, only lookup speed is of any real relevance, so you
might as 
well create a perfectly balanced binary tree.  Or even better, an array.

Stewart.

May 23 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Wednesday, 23 May 2012 at 18:39:02 UTC, Stewart Gordon wrote:
 On 23/05/2012 16:05, Roman D. Boiko wrote:
 <snip>
 I need some immutable collections for my D Compiler Tools 
 project, namely linked list,
 red-black tree, and possibly some others.

 <snip>

 What's the point of an immutable red-black tree?

 The whole point of a red-black tree is to facilitate fast 
 dynamic addition, removal and lookup of elements.

 When the tree is immutable, only lookup speed is of any real 
 relevance, so you might as well create a perfectly balanced 
 binary tree.  Or even better, an array.

 Stewart.

Immutable collections in most cases have the same performance 
characteristics as mutable ones. Space consumption may be higher, 
but not necessarily.

Instead of mutating a collection, a new one is returned each time 
you add or remove an element. It shares most nodes with a 
previous one.

May 23 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Wednesday, 23 May 2012 at 18:51:32 UTC, Roman D. Boiko wrote:
 Immutable collections in most cases have the same performance 
 characteristics as mutable ones. Space consumption may be 
 higher, but not necessarily.

 Instead of mutating a collection, a new one is returned each 
 time you add or remove an element. It shares most nodes with a 
 previous one.

I've posted some links with information on this topic: 
http://d-coding.com/2012/05/21/functional-data-structures.html

It is also easy to find other sources on this topic for those who 
are interested.

May 23 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/23/12 1:53 PM, Roman D. Boiko wrote:
 On Wednesday, 23 May 2012 at 18:51:32 UTC, Roman D. Boiko wrote:
 Immutable collections in most cases have the same performance
 characteristics as mutable ones. Space consumption may be higher, but
 not necessarily.

 Instead of mutating a collection, a new one is returned each time you
 add or remove an element. It shares most nodes with a previous one.

 I've posted some links with information on this topic:
 http://d-coding.com/2012/05/21/functional-data-structures.html

 It is also easy to find other sources on this topic for those who are
 interested.

Okasaki put together the ultimate work on immutable data structures. I 
have plans to add immutable collections to D containers once I sort out 
the allocators matter. Unfortunately, I find myself gasping for time so 
the news that you are working on such are awesome. I suggest you to 
absorb the approach taken by ranges and (however incomplete it looks at 
the moment) std.container, and build on such.

Andrei

May 23 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Wednesday, 23 May 2012 at 21:13:48 UTC, Andrei Alexandrescu 
wrote:
 Okasaki put together the ultimate work on immutable data 
 structures. I have plans to add immutable collections to D 
 containers once I sort out the allocators matter. 
 Unfortunately, I find myself gasping for time so the news that 
 you are working on such are awesome. I suggest you to absorb 
 the approach taken by ranges and (however incomplete it looks 
 at the moment) std.container, and build on such.

 Andrei

I would appreciate critics and feedback once I start. The most 
difficult part is conceptual. Range primitives don't fit. I'm 
trying to invent something as good as ranges are for mutable 
data, but I don't feel that I'm qualified enough to get it right 
without help.

The end goal is to have something as easy to work with as 

course, efficient.

May 23 2012

"bearophile" <bearophileHUGS lycos.com> writes:

Roman D. Boiko:

 The end goal is to have something as easy to work with as 

 course, efficient.

If you plan in creating a start of purely functional collections 
library then I suggest you to also take a look at Clojure and 
various Haskell collections. Clojure is very based on persistent 
purely functional data structures, used in modern ways. And in 
the Haskell tribe there are some very smart persons that have 
worked on those things for fifteen years. So spending one extra 
week reading and looking at those two worlds may help you a lot 
later, and give you many good/better ideas.

Bye,
bearophile

May 23 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Wednesday, 23 May 2012 at 23:51:45 UTC, bearophile wrote:
 Roman D. Boiko:

 The end goal is to have something as easy to work with as 

 course, efficient.

 If you plan in creating a start of purely functional 
 collections library then I suggest you to also take a look at 
 Clojure and various Haskell collections. Clojure is very based 
 on persistent purely functional data structures, used in modern 
 ways. And in the Haskell tribe there are some very smart 
 persons that have worked on those things for fifteen years. So 
 spending one extra week reading and looking at those two worlds 
 may help you a lot later, and give you many good/better ideas.

 Bye,
 bearophile

Thanks, I will!

May 23 2012

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Wednesday, May 23, 2012 20:51:31 Roman D. Boiko wrote:
 On Wednesday, 23 May 2012 at 18:39:02 UTC, Stewart Gordon wrote:
 On 23/05/2012 16:05, Roman D. Boiko wrote:
 <snip>
 
 I need some immutable collections for my D Compiler Tools
 project, namely linked list,
 red-black tree, and possibly some others.

 
 <snip>
 
 What's the point of an immutable red-black tree?
 
 The whole point of a red-black tree is to facilitate fast
 dynamic addition, removal and lookup of elements.
 
 When the tree is immutable, only lookup speed is of any real
 relevance, so you might as well create a perfectly balanced
 binary tree. Or even better, an array.
 
 Stewart.

 
 Immutable collections in most cases have the same performance
 characteristics as mutable ones. Space consumption may be higher,
 but not necessarily.
 
 Instead of mutating a collection, a new one is returned each time
 you add or remove an element. It shares most nodes with a
 previous one.

In my personal experience, that's not true at all. I've seen _huge_ 
performance problems in Haskell when using a map. There may be cases where an 
immutable container may not pose large performance problems, but I would be 
_very_ wary of using immutable containers, and _very_ careful with them when I 
did.

- Jonathan M Davis

May 23 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Wednesday, 23 May 2012 at 21:12:04 UTC, Jonathan M Davis wrote:
 In my personal experience, that's not true at all. I've seen 
 _huge_
 performance problems in Haskell when using a map. There may be 
 cases where an
 immutable container may not pose large performance problems, 
 but I would be
 _very_ wary of using immutable containers, and _very_ careful 
 with them when I
 did.

 - Jonathan M Davis

I expect problems with eliminating tail calls (especially for 
mutually recursive functions), but cannot find any other reason, 
theoretical or implementation, that would necessarily make it 
execute slowly in D. I think that cache friendliness is possible 
to achieve, at least in my use cases. What else could go wrong?

May 23 2012

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Wednesday, May 23, 2012 23:58:42 Roman D. Boiko wrote:
 On Wednesday, 23 May 2012 at 21:12:04 UTC, Jonathan M Davis wrote:
 In my personal experience, that's not true at all. I've seen
 _huge_
 performance problems in Haskell when using a map. There may be
 cases where an
 immutable container may not pose large performance problems,
 but I would be
 _very_ wary of using immutable containers, and _very_ careful
 with them when I
 did.
 
 - Jonathan M Davis

 
 I expect problems with eliminating tail calls (especially for
 mutually recursive functions), but cannot find any other reason,
 theoretical or implementation, that would necessarily make it
 execute slowly in D. I think that cache friendliness is possible
 to achieve, at least in my use cases. What else could go wrong?

As part of my thesis work, I wrote a program which was counting possible 
tokens in code (as part of an AI attempting to learn about a programming 
language), which required a map of tokens to the number of times that they'd 
been seen. Because I had previously been doing stuff in Haskell for my thesis, 
I continued to use Haskell for that portion, and it was a _huge_ mistake, 
because the performance was _terrible_ - as in it could only process a few 
files in a day kind of terrible. I ultimately had to switch over to another 
language to get it to work (it ended up being Java, but I could have used a 
variety of other languages).

Maybe if I had simply been adding elements to the map, it wouldn't have been 
anywhere near as bad, but since I had to mutate them (and both the map and its 
elments were technically immutable), that become _insanely_ expensive.

So, it's quite possible that you have a use case where using immutable 
containers works really well and isn't a problem, but in what I've dealt with 
personally, I've found it to be a huge performance problem and so am very 
leery of immutable containers.

- Jonathan M Davis

May 23 2012

"Roman D. Boiko" <rb d-coding.com> writes:

On Thursday, 24 May 2012 at 01:00:52 UTC, Jonathan M Davis wrote:
 As part of my thesis work, I wrote a program which was counting 
 possible
 tokens in code (as part of an AI attempting to learn about a 
 programming
 language), which required a map of tokens to the number of 
 times that they'd
 been seen. Because I had previously been doing stuff in Haskell 
 for my thesis,
 I continued to use Haskell for that portion, and it was a 
 _huge_ mistake,
 because the performance was _terrible_ - as in it could only 
 process a few
 files in a day kind of terrible. I ultimately had to switch 
 over to another
 language to get it to work (it ended up being Java, but I could 
 have used a
 variety of other languages).

 Maybe if I had simply been adding elements to the map, it 
 wouldn't have been
 anywhere near as bad, but since I had to mutate them (and both 
 the map and its
 elments were technically immutable), that become _insanely_ 
 expensive.

 So, it's quite possible that you have a use case where using 
 immutable
 containers works really well and isn't a problem, but in what 
 I've dealt with
 personally, I've found it to be a huge performance problem and 
 so am very
 leery of immutable containers.

 - Jonathan M Davis

Is source code available?

May 23 2012

"Paulo Pinto" <pjmlp progtools.org> writes:

On Wednesday, 23 May 2012 at 21:12:04 UTC, Jonathan M Davis wrote:
 On Wednesday, May 23, 2012 20:51:31 Roman D. Boiko wrote:
 On Wednesday, 23 May 2012 at 18:39:02 UTC, Stewart Gordon 
 wrote:
 On 23/05/2012 16:05, Roman D. Boiko wrote:
 <snip>
 
 I need some immutable collections for my D Compiler Tools
 project, namely linked list,
 red-black tree, and possibly some others.

 
 <snip>
 
 What's the point of an immutable red-black tree?
 
 The whole point of a red-black tree is to facilitate fast
 dynamic addition, removal and lookup of elements.
 
 When the tree is immutable, only lookup speed is of any real
 relevance, so you might as well create a perfectly balanced
 binary tree. Or even better, an array.
 
 Stewart.

 
 Immutable collections in most cases have the same performance
 characteristics as mutable ones. Space consumption may be 
 higher,
 but not necessarily.
 
 Instead of mutating a collection, a new one is returned each 
 time
 you add or remove an element. It shares most nodes with a
 previous one.

 In my personal experience, that's not true at all. I've seen 
 _huge_
 performance problems in Haskell when using a map. There may be 
 cases where an
 immutable container may not pose large performance problems, 
 but I would be
 _very_ wary of using immutable containers, and _very_ careful 
 with them when I
 did.

 - Jonathan M Davis


While I am an Haskell newbie, I think most of those problems are
related to how hard it is for many developers to learn how to 
properly
optimize Haskell code.

The straightforward way to do some things is not the most 
performant, and
some optimizations require fine tuning of strict/lazy semantics 
on the
data structures manipulations.

--
Paulo

May 24 2012

"David Nadlinger" <see klickverbot.at> writes:

On Wednesday, 23 May 2012 at 18:39:02 UTC, Stewart Gordon wrote:
 When the tree is immutable, only lookup speed is of any real 
 relevance, so you might as well create a perfectly balanced 
 binary tree.  Or even better, an array.

An array does not facilitate sharing common subsets between 
containers, which is usually what you are aiming for when 
designing immutable containers – the idea is to get away with 
immutability performance-wise because you don't have to copy much 
on the common mutation operations.

David

May 23 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/23/12 1:54 PM, David Nadlinger wrote:
 On Wednesday, 23 May 2012 at 18:39:02 UTC, Stewart Gordon wrote:
 When the tree is immutable, only lookup speed is of any real
 relevance, so you might as well create a perfectly balanced binary
 tree. Or even better, an array.

 An array does not facilitate sharing common subsets between containers,
 which is usually what you are aiming for when designing immutable
 containers – the idea is to get away with immutability performance-wise
 because you don't have to copy much on the common mutation operations.

Yah, FP doesn't like arrays and immutable containers essentially eschew 
arrays altogether. (The most surprising thing to me is that the FP 
community generally fails to acknowledge that as a problem.)

Immutable data structures use trees pervasively for random-access data.


Andrei

May 23 2012

"bearophile" <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 Yah, FP doesn't like arrays and immutable containers 
 essentially eschew arrays altogether.

In the Chapel/X10 worlds I have read about various strategies to 
use mutable arrays too in parallel/concurrent situations. 
Probably some of those ideas will be quite useful in D too.


 (The most surprising thing to me is that the FP community
 generally fails to acknowledge that as a problem.)

Again and again I have seen how modern CPUs like linear scans on 
(small) arrays of values. Sometimes the performance is 
surprising, and beats almost everything else, including binary 
searches on the same small array.

This sometimes means that using a single core on such mutable 
arrays from C/C++/D leads to faster code (and far less 
electricity consumed) than using 4-8 cores on less optimal data 
structures (like structures with many pointers).

This is also why your idea of "small string optimization" was 
probably leading to significantly higher performance, and why I 
think the built-in D AAs will need a "small associative array" 
optimization (that even Python dicts have, for a number of items 
less than 9), BigInts will need a small value (no heap) 
optimization, and so on for other Phobos data structures 
(including lists!). Often such optimizations means just storing 
the few values in a small array instead of a fancier data 
structure (like a list), so adding such optimizations is usually 
not hard.

Bye,
bearophile

May 23 2012

D Programming

C/C++ Programming

Other

digitalmars.D - Does D have "structural sharing" of immutable collections?