digitalmars.D - Decision on container design

Andrei Alexandrescu (14/14) Jan 28 2011 Today after work I plan to start making one pass through std.container.

bearophile (7/7) Jan 28 2011 Andrei:
Michel Fortin (13/31) Jan 28 2011 Not my preferred choices (especially #1), but having containers in

Andrei Alexandrescu (10/36) Jan 28 2011 Well if you brought forth some strong argument I'm all ears. What I see

Michel Fortin (51/68) Jan 28 2011 We already argument this over and over in the past. First, I totally

bearophile (17/23) Jan 28 2011 Built-in AAs are currently broken and in need to be fixed:

spir (33/48) Jan 28 2011 Variation on the theme:

Tomek =?UTF-8?B?U293acWEc2tp?= (10/59) Jan 28 2011 Is there anything implementation specific in the outer struct that provi...

Michel Fortin (11/70) Jan 28 2011 You could provide an implementation-specific version of some functions

Tomek =?ISO-8859-2?Q?Sowi=F1ski?= (28/37) Jan 29 2011 by

Simen kjaeraas (27/57) Jan 29 2011 s

Andrei Alexandrescu (5/63) Feb 01 2011 I do something similar with RefCounted. There are problems - you need to...

Simen kjaeraas (40/43) Feb 01 2011 Static functions can safely be called. Hence, this template:

Denis Koroskin (28/91) Jan 28 2011 Unfortunately, this design has big issues:

Michel Fortin (26/56) Jan 28 2011 That's indeed a problem. I don't think it's a fatal flaw however, given

Tomek =?ISO-8859-2?Q?Sowi=F1ski?= (4/9) Jan 29 2011 Or take an output range.
Andrei Alexandrescu (15/67) Feb 01 2011 Yep, yep, I found myself wrestling with the same issues. All good

Michel Fortin (19/82) Feb 01 2011 But are you not just pushing the aggravation elsewhere? If I need a by

Andrei Alexandrescu (24/102) Feb 01 2011 If semantics are the primary concern, you could (and in fact Phobos

Jonathan M Davis (8/101) Feb 01 2011 Java implements containers as classes (not that it really has any other ...
bearophile (4/14) Feb 01 2011 I agree that most times a reference is better. This brings back the need...
Michel Fortin (18/22) Feb 01 2011 What exactly is "most of the time"? In C++, you pass containers by '&'

Simon Buerger (14/33) Feb 01 2011 Thats would not be "per-value" as most understand it.

Steven Schveighoffer (27/98) Feb 01 2011 I've been thinking of making Appender.Impl public or at least its own

Simon Buerger (8/117) Feb 01 2011 Just to note: the "correct" solution for the last problem is

Steven Schveighoffer (15/28) Feb 01 2011 This is not a "solution". I cannot enforce that someone uses foreach wi...

Steven Schveighoffer (16/40) Jan 31 2011 I see value in this idiom, and I have ideas on how to solve this with

bearophile (8/10) Jan 28 2011 This page:

Tomek =?ISO-8859-2?Q?Sowi=F1ski?= (9/23) Jan 29 2011 ve indicated large percentage improvements

SHOO (14/30) Jan 29 2011 I think that it is a good policy.
Simon Buerger (27/36) Jan 29 2011 Not perfectly what I would like, but a reasonable choice, and most

bearophile (4/7) Jan 29 2011 This is good for a language like Python, or even Java, but for a system ...

Jim (2/9) Jan 29 2011 There could be an interface named Set and various implementations of it,...

Andrei Alexandrescu (4/13) Jan 29 2011 Thus approach has a few advantages but a lot of serious issues. I

KennyTM~ (9/48) Jan 29 2011 A 'deque' just mean a double-ended queue, not necessarily doubly linked
Jonathan M Davis (8/11) Jan 29 2011 It was decided previously that containers in std.container would specifi...
spir (7/16) Jan 30 2011 +++ on all those points
Steven Schveighoffer (3/40) Jan 31 2011 http://www.dsource.org/projects/dcollections

Simon Buerger (6/8) Jan 31 2011 Well, seems not bad on a quick look. But source is updated 2 years

David Nadlinger (5/6) Jan 31 2011 http://www.dsource.org/projects/dcollections/browser/branches/d2 – fou...
Steven Schveighoffer (14/22) Jan 31 2011 latest meaningful change was 4 months ago:

Simon Buerger (6/29) Jan 31 2011 Okay, my fault. Didnt realize you were the author, and the project is

Steven Schveighoffer (7/12) Jan 31 2011 Sorry about that, I have had at least 3 people be confused at why trunk ...

dsimcha (13/27) Jan 29 2011 In general I actually like this idea, but can we still have sealed

Andrei Alexandrescu (3/6) Jan 29 2011 Where is it?

dsimcha (2/8) Jan 29 2011

Andrei Alexandrescu (5/15) Jan 29 2011 Well to create a proposal you'd need to generate the documentation and

dsimcha (6/24) Jan 29 2011 I've uploaded the documentation to

Andrei Alexandrescu (40/45) Feb 01 2011 Sorry for being slow in continuing this thread.

spir (18/27) Feb 01 2011 An alternative, or a complementary approach, may be to delegate part of ...
dsimcha (10/55) Feb 02 2011 I see your point. Some of the stuff I've posted lately (RandAA,
dsimcha (5/50) Feb 02 2011 BTW, I thought the case for a sealed hash table with an optimized memory...

Andrei Alexandrescu (4/8) Feb 02 2011 Perhaps a very good argument could be made with some benchmarks that

Jonathan M Davis (14/32) Jan 29 2011 Overall, I think definitely support this approach. I definitely think th...

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Today after work I plan to start making one pass through std.container. 
After having thought of things for a long time, my conclusions are as 
follows:

1. Containers will be classes.

2. Most of the methods in existing containers will be final. It's up to 
the container to make a method final or not.

3. Containers and their ranges decide whether they give away references 
to their objects. Sealing is a great idea but it makes everybody's life 
too complicated. I'll defer sealing to future improvements in the 
language and/or the reflection subsystem.

4. Containers will assume that objects are cheap to copy so they won't 
worry about moving primitives.

Any showstoppers, please share.


Andrei

Jan 28 2011

bearophile <bearophileHUGS lycos.com> writes:

Andrei:

As far as I understand the implications, I agree with your conclusions.
Other people will be free to create an alternative D2 library like this one:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html
but a standard library can't be too much hard to use.

Bye,
bearophile

Jan 28 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-01-28 13:31:58 -0500, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Today after work I plan to start making one pass through std.container. 
 After having thought of things for a long time, my conclusions are as 
 follows:
 
 1. Containers will be classes.
 
 2. Most of the methods in existing containers will be final. It's up to 
 the container to make a method final or not.
 
 3. Containers and their ranges decide whether they give away references 
 to their objects. Sealing is a great idea but it makes everybody's life 
 too complicated. I'll defer sealing to future improvements in the 
 language and/or the reflection subsystem.
 
 4. Containers will assume that objects are cheap to copy so they won't 
 worry about moving primitives.
 
 Any showstoppers, please share.


Phobos will certainly be an improvement over not having them. So go 
ahead!


possible even if they fallback to (cheap) copy semantic when move isn't 
available. That way, if you have a type which is moveable but not 
copyable you can still put it in a container. Does that makes sense?


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Jan 28 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/28/11 3:05 PM, Michel Fortin wrote:
 On 2011-01-28 13:31:58 -0500, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 Today after work I plan to start making one pass through
 std.container. After having thought of things for a long time, my
 conclusions are as follows:

 1. Containers will be classes.

 2. Most of the methods in existing containers will be final. It's up
 to the container to make a method final or not.

 3. Containers and their ranges decide whether they give away
 references to their objects. Sealing is a great idea but it makes
 everybody's life too complicated. I'll defer sealing to future
 improvements in the language and/or the reflection subsystem.

 4. Containers will assume that objects are cheap to copy so they won't
 worry about moving primitives.

 Any showstoppers, please share.


 Phobos will certainly be an improvement over not having them. So go ahead!

Well if you brought forth some strong argument I'm all ears. What I see 
for now is that struct containers are just difficult to implement and 
need to have carefully explained semantics, whereas a lot of people know 
how classes behave from day one.


 possible even if they fallback to (cheap) copy semantic when move isn't
 available. That way, if you have a type which is moveable but not
 copyable you can still put it in a container. Does that makes sense?

That's what I did up until now. It is tantamount to defining a bunch of 
methods (aliases or not) that add to the interface that the user must 
absorb, but that are seldom useful. It just seems that the entire move 
paraphernalia doesn't lift its weight.


Andrei

Jan 28 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-01-28 17:09:08 -0500, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 1/28/11 3:05 PM, Michel Fortin wrote:

 Phobos will certainly be an improvement over not having them. So go ahead!

 
 Well if you brought forth some strong argument I'm all ears. What I see 
 for now is that struct containers are just difficult to implement and 
 need to have carefully explained semantics, whereas a lot of people 
 know how classes behave from day one.

We already argument this over and over in the past. First, I totally 
acknowledge that C++ style containers have a problem: they make it 
easier to copy the content than pass it by reference. On the other side 
of the spectrum, I think that class semantics makes it too easy to have 
null dereferences, it's easy to get lost when you have a container of 
containers.

I have some experience with containers having class-style semantics: in 
Objective-C, I ended up creating a set of macro-like functions which I 
use to initialize containers whenever I use them in case they are null. 
And I had to do more of these utility functions to handle a particular 
data structure of mine which is a dictionary of arrays of objects. In 
C++, I'd have declared this as a "map< string, vector< Object > >" and 
be done with it; no need for special care initializing each vector, so 
much easier than in Objective-C.

I agree that defining structs to have reference semantics as you have 
done is complicated. But I like the lazy initialization, and we have a 
precedent for that with AAs (ideally, AAs would be a compatible 
container too). Can't we just use the GC instead of reference counting? 
I'd make things much easier. Here is a implementation:

	struct Container
	{
		struct Impl { ... }

		private Impl* _impl;
		ref Impl impl()  property
		{
			if (!impl) impl = new Impl;
			return *impl;
		}
	
		alias impl this;
	}

I also believe reference semantics are not to be used everywhere, even 
though they're good most of the time. I'd like to have a way to bypass 
it and get a value-semantic container. With the above, it's easy as 
long as you keep Container.Impl public:

	void main() {
		Container      lazyHeapAllocatedContainer;
		Container.Impl stackAllocatedContainer;
	}

	void MyObject {
		Container.Impl listOfObjects;
	}



 possible even if they fallback to (cheap) copy semantic when move isn't
 available. That way, if you have a type which is moveable but not
 copyable you can still put it in a container. Does that makes sense?

 
 That's what I did up until now. It is tantamount to defining a bunch of 
 methods (aliases or not) that add to the interface that the user must 
 absorb, but that are seldom useful. It just seems that the entire move 
 paraphernalia doesn't lift its weight.

But could we limit this to say that only containers that can return 
elements by ref? Perhaps that won't help. You know the problem better 
than me, I don't really have anything more to say.


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Jan 28 2011

bearophile <bearophileHUGS lycos.com> writes:

Michel Fortin:

 On the other side 
 of the spectrum, I think that class semantics makes it too easy to have 
 null dereferences,

That's why I have asked for a   suffix syntax+semantics for notnull reference
types in D :-)


 I agree that defining structs to have reference semantics as you have 
 done is complicated. But I like the lazy initialization, and we have a 
 precedent for that with AAs

Built-in AAs are currently broken and in need to be fixed:

import std.stdio: writeln;
void foo(int[int] aa, int n) {
    aa[n] = n;
}
void main() {
    int[int] a;
    foo(a, 0);
    writeln(a);
    a[1] = 1;
    foo(a, 2);
    writeln(a);
}

Bye,
bearophile

Jan 28 2011

spir <denis.spir gmail.com> writes:

On 01/29/2011 01:01 AM, bearophile wrote:
 Built-in AAs are currently broken and in need to be fixed:

 import std.stdio: writeln;
 void foo(int[int] aa, int n) {
      aa[n] = n;
 }
 void main() {
      int[int] a;
      foo(a, 0);
      writeln(a);
      a[1] = 1;
      foo(a, 2);
      writeln(a);
 }

 Bye,
 bearophile

Variation on the theme:

void foo(int[int] aa, int n) {
     aa[n] = n;
}
void foo(int[] a, int n) {
     a ~= n;
}
void bar(ref int[int] aa, int n) {
     aa[n] = n;
}
void bar(ref int[] a, int n) {
     a ~= n;
}

unittest {
     int[int] aa;
     foo(aa, 3);
     writeln(aa.length);

     int[] a;
     foo(a, 3);
     writeln(a.length);

     int[int] bb;
     bar(bb, 3);
     writeln(bb.length);

     int[] b;
     bar(b, 3);
     writeln(b.length);
}

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Jan 28 2011

Tomek =?UTF-8?B?U293acWEc2tp?= <just ask.me> writes:

Michel Fortin napisa=C5=82:

 We already argument this over and over in the past. First, I totally=20
 acknowledge that C++ style containers have a problem: they make it=20
 easier to copy the content than pass it by reference. On the other side=20
 of the spectrum, I think that class semantics makes it too easy to have=20
 null dereferences, it's easy to get lost when you have a container of=20
 containers.
=20
 I have some experience with containers having class-style semantics: in=20
 Objective-C, I ended up creating a set of macro-like functions which I=20
 use to initialize containers whenever I use them in case they are null.=20
 And I had to do more of these utility functions to handle a particular=20
 data structure of mine which is a dictionary of arrays of objects. In=20
 C++, I'd have declared this as a "map< string, vector< Object > >" and=20
 be done with it; no need for special care initializing each vector, so=20
 much easier than in Objective-C.
=20
 I agree that defining structs to have reference semantics as you have=20
 done is complicated. But I like the lazy initialization, and we have a=20
 precedent for that with AAs (ideally, AAs would be a compatible=20
 container too). Can't we just use the GC instead of reference counting?=20
 I'd make things much easier. Here is a implementation:
=20
 	struct Container
 	{
 		struct Impl { ... }
=20
 		private Impl* _impl;
 		ref Impl impl()  property
 		{
 			if (!impl) impl =3D new Impl;
 			return *impl;
 		}
 =09
 		alias impl this;
 	}
=20
 I also believe reference semantics are not to be used everywhere, even=20
 though they're good most of the time. I'd like to have a way to bypass=20
 it and get a value-semantic container. With the above, it's easy as=20
 long as you keep Container.Impl public:
=20
 	void main() {
 		Container      lazyHeapAllocatedContainer;
 		Container.Impl stackAllocatedContainer;
 	}
=20
 	void MyObject {
 		Container.Impl listOfObjects;
 	}

Is there anything implementation specific in the outer struct that provides=
 ref semantics to Impl? If not, Container could be generic, parametrized by=
 Impl type.

Overall, I think a value-like implementation in a referency wrapper is a cl=
ear-cut idiom, bringing order to otherwise messy struct-implemented ref-sem=
antics. Do you know of a existing collection library that exploits this ide=
a?

--=20
Tomek

Jan 28 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-01-28 19:00:02 -0500, Tomek Sowiński <just ask.me> said:

 Michel Fortin napisał:
 
 We already argument this over and over in the past. First, I totally
 acknowledge that C++ style containers have a problem: they make it
 easier to copy the content than pass it by reference. On the other side
 of the spectrum, I think that class semantics makes it too easy to have
 null dereferences, it's easy to get lost when you have a container of
 containers.
 
 I have some experience with containers having class-style semantics: in
 Objective-C, I ended up creating a set of macro-like functions which I
 use to initialize containers whenever I use them in case they are null.
 And I had to do more of these utility functions to handle a particular
 data structure of mine which is a dictionary of arrays of objects. In
 C++, I'd have declared this as a "map< string, vector< Object > >" and
 be done with it; no need for special care initializing each vector, so
 much easier than in Objective-C.
 
 I agree that defining structs to have reference semantics as you have
 done is complicated. But I like the lazy initialization, and we have a
 precedent for that with AAs (ideally, AAs would be a compatible
 container too). Can't we just use the GC instead of reference counting?
 I'd make things much easier. Here is a implementation:
 
 	struct Container
 	{
 		struct Impl { ... }
 
 		private Impl* _impl;
 		ref Impl impl()  property
 		{
 			if (!impl) impl = new Impl;
 			return *impl;
 		}
 	
 		alias impl this;
 	}
 
 I also believe reference semantics are not to be used everywhere, even
 though they're good most of the time. I'd like to have a way to bypass
 it and get a value-semantic container. With the above, it's easy as
 long as you keep Container.Impl public:
 
 	void main() {
 		Container      lazyHeapAllocatedContainer;
 		Container.Impl stackAllocatedContainer;
 	}
 
 	void MyObject {
 		Container.Impl listOfObjects;
 	}

 
 Is there anything implementation specific in the outer struct that provides
 ref semantics to Impl? If not, Container could be generic, parametrized by
 Impl type.

You could provide an implementation-specific version of some functions 
as an optimization. For instance there is no need to create the Impl 
when asking for the length, if the pointer is null, length is zero. 
Typically, const function can be implemented in the outward container 
with a shortcut checking for null.


 Overall, I think a value-like implementation in a referency wrapper is 
 a clear-cut idiom, bringing order to otherwise messy struct-implemented 
 ref-semantics. Do you know of a existing collection library that 
 exploits this idea?

No. Only associative arrays in D do that, that I know of.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Jan 28 2011

Tomek =?ISO-8859-2?Q?Sowi=F1ski?= <just ask.me> writes:

Michel Fortin napisa=B3:

 Is there anything implementation specific in the outer struct that prov=


ides
 ref semantics to Impl? If not, Container could be generic, parametrized=


 by
 Impl type. =20

=20
 You could provide an implementation-specific version of some functions=20
 as an optimization. For instance there is no need to create the Impl=20
 when asking for the length, if the pointer is null, length is zero.=20
 Typically, const function can be implemented in the outward container=20
 with a shortcut checking for null.

I think the reference struct can still be orthogonal to the container.

	struct Ref(Impl)
	{
		private Impl* _impl;
		ref Impl impl()  property
		{
			if (!impl) impl =3D new Impl;
			return *impl;
		}

		static if (hasLength!Impl)
		{
			auto length()  property
			{
				return impl ? impl.length : 0;
			}
		}

		alias impl this;
	}

Reusability lightens the burden of the container's author (less fuss for us=
er implementations) and somewhat standardizes containers as they all must e=
xhibit a certain API with certain semantics to be able to fit into Ref.

The downside is that the syntax for the most common case (ref semantics) is=
 a little nosier than for value-like behavior.

--=20
Tomek

Jan 29 2011

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Tomek Sowi=C5=84ski <just ask.me> wrote:

 Michel Fortin napisa=C5=82:

 Is there anything implementation specific in the outer struct that =



 =

 provides
 ref semantics to Impl? If not, Container could be generic,  =



 parametrized by
 Impl type.

 You could provide an implementation-specific version of some function=


s
 as an optimization. For instance there is no need to create the Impl
 when asking for the length, if the pointer is null, length is zero.
 Typically, const function can be implemented in the outward container=


 with a shortcut checking for null.

 I think the reference struct can still be orthogonal to the container.=

 	struct Ref(Impl)
 	{
 		private Impl* _impl;
 		ref Impl impl()  property
 		{
 			if (!impl) impl =3D new Impl;
 			return *impl;
 		}

 		static if (hasLength!Impl)
 		{
 			auto length()  property
 			{
 				return impl ? impl.length : 0;
 			}
 		}

 		alias impl this;
 	}

Now, other functions may also exploit such a shortcut. How do you plan
to implement that?

Actually, by adopting another idiom, it can be done:

struct Ref(Impl) {
     private Impl* _impl;
      property ref Impl impl() {
         return *(_impl =3D (_impl ? _impl : new Impl));
     }
     alias impl this;

     auto opDispatch(string name, T...)(T args) {
         mixin("return impl."~name~"(_impl,args);");
     }
}

struct ExampleImpl {
     static int length(ExampleImpl* that) {
         return that ? *that.actualLength : 0;
     }
     int actualLength( ) {
         return 42;
     }
}

Of course, I did not say it was a good idiom. :p

-- =

Simen

Jan 29 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/29/11 5:01 AM, Simen kjaeraas wrote:
 Tomek Sowiński <just ask.me> wrote:

 Michel Fortin napisał:

 Is there anything implementation specific in the outer struct that

 provides
 ref semantics to Impl? If not, Container could be generic,

 parametrized by
 Impl type.

 You could provide an implementation-specific version of some functions
 as an optimization. For instance there is no need to create the Impl
 when asking for the length, if the pointer is null, length is zero.
 Typically, const function can be implemented in the outward container
 with a shortcut checking for null.

 I think the reference struct can still be orthogonal to the container.

 struct Ref(Impl)
 {
 private Impl* _impl;
 ref Impl impl()  property
 {
 if (!impl) impl = new Impl;
 return *impl;
 }

 static if (hasLength!Impl)
 {
 auto length()  property
 {
 return impl ? impl.length : 0;
 }
 }

 alias impl this;
 }

 Now, other functions may also exploit such a shortcut. How do you plan
 to implement that?

 Actually, by adopting another idiom, it can be done:

 struct Ref(Impl) {
 private Impl* _impl;
  property ref Impl impl() {
 return *(_impl = (_impl ? _impl : new Impl));
 }
 alias impl this;

 auto opDispatch(string name, T...)(T args) {
 mixin("return impl."~name~"(_impl,args);");
 }
 }

 struct ExampleImpl {
 static int length(ExampleImpl* that) {
 return that ? *that.actualLength : 0;
 }
 int actualLength( ) {
 return 42;
 }
 }

 Of course, I did not say it was a good idiom. :p

I do something similar with RefCounted. There are problems - you need to 
know in advance which functions you can implement on a null container 
(empty and length are obvious candidates, but there could be others).

Andrei

Feb 01 2011

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 I do something similar with RefCounted. There are problems - you need to  
 know in advance which functions you can implement on a null container  
 (empty and length are obvious candidates, but there could be others).

Static functions can safely be called. Hence, this template:


import std.traits;

template isStaticFunc(T, string fn) {
     enum isStaticFunc = is(typeof({
         mixin("alias T."~fn~" func;");
         ParameterTypeTuple!func args;
         mixin("T."~fn~"(args);");
     }));
}


Now, Ref can look like this:


struct Ref(Impl) {
     private Impl* _impl;
      property ref Impl impl() {
         return *(_impl = (_impl ? _impl : new Impl));
     }
     //alias impl this; // Apparently, alias this takes precedence over  
opDispatch, so
                        // using both doesn't work. Well, unless you put  
opDispatch in Impl.

     auto opDispatch(string name, T...)(T args) if ( isStaticFunc!(Impl,  
name)) {
         mixin("return impl."~name~"(_impl,args);");
     }
}

struct ExampleImpl {
     static int length(ExampleImpl* that) {
         return that ? that.actualLength : 0;
     }
     int actualLength( ) {
         return 42;
     }
}


import std.stdio;

void main( ) {
     Ref!ExampleImpl a;

     writeln( a.length );
}


-- 
Simen

Feb 01 2011

"Denis Koroskin" <2korden gmail.com> writes:

On Sat, 29 Jan 2011 02:32:28 +0300, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2011-01-28 17:09:08 -0500, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> said:

 On 1/28/11 3:05 PM, Michel Fortin wrote:

 Phobos will certainly be an improvement over not having them. So go  
 ahead!

  Well if you brought forth some strong argument I'm all ears. What I  
 see for now is that struct containers are just difficult to implement  
 and need to have carefully explained semantics, whereas a lot of people  
 know how classes behave from day one.

 We already argument this over and over in the past. First, I totally  
 acknowledge that C++ style containers have a problem: they make it  
 easier to copy the content than pass it by reference. On the other side  
 of the spectrum, I think that class semantics makes it too easy to have  
 null dereferences, it's easy to get lost when you have a container of  
 containers.

 I have some experience with containers having class-style semantics: in  
 Objective-C, I ended up creating a set of macro-like functions which I  
 use to initialize containers whenever I use them in case they are null.  
 And I had to do more of these utility functions to handle a particular  
 data structure of mine which is a dictionary of arrays of objects. In  
 C++, I'd have declared this as a "map< string, vector< Object > >" and  
 be done with it; no need for special care initializing each vector, so  
 much easier than in Objective-C.

 I agree that defining structs to have reference semantics as you have  
 done is complicated. But I like the lazy initialization, and we have a  
 precedent for that with AAs (ideally, AAs would be a compatible  
 container too). Can't we just use the GC instead of reference counting?  
 I'd make things much easier. Here is a implementation:

 	struct Container
 	{
 		struct Impl { ... }

 		private Impl* _impl;
 		ref Impl impl()  property
 		{
 			if (!impl) impl = new Impl;
 			return *impl;
 		}
 	
 		alias impl this;
 	}

Unfortunately, this design has big issues:


void fill(Appender appender)
{
     appender.put("hello");
     appender.put("world");
}

void test()
{
     Appender<string> appender;
     fill(appender); // Appender is supposed to have reference semantics
     assert(appender.length != 0); // fails!
}

Asserting above fails because at the time you pass appender object to the  
fill method it isn't initialized yet (lazy initialization). As such, a  
null is passed, creating an instance at first appending, but the result  
isn't seen to the caller.

An explicit initialization is needed to work around this design issue. The  
worst thing is that in many cases it would work fine (you might have  
already initialized it indirectly) but sometimes you get unexpected  
result. I got hit by this in past, and it wasn't easy to trace down.

As such, I strongly believe containers either need to have copy semantics,  
or be classes. However, copy semantics contradicts with the "cheap copy  
ctor" idiom because you need to copy all the elements from source  
container.

 I also believe reference semantics are not to be used everywhere, even  
 though they're good most of the time. I'd like to have a way to bypass  
 it and get a value-semantic container. With the above, it's easy as long  
 as you keep Container.Impl public:

 	void main() {
 		Container      lazyHeapAllocatedContainer;
 		Container.Impl stackAllocatedContainer;
 	}

 	void MyObject {
 		Container.Impl listOfObjects;
 	}



 possible even if they fallback to (cheap) copy semantic when move isn't
 available. That way, if you have a type which is moveable but not
 copyable you can still put it in a container. Does that makes sense?

  That's what I did up until now. It is tantamount to defining a bunch  
 of methods (aliases or not) that add to the interface that the user  
 must absorb, but that are seldom useful. It just seems that the entire  
 move paraphernalia doesn't lift its weight.

 But could we limit this to say that only containers that can return  
 elements by ref? Perhaps that won't help. You know the problem better  
 than me, I don't really have anything more to say.


-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Jan 28 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-01-28 20:10:06 -0500, "Denis Koroskin" <2korden gmail.com> said:

 Unfortunately, this design has big issues:
 
 
 void fill(Appender appender)
 {
      appender.put("hello");
      appender.put("world");
 }
 
 void test()
 {
      Appender<string> appender;
      fill(appender); // Appender is supposed to have reference semantics
      assert(appender.length != 0); // fails!
 }
 
 Asserting above fails because at the time you pass appender object to 
 the  fill method it isn't initialized yet (lazy initialization). As 
 such, a  null is passed, creating an instance at first appending, but 
 the result  isn't seen to the caller.

That's indeed a problem. I don't think it's a fatal flaw however, given 
that the idiom already exists in AAs.

That said, the nice thing about my proposal is that you can easily 
reuse the Impl to create a new container to build a new container 
wrapper with the semantics you like with no loss of efficiency.

As for the case of Appender... personally in the case above I'd be 
tempted to use Appender.Impl directly (value semantics) and make fill 
take a 'ref'. There's no point in having an extra heap allocation, 
especially if you're calling test() in a loop or if there's a good 
chance fill() has nothing to append to it.

That's the issue with containers. The optimal semantics always change 
depending on the use case.


 An explicit initialization is needed to work around this design issue. 
 The  worst thing is that in many cases it would work fine (you might 
 have  already initialized it indirectly) but sometimes you get 
 unexpected  result. I got hit by this in past, and it wasn't easy to 
 trace down.
 
 As such, I strongly believe containers either need to have copy 
 semantics,  or be classes. However, copy semantics contradicts with the 
 "cheap copy  ctor" idiom because you need to copy all the elements from 
 source  container.

Personally, I'm really concerned by the case where you have a container 
of containers. Class semantics make things really complicated as you 
always have to initialize everything in the container explicitly; value 
semantics makes things semantically easier but quite inefficient as 
moving elements inside of the outermost container implies copying the 
containers. Making containers auto-initialize themselves on first use 
solves the case where containers are references-types; making 
containers capable of using move semantics solves the problem for 
value-type containers.


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Jan 28 2011

Tomek =?ISO-8859-2?Q?Sowi=F1ski?= <just ask.me> writes:

Michel Fortin napisa=B3:

 As for the case of Appender... personally in the case above I'd be=20
 tempted to use Appender.Impl directly (value semantics) and make fill=20
 take a 'ref'. There's no point in having an extra heap allocation,=20
 especially if you're calling test() in a loop or if there's a good=20
 chance fill() has nothing to append to it.

Or take an output range.

--=20
Tomek

Jan 29 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/28/11 8:12 PM, Michel Fortin wrote:
 On 2011-01-28 20:10:06 -0500, "Denis Koroskin" <2korden gmail.com> said:

 Unfortunately, this design has big issues:


 void fill(Appender appender)
 {
 appender.put("hello");
 appender.put("world");
 }

 void test()
 {
 Appender<string> appender;
 fill(appender); // Appender is supposed to have reference semantics
 assert(appender.length != 0); // fails!
 }

 Asserting above fails because at the time you pass appender object to
 the fill method it isn't initialized yet (lazy initialization). As
 such, a null is passed, creating an instance at first appending, but
 the result isn't seen to the caller.

 That's indeed a problem. I don't think it's a fatal flaw however, given
 that the idiom already exists in AAs.

 That said, the nice thing about my proposal is that you can easily reuse
 the Impl to create a new container to build a new container wrapper with
 the semantics you like with no loss of efficiency.

 As for the case of Appender... personally in the case above I'd be
 tempted to use Appender.Impl directly (value semantics) and make fill
 take a 'ref'. There's no point in having an extra heap allocation,
 especially if you're calling test() in a loop or if there's a good
 chance fill() has nothing to append to it.

 That's the issue with containers. The optimal semantics always change
 depending on the use case.

Yep, yep, I found myself wrestling with the same issues. All good 
points. On one hand containers are a target for optimization because 
many will use them. On the other hand you'd want to have reasonably 
simple and idiomatic code in the container implementation because you 
want people to understand them easily and also to write their own. I 
thought for a while of a layered approach in which you'd have both the 
value and the sealed reference version of a container... it's just too 
much aggravation.

 An explicit initialization is needed to work around this design issue.
 The worst thing is that in many cases it would work fine (you might
 have already initialized it indirectly) but sometimes you get
 unexpected result. I got hit by this in past, and it wasn't easy to
 trace down.

 As such, I strongly believe containers either need to have copy
 semantics, or be classes. However, copy semantics contradicts with the
 "cheap copy ctor" idiom because you need to copy all the elements from
 source container.

 Personally, I'm really concerned by the case where you have a container
 of containers. Class semantics make things really complicated as you
 always have to initialize everything in the container explicitly; value
 semantics makes things semantically easier but quite inefficient as
 moving elements inside of the outermost container implies copying the
 containers. Making containers auto-initialize themselves on first use
 solves the case where containers are references-types; making containers
 capable of using move semantics solves the problem for value-type
 containers.

Neither values nor references are perfect indeed. For example, someone 
mentioned, hey, in STL I write set< vector<double> > and it Just 
Works(tm). On the other hand, if you swap the two names it still seems 
to work but it's awfully inefficient (something that may trip even 
experienced developers).


Andrei

Feb 01 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-02-01 11:12:13 -0500, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 1/28/11 8:12 PM, Michel Fortin wrote:
 On 2011-01-28 20:10:06 -0500, "Denis Koroskin" <2korden gmail.com> said:
 
 Unfortunately, this design has big issues:
 
 
 void fill(Appender appender)
 {
 appender.put("hello");
 appender.put("world");
 }
 
 void test()
 {
 Appender<string> appender;
 fill(appender); // Appender is supposed to have reference semantics
 assert(appender.length != 0); // fails!
 }
 
 Asserting above fails because at the time you pass appender object to
 the fill method it isn't initialized yet (lazy initialization). As
 such, a null is passed, creating an instance at first appending, but
 the result isn't seen to the caller.

 
 That's indeed a problem. I don't think it's a fatal flaw however, given
 that the idiom already exists in AAs.
 
 That said, the nice thing about my proposal is that you can easily reuse
 the Impl to create a new container to build a new container wrapper with
 the semantics you like with no loss of efficiency.
 
 As for the case of Appender... personally in the case above I'd be
 tempted to use Appender.Impl directly (value semantics) and make fill
 take a 'ref'. There's no point in having an extra heap allocation,
 especially if you're calling test() in a loop or if there's a good
 chance fill() has nothing to append to it.
 
 That's the issue with containers. The optimal semantics always change
 depending on the use case.

 
 Yep, yep, I found myself wrestling with the same issues. All good 
 points. On one hand containers are a target for optimization because 
 many will use them. On the other hand you'd want to have reasonably 
 simple and idiomatic code in the container implementation because you 
 want people to understand them easily and also to write their own. I 
 thought for a while of a layered approach in which you'd have both the 
 value and the sealed reference version of a container... it's just too 
 much aggravation.

But are you not just pushing the aggravation elsewhere? If I need a by 
value container for some reason (performance or semantics) I'll have to 
write my own, and likely others will write their own too.

Using classes for containers is just marginally better than making them 
by-value structs: you can use 'new' with a by-value struct if you want 
it to behave as a class-like by-reference container:

	struct Container {
		...
	}

	auto c = new Container();

The only noticeable difference from a class container is that now c is 
now a Container*.


 Personally, I'm really concerned by the case where you have a container
 of containers. Class semantics make things really complicated as you
 always have to initialize everything in the container explicitly; value
 semantics makes things semantically easier but quite inefficient as
 moving elements inside of the outermost container implies copying the
 containers. Making containers auto-initialize themselves on first use
 solves the case where containers are references-types; making containers
 capable of using move semantics solves the problem for value-type
 containers.

 
 Neither values nor references are perfect indeed. For example, someone 
 mentioned, hey, in STL I write set< vector<double> > and it Just 
 Works(tm). On the other hand, if you swap the two names it still seems 
 to work but it's awfully inefficient (something that may trip even 
 experienced developers).

Isn't that solved by C++0x, using move semantics in swap?

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Feb 01 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/1/11 10:44 AM, Michel Fortin wrote:
 On 2011-02-01 11:12:13 -0500, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 On 1/28/11 8:12 PM, Michel Fortin wrote:
 On 2011-01-28 20:10:06 -0500, "Denis Koroskin" <2korden gmail.com> said:

 Unfortunately, this design has big issues:


 void fill(Appender appender)
 {
 appender.put("hello");
 appender.put("world");
 }

 void test()
 {
 Appender<string> appender;
 fill(appender); // Appender is supposed to have reference semantics
 assert(appender.length != 0); // fails!
 }

 Asserting above fails because at the time you pass appender object to
 the fill method it isn't initialized yet (lazy initialization). As
 such, a null is passed, creating an instance at first appending, but
 the result isn't seen to the caller.

 That's indeed a problem. I don't think it's a fatal flaw however, given
 that the idiom already exists in AAs.

 That said, the nice thing about my proposal is that you can easily reuse
 the Impl to create a new container to build a new container wrapper with
 the semantics you like with no loss of efficiency.

 As for the case of Appender... personally in the case above I'd be
 tempted to use Appender.Impl directly (value semantics) and make fill
 take a 'ref'. There's no point in having an extra heap allocation,
 especially if you're calling test() in a loop or if there's a good
 chance fill() has nothing to append to it.

 That's the issue with containers. The optimal semantics always change
 depending on the use case.

 Yep, yep, I found myself wrestling with the same issues. All good
 points. On one hand containers are a target for optimization because
 many will use them. On the other hand you'd want to have reasonably
 simple and idiomatic code in the container implementation because you
 want people to understand them easily and also to write their own. I
 thought for a while of a layered approach in which you'd have both the
 value and the sealed reference version of a container... it's just too
 much aggravation.

 But are you not just pushing the aggravation elsewhere? If I need a by
 value container for some reason (performance or semantics) I'll have to
 write my own, and likely others will write their own too.

If semantics are the primary concern, you could (and in fact Phobos 
could) provide a Value!C template that automatically calls dup in 
this(this) etc.

For performance I agree there is stuff that class containers leave on 
the table.

 Using classes for containers is just marginally better than making them
 by-value structs: you can use 'new' with a by-value struct if you want
 it to behave as a class-like by-reference container:

 struct Container {
 ...
 }

 auto c = new Container();

 The only noticeable difference from a class container is that now c is
 now a Container*.

Well one problem now is that if you have a Container* you don't know 
whether it's dynamically allocated or the address of some 
stack-allocated object. This is pretty big; a major issue that I believe 
C++ has is that you can seldom reason modularly about functions because 
C++ makes it impossible to represent reference semantics with 
local/remote/shared/no ownership without resorting to convention.

A better solution is to define something like

auto c = new Classify!Container;

which transforms a value into a class object.

With this, the question becomes a matter of choosing the right default: 
do we want values most of the time and occasional references, or vice 
versa? I think most of the time you need references, as witnessed by the 
many '&'s out there in code working on STL containers.

 Personally, I'm really concerned by the case where you have a container
 of containers. Class semantics make things really complicated as you
 always have to initialize everything in the container explicitly; value
 semantics makes things semantically easier but quite inefficient as
 moving elements inside of the outermost container implies copying the
 containers. Making containers auto-initialize themselves on first use
 solves the case where containers are references-types; making containers
 capable of using move semantics solves the problem for value-type
 containers.

 Neither values nor references are perfect indeed. For example, someone
 mentioned, hey, in STL I write set< vector<double> > and it Just
 Works(tm). On the other hand, if you swap the two names it still seems
 to work but it's awfully inefficient (something that may trip even
 experienced developers).

 Isn't that solved by C++0x, using move semantics in swap?

This particular incarnation yes, but that doesn't automatically fix user 
code that forgets the cost of copying. But that took a large language 
change. My point was that values by default is not automatically a good 
choice.


Andrei

Feb 01 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday 01 February 2011 09:07:55 Andrei Alexandrescu wrote:
 On 2/1/11 10:44 AM, Michel Fortin wrote:
 On 2011-02-01 11:12:13 -0500, Andrei Alexandrescu
 
 <SeeWebsiteForEmail erdani.org> said:
 On 1/28/11 8:12 PM, Michel Fortin wrote:
 On 2011-01-28 20:10:06 -0500, "Denis Koroskin" <2korden gmail.com> said:
 Unfortunately, this design has big issues:
 
 
 void fill(Appender appender)
 {
 appender.put("hello");
 appender.put("world");
 }
 
 void test()
 {
 Appender<string> appender;
 fill(appender); // Appender is supposed to have reference semantics
 assert(appender.length != 0); // fails!
 }
 
 Asserting above fails because at the time you pass appender object to
 the fill method it isn't initialized yet (lazy initialization). As
 such, a null is passed, creating an instance at first appending, but
 the result isn't seen to the caller.

 
 That's indeed a problem. I don't think it's a fatal flaw however, given
 that the idiom already exists in AAs.
 
 That said, the nice thing about my proposal is that you can easily
 reuse the Impl to create a new container to build a new container
 wrapper with the semantics you like with no loss of efficiency.
 
 As for the case of Appender... personally in the case above I'd be
 tempted to use Appender.Impl directly (value semantics) and make fill
 take a 'ref'. There's no point in having an extra heap allocation,
 especially if you're calling test() in a loop or if there's a good
 chance fill() has nothing to append to it.
 
 That's the issue with containers. The optimal semantics always change
 depending on the use case.

 
 Yep, yep, I found myself wrestling with the same issues. All good
 points. On one hand containers are a target for optimization because
 many will use them. On the other hand you'd want to have reasonably
 simple and idiomatic code in the container implementation because you
 want people to understand them easily and also to write their own. I
 thought for a while of a layered approach in which you'd have both the
 value and the sealed reference version of a container... it's just too
 much aggravation.

 
 But are you not just pushing the aggravation elsewhere? If I need a by
 value container for some reason (performance or semantics) I'll have to
 write my own, and likely others will write their own too.

 
 If semantics are the primary concern, you could (and in fact Phobos
 could) provide a Value!C template that automatically calls dup in
 this(this) etc.
 
 For performance I agree there is stuff that class containers leave on
 the table.
 
 Using classes for containers is just marginally better than making them
 by-value structs: you can use 'new' with a by-value struct if you want
 it to behave as a class-like by-reference container:
 
 struct Container {
 ...
 }
 
 auto c = new Container();
 
 The only noticeable difference from a class container is that now c is
 now a Container*.

 
 Well one problem now is that if you have a Container* you don't know
 whether it's dynamically allocated or the address of some
 stack-allocated object. This is pretty big; a major issue that I believe
 C++ has is that you can seldom reason modularly about functions because
 C++ makes it impossible to represent reference semantics with
 local/remote/shared/no ownership without resorting to convention.
 
 A better solution is to define something like
 
 auto c = new Classify!Container;
 
 which transforms a value into a class object.
 
 With this, the question becomes a matter of choosing the right default:
 do we want values most of the time and occasional references, or vice
 versa? I think most of the time you need references, as witnessed by the
 many '&'s out there in code working on STL containers.

Java implements containers as classes (not that it really has any other
choice), 
so all containers in Java have reference semantics, and I've _never_ found that 
to be a problem. I do think that there are rare cases where it makes sense for
a 
container to be a value type, but I really do think that it's a rare case for 
the average programmer. On the other hand, having containers be value types in 
C++ is _frequently_ a problem.

- Jonathan M Davis

Feb 01 2011

bearophile <bearophileHUGS lycos.com> writes:

Andrei:

 A better solution is to define something like
 
 auto c = new Classify!Container;
 
 which transforms a value into a class object.
 
 With this, the question becomes a matter of choosing the right default: 
 do we want values most of the time and occasional references, or vice 
 versa? I think most of the time you need references, as witnessed by the 
 many '&'s out there in code working on STL containers.

I agree that most times a reference is better. This brings back the need for a
very good (efficient, syntactically readable, and even if not safe, able to
spot most common errors, and RAII-safe) way to allocate class instances
in-place, on the stack or inside another struct/class.

Bye,
bearophile

Feb 01 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-02-01 12:07:55 -0500, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 With this, the question becomes a matter of choosing the right default: 
 do we want values most of the time and occasional references, or vice 
 versa? I think most of the time you need references, as witnessed by 
 the many '&'s out there in code working on STL containers.

What exactly is "most of the time"? In C++, you pass containers by '&' 
for function parameters, using '&' elsewhere is rare.

One thing I proposed some time ago to address this problem (and to 
which no one replied) was this:

	ref struct Container { ... } // new "ref struct" concept

	void func(Container c) {
		// c is implicitly a "ref Container"
	}

	Container a; // by value
	func(a); // implicitly passed by ref

Containers would be stored by value, but always passed by ref in 
functions parameters.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Feb 01 2011

Simon Buerger <krox gmx.net> writes:

On 01.02.2011 20:01, Michel Fortin wrote:
 On 2011-02-01 12:07:55 -0500, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 With this, the question becomes a matter of choosing the right
 default: do we want values most of the time and occasional
 references, or vice versa? I think most of the time you need
 references, as witnessed by the many '&'s out there in code working
 on STL containers.

 What exactly is "most of the time"? In C++, you pass containers by '&'
 for function parameters, using '&' elsewhere is rare.

 One thing I proposed some time ago to address this problem (and to
 which no one replied) was this:

 ref struct Container { ... } // new "ref struct" concept

 void func(Container c) {
 // c is implicitly a "ref Container"
 }

 Container a; // by value
 func(a); // implicitly passed by ref

 Containers would be stored by value, but always passed by ref in
 functions parameters.

Thats would not be "per-value" as most understand it.

Container globalC;
func(c);

void func(Container paramC)
{
	c.add(42);	// modifies globalC in reference-semantics
                         // leaves globalC as it was in value-semantics
}


Your Idea would be somehow truly value-based if you default not only 
to "ref" but to "const ref", because then the function would not be 
able to alter globalC. But making parameters default-const was not 
considered the right way for D.

- Krox

Feb 01 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 01 Feb 2011 11:44:36 -0500, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2011-02-01 11:12:13 -0500, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> said:

 On 1/28/11 8:12 PM, Michel Fortin wrote:
 On 2011-01-28 20:10:06 -0500, "Denis Koroskin" <2korden gmail.com>  
 said:

 Unfortunately, this design has big issues:
   void fill(Appender appender)
 {
 appender.put("hello");
 appender.put("world");
 }
  void test()
 {
 Appender<string> appender;
 fill(appender); // Appender is supposed to have reference semantics
 assert(appender.length != 0); // fails!
 }
  Asserting above fails because at the time you pass appender object to
 the fill method it isn't initialized yet (lazy initialization). As
 such, a null is passed, creating an instance at first appending, but
 the result isn't seen to the caller.

  That's indeed a problem. I don't think it's a fatal flaw however,  
 given
 that the idiom already exists in AAs.
  That said, the nice thing about my proposal is that you can easily  
 reuse
 the Impl to create a new container to build a new container wrapper  
 with
 the semantics you like with no loss of efficiency.
  As for the case of Appender... personally in the case above I'd be
 tempted to use Appender.Impl directly (value semantics) and make fill
 take a 'ref'. There's no point in having an extra heap allocation,
 especially if you're calling test() in a loop or if there's a good
 chance fill() has nothing to append to it.



I've been thinking of making Appender.Impl public or at least its own  
type.  A stack-based appender makes a lot of sense when you are using it  
temporarily to build an array.

But array-based containers really are in a separate class from node-based  
containers.  It's tempting to conflate the two because they are both  
'containers', but arrays allow many more optimizations/features that  
node-based containers simply can't do.

  Yep, yep, I found myself wrestling with the same issues. All good  
 points. On one hand containers are a target for optimization because  
 many will use them. On the other hand you'd want to have reasonably  
 simple and idiomatic code in the container implementation because you  
 want people to understand them easily and also to write their own. I  
 thought for a while of a layered approach in which you'd have both the  
 value and the sealed reference version of a container... it's just too  
 much aggravation.

 But are you not just pushing the aggravation elsewhere? If I need a by  
 value container for some reason (performance or semantics) I'll have to  
 write my own, and likely others will write their own too.

foo(container.dup) ; // value semantics

I'm sure some template guru can make a wrapper type for this for the rare  
occasions that you need it.  We can work on solving the  
"auto-initialization" issue (where a nested container 'just works'), I  
think there are ways to do it.

If that still doesn't help for your issues, then writing your own may be  
the only valid option.

 Using classes for containers is just marginally better than making them  
 by-value structs: you can use 'new' with a by-value struct if you want  
 it to behave as a class-like by-reference container:

 	struct Container {
 		...
 	}

 	auto c = new Container();

 The only noticeable difference from a class container is that now c is  
 now a Container*.

And doesn't get cleaned up by the GC properly.  Plus, each member call  
must check if the container is 'instantiated', since we can have no  
default ctors.

Yes, it's a trade-off, and I think by far class-based containers win for  
the common case.

 Personally, I'm really concerned by the case where you have a container
 of containers. Class semantics make things really complicated as you
 always have to initialize everything in the container explicitly; value
 semantics makes things semantically easier but quite inefficient as
 moving elements inside of the outermost container implies copying the
 containers. Making containers auto-initialize themselves on first use
 solves the case where containers are references-types; making  
 containers
 capable of using move semantics solves the problem for value-type
 containers.

  Neither values nor references are perfect indeed. For example, someone  
 mentioned, hey, in STL I write set< vector<double> > and it Just  
 Works(tm). On the other hand, if you swap the two names it still seems  
 to work but it's awfully inefficient (something that may trip even  
 experienced developers).

 Isn't that solved by C++0x, using move semantics in swap?

swap isn't the problem.

foreach(s; myVectorSet)
{
    // if s is by value, it must be copied for each iteration in the loop
}

-Steve

Feb 01 2011

Simon Buerger <krox gmx.net> writes:

On 01.02.2011 18:08, Steven Schveighoffer wrote:
 On Tue, 01 Feb 2011 11:44:36 -0500, Michel Fortin
 <michel.fortin michelf.com> wrote:

 On 2011-02-01 11:12:13 -0500, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 On 1/28/11 8:12 PM, Michel Fortin wrote:
 On 2011-01-28 20:10:06 -0500, "Denis Koroskin" <2korden gmail.com>
 said:

 Unfortunately, this design has big issues:
 void fill(Appender appender)
 {
 appender.put("hello");
 appender.put("world");
 }
 void test()
 {
 Appender<string> appender;
 fill(appender); // Appender is supposed to have reference semantics
 assert(appender.length != 0); // fails!
 }
 Asserting above fails because at the time you pass appender
 object to
 the fill method it isn't initialized yet (lazy initialization). As
 such, a null is passed, creating an instance at first appending, but
 the result isn't seen to the caller.

 That's indeed a problem. I don't think it's a fatal flaw however,
 given
 that the idiom already exists in AAs.
 That said, the nice thing about my proposal is that you can easily
 reuse
 the Impl to create a new container to build a new container
 wrapper with
 the semantics you like with no loss of efficiency.
 As for the case of Appender... personally in the case above I'd be
 tempted to use Appender.Impl directly (value semantics) and make fill
 take a 'ref'. There's no point in having an extra heap allocation,
 especially if you're calling test() in a loop or if there's a good
 chance fill() has nothing to append to it.



 I've been thinking of making Appender.Impl public or at least its own
 type. A stack-based appender makes a lot of sense when you are using
 it temporarily to build an array.

 But array-based containers really are in a separate class from
 node-based containers. It's tempting to conflate the two because they
 are both 'containers', but arrays allow many more
 optimizations/features that node-based containers simply can't do.

 Yep, yep, I found myself wrestling with the same issues. All good
 points. On one hand containers are a target for optimization
 because many will use them. On the other hand you'd want to have
 reasonably simple and idiomatic code in the container
 implementation because you want people to understand them easily
 and also to write their own. I thought for a while of a layered
 approach in which you'd have both the value and the sealed
 reference version of a container... it's just too much aggravation.

 But are you not just pushing the aggravation elsewhere? If I need a
 by value container for some reason (performance or semantics) I'll
 have to write my own, and likely others will write their own too.

 foo(container.dup) ; // value semantics

 I'm sure some template guru can make a wrapper type for this for the
 rare occasions that you need it. We can work on solving the
 "auto-initialization" issue (where a nested container 'just works'), I
 think there are ways to do it.

 If that still doesn't help for your issues, then writing your own may
 be the only valid option.

 Using classes for containers is just marginally better than making
 them by-value structs: you can use 'new' with a by-value struct if
 you want it to behave as a class-like by-reference container:

 struct Container {
 ...
 }

 auto c = new Container();

 The only noticeable difference from a class container is that now c
 is now a Container*.

 And doesn't get cleaned up by the GC properly. Plus, each member call
 must check if the container is 'instantiated', since we can have no
 default ctors.

 Yes, it's a trade-off, and I think by far class-based containers win
 for the common case.

 Personally, I'm really concerned by the case where you have a
 container
 of containers. Class semantics make things really complicated as you
 always have to initialize everything in the container explicitly;
 value
 semantics makes things semantically easier but quite inefficient as
 moving elements inside of the outermost container implies copying the
 containers. Making containers auto-initialize themselves on first use
 solves the case where containers are references-types; making
 containers
 capable of using move semantics solves the problem for value-type
 containers.

 Neither values nor references are perfect indeed. For example,
 someone mentioned, hey, in STL I write set< vector<double> > and it
 Just Works(tm). On the other hand, if you swap the two names it
 still seems to work but it's awfully inefficient (something that
 may trip even experienced developers).

 Isn't that solved by C++0x, using move semantics in swap?

 swap isn't the problem.

 foreach(s; myVectorSet)
 {
 // if s is by value, it must be copied for each iteration in the loop
 }

 -Steve

Just to note: the "correct" solution for the last problem is

foreach(const ref s; myVectorSet)

which is working in current D. In a more value-based language you may 
even want to default to "const ref" for foreach-loop-values, and even 
for function-parameters. I suggested that a while ago, but wasn't 
liked much for D, for good reasons.

- Krox

Feb 01 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 01 Feb 2011 14:26:26 -0500, Simon Buerger <krox gmx.net> wrote:

 On 01.02.2011 18:08, Steven Schveighoffer wrote:

 swap isn't the problem.

 foreach(s; myVectorSet)
 {
 // if s is by value, it must be copied for each iteration in the loop
 }

 Just to note: the "correct" solution for the last problem is

 foreach(const ref s; myVectorSet)

 which is working in current D. In a more value-based language you may  
 even want to default to "const ref" for foreach-loop-values, and even  
 for function-parameters. I suggested that a while ago, but wasn't liked  
 much for D, for good reasons.

This is not a "solution".  I cannot enforce that someone uses foreach with  
ref semantics.  Unless the type is a reference type, like a class.

Note that const ref isn't necessarily want, you could want to mutate the  
elements of the vector, meaning you really want ref.

The whole point is, it's too easy to get it wrong, and the wrong thing  
looks innocuous and does not look like it will perform horribly.

Compare this to someone who actually *wants* value semantics when  
iterating a reference type (which should be very rare):

foreach(s; myVectorSet)
{
    auto setIAmGoingToUse = s.dup; // clear intentions: "yes, I want to do  
an expensive copy"
}

-Steve

Feb 01 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 28 Jan 2011 18:32:28 -0500, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2011-01-28 17:09:08 -0500, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> said:

 On 1/28/11 3:05 PM, Michel Fortin wrote:

 Phobos will certainly be an improvement over not having them. So go  
 ahead!

  Well if you brought forth some strong argument I'm all ears. What I  
 see for now is that struct containers are just difficult to implement  
 and need to have carefully explained semantics, whereas a lot of people  
 know how classes behave from day one.

 We already argument this over and over in the past. First, I totally  
 acknowledge that C++ style containers have a problem: they make it  
 easier to copy the content than pass it by reference. On the other side  
 of the spectrum, I think that class semantics makes it too easy to have  
 null dereferences, it's easy to get lost when you have a container of  
 containers.

 I have some experience with containers having class-style semantics: in  
 Objective-C, I ended up creating a set of macro-like functions which I  
 use to initialize containers whenever I use them in case they are null.  
 And I had to do more of these utility functions to handle a particular  
 data structure of mine which is a dictionary of arrays of objects. In  
 C++, I'd have declared this as a "map< string, vector< Object > >" and  
 be done with it; no need for special care initializing each vector, so  
 much easier than in Objective-C.

I see value in this idiom, and I have ideas on how to solve this with  
class-based containers.  Essentially, I think we need a way to construct a  
"container of containers" where the owner container is the one who has  
class semantics, and the sub-containers only provide access through the  
owner.  I haven't fleshed out how this will work, but it solves another  
really important problem that dcollections currently has --  
over-allocation.

The issue is, if you have a nested container (say a map of strings to  
linked-lists), the map container pre-allocates a page of elements, to ease  
stress on the GC.  But so does each linked list!  However, since the lists  
all are part of the map, they could potentially share the same allocator.   
If you have few elements in each list, this could significantly reduce the  
over-allocation of memory.

-Steve

Jan 31 2011

bearophile <bearophileHUGS lycos.com> writes:

Andrei:

 1. Containers will be classes.

This page:
http://www.jroller.com/scolebourne/entry/the_next_big_jvm_language1

A quotation:

3) Everything is a monitor. In Java and the JVM, every object is a monitor,
meaning that you can synchronize on any object. This is incredibly wasteful at
the JVM level. Senior JVM guys have indicated large percentage improvements in
JVM space and performance if we removed the requirement that every object can
be synchronized on. (Instead, you would have specific classes like Java 5 Lock)<

I have read similar comments in various other places.

What about creating a  nomonitor annotation, for D2 classes to not create a
monitor for specific classes annotated with it? This may reduce some class
overhead.

Bye,
bearophile

Jan 28 2011

Tomek =?ISO-8859-2?Q?Sowi=F1ski?= <just ask.me> writes:

bearophile napisa=B3:

 This page:
 http://www.jroller.com/scolebourne/entry/the_next_big_jvm_language1
=20
 A quotation:
=20
3) Everything is a monitor. In Java and the JVM, every object is a monit=


or, meaning that you can synchronize on any
object. This is incredibly wasteful at the JVM level. Senior JVM guys ha=


ve indicated large percentage improvements
in JVM space and performance if we removed the requirement that every ob=


ject can be synchronized on. (Instead, you
would have specific classes like Java 5 Lock)<

=20
 I have read similar comments in various other places.
=20
 What about creating a  nomonitor annotation, for D2 classes to not create=

 a monitor for specific classes annotated
 with it? This may reduce some class overhead.

Better just remove it, it's not used often. Besides, there are different lo=
cks, one size doesn't fit all.

--=20
Tomek

Jan 29 2011

SHOO <zan77137 nifty.com> writes:

(2011/01/29 3:31), Andrei Alexandrescu wrote:
 Today after work I plan to start making one pass through std.container.
 After having thought of things for a long time, my conclusions are as
 follows:

 1. Containers will be classes.

 2. Most of the methods in existing containers will be final. It's up to
 the container to make a method final or not.

 3. Containers and their ranges decide whether they give away references
 to their objects. Sealing is a great idea but it makes everybody's life
 too complicated. I'll defer sealing to future improvements in the
 language and/or the reflection subsystem.

 4. Containers will assume that objects are cheap to copy so they won't
 worry about moving primitives.

 Any showstoppers, please share.


 Andrei

1, 2

I think that it is a good policy.
There is hardly the superiority that a containers are structs. :-)

In the first place I had doubt towards treating structs as classes 
(OOP). When I use it by such a way, the lack of the default constructor 
is fatal.

3, 4

If speed is accompanied, the sealing up is useful, but if it is late, 
there are many unnecessary cases. Because it is thought that a lot of 
repetition access to the element and life cycles occurs in the case of a 
container, handling it by the element unit should be high-performance. 
In addition, for usability, a thing of simple behavior is preferable 
rather than a complicated thing.
However, on the other hand, the design that wrong use is impossible 
should be considered enough.

Jan 29 2011

Simon Buerger <krox gmx.net> writes:

On 28.01.2011 19:31, Andrei Alexandrescu wrote:
 1. Containers will be classes.

 2. Most of the methods in existing containers will be final. It's up
 to the container to make a method final or not.

 3. Containers and their ranges decide whether they give away
 references to their objects. Sealing is a great idea but it makes
 everybody's life too complicated. I'll defer sealing to future
 improvements in the language and/or the reflection subsystem.

 4. Containers will assume that objects are cheap to copy so they won't
 worry about moving primitives.

Not perfectly what I would like, but a reasonable choice, and most 
important to actually have a mature container-lib. But there are other 
choices remaining: what containers will be there and what will they be 
called? My suggestion is

* Set, MulitSet, Map, MultiMap (hash-table based)
* OrderedSet, OrderedMultiSet, OrderedMap, OrderedMultiMap (tree-based)
* Sequence (like stl-Deque. the name is just more intuitive. Funny 
enough, the stl-deque implemenation has nothing to do with a "doubly 
linked list")
* Array (like stl-vector. I think "vector" is a kinda strange name, 
but that may be only my impression)
* List (linked list)

* Stack/Queue/PriorityQueue should be done on top of an other class, 
with a "impl"-template-param, like the stl-ones

Things to note:
* container should be named with respect to their use, not the 
implementation. "HashSet" is a bad name, because the user shouldnt 
care about the implemenation.

* unordered sets are used more often than ordered. So it should be 
"Set/OrderedSet", and not "UnorderedSet/Set" (also, the first one is 
two characters less typing *g*)

* opEqual should work between different types of Sets (or Maps, or 
sequences). Nothing wrong with comparing an ordered to an unordered 
one, or a list to an array.

just my 2 cents,
Krox

Jan 29 2011

bearophile <bearophileHUGS lycos.com> writes:

Simon Buerger:

 * container should be named with respect to their use, not the 
 implementation. "HashSet" is a bad name, because the user shouldnt 
 care about the implemenation.

This is good for a language like Python, or even Java, but for a system
language I want to know the algorithms I'm using under the hood. So I am not
sure.

Bye,
bearophile

Jan 29 2011

Jim <bitcirkel yahoo.com> writes:

bearophile Wrote:

 Simon Buerger:
 
 * container should be named with respect to their use, not the 
 implementation. "HashSet" is a bad name, because the user shouldnt 
 care about the implemenation.

 
 This is good for a language like Python, or even Java, but for a system
language I want to know the algorithms I'm using under the hood. So I am not
sure.

There could be an interface named Set and various implementations of it, e.g.
HashSet. I think this would be a good principle in general for all abstract
collection types.

Jan 29 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 01/29/2011 12:44 PM, Jim wrote:
 bearophile Wrote:

 Simon Buerger:

 * container should be named with respect to their use, not the
 implementation. "HashSet" is a bad name, because the user shouldnt
 care about the implemenation.

 This is good for a language like Python, or even Java, but for a system
language I want to know the algorithms I'm using under the hood. So I am not
sure.

 There could be an interface named Set and various implementations of it, e.g.
HashSet. I think this would be a good principle in general for all abstract
collection types.

Thus approach has a few advantages but a lot of serious issues. I 
decided to not define a hierarchy for std.container.

Andrei

Jan 29 2011

KennyTM~ <kennytm gmail.com> writes:

On Jan 30, 11 01:09, Simon Buerger wrote:
 On 28.01.2011 19:31, Andrei Alexandrescu wrote:
 1. Containers will be classes.

 2. Most of the methods in existing containers will be final. It's up
 to the container to make a method final or not.

 3. Containers and their ranges decide whether they give away
 references to their objects. Sealing is a great idea but it makes
 everybody's life too complicated. I'll defer sealing to future
 improvements in the language and/or the reflection subsystem.

 4. Containers will assume that objects are cheap to copy so they won't
 worry about moving primitives.

 Not perfectly what I would like, but a reasonable choice, and most
 important to actually have a mature container-lib. But there are other
 choices remaining: what containers will be there and what will they be
 called? My suggestion is

 * Set, MulitSet, Map, MultiMap (hash-table based)
 * OrderedSet, OrderedMultiSet, OrderedMap, OrderedMultiMap (tree-based)
 * Sequence (like stl-Deque. the name is just more intuitive. Funny
 enough, the stl-deque implemenation has nothing to do with a "doubly
 linked list")

A 'deque' just mean a double-ended queue, not necessarily doubly linked 
list. I don't like the name Sequence since it doesn't specify the 
container can be modified from both ends. 'Deque' is just fine.

 * Array (like stl-vector. I think "vector" is a kinda strange name, but
 that may be only my impression)
 * List (linked list)

 * Stack/Queue/PriorityQueue should be done on top of an other class,
 with a "impl"-template-param, like the stl-ones

 Things to note:
 * container should be named with respect to their use, not the
 implementation. "HashSet" is a bad name, because the user shouldnt care
 about the implemenation.

Except that there is already a std.container.RedBlackTree named in this 
fashion :).

 * unordered sets are used more often than ordered. So it should be
 "Set/OrderedSet", and not "UnorderedSet/Set" (also, the first one is two
 characters less typing *g*)

 * opEqual should work between different types of Sets (or Maps, or
 sequences). Nothing wrong with comparing an ordered to an unordered one,
 or a list to an array.

This breaks transitivity of '==' unnecessarily. I don't see the benefit 
of comparing two different kinds of containers. If you really need to 
compare them, use std.algorithm.equal.

 just my 2 cents,
 Krox

Jan 29 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday 29 January 2011 09:09:00 Simon Buerger wrote:
 * container should be named with respect to their use, not the
 implementation. "HashSet" is a bad name, because the user shouldnt
 care about the implemenation.

It was decided previously that containers in std.container would specifically
be 
named based on the type of data structure that they are rather than what
they're 
used for. Hence why the newest container to std.container is called 
RedBlackTree. If you want to call it TreeSet or something similar, then all you 
have to do is alias it. So, it's easy to get names based on usage if that's
what 
you want, but that's not the naming scheme that std.container is using.

- Jonathan M Davis

Jan 29 2011

spir <denis.spir gmail.com> writes:

On 01/29/2011 06:09 PM, Simon Buerger wrote:
 Things to note:
 * container should be named with respect to their use, not the implementation.
 "HashSet" is a bad name, because the user shouldnt care about the
implemenation.

 * unordered sets are used more often than ordered. So it should be
 "Set/OrderedSet", and not "UnorderedSet/Set" (also, the first one is two
 characters less typing *g*)

 * opEqual should work between different types of Sets (or Maps, or sequences).
 Nothing wrong with comparing an ordered to an unordered one, or a list to an
 array.

+++ on all those points

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Jan 30 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 29 Jan 2011 12:09:00 -0500, Simon Buerger <krox gmx.net> wrote:

 On 28.01.2011 19:31, Andrei Alexandrescu wrote:
 1. Containers will be classes.

 2. Most of the methods in existing containers will be final. It's up
 to the container to make a method final or not.

 3. Containers and their ranges decide whether they give away
 references to their objects. Sealing is a great idea but it makes
 everybody's life too complicated. I'll defer sealing to future
 improvements in the language and/or the reflection subsystem.

 4. Containers will assume that objects are cheap to copy so they won't
 worry about moving primitives.

 Not perfectly what I would like, but a reasonable choice, and most  
 important to actually have a mature container-lib. But there are other  
 choices remaining: what containers will be there and what will they be  
 called? My suggestion is

 * Set, MulitSet, Map, MultiMap (hash-table based)
 * OrderedSet, OrderedMultiSet, OrderedMap, OrderedMultiMap (tree-based)
 * Sequence (like stl-Deque. the name is just more intuitive. Funny  
 enough, the stl-deque implemenation has nothing to do with a "doubly  
 linked list")
 * Array (like stl-vector. I think "vector" is a kinda strange name, but  
 that may be only my impression)
 * List (linked list)

 * Stack/Queue/PriorityQueue should be done on top of an other class,  
 with a "impl"-template-param, like the stl-ones

 Things to note:
 * container should be named with respect to their use, not the  
 implementation. "HashSet" is a bad name, because the user shouldnt care  
 about the implemenation.

 * unordered sets are used more often than ordered. So it should be  
 "Set/OrderedSet", and not "UnorderedSet/Set" (also, the first one is two  
 characters less typing *g*)

 * opEqual should work between different types of Sets (or Maps, or  
 sequences). Nothing wrong with comparing an ordered to an unordered one,  
 or a list to an array.

http://www.dsource.org/projects/dcollections

-Steve

Jan 31 2011

Simon Buerger <krox gmx.net> writes:

On 31.01.2011 17:53, Steven Schveighoffer wrote:
 http://www.dsource.org/projects/dcollections

 -Steve

Well, seems not bad on a quick look. But source is updated 2 years 
ago, so I doubt it would compile with current dmd. Anyway, the topic 
here is the std-behaviour of the std-lib. But sure, always nice to 
have custom alternatives.

Krox

Jan 31 2011

David Nadlinger <see klickverbot.at> writes:

On 1/31/11 6:48 PM, Simon Buerger wrote:
 Well, seems not bad on a quick look. But source is updated 2 years ago,…

http://www.dsource.org/projects/dcollections/browser/branches/d2 – four 
months are not quite as long as two years. Oh, and Steve is the very 
author of dcollections. ;)

David

Jan 31 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 31 Jan 2011 12:48:06 -0500, Simon Buerger <krox gmx.net> wrote:

 On 31.01.2011 17:53, Steven Schveighoffer wrote:
 http://www.dsource.org/projects/dcollections

 -Steve

 Well, seems not bad on a quick look. But source is updated 2 years ago,  
 so I doubt it would compile with current dmd. Anyway, the topic here is  
 the std-behaviour of the std-lib. But sure, always nice to have custom  
 alternatives.

latest meaningful change was 4 months ago:  
http://www.dsource.org/projects/dcollections/changeset/102

It should compile on the latest DMD (if not, file a ticket).  BTW, it  
changes very little mostly because I haven't had many complaints about it  
(maybe not a good sign?) and I haven't had much opportunity to use it, as  
my day job does not allow using D unfortunately.

It was proposed as a possibility for the std lib, but Andrei and I  
couldn't come to an agreement on what the collections should look like.   
Ironically, his latest decision moves std.container closer to dcollections  
in design.

However, it should be relatively compatible with phobos' container lib (in  
fact RedBlackTree is a direct port of dcollections' version).

-Steve

Jan 31 2011

Simon Buerger <krox gmx.net> writes:

Okay, my fault. Didnt realize you were the author, and the project is 
still active. The "2 years" came from here: 
http://www.dsource.org/projects/dcollections/browser/trunk/dcollections. 
I thought, that "trunk" was the most recent version. Added a bookmark, 
and will definitely take a closer look later. Thx for mentioning.


On 31.01.2011 19:09, Steven Schveighoffer wrote:
 On Mon, 31 Jan 2011 12:48:06 -0500, Simon Buerger <krox gmx.net> wrote:

 On 31.01.2011 17:53, Steven Schveighoffer wrote:
 http://www.dsource.org/projects/dcollections

 -Steve

 Well, seems not bad on a quick look. But source is updated 2 years
 ago, so I doubt it would compile with current dmd. Anyway, the topic
 here is the std-behaviour of the std-lib. But sure, always nice to
 have custom alternatives.

 latest meaningful change was 4 months ago:
 http://www.dsource.org/projects/dcollections/changeset/102

 It should compile on the latest DMD (if not, file a ticket). BTW, it
 changes very little mostly because I haven't had many complaints about
 it (maybe not a good sign?) and I haven't had much opportunity to use
 it, as my day job does not allow using D unfortunately.

 It was proposed as a possibility for the std lib, but Andrei and I
 couldn't come to an agreement on what the collections should look
 like. Ironically, his latest decision moves std.container closer to
 dcollections in design.

 However, it should be relatively compatible with phobos' container lib
 (in fact RedBlackTree is a direct port of dcollections' version).

 -Steve

Jan 31 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 31 Jan 2011 13:36:57 -0500, Simon Buerger <krox gmx.net> wrote:

 Okay, my fault. Didnt realize you were the author, and the project is  
 still active. The "2 years" came from here:  
 http://www.dsource.org/projects/dcollections/browser/trunk/dcollections.  
 I thought, that "trunk" was the most recent version. Added a bookmark,  
 and will definitely take a closer look later. Thx for mentioning.

Sorry about that, I have had at least 3 people be confused at why trunk  
wasn't the D2 version.  I really need to swap those (make the 1.x version  
the branch).  Unfortunately, it will screw up the D1 documentation, since  
it's auto-generated.  But that should probably be moot since the D1  
version isn't changing...

-Steve

Jan 31 2011

dsimcha <dsimcha yahoo.com> writes:

In general I actually like this idea, but can we still have sealed 
ref-counted arrays and associative arrays somewhere in Phobos (not 
necessarily any more esoteric containers than these and not necessarily 
in std.container, maybe in a new module called std.refcounted or 
something)?  For huge arrays/AAs I think ref counting is an important 
optimization, especially with the GC in its current state but to a 
lesser extent even when we get a better GC.  GC tends to be slow 
compared to manual memory management in cases where most allocated 
blocks are large.

Also, I like reference counting for huge arrays/AAs enough that I 
submitted a ref counted hash table implementation to the Phobos list for 
review, which noone seemed to notice.  Please comment.

On 1/28/2011 1:31 PM, Andrei Alexandrescu wrote:
 Today after work I plan to start making one pass through std.container.
 After having thought of things for a long time, my conclusions are as
 follows:

 1. Containers will be classes.

 2. Most of the methods in existing containers will be final. It's up to
 the container to make a method final or not.

 3. Containers and their ranges decide whether they give away references
 to their objects. Sealing is a great idea but it makes everybody's life
 too complicated. I'll defer sealing to future improvements in the
 language and/or the reflection subsystem.

 4. Containers will assume that objects are cheap to copy so they won't
 worry about moving primitives.

 Any showstoppers, please share.


 Andrei

Jan 29 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 01/29/2011 12:54 PM, dsimcha wrote:
 Also, I like reference counting for huge arrays/AAs enough that I
 submitted a ref counted hash table implementation to the Phobos list for
 review, which noone seemed to notice. Please comment.

Where is it?

Andrei

Jan 29 2011

dsimcha <dsimcha yahoo.com> writes:

http://www.dsource.org/projects/aa/browser/trunk/randAA/randaasealed.d

On 1/29/2011 2:18 PM, Andrei Alexandrescu wrote:
 On 01/29/2011 12:54 PM, dsimcha wrote:
 Also, I like reference counting for huge arrays/AAs enough that I
 submitted a ref counted hash table implementation to the Phobos list for
 review, which noone seemed to notice. Please comment.

 Where is it?

 Andrei

Jan 29 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 01/29/2011 01:29 PM, dsimcha wrote:
 http://www.dsource.org/projects/aa/browser/trunk/randAA/randaasealed.d

 On 1/29/2011 2:18 PM, Andrei Alexandrescu wrote:
 On 01/29/2011 12:54 PM, dsimcha wrote:
 Also, I like reference counting for huge arrays/AAs enough that I
 submitted a ref counted hash table implementation to the Phobos list for
 review, which noone seemed to notice. Please comment.

 Where is it?

 Andrei


Well to create a proposal you'd need to generate the documentation and 
put it somewhere and write a note to this group motivating the poposal 
and asking for a review manager.

Andrei

Jan 29 2011

dsimcha <dsimcha yahoo.com> writes:

I've uploaded the documentation to 
http://cis.jhu.edu/~dsimcha/randaasealed.html and mentioned it again on 
the mailing list.  The documentation is pretty sparse because 
interface-wise it's just a standard hash table.  More generally, though, 
are we still interested in sealed/ref counted containers?

On 1/29/2011 2:52 PM, Andrei Alexandrescu wrote:
 On 01/29/2011 01:29 PM, dsimcha wrote:
 http://www.dsource.org/projects/aa/browser/trunk/randAA/randaasealed.d

 On 1/29/2011 2:18 PM, Andrei Alexandrescu wrote:
 On 01/29/2011 12:54 PM, dsimcha wrote:
 Also, I like reference counting for huge arrays/AAs enough that I
 submitted a ref counted hash table implementation to the Phobos list
 for
 review, which noone seemed to notice. Please comment.

 Where is it?

 Andrei


 Well to create a proposal you'd need to generate the documentation and
 put it somewhere and write a note to this group motivating the poposal
 and asking for a review manager.

 Andrei

Jan 29 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/29/11 3:36 PM, dsimcha wrote:
 I've uploaded the documentation to
 http://cis.jhu.edu/~dsimcha/randaasealed.html and mentioned it again on
 the mailing list. The documentation is pretty sparse because
 interface-wise it's just a standard hash table. More generally, though,
 are we still interested in sealed/ref counted containers?

Sorry for being slow in continuing this thread.

Regarding the general issue that someone makes an informal proposal 
(either here, as a DIP, or on the Phobos mailing list), followed by a 
thundering silence: I believe that a good technique is to formalize the 
proposal review process, which has been a homerun for Boost. The 
disadvantage of that is that almost without exception this is very 
taxing to library submitters. This means the submitter must put a lot of 
thought and a lot of work into motivating, polishing, and documenting an 
artifact without any guarantee that it would lead to inclusion in the 
target library. I've seen very, VERY elaborate Boost submissions fail - 
literally months of work gone to waste.

I'm not sure how to save people from doing work up front in hope of an 
uncertain outcome in the future.

I do know what does _not_ work: the take it or leave it approach: "Hey, 
I have this code for abstraction XYZ that I extracted from a project of 
mine and I think it may be of general interest. It's at 
http://site.com/path/to/code.{d,html}. It needs polishing here and 
there, it's largely undocumented, but I'm sure the ideas shine through. 
Eh?"

The doc at http://cis.jhu.edu/~dsimcha/randaasealed.html is somewhere in 
between. It is clear you have a good understanding of sealing and hash 
containers, but let me ask you this - if you wanted to sell this to 
someone, what would you do? Probably you'd show some relevant benchmarks 
putting the built-in hashes to shame. Maybe you'd have some good 
examples - yes, we know it's a hash, but it doesn't hurt to see some 
code pulp over there. Maybe you'd explain sealing and discuss its 
relative advantages and disadvantages (which have not yet been 
documented anywhere - a great opportunity). Maybe you'd even show some 
numbers showing how sealing does as well/better/worse than reference 
leaking.

This is a good amount of upfront work for little promise. Again, I don't 
know yet how to optimize for minimizing that. What I did see works on 
Boost is a request for interest in the form of a discussion (usually 
with NO source, only USAGE examples) asking if there are people who are 
interested in such a notion. In this particular case, you'd need numbers 
to make a strong case which means that code must be already written. For 
something like e.g. "how about a Finite State Automaton library?" 
perhaps upfront code wouldn't be necessary for gauging interest.


Andrei

Feb 01 2011

spir <denis.spir gmail.com> writes:

On 02/01/2011 05:00 PM, Andrei Alexandrescu wrote:
 Regarding the general issue that someone makes an informal proposal (either
 here, as a DIP, or on the Phobos mailing list), followed by a thundering
 silence: I believe that a good technique is to formalize the proposal review
 process, which has been a homerun for Boost. The disadvantage of that is that
 almost without exception this is very taxing to library submitters. This means
 the submitter must put a lot of thought and a lot of work into motivating,
 polishing, and documenting an artifact without any guarantee that it would lead
 to inclusion in the target library. I've seen very, VERY elaborate Boost
 submissions fail - literally months of work gone to waste.

An alternative, or a complementary approach, may be to delegate part of your 
responsability. In this case, I'm thinking at a pool of people which "mission" 
would be to obviously show interest (or lack of) for proposals made on the 
mailing list --whatever their advancement, formality, code quality... This 
would provide a valuable indicator while removing some load from your 
shoulders, I guess. Such people may be chosen by cooptation.
Note this approach is not exclusive of formal & heavy adoption process like 
Boost's; instead it can be a complementary or preliminary way of judging 
interest for proposals.

A similar principle may indeed be used for other purpose: specification & 
evolution of D-the-language, implementation & bug removal of the reference 
compiler,...

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Feb 01 2011

dsimcha <dsimcha yahoo.com> writes:

I see your point.  Some of the stuff I've posted lately (RandAA, 
CsvRange) is in a usable but admittedly rough state, and was written 
partially for myself and partially for Phobos.

If this came off as take-it-or-leave-it, this was unintended. These 
posts were supposed to be informal requests for comment.  My main goal 
in these posts was to gauge whether there's enough interest, whether the 
design was good enough that it's worth polishing up for Phobos, and 
whether I missed any key features.  If there was sufficient interest I 
had every intention of polishing/fleshing out these libraries.

On 2/1/2011 11:00 AM, Andrei Alexandrescu wrote:
 On 1/29/11 3:36 PM, dsimcha wrote:
 I've uploaded the documentation to
 http://cis.jhu.edu/~dsimcha/randaasealed.html and mentioned it again on
 the mailing list. The documentation is pretty sparse because
 interface-wise it's just a standard hash table. More generally, though,
 are we still interested in sealed/ref counted containers?

 Sorry for being slow in continuing this thread.

 Regarding the general issue that someone makes an informal proposal
 (either here, as a DIP, or on the Phobos mailing list), followed by a
 thundering silence: I believe that a good technique is to formalize the
 proposal review process, which has been a homerun for Boost. The
 disadvantage of that is that almost without exception this is very
 taxing to library submitters. This means the submitter must put a lot of
 thought and a lot of work into motivating, polishing, and documenting an
 artifact without any guarantee that it would lead to inclusion in the
 target library. I've seen very, VERY elaborate Boost submissions fail -
 literally months of work gone to waste.

 I'm not sure how to save people from doing work up front in hope of an
 uncertain outcome in the future.

 I do know what does _not_ work: the take it or leave it approach: "Hey,
 I have this code for abstraction XYZ that I extracted from a project of
 mine and I think it may be of general interest. It's at
 http://site.com/path/to/code.{d,html}. It needs polishing here and
 there, it's largely undocumented, but I'm sure the ideas shine through.
 Eh?"

 The doc at http://cis.jhu.edu/~dsimcha/randaasealed.html is somewhere in
 between. It is clear you have a good understanding of sealing and hash
 containers, but let me ask you this - if you wanted to sell this to
 someone, what would you do? Probably you'd show some relevant benchmarks
 putting the built-in hashes to shame. Maybe you'd have some good
 examples - yes, we know it's a hash, but it doesn't hurt to see some
 code pulp over there. Maybe you'd explain sealing and discuss its
 relative advantages and disadvantages (which have not yet been
 documented anywhere - a great opportunity). Maybe you'd even show some
 numbers showing how sealing does as well/better/worse than reference
 leaking.

 This is a good amount of upfront work for little promise. Again, I don't
 know yet how to optimize for minimizing that. What I did see works on
 Boost is a request for interest in the form of a discussion (usually
 with NO source, only USAGE examples) asking if there are people who are
 interested in such a notion. In this particular case, you'd need numbers
 to make a strong case which means that code must be already written. For
 something like e.g. "how about a Finite State Automaton library?"
 perhaps upfront code wouldn't be necessary for gauging interest.


 Andrei

Feb 02 2011

dsimcha <dsimcha yahoo.com> writes:

BTW, I thought the case for a sealed hash table with an optimized memory 
allocation scheme was self-evident to most heavy D users, given how the 
builtin hash table leaks memory/fragments the heap (I think it's a 
little of both) and destroys multithreaded performance.

On 2/1/2011 11:00 AM, Andrei Alexandrescu wrote:
 On 1/29/11 3:36 PM, dsimcha wrote:
 I've uploaded the documentation to
 http://cis.jhu.edu/~dsimcha/randaasealed.html and mentioned it again on
 the mailing list. The documentation is pretty sparse because
 interface-wise it's just a standard hash table. More generally, though,
 are we still interested in sealed/ref counted containers?

 Sorry for being slow in continuing this thread.

 Regarding the general issue that someone makes an informal proposal
 (either here, as a DIP, or on the Phobos mailing list), followed by a
 thundering silence: I believe that a good technique is to formalize the
 proposal review process, which has been a homerun for Boost. The
 disadvantage of that is that almost without exception this is very
 taxing to library submitters. This means the submitter must put a lot of
 thought and a lot of work into motivating, polishing, and documenting an
 artifact without any guarantee that it would lead to inclusion in the
 target library. I've seen very, VERY elaborate Boost submissions fail -
 literally months of work gone to waste.

 I'm not sure how to save people from doing work up front in hope of an
 uncertain outcome in the future.

 I do know what does _not_ work: the take it or leave it approach: "Hey,
 I have this code for abstraction XYZ that I extracted from a project of
 mine and I think it may be of general interest. It's at
 http://site.com/path/to/code.{d,html}. It needs polishing here and
 there, it's largely undocumented, but I'm sure the ideas shine through.
 Eh?"

 The doc at http://cis.jhu.edu/~dsimcha/randaasealed.html is somewhere in
 between. It is clear you have a good understanding of sealing and hash
 containers, but let me ask you this - if you wanted to sell this to
 someone, what would you do? Probably you'd show some relevant benchmarks
 putting the built-in hashes to shame. Maybe you'd have some good
 examples - yes, we know it's a hash, but it doesn't hurt to see some
 code pulp over there. Maybe you'd explain sealing and discuss its
 relative advantages and disadvantages (which have not yet been
 documented anywhere - a great opportunity). Maybe you'd even show some
 numbers showing how sealing does as well/better/worse than reference
 leaking.

 This is a good amount of upfront work for little promise. Again, I don't
 know yet how to optimize for minimizing that. What I did see works on
 Boost is a request for interest in the form of a discussion (usually
 with NO source, only USAGE examples) asking if there are people who are
 interested in such a notion. In this particular case, you'd need numbers
 to make a strong case which means that code must be already written. For
 something like e.g. "how about a Finite State Automaton library?"
 perhaps upfront code wouldn't be necessary for gauging interest.


 Andrei

Feb 02 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/2/11 8:56 AM, dsimcha wrote:
 BTW, I thought the case for a sealed hash table with an optimized memory
 allocation scheme was self-evident to most heavy D users, given how the
 builtin hash table leaks memory/fragments the heap (I think it's a
 little of both) and destroys multithreaded performance.

Perhaps a very good argument could be made with some benchmarks that 
show how your table is better than others in casual use and in heavy use.

Andrei

Feb 02 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Friday 28 January 2011 10:31:58 Andrei Alexandrescu wrote:
 Today after work I plan to start making one pass through std.container.
 After having thought of things for a long time, my conclusions are as
 follows:
 
 1. Containers will be classes.
 
 2. Most of the methods in existing containers will be final. It's up to
 the container to make a method final or not.
 
 3. Containers and their ranges decide whether they give away references
 to their objects. Sealing is a great idea but it makes everybody's life
 too complicated. I'll defer sealing to future improvements in the
 language and/or the reflection subsystem.
 
 4. Containers will assume that objects are cheap to copy so they won't
 worry about moving primitives.
 
 Any showstoppers, please share.

Overall, I think definitely support this approach. I definitely think that 
containers should be reference types, and I'm not sure that I care much whether 
they end up being structs or classes. My one concern would be the ability to 
inline the container's function calls, and marking them as final fixes that 
problem. I'm not sure that I know enough about the sealing issue to comment on 
it, but if what we're doing doesn't preclude having it later, then that sounds 
like it's a good compromise. And I definitely think that make it so that 
containers don't have to worry about moving primitives is the way to go.

So, I'm essentially 100% behind this, and I don't see any real issues with it. 
Containers is one of the places that Phobos is generally behind on and which 
really needs to be taken care if we really want people to be using D and
Phobos. 
So, it's great to see that progress is being made with regards to containers.

- Jonathan M Davis

Jan 29 2011

D Programming

C/C++ Programming

Other

digitalmars.D - Decision on container design