digitalmars.D - BinaryHeap is a range so it goes in std.range. Agree?

Andrei Alexandrescu (12/12) Jun 08 2010 I finalized BinaryHeap. It's pretty cool - it builds a forward range on

Simen kjaeraas (7/18) Jun 08 2010 No. Binaryheap may work like a range, and it's definitely usable as a

Andrei Alexandrescu (8/27) Jun 08 2010 Let me be more specific: BinaryHeap does not work like a range. It _is_

Simen kjaeraas (15/24) Jun 08 2010 How is it not a container? Because it uses a different container as a

Steven Schveighoffer (12/17) Jun 08 2010 This is not exactly correct in my opinion. A range that presents a
Andrei Alexandrescu (16/39) Jun 08 2010 It does not implement the container primitives and is not a reference ty...

Steven Schveighoffer (10/16) Jun 08 2010 What about making a BinaryHeapRange and a BinaryHeapContainer (well, wit...
Joel C. Salomon (11/22) Jun 08 2010 From your perspective as the library writer, perhaps. As a user, my
Simen kjaeraas (13/32) Jun 08 2010 And my point is that removeFront() is popFront() with a different name,

Andrei Alexandrescu (20/51) Jun 08 2010 Instead of assuming that I'm simultaneously retard(ed) and stubborn, it

Simen kjaeraas (11/26) Jun 09 2010 Now this is exactly what I wanted! Thank you. Sorry we had to fight to

Andrei Alexandrescu (7/17) Jun 09 2010 The problem is that operating on a range as a heap is not just one

Simen kjaeraas (5/10) Jun 09 2010 I might have mixed up names here. I meant heapify not as the operation,

Andrei Alexandrescu (4/12) Jun 09 2010 Yes, but does that operation yield an object of a new type, or is just

Michel Fortin (9/14) Jun 09 2010 And if you were accustomed to the STL, you'd just look for a binary

Andrei Alexandrescu (4/15) Jun 09 2010 I don't think this will scale. There are quite a few data structures out...

Don (7/28) Jun 09 2010 On the other hand, I don't think we want a 5Mb module, either. There's a...

Andrei Alexandrescu (7/35) Jun 09 2010 I'm not sure what's the best approach. Best I can think of is by
Michel Fortin (23/50) Jun 09 2010 Beside the size of the module, there is also the issue of namespace

Andrei Alexandrescu (8/59) Jun 09 2010 Say you import std.container and you end up with two dozen containers:

Michel Fortin (27/58) Jun 09 2010 None... until you import something else that has a different Array

Jonathan M Davis (16/32) Jun 09 2010 I'd expect that to lead to module explosion. It wouldn't be as bad as

Walter Bright (5/24) Jun 09 2010 One solution is to have std.container consist of the following:

Andrei Alexandrescu (8/31) Jun 09 2010 I wanted to do that but with the singular:

Walter Bright (2/11) Jun 09 2010 Because then it's overloading a name based on the contents of that name.
Simen kjaeraas (16/18) Jun 09 2010 //////////////
Leandro Lucarella (11/49) Jun 09 2010 That's why the convetion .all is used. If that worked I guess .all

FeepingCreature (2/5) Jun 13 2010 Law of Least Astonishment. Never forget.

Andrei Alexandrescu (7/13) Jun 13 2010 I made BinaryHeap a container, which I recently realized is the best

Michel Fortin (7/10) Jun 08 2010 import std.binheap;

KennyTM~ (2/8) Jun 08 2010 vote++;

Steven Schveighoffer (13/24) Jun 08 2010 If BinaryHeap can be added to, how is it a range? I would suspect that ...

Andrei Alexandrescu (8/36) Jun 08 2010 Well it still is. I mean isForwardRange!(BinaryHeap!(Array!int)) is true...

Steven Schveighoffer (20/59) Jun 08 2010 Functionally yes, something that supports all the range functions can be...

jlquinn (4/19) Jun 08 2010 I second this point. In STL parlance, BinaryHeap would be a container a...

Andrei Alexandrescu (11/39) Jun 08 2010 Unfortunately things are not that simple.

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

I finalized BinaryHeap. It's pretty cool - it builds a forward range on 
top of a random-access range - typically T[] - or a random-access 
container - typically Array!T.

The difference is simple - if you build on top of a range the heap can't 
grow beyond the size of that range. Building on top of a container makes 
the heap growable.

Making BinaryHeap a range is actually pretty cool - just walking the 
heap is tantamount to lazily sorting the container. (Of course 
BinaryHeap has primitives in addition to the four range primitives.)

Do you agree with putting BinaryHeap in std.range (as opposed to 
std.container)?


Andrei

Jun 08 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 I finalized BinaryHeap. It's pretty cool - it builds a forward range on  
 top of a random-access range - typically T[] - or a random-access  
 container - typically Array!T.

 The difference is simple - if you build on top of a range the heap can't  
 grow beyond the size of that range. Building on top of a container makes  
 the heap growable.

 Making BinaryHeap a range is actually pretty cool - just walking the  
 heap is tantamount to lazily sorting the container. (Of course  
 BinaryHeap has primitives in addition to the four range primitives.)

 Do you agree with putting BinaryHeap in std.range (as opposed to  
 std.container)?

No. Binaryheap may work like a range, and it's definitely usable as a
range, but the associativity of my brain says it's a container.
I fear this will end up like enum for manifest constants, which behaves
the way it should, but does not belong in that location.

-- 
Simen

Jun 08 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/08/2010 10:00 AM, Simen kjaeraas wrote:
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 I finalized BinaryHeap. It's pretty cool - it builds a forward range
 on top of a random-access range - typically T[] - or a random-access
 container - typically Array!T.

 The difference is simple - if you build on top of a range the heap
 can't grow beyond the size of that range. Building on top of a
 container makes the heap growable.

 Making BinaryHeap a range is actually pretty cool - just walking the
 heap is tantamount to lazily sorting the container. (Of course
 BinaryHeap has primitives in addition to the four range primitives.)

 Do you agree with putting BinaryHeap in std.range (as opposed to
 std.container)?

 No. Binaryheap may work like a range, and it's definitely usable as a
 range, but the associativity of my brain says it's a container.

Let me be more specific: BinaryHeap does not work like a range. It _is_ 
a range. It implements all of a forward range's primitives with the 
required semantics.

 I fear this will end up like enum for manifest constants, which behaves
 the way it should, but does not belong in that location.

But where should I put it then? I thought it would be even more 
confusing if I put something in std.container that wait a minute, is not 
a container.


Andrei

Jun 08 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 Let me be more specific: BinaryHeap does not work like a range. It _is_  
 a range. It implements all of a forward range's primitives with the  
 required semantics.

Is not anything that works like a range automatically a range?

 But where should I put it then? I thought it would be even more  
 confusing if I put something in std.container that wait a minute, is not  
 a container.

How is it not a container? Because it uses a different container as a
back-end?

 Generally, a range with benefits (i.e. extra functions) is still a range
 if it implements the required range primitives with the required
 semantics.

So many containers are ranges, then. And we will see this question for
any of those containers added to Phobos. Or we decide on some rules to
disambiguate between the two, and follow those.

I hold that a range is a view that does not change the underlying data,
and does not store all its data.
That makes an array a container, which I feel is correct. It may still
have range functionality, and thus be both, but it should be
categorized as a container first.

Are there other ranges that violate this definition?

-- 
Simen

Jun 08 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 08 Jun 2010 11:53:06 -0400, Simen kjaeraas  
<simen.kjaras gmail.com> wrote:

 I hold that a range is a view that does not change the underlying data,
 and does not store all its data.
 That makes an array a container, which I feel is correct. It may still
 have range functionality, and thus be both, but it should be
 categorized as a container first.

This is not exactly correct in my opinion.  A range that presents a  
read-only view of the underlying data would unnecessarily exclude many  
algorithms.  I would say it's a view of the data which can change the  
data, but not restructure the data (swapping two elements' values is not  
restructuring, re-linking two linked list nodes is restructuring).   
Essentially, it's a read-only view of the topology.

Array is special in that even the container itself cannot change its own  
topology since it's based on contiguous memory which cannot be  
restructured, so it can also be considered a range.

-Steve

Jun 08 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/08/2010 10:53 AM, Simen kjaeraas wrote:
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 Let me be more specific: BinaryHeap does not work like a range. It
 _is_ a range. It implements all of a forward range's primitives with
 the required semantics.

 Is not anything that works like a range automatically a range?

Yah, just wanted to be precise.

 But where should I put it then? I thought it would be even more
 confusing if I put something in std.container that wait a minute, is
 not a container.

 How is it not a container? Because it uses a different container as a
 back-end?

It does not implement the container primitives and is not a reference type.

 Generally, a range with benefits (i.e. extra functions) is still a range
 if it implements the required range primitives with the required
 semantics.

 So many containers are ranges, then.

There aren't because no container implements e.g. popFront().

 And we will see this question for
 any of those containers added to Phobos. Or we decide on some rules to
 disambiguate between the two, and follow those.

There are already rules that disambiguate range operations from similar 
container operations. For example a range defines popFront() whereas a 
container defines removeFront().

 I hold that a range is a view that does not change the underlying data,
 and does not store all its data.
 That makes an array a container, which I feel is correct. It may still
 have range functionality, and thus be both, but it should be
 categorized as a container first.

 Are there other ranges that violate this definition?

The explanation goes as Steve puts it. A container has control over its 
topology.

I agree that a BinaryHeap built on top of a container may ultimately 
affect the topology of the container, which makes it unlike e.g. Take or 
Chain. I could choose to disallow that and simply require that 
BinaryHeap always works on top of a range, not a container. But I think 
it's useful to have the growing functionality, and I don't think that 
makes BinaryHeap hopelessly confusing.


Andrei

Jun 08 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 08 Jun 2010 12:26:19 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 I agree that a BinaryHeap built on top of a container may ultimately  
 affect the topology of the container, which makes it unlike e.g. Take or  
 Chain. I could choose to disallow that and simply require that  
 BinaryHeap always works on top of a range, not a container. But I think  
 it's useful to have the growing functionality, and I don't think that  
 makes BinaryHeap hopelessly confusing.

What about making a BinaryHeapRange and a BinaryHeapContainer (well, with  
better names)?

Essentially, the container flavored heap would use a BinaryHeapRange for  
its range type, and would supply the necessary functions for it to be a  
container.

I admit not reading any code for BinaryHeap, so this might just be a  
foolish suggestion...

-Steve

Jun 08 2010

"Joel C. Salomon" <joelcsalomon gmail.com> writes:

On 06/08/2010 12:26 PM, Andrei Alexandrescu wrote:
 On 06/08/2010 10:53 AM, Simen kjaeraas wrote:
 I hold that a range is a view that does not change the underlying data,
 and does not store all its data.
 That makes an array a container, which I feel is correct. It may still
 have range functionality, and thus be both, but it should be
 categorized as a container first.

 Are there other ranges that violate this definition?

 
 The explanation goes as Steve puts it. A container has control over its
 topology.

From your perspective as the library writer, perhaps.  As a user, my
criterion is this:  If I might go behind the back of the view, either to
directly access the underlying structure or to impose a different view,
then the view is a “range”; if that is unlikely than the view is itself
a structure, a “container”.

One addendum:  The fact that I can choose the underlying container does
not count as “going behind the back of the view”.  If I implement a
BinaryHeap (say) on something that wraps an on-disk file, that’s just an
implementation detail; it’s still just a variant of BinaryHeap.

—Joel Salomon

Jun 08 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 On 06/08/2010 10:53 AM, Simen kjaeraas wrote:
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:


 But where should I put it then? I thought it would be even more
 confusing if I put something in std.container that wait a minute, is
 not a container.

 How is it not a container? Because it uses a different container as a
 back-end?

 It does not implement the container primitives and is not a reference  
 type.

So it is not a container because you chose not to make it a container.

 There are already rules that disambiguate range operations from similar  
 container operations. For example a range defines popFront() whereas a  
 container defines removeFront().

And my point is that removeFront() is popFront() with a different name,
so many containers could be considered ranges with mutated member
functions. :p But we're arguing semantics.

 I agree that a BinaryHeap built on top of a container may ultimately  
 affect the topology of the container, which makes it unlike e.g. Take or  
 Chain. I could choose to disallow that and simply require that  
 BinaryHeap always works on top of a range, not a container. But I think  
 it's useful to have the growing functionality, and I don't think that  
 makes BinaryHeap hopelessly confusing.

To me, this makes it a container.

Now, my favorite way of dealing with this: Where would I look for a
binary heap if I wanted one? I would think of it as a container, and thus
check std.container. If it was not there, I would use the search function
to find it. I can invent reasons, but it's mostly grounded in learned
names and categories.

-- 
Simen

Jun 08 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/08/2010 04:41 PM, Simen kjaeraas wrote:
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 On 06/08/2010 10:53 AM, Simen kjaeraas wrote:
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:


 But where should I put it then? I thought it would be even more
 confusing if I put something in std.container that wait a minute, is
 not a container.

 How is it not a container? Because it uses a different container as a
 back-end?

 It does not implement the container primitives and is not a reference
 type.

 So it is not a container because you chose not to make it a container.

Instead of assuming that I'm simultaneously retard(ed) and stubborn, it 
would be more productive to ask more about my reasons. I'll assume you 
did so.

I wrote BinaryHeap out of necessity. The necessity was to solve the 
selection problem (see topN, topNIndex in std.algorithm) and to 
implement n-way merge (see std.algorithm.nWayUnion, I think I'll rename 
it to nWayMerge).

In all these cases, there was no necessity for defining a heap as a 
container; instead, using it as a means to add structure over an 
existing range were sufficient.

If I defined the heap as a container, there would be need for one extra 
allocation and also extra indirections to ensure reference semantics. I 
wanted to avoid all that.

 There are already rules that disambiguate range operations from
 similar container operations. For example a range defines popFront()
 whereas a container defines removeFront().

 And my point is that removeFront() is popFront() with a different name,
 so many containers could be considered ranges with mutated member
 functions. :p But we're arguing semantics.

It's a good point. I agree it would make sense to define a binary heap 
container in addition to a binary heap range. I'm only weary that they 
don't have very clean means to reuse code, so we'll end up with some 
unpleasant implementation internals. I guess I'll just have to do that.

 I agree that a BinaryHeap built on top of a container may ultimately
 affect the topology of the container, which makes it unlike e.g. Take
 or Chain. I could choose to disallow that and simply require that
 BinaryHeap always works on top of a range, not a container. But I
 think it's useful to have the growing functionality, and I don't think
 that makes BinaryHeap hopelessly confusing.

 To me, this makes it a container.

 Now, my favorite way of dealing with this: Where would I look for a
 binary heap if I wanted one? I would think of it as a container, and thus
 check std.container. If it was not there, I would use the search function
 to find it. I can invent reasons, but it's mostly grounded in learned
 names and categories.

I think that's sensible.


Andrei

Jun 08 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 Instead of assuming that I'm simultaneously retard(ed) and stubborn

I'm sorry, that was not my intention.

 I wrote BinaryHeap out of necessity. The necessity was to solve the  
 selection problem (see topN, topNIndex in std.algorithm) and to  
 implement n-way merge (see std.algorithm.nWayUnion, I think I'll rename  
 it to nWayMerge).

 In all these cases, there was no necessity for defining a heap as a  
 container; instead, using it as a means to add structure over an  
 existing range were sufficient.

 If I defined the heap as a container, there would be need for one extra  
 allocation and also extra indirections to ensure reference semantics. I  
 wanted to avoid all that.

Now this is exactly what I wanted! Thank you. Sorry we had to fight to
get there. :p


 It's a good point. I agree it would make sense to define a binary heap  
 container in addition to a binary heap range. I'm only weary that they  
 don't have very clean means to reuse code, so we'll end up with some  
 unpleasant implementation internals. I guess I'll just have to do that.

I fear that this will also lead to confusion. "What? Two binheaps? What
do they need that for?" Basically, the binheap range is a heapifying
range, yes?

So to me, it seems to make sense to have std.range.heapify, and
std.container.binheap.

-- 
Simen

Jun 09 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/09/2010 06:20 AM, Simen kjaeraas wrote:
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 It's a good point. I agree it would make sense to define a binary heap
 container in addition to a binary heap range. I'm only weary that they
 don't have very clean means to reuse code, so we'll end up with some
 unpleasant implementation internals. I guess I'll just have to do that.

 I fear that this will also lead to confusion. "What? Two binheaps? What
 do they need that for?" Basically, the binheap range is a heapifying
 range, yes?

 So to me, it seems to make sense to have std.range.heapify, and
 std.container.binheap.

The problem is that operating on a range as a heap is not just one 
operation (i.e. heapify), it's a handful of them. STL took the approach 
of providing make_heap, push_heap, pop_heap, sort_heap, and is_heap. 
Their use is quite clunky. I think that's too unstructured an approach 
and would like to improve on it.

Andrei

Jun 09 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

  Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 The problem is that operating on a range as a heap is not just one  
 operation (i.e. heapify), it's a handful of them. STL took the approach  
 of providing make_heap, push_heap, pop_heap, sort_heap, and is_heap.  
 Their use is quite clunky. I think that's too unstructured an approach  
 and would like to improve on it.

I might have mixed up names here. I meant heapify not as the operation,
but as in 'turn this into a heap'.

-- 
Simen

Jun 09 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/09/2010 10:13 AM, Simen kjaeraas wrote:
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 The problem is that operating on a range as a heap is not just one
 operation (i.e. heapify), it's a handful of them. STL took the
 approach of providing make_heap, push_heap, pop_heap, sort_heap, and
 is_heap. Their use is quite clunky. I think that's too unstructured an
 approach and would like to improve on it.

 I might have mixed up names here. I meant heapify not as the operation,
 but as in 'turn this into a heap'.

Yes, but does that operation yield an object of a new type, or is just 
like STL's make_heap?

Andrei

Jun 09 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-06-08 17:41:22 -0400, "Simen kjaeraas" <simen.kjaras gmail.com> said:

 Now, my favorite way of dealing with this: Where would I look for a
 binary heap if I wanted one? I would think of it as a container, and thus
 check std.container. If it was not there, I would use the search function
 to find it. I can invent reasons, but it's mostly grounded in learned
 names and categories.

And if you were accustomed to the STL, you'd just look for a binary 
heap header to include instead of trying to philosophize about which 
category of things it fits best. That's why I suggested 
"std.binaryheap" earlier. (Could be "std.binheap" if you want it short.)

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Jun 09 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/09/2010 07:57 AM, Michel Fortin wrote:
 On 2010-06-08 17:41:22 -0400, "Simen kjaeraas" <simen.kjaras gmail.com>
 said:

 Now, my favorite way of dealing with this: Where would I look for a
 binary heap if I wanted one? I would think of it as a container, and thus
 check std.container. If it was not there, I would use the search function
 to find it. I can invent reasons, but it's mostly grounded in learned
 names and categories.

 And if you were accustomed to the STL, you'd just look for a binary heap
 header to include instead of trying to philosophize about which category
 of things it fits best. That's why I suggested "std.binaryheap" earlier.
 (Could be "std.binheap" if you want it short.)

I don't think this will scale. There are quite a few data structures out 
there, I'm afraid we'll have too many modules too soon.

Andrei

Jun 09 2010

Don <nospam nospam.com> writes:

Andrei Alexandrescu wrote:
 On 06/09/2010 07:57 AM, Michel Fortin wrote:
 On 2010-06-08 17:41:22 -0400, "Simen kjaeraas" <simen.kjaras gmail.com>
 said:

 Now, my favorite way of dealing with this: Where would I look for a
 binary heap if I wanted one? I would think of it as a container, and 
 thus
 check std.container. If it was not there, I would use the search 
 function
 to find it. I can invent reasons, but it's mostly grounded in learned
 names and categories.

 And if you were accustomed to the STL, you'd just look for a binary heap
 header to include instead of trying to philosophize about which category
 of things it fits best. That's why I suggested "std.binaryheap" earlier.
 (Could be "std.binheap" if you want it short.)

 
 I don't think this will scale. There are quite a few data structures out 
 there, I'm afraid we'll have too many modules too soon.
 
 Andrei

On the other hand, I don't think we want a 5Mb module, either. There's a 
very large number of potential containers, once you include all the 
esoteric ones.

(std.algorithm has the same problem, of course).
It's a difficult tradeoff. I hope you're able to come up with a 
reasonable rationale.

Jun 09 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/09/2010 10:37 AM, Don wrote:
 Andrei Alexandrescu wrote:
 On 06/09/2010 07:57 AM, Michel Fortin wrote:
 On 2010-06-08 17:41:22 -0400, "Simen kjaeraas" <simen.kjaras gmail.com>
 said:

 Now, my favorite way of dealing with this: Where would I look for a
 binary heap if I wanted one? I would think of it as a container, and
 thus
 check std.container. If it was not there, I would use the search
 function
 to find it. I can invent reasons, but it's mostly grounded in learned
 names and categories.

 And if you were accustomed to the STL, you'd just look for a binary heap
 header to include instead of trying to philosophize about which category
 of things it fits best. That's why I suggested "std.binaryheap" earlier.
 (Could be "std.binheap" if you want it short.)

 I don't think this will scale. There are quite a few data structures
 out there, I'm afraid we'll have too many modules too soon.

 Andrei

 On the other hand, I don't think we want a 5Mb module, either. There's a
 very large number of potential containers, once you include all the
 esoteric ones.

 (std.algorithm has the same problem, of course).
 It's a difficult tradeoff. I hope you're able to come up with a
 reasonable rationale.

I'm not sure what's the best approach. Best I can think of is by 
category, e.g. all sequences go in std.sequence, all trees go in 
std.tree, all hashes go in std.hash and so on. Since heaps are trees 
they may be in std.tree. But then there are quite a few heap kinds out 
there (binary, Fibonacci, binomial, soft) so perhaps std.heap is deserved.

Andrei

Jun 09 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-06-09 11:37:38 -0400, Don <nospam nospam.com> said:

 Andrei Alexandrescu wrote:
 On 06/09/2010 07:57 AM, Michel Fortin wrote:
 On 2010-06-08 17:41:22 -0400, "Simen kjaeraas" <simen.kjaras gmail.com>
 said:
 
 Now, my favorite way of dealing with this: Where would I look for a
 binary heap if I wanted one? I would think of it as a container, and thus
 check std.container. If it was not there, I would use the search function
 to find it. I can invent reasons, but it's mostly grounded in learned
 names and categories.

 
 And if you were accustomed to the STL, you'd just look for a binary heap
 header to include instead of trying to philosophize about which category
 of things it fits best. That's why I suggested "std.binaryheap" earlier.
 (Could be "std.binheap" if you want it short.)

 
 I don't think this will scale. There are quite a few data structures 
 out there, I'm afraid we'll have too many modules too soon.
 
 Andrei

 
 On the other hand, I don't think we want a 5Mb module, either. There's 
 a very large number of potential containers, once you include all the 
 esoteric ones.

Beside the size of the module, there is also the issue of namespace 
pollution. If you import std.container and you end up with a dozen of 
different containers plus their related functions, it's quite a lot. 
(Of course you can use selective import, but that's a workaround, it's 
better not have the problem in the first place.)


 (std.algorithm has the same problem, of course).
 It's a difficult tradeoff. I hope you're able to come up with a 
 reasonable rationale.

Personally I'd draw the line like this: a module should contain 
functions and types which are either tied in their implementation or 
which their use are correlated (if you use this thing you'll likely 
need this other one).

For instance, a tree set and a tree map are almost the same, so they're 
good candidates to put together in a module (std.tree?). A linked list 
and an array are totally independent in implementation, and the need 
for one doesn't correlate with the need for another, so there's no 
reason to put them in the same module; they thus belong to separate 
modules.

This "correlation in usage" rule should reduce the number of modules 
you need to import while also minimizing the number of symbols, the 
code size, and the dependencies of each module. What do you think?

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Jun 09 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/09/2010 02:34 PM, Michel Fortin wrote:
 On 2010-06-09 11:37:38 -0400, Don <nospam nospam.com> said:

 Andrei Alexandrescu wrote:
 On 06/09/2010 07:57 AM, Michel Fortin wrote:
 On 2010-06-08 17:41:22 -0400, "Simen kjaeraas" <simen.kjaras gmail.com>
 said:

 Now, my favorite way of dealing with this: Where would I look for a
 binary heap if I wanted one? I would think of it as a container,
 and thus
 check std.container. If it was not there, I would use the search
 function
 to find it. I can invent reasons, but it's mostly grounded in learned
 names and categories.

 And if you were accustomed to the STL, you'd just look for a binary
 heap
 header to include instead of trying to philosophize about which
 category
 of things it fits best. That's why I suggested "std.binaryheap"
 earlier.
 (Could be "std.binheap" if you want it short.)

 I don't think this will scale. There are quite a few data structures
 out there, I'm afraid we'll have too many modules too soon.

 Andrei

 On the other hand, I don't think we want a 5Mb module, either. There's
 a very large number of potential containers, once you include all the
 esoteric ones.

 Beside the size of the module, there is also the issue of namespace
 pollution. If you import std.container and you end up with a dozen of
 different containers plus their related functions, it's quite a lot. (Of
 course you can use selective import, but that's a workaround, it's
 better not have the problem in the first place.)

Say you import std.container and you end up with two dozen containers: 
Array, SList, DList, BinaryHeap (groovy baby!), RedBlackTree, Trie, ...

Where are the clashes?

 (std.algorithm has the same problem, of course).
 It's a difficult tradeoff. I hope you're able to come up with a
 reasonable rationale.

 Personally I'd draw the line like this: a module should contain
 functions and types which are either tied in their implementation or
 which their use are correlated (if you use this thing you'll likely need
 this other one).

 For instance, a tree set and a tree map are almost the same, so they're
 good candidates to put together in a module (std.tree?).

Hm. On the other hand, if one is using a tree set that doesn't increase 
by a lot the likelihood of using a tree map.

 A linked list
 and an array are totally independent in implementation, and the need for
 one doesn't correlate with the need for another, so there's no reason to
 put them in the same module; they thus belong to separate modules.

 This "correlation in usage" rule should reduce the number of modules you
 need to import while also minimizing the number of symbols, the code
 size, and the dependencies of each module. What do you think?

I think we should look for a better rule.


Andrei

Jun 09 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-06-09 15:44:48 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 06/09/2010 02:34 PM, Michel Fortin wrote:
 Beside the size of the module, there is also the issue of namespace
 pollution. If you import std.container and you end up with a dozen of
 different containers plus their related functions, it's quite a lot. (Of
 course you can use selective import, but that's a workaround, it's
 better not have the problem in the first place.)

 
 Say you import std.container and you end up with two dozen containers: 
 Array, SList, DList, BinaryHeap (groovy baby!), RedBlackTree, Trie, ...
 
 Where are the clashes?

None... until you import something else that has a different Array 
type. If you import std.container because you need only BinaryHeap and 
doing this creates a clash with your own Array type you're going to 
have to disambiguate manually everywhere, rename your Array type 
everywhere, or to use a selective import. Not having the clash is 
better. And that clash will be quite frustrating if you need only 
BinaryHeap which has nothing to do with std.containers.Array.


 Personally I'd draw the line like this: a module should contain
 functions and types which are either tied in their implementation or
 which their use are correlated (if you use this thing you'll likely need
 this other one).
 
 For instance, a tree set and a tree map are almost the same, so they're
 good candidates to put together in a module (std.tree?).

 
 Hm. On the other hand, if one is using a tree set that doesn't increase 
 by a lot the likelihood of using a tree map.

Indeed, but since both share the same tree implementation, it'd be 
wasteful to make another module just for a few lines of code wrapping 
the same tree structure. The idea is to avoid creating a separate 
module for every small tweak of a core structure. There are multiple 
criteria which can be considered, my rule is based on two.


 A linked list
 and an array are totally independent in implementation, and the need for
 one doesn't correlate with the need for another, so there's no reason to
 put them in the same module; they thus belong to separate modules.
 
 This "correlation in usage" rule should reduce the number of modules you
 need to import while also minimizing the number of symbols, the code
 size, and the dependencies of each module. What do you think?

 
 I think we should look for a better rule.

The rule I proposed was really "correlation in usage *or* shared 
implementation"; any of the two allows things to belong to the same 
module. The correlation avoids unnecessary imports, and the shared 
implementation part is to avoid scattering things in too many modules, 
and also to make it easier to keep common implementation details 
private.

But it's just a guideline, it's what I'd try to do, not necessarily 
what I'd do every time. You don't have to follow a guideline when it 
makes more sense to do something else.


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Jun 09 2010

Jonathan M Davis <jmdavisProg gmail.com> writes:

Michel Fortin wrote:

 Personally I'd draw the line like this: a module should contain
 functions and types which are either tied in their implementation or
 which their use are correlated (if you use this thing you'll likely
 need this other one).
 
 For instance, a tree set and a tree map are almost the same, so they're
 good candidates to put together in a module (std.tree?). A linked list
 and an array are totally independent in implementation, and the need
 for one doesn't correlate with the need for another, so there's no
 reason to put them in the same module; they thus belong to separate
 modules.
 
 This "correlation in usage" rule should reduce the number of modules
 you need to import while also minimizing the number of symbols, the
 code size, and the dependencies of each module. What do you think?
 

I'd expect that to lead to module explosion. It wouldn't be as bad as 
importing each class individually as typically happens in Java, but it would 
mean a lot more modules. It can certainly quickly lead to something close a 
module per container. I don't, personally, see much downside to throwing all 
of the containers in one module except that that's a lot to sort through 
when looking for a container documentation-wise. They definitely shouldn't 
be clashing with each other in any way, and ideally, you wouldn't be writing 
any code which clashed with them either since they'd be standard.

Breaking up larger modules like std.algorithm and std.container isn't 
necessarily a bad idea, but it definitely should be done in a manner which 
avoids module explosion.

Personally, I'd much prefer something closer to importing the entire 
standard library with one import than having to import each individual class 
one by one.

- Jonathan M Davis

Jun 09 2010

Walter Bright <newshound1 digitalmars.com> writes:

Andrei Alexandrescu wrote:
 On 06/09/2010 07:57 AM, Michel Fortin wrote:
 On 2010-06-08 17:41:22 -0400, "Simen kjaeraas" <simen.kjaras gmail.com>
 said:

 Now, my favorite way of dealing with this: Where would I look for a
 binary heap if I wanted one? I would think of it as a container, and 
 thus
 check std.container. If it was not there, I would use the search 
 function
 to find it. I can invent reasons, but it's mostly grounded in learned
 names and categories.

 And if you were accustomed to the STL, you'd just look for a binary heap
 header to include instead of trying to philosophize about which category
 of things it fits best. That's why I suggested "std.binaryheap" earlier.
 (Could be "std.binheap" if you want it short.)

 
 I don't think this will scale. There are quite a few data structures out 
 there, I'm afraid we'll have too many modules too soon.

One solution is to have std.container consist of the following:

public import std.containers.binaryheap;
public import std.containers.redblacktree;
...

Jun 09 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/09/2010 02:48 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 On 06/09/2010 07:57 AM, Michel Fortin wrote:
 On 2010-06-08 17:41:22 -0400, "Simen kjaeraas" <simen.kjaras gmail.com>
 said:

 Now, my favorite way of dealing with this: Where would I look for a
 binary heap if I wanted one? I would think of it as a container, and
 thus
 check std.container. If it was not there, I would use the search
 function
 to find it. I can invent reasons, but it's mostly grounded in learned
 names and categories.

 And if you were accustomed to the STL, you'd just look for a binary heap
 header to include instead of trying to philosophize about which category
 of things it fits best. That's why I suggested "std.binaryheap" earlier.
 (Could be "std.binheap" if you want it short.)

 I don't think this will scale. There are quite a few data structures
 out there, I'm afraid we'll have too many modules too soon.

 One solution is to have std.container consist of the following:

 public import std.containers.binaryheap;
 public import std.containers.redblacktree;

I wanted to do that but with the singular:

public import std.container.binaryheap;

So if someone imports std.container then they get them all, if someone 
imports std.container.something then they only get something.

Having a module and a package with the same name does not currently 
work, and I don't think there's a good rationale for that limitation.


Andrei

Jun 09 2010

Walter Bright <newshound1 digitalmars.com> writes:

Andrei Alexandrescu wrote:
 I wanted to do that but with the singular:
 
 public import std.container.binaryheap;
 
 So if someone imports std.container then they get them all, if someone 
 imports std.container.something then they only get something.
 
 Having a module and a package with the same name does not currently 
 work, and I don't think there's a good rationale for that limitation.

Because then it's overloading a name based on the contents of that name.

Jun 09 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 Having a module and a package with the same name does not currently  
 work, and I don't think there's a good rationale for that limitation.

//////////////
module a;
public import a.b;

struct b {
     static int c;
}

//////////////
module a.b;

int c;

//////////////
module foo;

import a;

a.b.c = 3; // Where's you lookup rules now?

-- 
Simen

Jun 09 2010

Leandro Lucarella <llucax gmail.com> writes:

Andrei Alexandrescu, el  9 de junio a las 14:59 me escribiste:
 On 06/09/2010 02:48 PM, Walter Bright wrote:
Andrei Alexandrescu wrote:
On 06/09/2010 07:57 AM, Michel Fortin wrote:
On 2010-06-08 17:41:22 -0400, "Simen kjaeraas" <simen.kjaras gmail.com>
said:

Now, my favorite way of dealing with this: Where would I look for a
binary heap if I wanted one? I would think of it as a container, and
thus
check std.container. If it was not there, I would use the search
function
to find it. I can invent reasons, but it's mostly grounded in learned
names and categories.

And if you were accustomed to the STL, you'd just look for a binary heap
header to include instead of trying to philosophize about which category
of things it fits best. That's why I suggested "std.binaryheap" earlier.
(Could be "std.binheap" if you want it short.)

I don't think this will scale. There are quite a few data structures
out there, I'm afraid we'll have too many modules too soon.

One solution is to have std.container consist of the following:

public import std.containers.binaryheap;
public import std.containers.redblacktree;

 
 I wanted to do that but with the singular:
 
 public import std.container.binaryheap;
 
 So if someone imports std.container then they get them all, if
 someone imports std.container.something then they only get
 something.
 
 Having a module and a package with the same name does not currently
 work, and I don't think there's a good rationale for that
 limitation.

That's why the convetion .all is used. If that worked I guess .all
wasn't necessary in the first place.

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
You can try the best you can
If you try the best you can
The best you can is good enough

Jun 09 2010

FeepingCreature <default_357-line yahoo.de> writes:

On 08.06.2010 23:41, Simen kjaeraas wrote:
 Now, my favorite way of dealing with this: Where would I look for a
 binary heap if I wanted one? I would think of it as a container, and thus
 check std.container. 

Law of Least Astonishment. Never forget.

Jun 13 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

FeepingCreature wrote:
 On 08.06.2010 23:41, Simen kjaeraas wrote:
 Now, my favorite way of dealing with this: Where would I look for a
 binary heap if I wanted one? I would think of it as a container, and thus
 check std.container. 

 
 Law of Least Astonishment. Never forget.

I made BinaryHeap a container, which I recently realized is the best 
decision. Unfortunately, the HTML documentation is not there:

http://www.digitalmars.com/d/2.0/phobos/std_container.html

Walter, could you please fix this?


Thanks,

Andrei

Jun 13 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-06-08 11:09:44 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 But where should I put it then? I thought it would be even more 
 confusing if I put something in std.container that wait a minute, is 
 not a container.

	import std.binheap;

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Jun 08 2010

KennyTM~ <kennytm gmail.com> writes:

On Jun 9, 10 02:02, Michel Fortin wrote:
 On 2010-06-08 11:09:44 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 But where should I put it then? I thought it would be even more
 confusing if I put something in std.container that wait a minute, is
 not a container.

 import std.binheap;

vote++;

Jun 08 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 08 Jun 2010 10:47:33 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 I finalized BinaryHeap. It's pretty cool - it builds a forward range on  
 top of a random-access range - typically T[] - or a random-access  
 container - typically Array!T.

 The difference is simple - if you build on top of a range the heap can't  
 grow beyond the size of that range. Building on top of a container makes  
 the heap growable.

 Making BinaryHeap a range is actually pretty cool - just walking the  
 heap is tantamount to lazily sorting the container. (Of course  
 BinaryHeap has primitives in addition to the four range primitives.)

 Do you agree with putting BinaryHeap in std.range (as opposed to  
 std.container)?

If BinaryHeap can be added to, how is it a range?  I would suspect that  
BinaryHeap can be in std.range, but a BinaryHeap that takes a container as  
an input is definitely not a range IMO.  You might say the same about an  
array, but an array is special in that if I append to one array, it does  
not affect the other.

I don't know where it belongs.  To me, a range is a type that in itself  
cannot affect the topology of the data structure.  It's like a window into  
the data structure that provides a common interface.  A container is the  
owner of the data structure, and can provide ranges over its data.  This  
may be an inadequate view, but its worked for me so far.

-Steve

Jun 08 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/08/2010 10:04 AM, Steven Schveighoffer wrote:
 On Tue, 08 Jun 2010 10:47:33 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 I finalized BinaryHeap. It's pretty cool - it builds a forward range
 on top of a random-access range - typically T[] - or a random-access
 container - typically Array!T.

 The difference is simple - if you build on top of a range the heap
 can't grow beyond the size of that range. Building on top of a
 container makes the heap growable.

 Making BinaryHeap a range is actually pretty cool - just walking the
 heap is tantamount to lazily sorting the container. (Of course
 BinaryHeap has primitives in addition to the four range primitives.)

 Do you agree with putting BinaryHeap in std.range (as opposed to
 std.container)?

 If BinaryHeap can be added to, how is it a range?

It is a forward range and also an output range by supporting put().

 I would suspect that
 BinaryHeap can be in std.range, but a BinaryHeap that takes a container
 as an input is definitely not a range IMO.

Well it still is. I mean isForwardRange!(BinaryHeap!(Array!int)) is true.

Generally, a range with benefits (i.e. extra functions) is still a range 
if it implements the required range primitives with the required semantics.

 You might say the same about
 an array, but an array is special in that if I append to one array, it
 does not affect the other.

 I don't know where it belongs. To me, a range is a type that in itself
 cannot affect the topology of the data structure. It's like a window
 into the data structure that provides a common interface. A container is
 the owner of the data structure, and can provide ranges over its data.
 This may be an inadequate view, but its worked for me so far.

Actually if a heap is built on top of a container, it can affect its 
topology because it has the sensible primitive of adding to the heap.


Andrei

Jun 08 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 08 Jun 2010 11:12:01 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 06/08/2010 10:04 AM, Steven Schveighoffer wrote:
 On Tue, 08 Jun 2010 10:47:33 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 I finalized BinaryHeap. It's pretty cool - it builds a forward range
 on top of a random-access range - typically T[] - or a random-access
 container - typically Array!T.

 The difference is simple - if you build on top of a range the heap
 can't grow beyond the size of that range. Building on top of a
 container makes the heap growable.

 Making BinaryHeap a range is actually pretty cool - just walking the
 heap is tantamount to lazily sorting the container. (Of course
 BinaryHeap has primitives in addition to the four range primitives.)

 Do you agree with putting BinaryHeap in std.range (as opposed to
 std.container)?

 If BinaryHeap can be added to, how is it a range?

 It is a forward range and also an output range by supporting put().

 I would suspect that
 BinaryHeap can be in std.range, but a BinaryHeap that takes a container
 as an input is definitely not a range IMO.

 Well it still is. I mean isForwardRange!(BinaryHeap!(Array!int)) is true.

 Generally, a range with benefits (i.e. extra functions) is still a range  
 if it implements the required range primitives with the required  
 semantics.

Functionally yes, something that supports all the range functions can be  
considered a range, but semantically, to me anyways, I always consider a  
range to be a something that is a view into a container.  Something that  
destroys the container as it iterates, or can affect the topology of the  
original container seems like it is a different animal.  You could have  
called std.container's removeFront popFront, and then containers could be  
"ranges" too.  Very slow and bizarre ranges :)

This is part of the reason why I have trouble accepting I/O ranges -- they  
just behave so much differently than other ranges it seems like they  
should have their own interface.

Have you ever seen that movie Wall-E?  There's a scene where he finds a  
spork, and in trying to place it, he gets to his bin of forks and spoons,  
with a cup for each type.  After going back and forth between spoons and  
forks, he just places it on the shelf between the two cups.  Maybe  
BinaryHeap is a crange :)

 You might say the same about
 an array, but an array is special in that if I append to one array, it
 does not affect the other.

 I don't know where it belongs. To me, a range is a type that in itself
 cannot affect the topology of the data structure. It's like a window
 into the data structure that provides a common interface. A container is
 the owner of the data structure, and can provide ranges over its data.
 This may be an inadequate view, but its worked for me so far.

 Actually if a heap is built on top of a container, it can affect its  
 topology because it has the sensible primitive of adding to the heap.

Yes, that is my point of why at least growable heaps should not be  
considered ranges.

-Steve

Jun 08 2010

jlquinn <jlquinn optonline.net> writes:

Steven Schveighoffer Wrote:
 You might say the same about
 an array, but an array is special in that if I append to one array, it
 does not affect the other.

 I don't know where it belongs. To me, a range is a type that in itself
 cannot affect the topology of the data structure. It's like a window
 into the data structure that provides a common interface. A container is
 the owner of the data structure, and can provide ranges over its data.
 This may be an inadequate view, but its worked for me so far.

 Actually if a heap is built on top of a container, it can affect its  
 topology because it has the sensible primitive of adding to the heap.

 
 Yes, that is my point of why at least growable heaps should not be  
 considered ranges.

I second this point.  In STL parlance, BinaryHeap would be a container adapter.
 Also, to me a range feels morally equivalent to an iterator and binary heap
feels like a data structure that you can iterate over.  And the fact that you
can insert objects into it and later remove them in a manipulated order makes
it feel even more like an active component and thus a container.

So I'd naturally go looking for it in std.container and be a little confused to
find it in std.range.  It would become a quirk that one adjusts to.

Jerry

Jun 08 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/08/2010 12:29 PM, jlquinn wrote:
 Steven Schveighoffer Wrote:
 You might say the same about an array, but an array is special
 in that if I append to one array, it does not affect the
 other.

 I don't know where it belongs. To me, a range is a type that in
 itself cannot affect the topology of the data structure. It's
 like a window into the data structure that provides a common
 interface. A container is the owner of the data structure, and
 can provide ranges over its data. This may be an inadequate
 view, but its worked for me so far.

 Actually if a heap is built on top of a container, it can affect
 its topology because it has the sensible primitive of adding to
 the heap.

 Yes, that is my point of why at least growable heaps should not be
 considered ranges.

 I second this point.  In STL parlance, BinaryHeap would be a
 container adapter.  Also, to me a range feels morally equivalent to
 an iterator and binary heap feels like a data structure that you can
 iterate over.  And the fact that you can insert objects into it and
 later remove them in a manipulated order makes it feel even more like
 an active component and thus a container.

 So I'd naturally go looking for it in std.container and be a little
 confused to find it in std.range.  It would become a quirk that one
 adjusts to.

 Jerry

Unfortunately things are not that simple.

A BinaryHeap built on top of a range can _still_ grow, just not larger 
than the size of the range. If you look through the code, you'll see 
that a constructor takes a range and an initial size.

Oddly enough, this is a very frequent of binary heaps: you have a range 
and you want to take the top N. So I wouldn't want to discount that.

So, what we have here are binary heaps having ranges as support (can 
only grow up to the size of the range) and binary heaps having 
containers as support (can grow with the container).


Andrei

Jun 08 2010

D Programming

C/C++ Programming

Other

digitalmars.D - BinaryHeap is a range so it goes in std.range. Agree?