## digitalmars.D - Heap: container or range?

• Andrei Alexandrescu (15/15) Jan 29 2009 A "computer science heap" is a structure that offers fast access to the
• Daniel Keep (6/27) Jan 29 2009 I suppose it comes down to whether or not the typical case is to either
• Bill Baxter (12/26) Jan 29 2009 In response to your previous question of how to distinguish from a
• Andrei Alexandrescu (10/41) Jan 29 2009 Ah, never mind all that. I realized that I can't have a heap range. This...
• Bill Baxter (13/62) Jan 29 2009 Messed 'h' up or messed 'a' up?
• Andrei Alexandrescu (8/31) Jan 29 2009 'a' is messed up (mutated) by definition. The problem is that after h
• Bill Baxter (5/38) Jan 29 2009 Ok. great. I thought that the slicing notation to make a sequence
• Christopher Wright (3/22) Jan 29 2009 I'm not seeing that. take should iterate through _h_, which will pop
• Jason House (3/48) Jan 29 2009 I would have expected index take(h,5)[0] to be 16...
• Andrei Alexandrescu (7/11) Jan 29 2009 Both ways have advantages and disadvantages. My early experiments
• Sean Kelly (11/21) Jan 29 2009 Hm... so assuming this is a min heap, I have:
• Sean Kelly (6/30) Jan 29 2009 Oh, I would also expect:
• Andrei Alexandrescu (4/39) Jan 29 2009 Yah, that's what I meant. h still thinks its length is 10 and that those...
• Brad Roberts (5/46) Jan 29 2009 Since take is a mutator, that implies that h needs to be a reference
• Sean Kelly (7/7) Jan 29 2009 I don't know if it matters, but I added the heap routines to Tango
• Andrei Alexandrescu (7/13) Jan 30 2009 A great use for heaps is topNCopy: given an input range (!), give me
• Sergey Gromov (11/49) Jan 30 2009 Actually, if ranges are by value, I'd expect
• Andrei Alexandrescu (13/62) Jan 30 2009 Me too. It does work indeed BUT NOT FOR INPUT RANGES. So in short you
• Denis Koroskin (3/13) Jan 30 2009 Same here. I feel that take (and some other adaptors) should accept rang...
• Andrei Alexandrescu (23/35) Jan 30 2009 While I'm not 100% clear on what take should do, this particular
• Brad Roberts (6/32) Jan 30 2009 I hate to come back to naming, but I think that's a major part of what's...
• Andrei Alexandrescu (3/38) Jan 30 2009 Names are important. "take" is taken from Haskell.
• Derek Parnell (6/7) Jan 30 2009 No it hasn't. Haskell still has it so its been "copied" from Haskell. :-...
• BCS (2/6) Jan 29 2009 what would be the problem of making it (optionaly) both?
• Yigal Chripun (3/18) Jan 29 2009 regarding your previous question:
• Bill Baxter (6/8) Jan 29 2009 I the distinction is that a Heap is one way to implement a Priority
• Don (16/37) Jan 30 2009 I think this depends on whether heap operations are required (and
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
```A "computer science heap" is a structure that offers fast access to the
largest element and fast extraction of it (which in turn provides access
to the next largest element etc.).

I'm just done working on the heap in std.algorithm. Now, it turns out
that heap supports both a meaningful definition as a full-fledged
container, and a beautiful definition as a range.

If Heap is a range, you initiate it with another range, which Heap
organizes in the heap manner. Then, successive calls to next() nicely
extract elements starting from the largest. If the underlying range
supports put(), then Heap also supports put() to insert into the heap.

Heap as a container would offer similar primitives but in addition would
"own" its data (would call destructors upon destruction, and would
support value copying).

What do you think? Should I make Heap a container or a range?

Andrei
```
Jan 29 2009
Daniel Keep <daniel.keep.lists gmail.com> writes:
```Andrei Alexandrescu wrote:
A "computer science heap" is a structure that offers fast access to the
largest element and fast extraction of it (which in turn provides access
to the next largest element etc.).

I'm just done working on the heap in std.algorithm. Now, it turns out
that heap supports both a meaningful definition as a full-fledged
container, and a beautiful definition as a range.

If Heap is a range, you initiate it with another range, which Heap
organizes in the heap manner. Then, successive calls to next() nicely
extract elements starting from the largest. If the underlying range
supports put(), then Heap also supports put() to insert into the heap.

Heap as a container would offer similar primitives but in addition would
"own" its data (would call destructors upon destruction, and would
support value copying).

What do you think? Should I make Heap a container or a range?

Andrei

I suppose it comes down to whether or not the typical case is to either
have a heap or heap-view of another container.  I'd suspect the first,
to be honest.

You could always just have Heap and HeapView.  :)

-- Daniel
```
Jan 29 2009
Bill Baxter <wbaxter gmail.com> writes:
```On Fri, Jan 30, 2009 at 11:49 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
A "computer science heap" is a structure that offers fast access to the
largest element and fast extraction of it (which in turn provides access to
the next largest element etc.).

In response to your previous question of how to distinguish from a
memory heap, you can use the specific name for the kind of heap it is,
like BinaryHeap, Binomial Heap, Fibbonocci Heap, etc.

I'm just done working on the heap in std.algorithm. Now, it turns out that
heap supports both a meaningful definition as a full-fledged container, and
a beautiful definition as a range.

If Heap is a range, you initiate it with another range, which Heap organizes
in the heap manner. Then, successive calls to next() nicely extract elements
starting from the largest. If the underlying range supports put(), then Heap
also supports put() to insert into the heap.

Heap as a container would offer similar primitives but in addition would
"own" its data (would call destructors upon destruction, and would support
value copying).

What do you think? Should I make Heap a container or a range?

Both sound useful depending on circumstances.  One provides the
all-in-one convenient solution, the other is more of a "non-invasive
heap".

If you emphasize the non-invasive version, then creating the
all-in-one version out of that should be pretty easy.  But you can't
take an all-in-one and turn it non-invasive so easily.

--bb
```
Jan 29 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
```Bill Baxter wrote:
On Fri, Jan 30, 2009 at 11:49 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
A "computer science heap" is a structure that offers fast access to the
largest element and fast extraction of it (which in turn provides access to
the next largest element etc.).

In response to your previous question of how to distinguish from a
memory heap, you can use the specific name for the kind of heap it is,
like BinaryHeap, Binomial Heap, Fibbonocci Heap, etc.

BinaryHeap it is. Thanks!

I'm just done working on the heap in std.algorithm. Now, it turns out that
heap supports both a meaningful definition as a full-fledged container, and
a beautiful definition as a range.

If Heap is a range, you initiate it with another range, which Heap organizes
in the heap manner. Then, successive calls to next() nicely extract elements
starting from the largest. If the underlying range supports put(), then Heap
also supports put() to insert into the heap.

Heap as a container would offer similar primitives but in addition would
"own" its data (would call destructors upon destruction, and would support
value copying).

What do you think? Should I make Heap a container or a range?

Both sound useful depending on circumstances.  One provides the
all-in-one convenient solution, the other is more of a "non-invasive
heap".

If you emphasize the non-invasive version, then creating the
all-in-one version out of that should be pretty easy.  But you can't
take an all-in-one and turn it non-invasive so easily.

Ah, never mind all that. I realized that I can't have a heap range. This
is because heaps must mutate the underlying range in-place and any copy
will mess the heap up. Here's the example that clarify it for me:

int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ];
auto h = Heap!(int[])(a);
assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]);

At this point the mutations done by take messed up h already!!

Andrei
```
Jan 29 2009
Bill Baxter <wbaxter gmail.com> writes:
```On Fri, Jan 30, 2009 at 12:14 PM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
Bill Baxter wrote:
On Fri, Jan 30, 2009 at 11:49 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
A "computer science heap" is a structure that offers fast access to the
largest element and fast extraction of it (which in turn provides access
to
the next largest element etc.).

In response to your previous question of how to distinguish from a
memory heap, you can use the specific name for the kind of heap it is,
like BinaryHeap, Binomial Heap, Fibbonocci Heap, etc.

BinaryHeap it is. Thanks!

I'm just done working on the heap in std.algorithm. Now, it turns out
that
heap supports both a meaningful definition as a full-fledged container,
and
a beautiful definition as a range.

If Heap is a range, you initiate it with another range, which Heap
organizes
in the heap manner. Then, successive calls to next() nicely extract
elements
starting from the largest. If the underlying range supports put(), then
Heap
also supports put() to insert into the heap.

Heap as a container would offer similar primitives but in addition would
"own" its data (would call destructors upon destruction, and would
support
value copying).

What do you think? Should I make Heap a container or a range?

Both sound useful depending on circumstances.  One provides the
all-in-one convenient solution, the other is more of a "non-invasive
heap".

If you emphasize the non-invasive version, then creating the
all-in-one version out of that should be pretty easy.  But you can't
take an all-in-one and turn it non-invasive so easily.

Ah, never mind all that. I realized that I can't have a heap range. This is
because heaps must mutate the underlying range in-place and any copy will
mess the heap up. Here's the example that clarify it for me:

int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ];
auto h = Heap!(int[])(a);
assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]);

At this point the mutations done by take messed up h already!!

Messed 'h' up or messed 'a' up?

Anyway, in that case it would be nice if you can easily run the heap
indirectly on an index or pointer buffer.

Will something like this work?
HeavyElement[] arrayOfHeavies;
auto h = Heap!(int[], (int i, int j){return
arrayOfHeavies[i]<arrayOfHeavies[j];} )( 0..arrayOfHeavies.length )

Now   arrayOfHeavies[h.pop()] should give you the smallest element of
the array of heavies, without ever having to copy/swap anything but
ints.

--bb
```
Jan 29 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
```Bill Baxter wrote:
On Fri, Jan 30, 2009 at 12:14 PM, Andrei Alexandrescu
Ah, never mind all that. I realized that I can't have a heap range. This is
because heaps must mutate the underlying range in-place and any copy will
mess the heap up. Here's the example that clarify it for me:

int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ];
auto h = Heap!(int[])(a);
assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]);

At this point the mutations done by take messed up h already!!

Messed 'h' up or messed 'a' up?

'a' is messed up (mutated) by definition. The problem is that after h
was copied to do take's deed, it got messed up (i.e., doesn't respect
the heap property anymore).

Anyway, in that case it would be nice if you can easily run the heap
indirectly on an index or pointer buffer.

Will something like this work?
HeavyElement[] arrayOfHeavies;
auto h = Heap!(int[], (int i, int j){return
arrayOfHeavies[i]<arrayOfHeavies[j];} )( 0..arrayOfHeavies.length )

Now   arrayOfHeavies[h.pop()] should give you the smallest element of
the array of heavies, without ever having to copy/swap anything but
ints.

Yah, that should work. You just need to initially fill the heap with the
numbers 0 to arrayOfHeavies.length. I presume that's what you meant with
the illegal slicing notation :o).

Andrei
```
Jan 29 2009
Bill Baxter <wbaxter gmail.com> writes:
```On Fri, Jan 30, 2009 at 1:17 PM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
Bill Baxter wrote:
On Fri, Jan 30, 2009 at 12:14 PM, Andrei Alexandrescu
Ah, never mind all that. I realized that I can't have a heap range. This
is
because heaps must mutate the underlying range in-place and any copy will
mess the heap up. Here's the example that clarify it for me:

int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ];
auto h = Heap!(int[])(a);
assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]);

At this point the mutations done by take messed up h already!!

Messed 'h' up or messed 'a' up?

'a' is messed up (mutated) by definition. The problem is that after h was
copied to do take's deed, it got messed up (i.e., doesn't respect the heap
property anymore).

Anyway, in that case it would be nice if you can easily run the heap
indirectly on an index or pointer buffer.

Will something like this work?
HeavyElement[] arrayOfHeavies;
auto h = Heap!(int[], (int i, int j){return
arrayOfHeavies[i]<arrayOfHeavies[j];} )( 0..arrayOfHeavies.length )

Now   arrayOfHeavies[h.pop()] should give you the smallest element of
the array of heavies, without ever having to copy/swap anything but
ints.

Yah, that should work. You just need to initially fill the heap with the
numbers 0 to arrayOfHeavies.length. I presume that's what you meant with the
illegal slicing notation :o).

Ok. great.  I thought that the slicing notation to make a sequence
worked everywhere now in D2.  I guess it's only inside a foreach.

--bb
```
Jan 29 2009
Christopher Wright <dhasenan gmail.com> writes:
```Andrei Alexandrescu wrote:
Bill Baxter wrote:
On Fri, Jan 30, 2009 at 12:14 PM, Andrei Alexandrescu
Ah, never mind all that. I realized that I can't have a heap range.
This is
because heaps must mutate the underlying range in-place and any copy
will
mess the heap up. Here's the example that clarify it for me:

int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ];
auto h = Heap!(int[])(a);
assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]);

At this point the mutations done by take messed up h already!!

Messed 'h' up or messed 'a' up?

'a' is messed up (mutated) by definition. The problem is that after h
was copied to do take's deed, it got messed up (i.e., doesn't respect
the heap property anymore).

I'm not seeing that. take should iterate through _h_, which will pop
elements from _a_ (or a copy thereof) while maintaining the heap invariant.
```
Jan 29 2009
Jason House <jason.james.house gmail.com> writes:
```Andrei Alexandrescu Wrote:

Bill Baxter wrote:
On Fri, Jan 30, 2009 at 11:49 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
A "computer science heap" is a structure that offers fast access to the
largest element and fast extraction of it (which in turn provides access to
the next largest element etc.).

In response to your previous question of how to distinguish from a
memory heap, you can use the specific name for the kind of heap it is,
like BinaryHeap, Binomial Heap, Fibbonocci Heap, etc.

BinaryHeap it is. Thanks!

I'm just done working on the heap in std.algorithm. Now, it turns out that
heap supports both a meaningful definition as a full-fledged container, and
a beautiful definition as a range.

If Heap is a range, you initiate it with another range, which Heap organizes
in the heap manner. Then, successive calls to next() nicely extract elements
starting from the largest. If the underlying range supports put(), then Heap
also supports put() to insert into the heap.

Heap as a container would offer similar primitives but in addition would
"own" its data (would call destructors upon destruction, and would support
value copying).

What do you think? Should I make Heap a container or a range?

Both sound useful depending on circumstances.  One provides the
all-in-one convenient solution, the other is more of a "non-invasive
heap".

If you emphasize the non-invasive version, then creating the
all-in-one version out of that should be pretty easy.  But you can't
take an all-in-one and turn it non-invasive so easily.

Ah, never mind all that. I realized that I can't have a heap range. This
is because heaps must mutate the underlying range in-place and any copy
will mess the heap up. Here's the example that clarify it for me:

int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ];
auto h = Heap!(int[])(a);
assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]);

At this point the mutations done by take messed up h already!!

I would have expected index take(h,5)[0] to be 16...

I still have yet to come to terms with passing ranges by value. I would expect
take to modify my range. I naturally expect heaps to be destructive as elements
are taken out. I also expect ranges to shrink. I don't see any issue...
```
Jan 29 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
``` I still have yet to come to terms with passing ranges by value. I
would expect take to modify my range. I naturally expect heaps to be
destructive as elements are taken out. I also expect ranges to
shrink. I don't see any issue...

involved passing by reference and it was an absolute mess: I'd forget a
"ref" here and there with weird results, or I'd forget to save a copy of
my range and I'd find is shrunk like an dehydrated fig in no time. One
advantage of pass by value is that code is shorter and nicer - no need
for a lot of temporaries.

Andrei
```
Jan 29 2009
Sean Kelly <sean invisibleduck.org> writes:
```Andrei Alexandrescu wrote:

Ah, never mind all that. I realized that I can't have a heap range. This
is because heaps must mutate the underlying range in-place and any copy
will mess the heap up. Here's the example that clarify it for me:

int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ];
auto h = Heap!(int[])(a);
assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]);

At this point the mutations done by take messed up h already!!

Hm... so assuming this is a min heap, I have:

a = [4,1,3,2,16,9,10,14,8,7];
h = [1,2,3,4,7,9,10,14,8,16];
take(5, h);
h = [8,10,9,16,14];

Shouldn't take call next repeatedly on the heap, which will in turn
perform a popHeap operation?  It's difficult to be sure what the result
will be after take(5,h) occurs, but it should still be a valid heap,
correct?

Sean
```
Jan 29 2009
Sean Kelly <sean invisibleduck.org> writes:
```Sean Kelly wrote:
Andrei Alexandrescu wrote:
Ah, never mind all that. I realized that I can't have a heap range.
This is because heaps must mutate the underlying range in-place and
any copy will mess the heap up. Here's the example that clarify it for
me:

int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ];
auto h = Heap!(int[])(a);
assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]);

At this point the mutations done by take messed up h already!!

Hm... so assuming this is a min heap, I have:

a = [4,1,3,2,16,9,10,14,8,7];
h = [1,2,3,4,7,9,10,14,8,16];
take(5, h);
h = [8,10,9,16,14];

Shouldn't take call next repeatedly on the heap, which will in turn
perform a popHeap operation?  It's difficult to be sure what the result
will be after take(5,h) occurs, but it should still be a valid heap,
correct?

Oh, I would also expect:

a = [8,10,9,16,14, garbage];

Since it isn't aware of the length adjustment.  Perhaps this is what you
meant?

Sean
```
Jan 29 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
```Sean Kelly wrote:
Sean Kelly wrote:
Andrei Alexandrescu wrote:
Ah, never mind all that. I realized that I can't have a heap range.
This is because heaps must mutate the underlying range in-place and
any copy will mess the heap up. Here's the example that clarify it
for me:

int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ];
auto h = Heap!(int[])(a);
assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]);

At this point the mutations done by take messed up h already!!

Hm... so assuming this is a min heap, I have:

a = [4,1,3,2,16,9,10,14,8,7];
h = [1,2,3,4,7,9,10,14,8,16];
take(5, h);
h = [8,10,9,16,14];

Shouldn't take call next repeatedly on the heap, which will in turn
perform a popHeap operation?  It's difficult to be sure what the
result will be after take(5,h) occurs, but it should still be a valid
heap, correct?

Oh, I would also expect:

a = [8,10,9,16,14, garbage];

Since it isn't aware of the length adjustment.  Perhaps this is what you
meant?

Sean

Yah, that's what I meant. h still thinks its length is 10 and that those
10 elements respect the heap property.

Andrei
```
Jan 29 2009
```Andrei Alexandrescu wrote:
Sean Kelly wrote:
Sean Kelly wrote:
Andrei Alexandrescu wrote:
Ah, never mind all that. I realized that I can't have a heap range.
This is because heaps must mutate the underlying range in-place and
any copy will mess the heap up. Here's the example that clarify it
for me:

int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ];
auto h = Heap!(int[])(a);
assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]);

At this point the mutations done by take messed up h already!!

Hm... so assuming this is a min heap, I have:

a = [4,1,3,2,16,9,10,14,8,7];
h = [1,2,3,4,7,9,10,14,8,16];
take(5, h);
h = [8,10,9,16,14];

Shouldn't take call next repeatedly on the heap, which will in turn
perform a popHeap operation?  It's difficult to be sure what the
result will be after take(5,h) occurs, but it should still be a valid
heap, correct?

Oh, I would also expect:

a = [8,10,9,16,14, garbage];

Since it isn't aware of the length adjustment.  Perhaps this is what
you meant?

Sean

Yah, that's what I meant. h still thinks its length is 10 and that those
10 elements respect the heap property.

Andrei

Since take is a mutator, that implies that h needs to be a reference
type or a by ref value, right?  So, why not make it so?

Later,
```
Jan 29 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
```Brad Roberts wrote:
Since take is a mutator, that implies that h needs to be a reference
type or a by ref value, right?  So, why not make it so?

That's a great question, and one that I don't have a definitive answer
to. There are several reasons for which by-ref manipulation is dicey:

1. Composition is harder.

Consider:

foreach (e; take(10, map!("a*a")(range))) ...

In this case, take is passed an rvalue. Rvalues can't and shouldn't bind
to references (I think C++'s rule of binding rvalues to const references
was a subtle disaster that necessitated endless contortions in defining
rvalue references).

So then we need to explicitate:

auto t = map!("a*a")(range);
foreach (e; take(10, t)) ...

2. Difficulties in coding.

My first attempt at ranges was in some formatting code. A by-ref
doctrine requires that "ref" is added religiously to the trafficked
values. That's not much of a problem except that, if you forget, the
code still compiles and runs as told, except that the way it's told is
not the way you meant.

3. More difficulties in coding

Coding with more non-local dependencies is harder than coding that
creates new, relatively independent entities. The examples given so far
were as simple as:

int[] a = [ 1, 2, 3 ];
take(1, a);
// WHY IS A NOT [2, 3] NOW???

In reality, code is almost never that simple and the reality is that it
becomes exceedingly harder to assess the state of a range after several
other lazy ranges mutate it at their leisure.

That's what I have so far.

On the other hand, other functions are a tougher call. Consider "drop".
drop(n, range) throws away at most n elements from the range. There are
a few possible ways to define drop:

a) By reference
int[] a = [ 1, 2, 3 ];
drop(1, a);
assert(a == [ 2, 3]);

b) By value
int[] a = [ 1, 2, 3 ];
drop(1, a);
assert(a == [ 1, 2, 3]);
a = drop(1, a);
assert(a == [ 2, 3]);

Unlike take, drop is eager, hence its by-ref implementation doesn't
raise as many problems. I'm not sure what's the best choice for drop.

Andrei
```
Jan 30 2009
Denis Koroskin <2korden gmail.com> writes:
```Andrei Alexandrescu Wrote:

[snip]

int[] a = [ 1, 2, 3 ];
take(1, a);

Why is a not [2, 3] now?
```
Jan 30 2009
Sean Kelly <sean invisibleduck.org> writes:
```I don't know if it matters, but I added the heap routines to Tango
because I wanted heapsort, so they kind of came for free.  Perhaps a key
distinction between ranges and algorithms is that algorithms may mutate
the underlying data, but ranges may not?  I've been trying to think of
another example of a mutating range and I haven't come up with one
yet... unless you consider input ranges mutating, I suppose.

Sean
```
Jan 29 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
```Sean Kelly wrote:
I don't know if it matters, but I added the heap routines to Tango
because I wanted heapsort, so they kind of came for free.  Perhaps a key
distinction between ranges and algorithms is that algorithms may mutate
the underlying data, but ranges may not?  I've been trying to think of
another example of a mutating range and I haven't come up with one
yet... unless you consider input ranges mutating, I suppose.

A great use for heaps is topNCopy: given an input range (!), give me
back the top N items in it, fast. The way that works is by maintaining a
heap in the return value.

I'm sure interesting cases will arise with ranges that mutate their
host, but heap doesn't look like a good candidate.

Andrei
```
Jan 30 2009
Sergey Gromov <snake.scaly gmail.com> writes:
```Thu, 29 Jan 2009 21:26:39 -0800, Andrei Alexandrescu wrote:

Sean Kelly wrote:
Sean Kelly wrote:
Andrei Alexandrescu wrote:
Ah, never mind all that. I realized that I can't have a heap range.
This is because heaps must mutate the underlying range in-place and
any copy will mess the heap up. Here's the example that clarify it
for me:

int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ];
auto h = Heap!(int[])(a);
assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]);

At this point the mutations done by take messed up h already!!

Hm... so assuming this is a min heap, I have:

a = [4,1,3,2,16,9,10,14,8,7];
h = [1,2,3,4,7,9,10,14,8,16];
take(5, h);
h = [8,10,9,16,14];

Shouldn't take call next repeatedly on the heap, which will in turn
perform a popHeap operation?  It's difficult to be sure what the
result will be after take(5,h) occurs, but it should still be a valid
heap, correct?

Oh, I would also expect:

a = [8,10,9,16,14, garbage];

Since it isn't aware of the length adjustment.  Perhaps this is what you
meant?

Sean

Yah, that's what I meant. h still thinks its length is 10 and that those
10 elements respect the heap property.

Actually, if ranges are by value, I'd expect

assert(equal(take(5, h) == take(5, h)));

OTOH, I'd expect something like the following to also work:

auto f = new File("blah.txt");
auto top = take(10, f.lines);

I.e. I'd expect some ranges to be Translators, and others to be
Mutators.  Translators are read-only views into underlying anything
which may involve some expensive bookkeeping.  Mutators are basically
references into a particular container and make sure that the container
and other Mutators stay consistent.
```
Jan 30 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
```Sergey Gromov wrote:
Thu, 29 Jan 2009 21:26:39 -0800, Andrei Alexandrescu wrote:

Sean Kelly wrote:
Sean Kelly wrote:
Andrei Alexandrescu wrote:
Ah, never mind all that. I realized that I can't have a heap range.
This is because heaps must mutate the underlying range in-place and
any copy will mess the heap up. Here's the example that clarify it
for me:

int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ];
auto h = Heap!(int[])(a);
assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]);

At this point the mutations done by take messed up h already!!

Hm... so assuming this is a min heap, I have:

a = [4,1,3,2,16,9,10,14,8,7];
h = [1,2,3,4,7,9,10,14,8,16];
take(5, h);
h = [8,10,9,16,14];

Shouldn't take call next repeatedly on the heap, which will in turn
perform a popHeap operation?  It's difficult to be sure what the
result will be after take(5,h) occurs, but it should still be a valid
heap, correct?

Oh, I would also expect:

a = [8,10,9,16,14, garbage];

Since it isn't aware of the length adjustment.  Perhaps this is what you
meant?

Sean

Yah, that's what I meant. h still thinks its length is 10 and that those
10 elements respect the heap property.

Actually, if ranges are by value, I'd expect

assert(equal(take(5, h) == take(5, h)));

Me too. It does work indeed BUT NOT FOR INPUT RANGES. So in short you
are showing yet another reason why heaps can't be good ranges.

OTOH, I'd expect something like the following to also work:

auto f = new File("blah.txt");
auto top = take(10, f.lines);

That also works (just that in the new std.stdio File is a struct, so
it's not dynamically allocated). However, f.lines is understandably an
input range so this will NOT work:

auto f = new File("blah.txt");
assert(equal(take(5, f.lines) == take(5, f.lines)));

The two f.lines instances operate on the same underlying file handle, so

I.e. I'd expect some ranges to be Translators, and others to be
Mutators.  Translators are read-only views into underlying anything
which may involve some expensive bookkeeping.  Mutators are basically
references into a particular container and make sure that the container
and other Mutators stay consistent.

I think the distinction is input vs. forward and mutable vs. immutable
ranges (the latter distinction is not checked in yet).

Andrei
```
Jan 30 2009
Denis Koroskin <2korden gmail.com> writes:
```Sergey Gromov Wrote:
[snip]
OTOH, I'd expect something like the following to also work:

auto f = new File("blah.txt");
auto top = take(10, f.lines);

Same here. I feel that take (and some other adaptors) should accept range by
reference. One of the advantages of ranges over opApply is that you can start
foreach, break somewhere (throw an exception, recover) and then continue. It
also means that the range is mutated inside foreach. If foreach mutates the
range, so can other algorithms that operate on ranges do.

I.e. I'd expect some ranges to be Translators, and others to be
Mutators.  Translators are read-only views into underlying anything
which may involve some expensive bookkeeping.  Mutators are basically
references into a particular container and make sure that the container
and other Mutators stay consistent.

```
Jan 30 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
```Denis Koroskin wrote:
Sergey Gromov Wrote: [snip]
OTOH, I'd expect something like the following to also work:

auto f = new File("blah.txt"); auto top = take(10, f.lines);

Same here. I feel that take (and some other adaptors) should accept
range by reference. One of the advantages of ranges over opApply is
that you can start foreach, break somewhere (throw an exception,
recover) and then continue. It also means that the range is mutated
inside foreach. If foreach mutates the range, so can other algorithms
that operate on ranges do.

While I'm not 100% clear on what take should do, this particular
argument is not that compelling. I really mean not compelling at all.

Consider:

int[] a = [ 1, 2, 3, 4, 5];
foreach (e; a)
{
if (e == 3) break;
}
assert(a == [4, 5]);

Do you seriously expect that to be the case? Of course not. However, you
would inconsistently expect this to happen:

int[] a = [ 1, 2, 3, 4, 5];
foreach (e; take(4, a))
{
if (e == 3) break;
}
assert(a == [4, 5]);

Ranges are modeled to correspond to slices. Slices have few operations
that manipulate them by reference (notably the controversial ~=) and
range are made to be unsurprising when compared to slices. They are
really extensions of slices.

Andrei
```
Jan 30 2009
```On Fri, 30 Jan 2009, Andrei Alexandrescu wrote:

Consider:

int[] a = [ 1, 2, 3, 4, 5];
foreach (e; a)
{
if (e == 3) break;
}
assert(a == [4, 5]);

Do you seriously expect that to be the case? Of course not. However, you would
inconsistently expect this to happen:

int[] a = [ 1, 2, 3, 4, 5];
foreach (e; take(4, a))
{
if (e == 3) break;
}
assert(a == [4, 5]);

Ranges are modeled to correspond to slices. Slices have few operations that
manipulate them by reference (notably the controversial ~=) and range are made
to be unsurprising when compared to slices. They are really extensions of
slices.

Andrei

I hate to come back to naming, but I think that's a major part of what's
causing grief here.  Take implies mutation of the input range.  How about
'subrange' or 'subset' or 'leftmost' anything that doesn't imply removal.

Later,
```
Jan 30 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
```Brad Roberts wrote:
On Fri, 30 Jan 2009, Andrei Alexandrescu wrote:

Consider:

int[] a = [ 1, 2, 3, 4, 5];
foreach (e; a)
{
if (e == 3) break;
}
assert(a == [4, 5]);

Do you seriously expect that to be the case? Of course not. However, you would
inconsistently expect this to happen:

int[] a = [ 1, 2, 3, 4, 5];
foreach (e; take(4, a))
{
if (e == 3) break;
}
assert(a == [4, 5]);

Ranges are modeled to correspond to slices. Slices have few operations that
manipulate them by reference (notably the controversial ~=) and range are made
to be unsurprising when compared to slices. They are really extensions of
slices.

Andrei

I hate to come back to naming, but I think that's a major part of what's
causing grief here.  Take implies mutation of the input range.  How about
'subrange' or 'subset' or 'leftmost' anything that doesn't imply removal.

Later,

Names are important. "take" is taken from Haskell.

Andrei
```
Jan 30 2009
Derek Parnell <derek psych.ward> writes:
```On Fri, 30 Jan 2009 14:46:28 -0800, Andrei Alexandrescu wrote:

Names are important. "take" is taken from Haskell.

No it hasn't. Haskell still has it so its been "copied" from Haskell. :-)

--
Derek Parnell
Melbourne, Australia
skype: derek.j.parnell
```
Jan 30 2009
BCS <none anon.com> writes:
```Hello Andrei,

What do you think? Should I make Heap a container or a range?

Andrei

what would be the problem of making it (optionaly) both?
```
Jan 29 2009
Yigal Chripun <yigal100 gmail.com> writes:
```Andrei Alexandrescu wrote:
A "computer science heap" is a structure that offers fast access to the
largest element and fast extraction of it (which in turn provides access
to the next largest element etc.).

I'm just done working on the heap in std.algorithm. Now, it turns out
that heap supports both a meaningful definition as a full-fledged
container, and a beautiful definition as a range.

If Heap is a range, you initiate it with another range, which Heap
organizes in the heap manner. Then, successive calls to next() nicely
extract elements starting from the largest. If the underlying range
supports put(), then Heap also supports put() to insert into the heap.

Heap as a container would offer similar primitives but in addition would
"own" its data (would call destructors upon destruction, and would
support value copying).

What do you think? Should I make Heap a container or a range?

Andrei

isn't a Heap also called a Priority Queue?
```
Jan 29 2009
Bill Baxter <wbaxter gmail.com> writes:
```On Fri, Jan 30, 2009 at 4:01 PM, Yigal Chripun <yigal100 gmail.com> wrote:
isn't a Heap also called a Priority Queue?

I the distinction is that a Heap is one way to implement a Priority
Queue.  You could implement a priority queue by scanning a linked list
for the smallest item and it would still be a (really inefficient)
Priority Queue.

--bb
```
Jan 29 2009
Yigal Chripun <yigal100 gmail.com> writes:
```Bill Baxter wrote:
On Fri, Jan 30, 2009 at 4:01 PM, Yigal Chripun<yigal100 gmail.com>  wrote:
isn't a Heap also called a Priority Queue?

I the distinction is that a Heap is one way to implement a Priority
Queue.  You could implement a priority queue by scanning a linked list
for the smallest item and it would still be a (really inefficient)
Priority Queue.

--bb

Ah. right.. :)
```
Jan 30 2009
Don <nospam nospam.com> writes:
```Andrei Alexandrescu wrote:
A "computer science heap" is a structure that offers fast access to the
largest element and fast extraction of it (which in turn provides access
to the next largest element etc.).

I'm just done working on the heap in std.algorithm. Now, it turns out
that heap supports both a meaningful definition as a full-fledged
container, and a beautiful definition as a range.

If Heap is a range, you initiate it with another range, which Heap
organizes in the heap manner. Then, successive calls to next() nicely
extract elements starting from the largest. If the underlying range
supports put(), then Heap also supports put() to insert into the heap.

Heap as a container would offer similar primitives but in addition would
"own" its data (would call destructors upon destruction, and would
support value copying).

What do you think? Should I make Heap a container or a range?

Andrei

I think this depends on whether heap operations are required (and
usable) in places other than in a heap container.

For some applications (certain graph algorithms), you want two ways to
access the data. You use an IndexedPriorityQueue structure, which
contains both a heap of (key, value) pairs ordered by value, allowing
you to pop the item with minimum value in O(log n) time, and also an
array allowing you to access any item by key in O(1) time. Whenever you
move items in the heap, you update the array at the same time.
It's impossible to implement such a data stucture using STL heap
primitives, and likewise, my guess is that it would be impossible in
both a range and container Heaps (basically, you need to be able to hook
in a user-defined function to be called whenever items are swapped in
the heap).
But if it is possible in one but not the other implementation, it should
be favoured.
```
Jan 30 2009