www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Making sense of ranges

reply Stewart Gordon <smjg_1998 yahoo.com> writes:
The documentation for std.range states

http://dlang.org/phobos/std_range.html
"This module defines the notion of range (by the membership tests isInputRange, 
isForwardRange, isBidirectionalRange, isRandomAccessRange), range capability
tests (such 
as hasLength or hasSlicing), and a few useful range incarnations."

But that intro doesn't describe what a range actually is.

Here's what I've made out so far:

- Ranges are basically a generalised API for accessing a container or stream
that is 
linear in logical structure and can have any element type.

- What is a range and what isn't is determined by compile-time duck typing -
testing 
whether it implements certain methods.

- A range over a finite data structure is essentially equivalent to a start/end
iterator 
pair as used by various C++ STL functions.

- Where a function in std.range is described as iterating through ranges in a
particular 
way, what this function does is to return a range that delivers the resulting
sequence.

Have I got all this right?


Things I'm confused by:

- One of the expansions of put is
     r.front = e; r.popFront();

What has putting something at the front of a range and then popping it to do
with 
outputting to the range?

- Why is a range that can save checkpoints called a "forward" range?


Stewart.
Mar 24 2012
next sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 03/24/2012 11:19 AM, Stewart Gordon wrote:
 The documentation for std.range states

 http://dlang.org/phobos/std_range.html
 "This module defines the notion of range (by the membership tests
 isInputRange, isForwardRange, isBidirectionalRange,
 isRandomAccessRange), range capability tests (such as hasLength or
 hasSlicing), and a few useful range incarnations."

 But that intro doesn't describe what a range actually is.

Apparently the library relies on the definition elsewhere. [1]
 Here's what I've made out so far:

 - Ranges are basically a generalised API for accessing a container or
 stream that is linear in logical structure and can have any element type.

Yes. Some ranges can also provide random access.
 - What is a range and what isn't is determined by compile-time duck
 typing - testing whether it implements certain methods.

Yes.
 - A range over a finite data structure is essentially equivalent to a
 start/end iterator pair as used by various C++ STL functions.

Yes.
 - Where a function in std.range is described as iterating through ranges
 in a particular way, what this function does is to return a range that
 delivers the resulting sequence.

Yes. Almost always lazily.
 Have I got all this right?

Yes.
 Things I'm confused by:

 - One of the expansions of put is
 r.front = e; r.popFront();

 What has putting something at the front of a range and then popping it
 to do with outputting to the range?

Iterating an output range is also by popFront(). So what it says is, put this element to the output range and advance the range. There is a gotcha about this when the output range is a slice: Whatever is just put into the range is popped right away! :) [2]
 - Why is a range that can save checkpoints called a "forward" range?

I agree. Here is my guess: The name of the ForwardRange comes from the fact that it is not double-ended. It can go only in one direction.
 Stewart.

Ali [1] http://www.informit.com/articles/printerfriendly.aspx?p=1407357 [2] http://ddili.org/ders/d.en/ranges.html
Mar 24 2012
next sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
On 24/03/2012 18:57, Ali Çehreli wrote:
<snip>
 Iterating an output range is also by popFront(). So what it says is, put this
element to
 the output range and advance the range. There is a gotcha about this when the
output range
 is a slice: Whatever is just put into the range is popped right away! :) [2]

I'm beginning to get it now: the purpose of an output range is to put new data into the underlying container. So once you've put something in, the remaining range is what's left to be populated. I had been thinking of outputting in terms of appending to the range, hence the confusion.
 - Why is a range that can save checkpoints called a "forward" range?

I agree. Here is my guess: The name of the ForwardRange comes from the fact that it is not double-ended. It can go only in one direction.

Bidirectional ranges are forward ranges as well. But I've just had a look at STL iterators, and it seems that the range category hierarchy has been lifted from that. There, a "forward" iterator combines the functionality of an input iterator and an output iterator, hence it's your basic iterator that can move forwards and do what it likes with the data it walks over. (It would seem that it's only an "output" iterator in that * returns an lvalue, which may or may not be actually assignable depending on the constancy of the element type.) In D ranges, OTOH, the only thing that distinguishes a forward range from a general input range is the presence of a save method to make a copy of it. Doesn't seem to have anything to do with either the "forward" name or the C++ origin thereof.... Something else I'm finding puzzling is moveFront, moveAt and moveBack. Is D trying to be C++11 or something? Move semantics don't seem to me to be useful in a language with GC. Stewart.
Mar 24 2012
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 03/25/2012 12:07 AM, Stewart Gordon wrote:
 On 24/03/2012 18:57, Ali Çehreli wrote:
 <snip>
 Iterating an output range is also by popFront(). So what it says is,
 put this element to
 the output range and advance the range. There is a gotcha about this
 when the output range
 is a slice: Whatever is just put into the range is popped right away!
 :) [2]

I'm beginning to get it now: the purpose of an output range is to put new data into the underlying container. So once you've put something in, the remaining range is what's left to be populated. I had been thinking of outputting in terms of appending to the range, hence the confusion.
 - Why is a range that can save checkpoints called a "forward" range?

I agree. Here is my guess: The name of the ForwardRange comes from the fact that it is not double-ended. It can go only in one direction.

Bidirectional ranges are forward ranges as well. But I've just had a look at STL iterators, and it seems that the range category hierarchy has been lifted from that. There, a "forward" iterator combines the functionality of an input iterator and an output iterator, hence it's your basic iterator that can move forwards and do what it likes with the data it walks over. (It would seem that it's only an "output" iterator in that * returns an lvalue, which may or may not be actually assignable depending on the constancy of the element type.) In D ranges, OTOH, the only thing that distinguishes a forward range from a general input range is the presence of a save method to make a copy of it. Doesn't seem to have anything to do with either the "forward" name or the C++ origin thereof.... Something else I'm finding puzzling is moveFront, moveAt and moveBack. Is D trying to be C++11 or something? Move semantics don't seem to me to be useful in a language with GC. Stewart.

The fact that the language has GC does not imply everything should go on the GC heap.
Mar 24 2012
prev sibling next sibling parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 03/24/2012 04:07 PM, Stewart Gordon wrote:
 On 24/03/2012 18:57, Ali Çehreli wrote:
 <snip>
 Iterating an output range is also by popFront(). So what it says is,
 put this element to
 the output range and advance the range. There is a gotcha about this
 when the output range
 is a slice: Whatever is just put into the range is popped right away!
 :) [2]

I'm beginning to get it now: the purpose of an output range is to put new data into the underlying container. So once you've put something in, the remaining range is what's left to be populated. I had been thinking of outputting in terms of appending to the range, hence the confusion.
 - Why is a range that can save checkpoints called a "forward" range?

I agree. Here is my guess: The name of the ForwardRange comes from the fact that it is not double-ended. It can go only in one direction.

Bidirectional ranges are forward ranges as well. But I've just had a look at STL iterators, and it seems that the range category hierarchy has been lifted from that. There, a "forward" iterator combines the functionality of an input iterator and an output iterator, hence it's your basic iterator that can move forwards and do what it likes with the data it walks over. (It would seem that it's only an "output" iterator in that * returns an lvalue, which may or may not be actually assignable depending on the constancy of the element type.) In D ranges, OTOH, the only thing that distinguishes a forward range from a general input range is the presence of a save method to make a copy of it.

I looked for rationale on Andrei's article. There is this bit about STL forward iterators: <quote> Input and forward iterators are syntactically identical but subtly different semantically—copying a forward iterator saves iteration state, but copying an input iterator just creates a new view of the same iteration state. </quote> I guess that's why 'save' is explicit on ForwardRange. Ali
 Doesn't seem to have anything to do with either the
 "forward" name or the C++ origin thereof....

 Something else I'm finding puzzling is moveFront, moveAt and moveBack.
 Is D trying to be C++11 or something? Move semantics don't seem to me to
 be useful in a language with GC.

 Stewart.

Mar 24 2012
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, March 24, 2012 18:14:46 Ali =C3=87ehreli wrote:
 I looked for rationale on Andrei's article. There is this bit about S=

 forward iterators:
=20
 <quote>
 Input and forward iterators are syntactically identical but subtly
 different semantically=E2=80=94copying a forward iterator saves itera=

 but copying an input iterator just creates a new view of the same
 iteration state.
 </quote>
=20
 I guess that's why 'save' is explicit on ForwardRange.

Yes. If a range is a reference type rather than a value type or a dynam= ic=20 array, then you _must_ have save in order to get a copy of its current=20= state. However, since most ranges are either dynamic arrays or value ty= pes,=20 functions are often written without using save like they should and wou= ld=20 end up consuming a forward range if it's a reference type (e.g. a class= ).=20 But save was created primarily to make it possible to have classes whic= h are=20 ranges rather than relying on assignment making a copy (and not all str= ucts=20 are value types, so save can be required for them as well). - Jonathan M Davis
Mar 24 2012
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 24 Mar 2012 19:07:01 -0400, Stewart Gordon <smjg_1998 yahoo.com>=
  =

wrote:

 On 24/03/2012 18:57, Ali =C3=87ehreli wrote:
 <snip>
 Iterating an output range is also by popFront(). So what it says is, =


 put this element to
 the output range and advance the range. There is a gotcha about this =


 when the output range
 is a slice: Whatever is just put into the range is popped right away!=


 :) [2]

I'm beginning to get it now: the purpose of an output range is to put =

 new data into the underlying container.  So once you've put something =

 in, the remaining range is what's left to be populated.  I had been  =

 thinking of outputting in terms of appending to the range, hence the  =

 confusion.

The output range is almost an entirely orthogonal concept to an input = range. It basically defines a way to output elements. How an output range directs its elements is up to the output range. It = = may append, it may overwrite, it may prepend, it can do anything it want= s. The only commonality is that a *writable* input range can also be an = output range (writable meaning r.front =3D x works). -Steve
Mar 26 2012
prev sibling next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 24.03.2012 22:19, Stewart Gordon wrote:
 The documentation for std.range states

 http://dlang.org/phobos/std_range.html
 "This module defines the notion of range (by the membership tests
 isInputRange, isForwardRange, isBidirectionalRange,
 isRandomAccessRange), range capability tests (such as hasLength or
 hasSlicing), and a few useful range incarnations."

 But that intro doesn't describe what a range actually is.

 Here's what I've made out so far:

 - Ranges are basically a generalised API for accessing a container or
 stream that is linear in logical structure and can have any element type.

 - What is a range and what isn't is determined by compile-time duck
 typing - testing whether it implements certain methods.

 - A range over a finite data structure is essentially equivalent to a
 start/end iterator pair as used by various C++ STL functions.

 - Where a function in std.range is described as iterating through ranges
 in a particular way, what this function does is to return a range that
 delivers the resulting sequence.

 Have I got all this right?

More or less fine.
 Things I'm confused by:

 - One of the expansions of put is
 r.front = e; r.popFront();

Strange looking this one, but it allows, for instance, to output things to a container that defines range over it's elements.
 What has putting something at the front of a range and then popping it
 to do with outputting to the range?

 - Why is a range that can save checkpoints called a "forward" range?

It's not truly checkpoints, but rather the whole state can copied. I believe, in some use cases checkpoints can be much smaller then the whole state, and we haven't got a checkpoint range yet. Could be a nice addition if proposal is sound. -- Dmitry Olshansky
Mar 24 2012
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Mar 24, 2012 at 06:19:32PM +0000, Stewart Gordon wrote:
 The documentation for std.range states
 
 http://dlang.org/phobos/std_range.html
 "This module defines the notion of range (by the membership tests
 isInputRange, isForwardRange, isBidirectionalRange,
 isRandomAccessRange), range capability tests (such as hasLength or
 hasSlicing), and a few useful range incarnations."
 
 But that intro doesn't describe what a range actually is.

In a previous post long lost inside a rather large thread some time ago, I proposed a rewrite of the docs of std.range to make it clearer, at least on a basic level, what a range is, why we should care, and what the module offers. This thread has further convinced me that std.range's docs *need* this rewrite. So here's my first attempt at it: https://github.com/quickfur/phobos/tree/stdrange_docs It's not complete yet; I still have a few more range-creation templates to cover, and then list the auxiliary functions, etc., before it's ready for merging into Phobos. But I thought I should put it up for review here first, in case people have some comments/suggestions or further things that should be put into the docs. Better to put it all in a single pull request than waste the Phobos maintainers' time with multiple updates to the docs. So, comments, suggestions, flames, etc., are hereby solicited. :-) T -- If lightning were to ever strike an orchestra, it'd always hit the conductor first.
Mar 25 2012
parent Mike Parker <aldacron gmail.com> writes:
On 3/27/2012 7:26 PM, Marco Leise wrote:
 Am Tue, 27 Mar 2012 06:00:58 +0200
 schrieb "Jesse Phillips"<jessekphillips+D gmail.com>:

 On Monday, 26 March 2012 at 00:50:32 UTC, H. S. Teoh wrote:

 This thread has further convinced me that std.range's docs
 *need* this
 rewrite. So here's my first attempt at it:

 	https://github.com/quickfur/phobos/tree/stdrange_docs

I find that opening to be much better. Look forward to the improvement.

I agree. To the newcomer it is now easy to see the rationale behind ranges and why they should take the time to understand the concept. What I missed when I started with D, was some explanation that the returned ranges from many algorithms are compile time generated structs. I often found myself wondering why I don't get an array returned when I put one into the algorithm and had to do array(...) all over the place, when actually I could often have passed on the ranges directly to a following foreach or similar.

Yes. When I saw that the implementation of ranges required more than a few minutes of digging around to comprehend, I just avoided them completely for quite a while. When I finally did decide to roll up my sleeves and get dirty, the lack of documentation in Phobos or on the web site was particularly frustrating. I was going to complain about it here in the newsgroups (not sure if I did or not), but then Ali announced the translation of his book chapter on ranges and I was enlightened.
 "Ranges whose elements are sorted affords ..."<- insert a comma before affords
perhaps? It would help non-native speakers.

Actually, a comma there would be incorrect. But because 'Ranges' is plural, 'affords' should lose the 's' (Ranges whose elements are sorted afford...). If the wording is confusing, perhaps it could be rewritten as "Sorted ranges afford..."
Mar 27 2012
prev sibling next sibling parent "Jesse Phillips" <jessekphillips+D gmail.com> writes:
On Monday, 26 March 2012 at 00:50:32 UTC, H. S. Teoh wrote:

 This thread has further convinced me that std.range's docs 
 *need* this
 rewrite. So here's my first attempt at it:

 	https://github.com/quickfur/phobos/tree/stdrange_docs

I find that opening to be much better. Look forward to the improvement.
Mar 26 2012
prev sibling next sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Tue, 27 Mar 2012 06:00:58 +0200
schrieb "Jesse Phillips" <jessekphillips+D gmail.com>:

 On Monday, 26 March 2012 at 00:50:32 UTC, H. S. Teoh wrote:
 
 This thread has further convinced me that std.range's docs 
 *need* this
 rewrite. So here's my first attempt at it:

 	https://github.com/quickfur/phobos/tree/stdrange_docs

I find that opening to be much better. Look forward to the improvement.

I agree. To the newcomer it is now easy to see the rationale behind ranges and why they should take the time to understand the concept. What I missed when I started with D, was some explanation that the returned ranges from many algorithms are compile time generated structs. I often found myself wondering why I don't get an array returned when I put one into the algorithm and had to do array(...) all over the place, when actually I could often have passed on the ranges directly to a following foreach or similar. "Ranges whose elements are sorted affords ..." <- insert a comma before affords perhaps? It would help non-native speakers. -- Marco
Mar 27 2012
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Mar 27, 2012 at 09:55:43PM +0900, Mike Parker wrote:
 On 3/27/2012 7:26 PM, Marco Leise wrote:

"Ranges whose elements are sorted affords ..."<- insert a comma
before affords perhaps? It would help non-native speakers.

Actually, a comma there would be incorrect. But because 'Ranges' is plural, 'affords' should lose the 's' (Ranges whose elements are sorted afford...). If the wording is confusing, perhaps it could be rewritten as "Sorted ranges afford..."

Fixed, thanks for catching that grammatical glitch. T -- Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- Brian W. Kernighan
Mar 27 2012