www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Range Type

reply "Janice Caron" <caron800 googlemail.com> writes:
I know this has cropped up before (in discussions about multiple
dimension arrays), but adding a range type would also really help with
the whole business of returning slices. (See the many other threads
currently buzzing with this topic).

A range is nothing more than a two-element struct

    struct Range(T,U=T)
    {
        T begin;
        U end;
    }

However, if you throw in some extra language support, it gets really,
really useful. Basically, you want the ".." infix operator always to
create a range. Thus

    auto x = 3 .. 4;

creates a Range!(int) with values { 3, 4 }. In general (a .. b) should
evaluate to a Range!(typeof(a),typeof(b)) with values { a, b }.
Finally, you also want [] and opSlice() to accept Range! parameters,
so that

    s = s[a..b];

can always be rewritten as

    auto t = a..b;
    s = s[t];

In general, opSlice(Range r) should be eqivalent to opSlice(r.begin, r.end).

In my opinion language support for ranges (allowing .. to return a
range, and allowing [] to accept a range) has advantages above and
beyond those already discussed, and may also allow many other exciting
possibilites we haven't even thought of yet.
Mar 24 2008
next sibling parent "Craig Black" <cblack ara.com> writes:
"Janice Caron" <caron800 googlemail.com> wrote in message 
news:mailman.192.1206357352.2351.digitalmars-d puremagic.com...
I know this has cropped up before (in discussions about multiple
 dimension arrays), but adding a range type would also really help with
 the whole business of returning slices. (See the many other threads
 currently buzzing with this topic).

 A range is nothing more than a two-element struct

    struct Range(T,U=T)
    {
        T begin;
        U end;
    }

 However, if you throw in some extra language support, it gets really,
 really useful. Basically, you want the ".." infix operator always to
 create a range. Thus

    auto x = 3 .. 4;

 creates a Range!(int) with values { 3, 4 }. In general (a .. b) should
 evaluate to a Range!(typeof(a),typeof(b)) with values { a, b }.
 Finally, you also want [] and opSlice() to accept Range! parameters,
 so that

    s = s[a..b];

 can always be rewritten as

    auto t = a..b;
    s = s[t];

 In general, opSlice(Range r) should be eqivalent to opSlice(r.begin, 
 r.end).

 In my opinion language support for ranges (allowing .. to return a
 range, and allowing [] to accept a range) has advantages above and
 beyond those already discussed, and may also allow many other exciting
 possibilites we haven't even thought of yet.
I've actually thought of this before. IMO, a built-in range type is good fit for D It also seems like it would be pretty simple to implement. -Craig
Mar 24 2008
prev sibling next sibling parent reply "Craig Black" <cblack ara.com> writes:
"Janice Caron" <caron800 googlemail.com> wrote in message 
news:mailman.192.1206357352.2351.digitalmars-d puremagic.com...
I know this has cropped up before (in discussions about multiple
 dimension arrays), but adding a range type would also really help with
 the whole business of returning slices. (See the many other threads
 currently buzzing with this topic).

 A range is nothing more than a two-element struct

    struct Range(T,U=T)
    {
        T begin;
        U end;
    }
Why would the type T and U ever be different? What's the point of the second type being different than the first? -Craig
Mar 24 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 24/03/2008, Craig Black <cblack ara.com> wrote:
 Why would the type T and U ever be different?  What's the point of the
  second type being different than the first?
Um ... I have no idea. I guess it's just to be consistent with the fact that we can already declare: opSlice(T lower, U upper) with T and U being different. Maybe someone will find a use for it one day. That said, it does at least allow you to use ".." in declarations, as in: int..int x; x = 3..4; and you have to admit, T..T is likely to be less typing (and look prettier) than Range!(T). :-)
Mar 24 2008
next sibling parent reply sambeau <sambeau-nospam mac.com> writes:
Janice Caron Wrote:

     int..int x;
     x = 3..4;
 
 and you have to admit, T..T is likely to be less typing 
 (and look prettier) than Range!(T). :-)
Even if this was just sugar I would vote for it, but being that it could really help with const-ness and thus speed D towards exciting functional paradigms I give it my 100% unconditional support. And now I need to think whether ranges are the only interesting exceptions to const that we need to consider. For instance.. CONS? Could we safely make a new const array by concating two constant arrays together? And if so is there any need for this? I can see that it would just another ay of doing COW, but could the compiler apply interesting optimisations (and paralellisations) from this knowledge? Hmm..
Mar 24 2008
parent "Janice Caron" <caron800 googlemail.com> writes:
On 24/03/2008, sambeau <sambeau-nospam mac.com> wrote:
  And now I need to think whether ranges are the only interesting exceptions to
const that we need to consider.
It's not an exception to const.
  For instance.. CONS? Could we safely make a new const array by concating two
constant arrays together? And if so is there any need for this? I can see that
it would just another ay of doing COW, but could the compiler apply interesting
optimisations (and paralellisations) from this knowledge?
You may perhaps have misunderstood the proposal. The type T..U is sugar for struct { T begin; U end; }, that's all. The value x..y is just an instance of typeof(x)..typeof(y). There's no array magic going on whatsover. You'd use ranges to /slice/ arrays, that's all.
Mar 24 2008
prev sibling next sibling parent reply "Craig Black" <cblack ara.com> writes:
"Janice Caron" <caron800 googlemail.com> wrote in message 
news:mailman.197.1206370014.2351.digitalmars-d puremagic.com...
 On 24/03/2008, Craig Black <cblack ara.com> wrote:
 Why would the type T and U ever be different?  What's the point of the
  second type being different than the first?
Um ... I have no idea. I guess it's just to be consistent with the fact that we can already declare: opSlice(T lower, U upper) with T and U being different. Maybe someone will find a use for it one day. That said, it does at least allow you to use ".." in declarations, as in: int..int x; x = 3..4; and you have to admit, T..T is likely to be less typing (and look prettier) than Range!(T). :-)
Why not just leave out the second type and save some typing? int.. x; x = 3 .. 4;
Mar 24 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 24/03/2008, Craig Black <cblack ara.com> wrote:
 Why not just leave out the second type and save some typing?

  int.. x;
That would make "a..b" semantically ambiguous - is "a.." a type and "b" a value, or is "a..b" an expression? We need to keep D's grammar context free.
Mar 24 2008
parent reply "Craig Black" <cblack ara.com> writes:
"Janice Caron" <caron800 googlemail.com> wrote in message 
news:mailman.201.1206373241.2351.digitalmars-d puremagic.com...
 On 24/03/2008, Craig Black <cblack ara.com> wrote:
 Why not just leave out the second type and save some typing?

  int.. x;
That would make "a..b" semantically ambiguous - is "a.." a type and "b" a value, or is "a..b" an expression? We need to keep D's grammar context free.
I could be wrong, but I don't think this would be a big deal. The .. would be interpreted differently when preceded by a type. The .. would work like []. int.. x; // A range of integers int[] x; // An array of integers
Mar 24 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 24/03/2008, Craig Black <cblack ara.com> wrote:
 I could be wrong, but I don't think this would be a big deal.  The .. would
  be interpreted differently when preceded by a type.  The .. would work like
  [].
I don't think it would, because ".." is an infix operator, whereas "[]" is a postfix operator. That said, it's no big deal. If we're arguing about /syntax/, I guess that mean we're not really disputing the /idea/. So I'm happy to let Walter worry about the syntax.
Mar 24 2008
parent reply Xinok <xnknet gmail.com> writes:
Janice Caron wrote:
 On 24/03/2008, Craig Black <cblack ara.com> wrote:
 I could be wrong, but I don't think this would be a big deal.  The .. would
  be interpreted differently when preceded by a type.  The .. would work like
  [].
I don't think it would, because ".." is an infix operator, whereas "[]" is a postfix operator.
* is also an infix operator, and we seem to get along fine with that. Regardless, I don't think it will work. How about if you have an array of ranges? [10..20, 20..30, 30..40] How do you write that type? This doesn't work: int..int[] Because, to me anyways, this translates to: range!(int, int[]) Using .. should work for literals. All that is really needed is changing the precedence of the operator. Currently, writing this: x = 0 .. 10 translates to: (x = 0) .. 10 I think the precedence fits best somewhere between + - and =. This way, we can write both: x = 0 .. 10 arr[0 .. $ - 1]
Mar 24 2008
next sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 24/03/2008, Xinok <xnknet gmail.com> wrote:
  This doesn't work:
  int..int[]
  Because, to me anyways, this translates to:
  range!(int, int[])
Ah, you're right. Well then, I guess we'll probably have to call it Range!, with both type parameters beng optional. If both type parameters are omitted, it defaults to size_t (or whatever is considered correct for array indexing these days); if only the second is omitted, the second defaults to the same as the first. Given that most of the time you're declare it with auto, or infer it from a range expression, that would certainly be acceptable.
Mar 24 2008
prev sibling parent reply "Craig Black" <cblack ara.com> writes:
"Xinok" <xnknet gmail.com> wrote in message 
news:fs8knm$61p$1 digitalmars.com...
 Janice Caron wrote:
 On 24/03/2008, Craig Black <cblack ara.com> wrote:
 I could be wrong, but I don't think this would be a big deal.  The .. 
 would
  be interpreted differently when preceded by a type.  The .. would work 
 like
  [].
I don't think it would, because ".." is an infix operator, whereas "[]" is a postfix operator.
* is also an infix operator, and we seem to get along fine with that. Regardless, I don't think it will work. How about if you have an array of ranges? [10..20, 20..30, 30..40] How do you write that type?
int..[] To clarify my opinion, I don't see a point in specifying the second type. -Craig
Mar 24 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 24/03/2008, Craig Black <cblack ara.com> wrote:
 int..[]
I think I'd be happy with Range!(int)
  To clarify my opinion, I don't see a point in specifying the second type.
...unless of course the second type is different from the first type! :) Rare though that case may be, I think we'd need it in order to retain the existing functionality of opSlice(T,U). Also, Range!(T,U)[] is unambiguous, wheras T..U[] ...?
Mar 24 2008
parent reply "Craig Black" <cblack ara.com> writes:
"Janice Caron" <caron800 googlemail.com> wrote in message 
news:mailman.209.1206385654.2351.digitalmars-d puremagic.com...
 On 24/03/2008, Craig Black <cblack ara.com> wrote:
 int..[]
I think I'd be happy with Range!(int)
  To clarify my opinion, I don't see a point in specifying the second 
 type.
...unless of course the second type is different from the first type! :) Rare though that case may be, I think we'd need it in order to retain the existing functionality of opSlice(T,U). Also, Range!(T,U)[] is unambiguous, wheras T..U[] ...?
I still think we don't need that second type. We are talking about D 2.0, so backwards.compatibility is less of an issue. Unless there's a compelling reason for it, it should be opSlice(T, T). I think this makes more sense, and simplifies the syntax. -Craig
Mar 24 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 24/03/2008, Craig Black <cblack ara.com> wrote:
 I still think we don't need that second type.  We are talking about D 2.0,
  so backwards.compatibility is less of an issue.  Unless there's a compelling
  reason for it, it should be opSlice(T, T).  I think this makes more sense,
  and simplifies the syntax.
I agree. But just to be sure, let's ask the loyal readers of this newsgroup... Has anyone here ever used opSlice(T,U), where type T != type U? And if so, for what purpose, and could you live without it?
Mar 24 2008
next sibling parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Janice Caron wrote:
 On 24/03/2008, Craig Black <cblack ara.com> wrote:
 I still think we don't need that second type.  We are talking about D 2.0,
  so backwards.compatibility is less of an issue.  Unless there's a compelling
  reason for it, it should be opSlice(T, T).  I think this makes more sense,
  and simplifies the syntax.
I agree. But just to be sure, let's ask the loyal readers of this newsgroup... Has anyone here ever used opSlice(T,U), where type T != type U? And if so, for what purpose, and could you live without it?
It could probably be used with something like this: <http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=67720> Assuming that code works (I haven't tested it myself) that should allow using $ in user-defined opSlice(). Pretty hack-y though; it seems to depend on a DMDFE implementation detail.
Mar 24 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 24/03/2008, Frits van Bommel <fvbommel remwovexcapss.nl> wrote:
  > But just to be sure, let's ask the loyal readers of this newsgroup...
  > Has anyone here ever used opSlice(T,U), where type T != type U? And if
  > so, for what purpose, and could you live without it?

 It could probably be used with something like this:
  <example of end-relative indexing>
You raise a good point. If we're going to allow auto x = 1..3; then I guess we should also allow auto x = 1..$; Exactly how that should implemented, I'm not sure - but if we get to a solution, it will make ranges mighty powerful beasts!
Mar 24 2008
parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Janice Caron wrote:
 On 24/03/2008, Frits van Bommel <fvbommel remwovexcapss.nl> wrote:
  > But just to be sure, let's ask the loyal readers of this newsgroup...
  > Has anyone here ever used opSlice(T,U), where type T != type U? And if
  > so, for what purpose, and could you live without it?

 It could probably be used with something like this:
  <example of end-relative indexing>
You raise a good point. If we're going to allow auto x = 1..3; then I guess we should also allow auto x = 1..$; Exactly how that should implemented, I'm not sure - but if we get to a solution, it will make ranges mighty powerful beasts!
Another very useful thing to have is a notation for the 'everything' slice. In Python it's just a colon by itself. It's useful for multi-dimensional arrays: x[:3,:] means slice out rows 0-3 of every column. Used to good effect in NumPy. --bb
Mar 24 2008
prev sibling parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Janice Caron wrote:
 On 24/03/2008, Craig Black <cblack ara.com> wrote:
 I still think we don't need that second type.  We are talking about D 2.0,
  so backwards.compatibility is less of an issue.  Unless there's a compelling
  reason for it, it should be opSlice(T, T).  I think this makes more sense,
  and simplifies the syntax.
I agree. But just to be sure, let's ask the loyal readers of this newsgroup... Has anyone here ever used opSlice(T,U), where type T != type U? And if so, for what purpose, and could you live without it?
Yes I have. The purpose is to implement custom end-relative ranges. The end-relative indexes are represented using a struct like so: struct EndRelative { int offset; } That plus the __opDollar lets you make your user type accept x[3..$-2]. (And even without the opDollar harck you can make it work with something like x[3..end-2].) --bb
Mar 24 2008
prev sibling parent reply renoX <renosky free.fr> writes:
Janice Caron a écrit :
 On 24/03/2008, Craig Black <cblack ara.com> wrote:
 Why would the type T and U ever be different?  What's the point of the
  second type being different than the first?
Um ... I have no idea. I guess it's just to be consistent with the fact that we can already declare: opSlice(T lower, U upper) with T and U being different. Maybe someone will find a use for it one day. That said, it does at least allow you to use ".." in declarations, as in: int..int x; x = 3..4; and you have to admit, T..T is likely to be less typing (and look prettier) than Range!(T). :-)
Well, most range will probably be integer range, so if we could define Range! to be Range!(int) by default then it would be even less typing than int..int. Shouldn't we plan an optional 'step' parameter now? That said, the 'step' is always messy to add, so having just Range with lower and upper would be a good start. renoX PS: I have the impression that each language grows to reimplement Ada at some point ;-) (which for me is a good thing!).
Mar 24 2008
next sibling parent "Craig Black" <cblack ara.com> writes:
 Shouldn't we plan an optional 'step' parameter now?
What would the benefit of the step be?
 That said, the 'step' is always messy to add, so having just Range with 
 lower and upper would be a good start.
I agree that it makes it more messy, and doesn't seem necessary. -Craig
Mar 24 2008
prev sibling next sibling parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
renoX wrote:
 PS:
 I have the impression that each language grows to reimplement Ada at 
 some point ;-)
 (which for me is a good thing!).
Wasn't that Common Lisp? (<http://en.wikipedia.org/wiki/Greenspun's_Tenth_Rule>) :)
Mar 24 2008
parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Frits van Bommel wrote:
 renoX wrote:
 PS:
 I have the impression that each language grows to reimplement Ada at 
 some point ;-)
 (which for me is a good thing!).
Wasn't that Common Lisp? (<http://en.wikipedia.org/wiki/Greenspun's_Tenth_Rule>) :)
Now we've got renoX's 10th rule! (Quick renoX -- if you want to be famous, better come up with another 9 rules fast!) --bb
Mar 24 2008
parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Bill Baxter wrote:
 Frits van Bommel wrote:
 renoX wrote:
 PS:
 I have the impression that each language grows to reimplement Ada at 
 some point ;-)
 (which for me is a good thing!).
Wasn't that Common Lisp? (<http://en.wikipedia.org/wiki/Greenspun's_Tenth_Rule>) :)
Now we've got renoX's 10th rule! (Quick renoX -- if you want to be famous, better come up with another 9 rules fast!)
No need. (Greenspun didn't have any other rules either :P)
Mar 25 2008
prev sibling parent Don Clugston <dac nospam.com.au> writes:
renoX wrote:
 Janice Caron a écrit :
 On 24/03/2008, Craig Black <cblack ara.com> wrote:
 Why would the type T and U ever be different?  What's the point of the
  second type being different than the first?
Um ... I have no idea. I guess it's just to be consistent with the fact that we can already declare: opSlice(T lower, U upper) with T and U being different. Maybe someone will find a use for it one day. That said, it does at least allow you to use ".." in declarations, as in: int..int x; x = 3..4; and you have to admit, T..T is likely to be less typing (and look prettier) than Range!(T). :-)
Well, most range will probably be integer range, so if we could define Range! to be Range!(int) by default then it would be even less typing than int..int. Shouldn't we plan an optional 'step' parameter now?
Definitely not. Consider floating point ranges. What 'step' do you use? 1..1e10 You can step linearly (1,2,3...), logarithmically(1, 1e2, 1e3...) or in IEEE space. (1, nextUp(1), nextUp(nextUp(1)), ....) There are just too many reasonable options. Once steps are involved, you need a generator. But ranges are simple.
Mar 26 2008
prev sibling next sibling parent sambeau <no-spam-for-sambeau mac.com> writes:
Janice Caron Wrote:

 On 24/03/2008, sambeau <sambeau-nospam mac.com> wrote:
 You may perhaps have misunderstood the proposal. The type T..U is
 sugar for struct { T begin; U end; }, that's all. The value x..y is
 just an instance of typeof(x)..typeof(y). There's no array magic going
 on whatsover. You'd use ranges to /slice/ arrays, that's all.
Yes, I was clearly miles away and grabbing the short end of the stick. I had just been reading your other thread about constness and thought you had come up with something more ambitous (that matched something going on in my head while reading it) - namely how can you express a subset of a const collection while informing the compiler that as it is a subset it must share it's super-sets constness.
  And now I need to think whether ranges are the only interesting exceptions to
const that we need to consider.
It's not an exception to const.
My apologies. "Exception" was probably a hasty term. I was considering, following on from this (above) that if you could declare one const collection to be a subset of another (and thus retain its constant nature) could you make a superset of a number of const collections and retain the constness, too. That was all. And I'm still mulling it over :-)
Mar 24 2008
prev sibling next sibling parent bobcat <a b.com> writes:
Nice! I was also wishing for ranges in D, but didn't know how it would behave
with the rest of the languages. 
Doing foreach(i, 1..20) would also be a natural fit for this, where foreach
also accepts a slice :). ( I know there is already something similar in D 2.0. )

Janice Caron Wrote:

 I know this has cropped up before (in discussions about multiple
 dimension arrays), but adding a range type would also really help with
 the whole business of returning slices. (See the many other threads
 currently buzzing with this topic).
 
 A range is nothing more than a two-element struct
 
     struct Range(T,U=T)
     {
         T begin;
         U end;
     }
 
 However, if you throw in some extra language support, it gets really,
 really useful. Basically, you want the ".." infix operator always to
 create a range. Thus
 
     auto x = 3 .. 4;
 
 creates a Range!(int) with values { 3, 4 }. In general (a .. b) should
 evaluate to a Range!(typeof(a),typeof(b)) with values { a, b }.
 Finally, you also want [] and opSlice() to accept Range! parameters,
 so that
 
     s = s[a..b];
 
 can always be rewritten as
 
     auto t = a..b;
     s = s[t];
 
 In general, opSlice(Range r) should be eqivalent to opSlice(r.begin, r.end).
 
 In my opinion language support for ranges (allowing .. to return a
 range, and allowing [] to accept a range) has advantages above and
 beyond those already discussed, and may also allow many other exciting
 possibilites we haven't even thought of yet.
Mar 24 2008
prev sibling next sibling parent "Koroskin Denis" <2korden+D gmail.com> writes:
On Mon, 24 Mar 2008 14:12:15 +0300, Janice Caron <caron800 googlemail.co=
m>  =

wrote:

 I know this has cropped up before (in discussions about multiple
 dimension arrays), but adding a range type would also really help with=
 the whole business of returning slices. (See the many other threads
 currently buzzing with this topic).

 A range is nothing more than a two-element struct

     struct Range(T,U=3DT)
     {
         T begin;
         U end;
     }

 However, if you throw in some extra language support, it gets really,
 really useful. Basically, you want the ".." infix operator always to
 create a range. Thus

     auto x =3D 3 .. 4;

 creates a Range!(int) with values { 3, 4 }. In general (a .. b) should=
 evaluate to a Range!(typeof(a),typeof(b)) with values { a, b }.
 Finally, you also want [] and opSlice() to accept Range! parameters,
 so that

     s =3D s[a..b];

 can always be rewritten as

     auto t =3D a..b;
     s =3D s[t];

 In general, opSlice(Range r) should be eqivalent to opSlice(r.begin,  =
 r.end).

 In my opinion language support for ranges (allowing .. to return a
 range, and allowing [] to accept a range) has advantages above and
 beyond those already discussed, and may also allow many other exciting=
 possibilites we haven't even thought of yet.
Yeah, built-in Ranges are great! Where do I vote for them? :)
Mar 26 2008
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
A range type can be used in many situations:
- You can iterate lazily on a interval, like the foreach in D 2.x, foreach(i;
0..10) {} or foreach(i; 'a'..'z'+1) {} (currently I use xrange() and
cinterval() for those things, cinterval is closed on the right).
- You can create a range of values inside an array (currently I use a range()
function for this), like arr[] = 0..arr.length.
- You can test if a value is in an interval: (5 in 0..1000) (currently 'in'
method of xrange).
- You may be able to define variables that assume only a limited range of
integrals/ASCII chars (this is for safety, like in Pascal. The compiler must
add run-time controls to be sure you can't go outside the bounds).

Bye,
bearophile
Mar 28 2008