www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Ranges, constantly frustrating

reply "Regan Heath" <regan netmail.co.nz> writes:
Things like this should "just work"..

File input ...

auto range = input.byLine();
while(!range.empty)
{
   range.popFront();
   foreach (i, line; range.take(4))  //Error: cannot infer argument types
   {
     ..etc..
   }
   range.popFront();
}

Tried adding 'int' and 'char[]' or 'auto' .. no dice.

Can someone explain why this fails, and if this is a permanent or  
temporary limitation of D/MD.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/
Feb 11 2014
next sibling parent "Jakob Ovrum" <jakobovrum gmail.com> writes:
On Tuesday, 11 February 2014 at 10:10:27 UTC, Regan Heath wrote:
 Things like this should "just work"..

 File input ...

 auto range = input.byLine();
 while(!range.empty)
 {
   range.popFront();
   foreach (i, line; range.take(4))  //Error: cannot infer 
 argument types
   {
     ..etc..
   }
   range.popFront();
 }

 Tried adding 'int' and 'char[]' or 'auto' .. no dice.

 Can someone explain why this fails, and if this is a permanent 
 or temporary limitation of D/MD.

 R
See this pull request[1] and the linked enhancement report. Also note that calling `r.popFront()` without checking `r.empty` is a program error (so it's recommended to at least put in an assert). [1] https://github.com/D-Programming-Language/phobos/pull/1866
Feb 11 2014
prev sibling next sibling parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Tue, 11 Feb 2014 10:10:27 -0000, Regan Heath <regan netmail.co.nz>  
wrote:

 Things like this should "just work"..

 File input ...

 auto range = input.byLine();
 while(!range.empty)
 {
    range.popFront();
    foreach (i, line; range.take(4))  //Error: cannot infer argument types
    {
      ..etc..
    }
    range.popFront();
 }

 Tried adding 'int' and 'char[]' or 'auto' .. no dice.

 Can someone explain why this fails, and if this is a permanent or  
 temporary limitation of D/MD.
Further, the naive solution of adding .array gets you in all sorts of trouble :p (The whole byLine buffer re-use issue). This should be simple and easy, dare I say it trivial.. or am I just being dense here. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Feb 11 2014
parent reply "Tobias Pankrath" <tobias pankrath.net> writes:
 Further, the naive solution of adding .array gets you in all 
 sorts of trouble :p  (The whole byLine buffer re-use issue).

 This should be simple and easy, dare I say it trivial.. or am I 
 just being dense here.

 R
The second naive solution would be to use readText and splitLines.
Feb 11 2014
next sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Tue, 11 Feb 2014 10:52:39 -0000, Tobias Pankrath <tobias pankrath.net>  
wrote:

 Further, the naive solution of adding .array gets you in all sorts of  
 trouble :p  (The whole byLine buffer re-use issue).

 This should be simple and easy, dare I say it trivial.. or am I just  
 being dense here.

 R
The second naive solution would be to use readText and splitLines.
The file is huge in my case :) R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Feb 11 2014
prev sibling parent "Steve Teale" <steve.teale britseyeview.com> writes:
On Tuesday, 11 February 2014 at 10:52:40 UTC, Tobias Pankrath 
wrote:

 The second naive solution would be to use readText and 
 splitLines.
That's the sort of thing I always do because then I understand what's going on, and when there's a bug I can find it easily! But then I'm not writing libraries. Steve
Feb 11 2014
prev sibling next sibling parent reply "Tobias Pankrath" <tobias pankrath.net> writes:
On Tuesday, 11 February 2014 at 10:10:27 UTC, Regan Heath wrote:
 Things like this should "just work"..

 File input ...

 auto range = input.byLine();
 while(!range.empty)
 {
   range.popFront();
   foreach (i, line; range.take(4))  //Error: cannot infer 
 argument types
   {
     ..etc..
   }
   range.popFront();
 }

 Tried adding 'int' and 'char[]' or 'auto' .. no dice.

 Can someone explain why this fails, and if this is a permanent 
 or temporary limitation of D/MD.

 R
Is foreach(i, val; aggregate) even defined if aggr is not an array or associated array? It is not in the docs: http://dlang.org/statement#ForeachStatement
Feb 11 2014
parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Tue, 11 Feb 2014 10:58:17 -0000, Tobias Pankrath <tobias pankrath.net>  
wrote:

 On Tuesday, 11 February 2014 at 10:10:27 UTC, Regan Heath wrote:
 Things like this should "just work"..

 File input ...

 auto range = input.byLine();
 while(!range.empty)
 {
   range.popFront();
   foreach (i, line; range.take(4))  //Error: cannot infer argument types
   {
     ..etc..
   }
   range.popFront();
 }

 Tried adding 'int' and 'char[]' or 'auto' .. no dice.

 Can someone explain why this fails, and if this is a permanent or  
 temporary limitation of D/MD.

 R
Is foreach(i, val; aggregate) even defined if aggr is not an array or associated array? It is not in the docs: http://dlang.org/statement#ForeachStatement
import std.stdio; struct S1 { private int[] elements = [9,8,7]; int opApply (int delegate (ref uint, ref int) block) { foreach (uint i, int n ; this.elements) block(i, n); return 0; } } void main() { S1 range; foreach(uint i, int x; range) { writefln("%d is %d", i, x); } } R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Feb 11 2014
next sibling parent reply "Tobias Pankrath" <tobias pankrath.net> writes:
On Tuesday, 11 February 2014 at 13:00:19 UTC, Regan Heath wrote:
 import std.stdio;

 struct S1 {
    private int[] elements = [9,8,7];
    int opApply (int delegate (ref uint, ref int) block) {
        foreach (uint i, int n ; this.elements)
            block(i, n);
        return 0;
    }
 }

 void main()
 {
 	S1 range;	
 	foreach(uint i, int x; range)
 	{
 	  writefln("%d is %d", i, x);
 	}
 }

 R
byLine does not use opApply https://github.com/D-Programming-Language/phobos/blob/master/std/stdio.d#L1389
Feb 11 2014
parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Tue, 11 Feb 2014 13:11:54 -0000, Tobias Pankrath <tobias pankrath.net>  
wrote:

 On Tuesday, 11 February 2014 at 13:00:19 UTC, Regan Heath wrote:
 import std.stdio;

 struct S1 {
    private int[] elements = [9,8,7];
    int opApply (int delegate (ref uint, ref int) block) {
        foreach (uint i, int n ; this.elements)
            block(i, n);
        return 0;
    }
 }

 void main()
 {
 	S1 range;	
 	foreach(uint i, int x; range)
 	{
 	  writefln("%d is %d", i, x);
 	}
 }

 R
byLine does not use opApply https://github.com/D-Programming-Language/phobos/blob/master/std/stdio.d#L1389
Ahh.. so this is a limitation of the range interface. Any plans to "fix" this? R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Feb 12 2014
parent reply "Jakob Ovrum" <jakobovrum gmail.com> writes:
On Wednesday, 12 February 2014 at 10:44:57 UTC, Regan Heath wrote:
 Ahh.. so this is a limitation of the range interface.  Any 
 plans to "fix" this?

 R
Did my original reply not arrive? It is the first reply in the thread... Reproduced:
 See this pull request[1] and the linked enhancement report.

 Also note that calling `r.popFront()` without checking 
 `r.empty` is a program error (so it's recommended to at least 
 put in an assert).

 [1] https://github.com/D-Programming-Language/phobos/pull/1866
Feb 12 2014
parent "Regan Heath" <regan netmail.co.nz> writes:
On Wed, 12 Feb 2014 11:08:57 -0000, Jakob Ovrum <jakobovrum gmail.com>  
wrote:

 On Wednesday, 12 February 2014 at 10:44:57 UTC, Regan Heath wrote:
 Ahh.. so this is a limitation of the range interface.  Any plans to  
 "fix" this?

 R
Did my original reply not arrive? It is the first reply in the thread...
It did, thanks. It would be better if this was part of the language and "just worked" as expected, but this is just about as good. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Feb 13 2014
prev sibling parent reply "Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:
On Tuesday, 11 February 2014 at 13:00:19 UTC, Regan Heath wrote:
 import std.stdio;

 struct S1 {
    private int[] elements = [9,8,7];
    int opApply (int delegate (ref uint, ref int) block) {
        foreach (uint i, int n ; this.elements)
            block(i, n);
        return 0;
    }
 }

 void main()
 {
 	S1 range;
S1 is not a range. But this is a correct response to "Is foreach(i, val; aggregate) even defined if aggr is not an array or associated array?"
Feb 11 2014
parent "Regan Heath" <regan netmail.co.nz> writes:
On Tue, 11 Feb 2014 19:08:18 -0000, Jesse Phillips  
<Jesse.K.Phillips+D gmail.com> wrote:

 On Tuesday, 11 February 2014 at 13:00:19 UTC, Regan Heath wrote:
 import std.stdio;

 struct S1 {
    private int[] elements = [9,8,7];
    int opApply (int delegate (ref uint, ref int) block) {
        foreach (uint i, int n ; this.elements)
            block(i, n);
        return 0;
    }
 }

 void main()
 {
 	S1 range;
S1 is not a range. But this is a correct response to "Is foreach(i, val; aggregate) even defined if aggr is not an array or associated array?"
True, but then I had missed the fact that there are two distinct mechanisms (opApply/range) in play here. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Feb 12 2014
prev sibling next sibling parent reply "Rene Zwanenburg" <renezwanenburg gmail.com> writes:
On Tuesday, 11 February 2014 at 10:10:27 UTC, Regan Heath wrote:
 Things like this should "just work"..

 File input ...

 auto range = input.byLine();
 while(!range.empty)
 {
   range.popFront();
   foreach (i, line; range.take(4))  //Error: cannot infer 
 argument types
   {
     ..etc..
   }
   range.popFront();
 }

 Tried adding 'int' and 'char[]' or 'auto' .. no dice.

 Can someone explain why this fails, and if this is a permanent 
 or temporary limitation of D/MD.

 R
foreach (i, line; iota(size_t.max).zip(range.take(4))) { }
Feb 11 2014
parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 02/11/2014 06:25 AM, Rene Zwanenburg wrote:
 On Tuesday, 11 February 2014 at 10:10:27 UTC, Regan Heath wrote:
   foreach (i, line; range.take(4))  //Error: cannot infer argument types
   {
     ..etc..
   }
 foreach (i, line; iota(size_t.max).zip(range.take(4)))
 {

 }
There is also the following, relying on tuples' automatic expansion in foreach: foreach (i, element; zip(sequence!"n", range.take(4))) { // ... } Ali
Feb 11 2014
parent "Regan Heath" <regan netmail.co.nz> writes:
On Tue, 11 Feb 2014 17:11:46 -0000, Ali =C7ehreli <acehreli yahoo.com> w=
rote:

 On 02/11/2014 06:25 AM, Rene Zwanenburg wrote:
 On Tuesday, 11 February 2014 at 10:10:27 UTC, Regan Heath wrote:
   foreach (i, line; range.take(4))  //Error: cannot infer argument  =
 types
   {
     ..etc..
   }
 foreach (i, line; iota(size_t.max).zip(range.take(4)))
 {

 }
There is also the following, relying on tuples' automatic expansion in=
=
 foreach:

      foreach (i, element; zip(sequence!"n", range.take(4))) {
          // ...
      }
Thanks for the workarounds. :) Both seem needlessly opaque, but I = realise you're not suggesting these are better than the original, just = that they actually work today. R -- = Using Opera's revolutionary email client: http://www.opera.com/mail/
Feb 12 2014
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 11 Feb 2014 05:10:27 -0500, Regan Heath <regan netmail.co.nz>  
wrote:

 Things like this should "just work"..

 File input ...

 auto range = input.byLine();
 while(!range.empty)
 {
    range.popFront();
    foreach (i, line; range.take(4))  //Error: cannot infer argument types
    {
      ..etc..
    }
    range.popFront();
 }

 Tried adding 'int' and 'char[]' or 'auto' .. no dice.

 Can someone explain why this fails, and if this is a permanent or  
 temporary limitation of D/MD.
This is only available using opApply style iteration. Using range iteration does not give you this ability. It's not a permanent limitation per se, but there is no plan at the moment to add multiple parameters to range iteration. One thing that IS a limitation though: we cannot overload on return values. So the obvious idea of overloading front to return tuples of various types, would not be feasible. opApply can do that because the delegate is a parameter. -Steve
Feb 11 2014
parent "Regan Heath" <regan netmail.co.nz> writes:
On Tue, 11 Feb 2014 19:16:31 -0000, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Tue, 11 Feb 2014 05:10:27 -0500, Regan Heath <regan netmail.co.nz>  
 wrote:

 Things like this should "just work"..

 File input ...

 auto range = input.byLine();
 while(!range.empty)
 {
    range.popFront();
    foreach (i, line; range.take(4))  //Error: cannot infer argument  
 types
    {
      ..etc..
    }
    range.popFront();
 }

 Tried adding 'int' and 'char[]' or 'auto' .. no dice.

 Can someone explain why this fails, and if this is a permanent or  
 temporary limitation of D/MD.
This is only available using opApply style iteration. Using range iteration does not give you this ability. It's not a permanent limitation per se, but there is no plan at the moment to add multiple parameters to range iteration. One thing that IS a limitation though: we cannot overload on return values. So the obvious idea of overloading front to return tuples of various types, would not be feasible. opApply can do that because the delegate is a parameter.
Thanks for the concise/complete response. I had managed to piece this together from other replies but it's clearer now. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Feb 12 2014
prev sibling parent reply "Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:
On Tuesday, 11 February 2014 at 10:10:27 UTC, Regan Heath wrote:
 Things like this should "just work"..

 File input ...

 auto range = input.byLine();
 while(!range.empty)
 {
   range.popFront();
   foreach (i, line; range.take(4))  //Error: cannot infer 
 argument types
   {
     ..etc..
   }
   range.popFront();
 }

 Tried adding 'int' and 'char[]' or 'auto' .. no dice.

 Can someone explain why this fails, and if this is a permanent 
 or temporary limitation of D/MD.

 R
In case the other replies weren't clear enough. A range does not have an index. What do you expect 'i' to be? Is it the line number? Is it the index within the line where 'take' begins? Where 'take' stops? There is a feature of foreach and tuple() which results in the tuple getting expanded automatically. byLine has its own issues with reuse of the buffer, it isn't inherent to ranges. I haven't really used it (needed it from std.process), when I wanted to read a large file I went with wrapping std.mmap: https://github.com/JesseKPhillips/libosm/blob/master/source/util/filerange.d
Feb 11 2014
next sibling parent "thedeemon" <dlang thedeemon.com> writes:
On Tuesday, 11 February 2014 at 19:48:41 UTC, Jesse Phillips 
wrote:

 In case the other replies weren't clear enough. A range does 
 not have an index.

 What do you expect 'i' to be?
In case of foreach(i, x; range) I would expect it to be iteration number of this particular foreach. I miss it sometimes, have to create another variable and increment it. I didn't know about automatic tuple expansion though, that looks better.
Feb 11 2014
prev sibling parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Tue, 11 Feb 2014 19:48:40 -0000, Jesse Phillips  
<Jesse.K.Phillips+D gmail.com> wrote:

 On Tuesday, 11 February 2014 at 10:10:27 UTC, Regan Heath wrote:
 Things like this should "just work"..

 File input ...

 auto range = input.byLine();
 while(!range.empty)
 {
   range.popFront();
   foreach (i, line; range.take(4))  //Error: cannot infer argument types
   {
     ..etc..
   }
   range.popFront();
 }

 Tried adding 'int' and 'char[]' or 'auto' .. no dice.

 Can someone explain why this fails, and if this is a permanent or  
 temporary limitation of D/MD.

 R
In case the other replies weren't clear enough. A range does not have an index.
It isn't *required* to (input/forward), but it could (random access). I think we even have a template to test if it's indexable as we can optimise some algorithms based on this.
 What do you expect 'i' to be? Is it the line number? Is it the index  
 within the line where 'take' begins? Where 'take' stops?
If I say take(5) I expect 0,1,2,3,4. The index into the take range itself. The reason I wanted it was I was parsing blocks of data over 6 lines - I wanted to ignore the first and last and process the middle 4. In fact I wanted to skip the 2nd of those 4 as well, but there was not single function (I could find) which would do all that so I coded the while above.
 There is a feature of foreach and tuple() which results in the tuple  
 getting expanded automatically.
And also the opApply overload taking a delegate with both parameters.
 byLine has its own issues with reuse of the buffer, it isn't inherent to  
 ranges. I haven't really used it (needed it from std.process), when I  
 wanted to read a large file I went with wrapping std.mmap:

 https://github.com/JesseKPhillips/libosm/blob/master/source/util/filerange.d
Cool, thanks. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Feb 12 2014
parent reply "Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:
On Wednesday, 12 February 2014 at 10:52:13 UTC, Regan Heath wrote:
 On Tue, 11 Feb 2014 19:48:40 -0000, Jesse Phillips 
 <Jesse.K.Phillips+D gmail.com> wrote:

 On Tuesday, 11 February 2014 at 10:10:27 UTC, Regan Heath 
 wrote:
 Things like this should "just work"..

 File input ...

 auto range = input.byLine();
 while(!range.empty)
 {
  range.popFront();
  foreach (i, line; range.take(4))  //Error: cannot infer 
 argument types
  {
 It isn't *required* to (input/forward), but it could (random 
 access).  I think we even have a template to test if it's 
 indexable as we can optimise some algorithms based on this.

 What do you expect 'i' to be? Is it the line number? Is it the 
 index within the line where 'take' begins? Where 'take' stops?
If I say take(5) I expect 0,1,2,3,4. The index into the take range itself.
I don't see how these two replies can coexist. 'range.take(5)' is a different range from 'range.' 'range may not traverse in index order (personally haven't seen such a range). But more importantly you're not dealing with random access ranges. The index you're receiving from take(5) can't be used on the range. Don't get me wrong, counting the elements as you iterate over them is useful, but it isn't the index into the range you're likely after. Maybe the number is needed to correspond to a line number.
 There is a feature of foreach and tuple() which results in the 
 tuple getting expanded automatically.
And also the opApply overload taking a delegate with both parameters.
I'm trying to stick with ranges and not iteration in general.
Feb 12 2014
parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Wed, 12 Feb 2014 21:01:58 -0000, Jesse Phillips  
<Jesse.K.Phillips+D gmail.com> wrote:

 On Wednesday, 12 February 2014 at 10:52:13 UTC, Regan Heath wrote:
 On Tue, 11 Feb 2014 19:48:40 -0000, Jesse Phillips  
 <Jesse.K.Phillips+D gmail.com> wrote:

 On Tuesday, 11 February 2014 at 10:10:27 UTC, Regan Heath wrote:
 Things like this should "just work"..

 File input ...

 auto range = input.byLine();
 while(!range.empty)
 {
  range.popFront();
  foreach (i, line; range.take(4))  //Error: cannot infer argument  
 types
  {
 It isn't *required* to (input/forward), but it could (random access).   
 I think we even have a template to test if it's indexable as we can  
 optimise some algorithms based on this.
You chopped of your own comment prompting this response, in which I am responding to a minor side-point, which I think has confused the actual issue. All I was saying above was that a range might well have an index, and we can test for that, but it's not relevant to the foreach issue below.
 What do you expect 'i' to be? Is it the line number? Is it the index  
 within the line where 'take' begins? Where 'take' stops?
If I say take(5) I expect 0,1,2,3,4. The index into the take range itself.
I don't see how these two replies can coexist. 'range.take(5)' is a different range from 'range.'
Yes, exactly, meaning that it can trivially "count" the items it returns, starting from 0, and give those to me as 'i'. *That's all I want*
 'range may not traverse in index order (personally haven't seen such a  
 range). But more importantly you're not dealing with random access  
 ranges. The index you're receiving from take(5) can't be used on the  
 range.
A forward range can do what I am describing above, it's trivial.
 Don't get me wrong, counting the elements as you iterate over them is  
 useful, but it isn't the index into the range you're likely after.
Nope, not what I am after. If I was, I'd iterate over the original range instead or keep a line count manually.
 Maybe the number is needed to correspond to a line number.
Nope. The file contains records of 5 lines plus a blank line. I want 0, 1, 2, 3, 4, 5 so I can skip lines 0, 2, and 5 *of each record*. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Feb 13 2014
parent reply "Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:
On Thursday, 13 February 2014 at 14:30:41 UTC, Regan Heath wrote:
 Don't get me wrong, counting the elements as you iterate over 
 them is useful, but it isn't the index into the range you're 
 likely after.
Nope, not what I am after. If I was, I'd iterate over the original range instead or keep a line count manually.
Maybe a better way to phrase this is, while counting may be what you're implementation needs, it is not immediately obvious what 'i' should be. Someone who desires an index into the original array will expect 'i' to be that; even though it can be explained that .take() is not the same range as the original. Thus it is better to be explicit with the .enumerate function.
Feb 13 2014
parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Fri, 14 Feb 2014 02:48:51 -0000, Jesse Phillips  
<Jesse.K.Phillips+D gmail.com> wrote:

 On Thursday, 13 February 2014 at 14:30:41 UTC, Regan Heath wrote:
 Don't get me wrong, counting the elements as you iterate over them is  
 useful, but it isn't the index into the range you're likely after.
Nope, not what I am after. If I was, I'd iterate over the original range instead or keep a line count manually.
Maybe a better way to phrase this is, while counting may be what you're implementation needs, it is not immediately obvious what 'i' should be. Someone who desires an index into the original array will expect 'i' to be that; even though it can be explained that .take() is not the same range as the original. Thus it is better to be explicit with the .enumerate function.
FWIW I disagree. I think it's immediately and intuitively obvious what 'i' should be when you're foreaching over X items taken from another range, even if you do not know take returns another range. Compare it to calling a function on a range and foreaching on the result, you would intuitively and immediately expect 'i' to relate to the result, not the input. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Feb 14 2014
next sibling parent reply "Jakob Ovrum" <jakobovrum gmail.com> writes:
On Friday, 14 February 2014 at 12:10:51 UTC, Regan Heath wrote:
 FWIW I disagree.  I think it's immediately and intuitively 
 obvious what 'i' should be when you're foreaching over X items 
 taken from another range, even if you do not know take returns 
 another range.  Compare it to calling a function on a range and 
 foreaching on the result, you would intuitively and immediately 
 expect 'i' to relate to the result, not the input.

 R
How should it behave on ranges without length, such as infinite ranges? Also, `enumerate` has the advantage of the `start` parameter, which usefulness is demonstrated in `enumerate`'s example as well as in an additional example in the bug report. I'm not yet sure whether I think it should be implemented at the language or library level, but I think the library approach has some advantages.
Feb 14 2014
parent "Regan Heath" <regan netmail.co.nz> writes:
On Fri, 14 Feb 2014 12:29:49 -0000, Jakob Ovrum <jakobovrum gmail.com>  
wrote:

 On Friday, 14 February 2014 at 12:10:51 UTC, Regan Heath wrote:
 FWIW I disagree.  I think it's immediately and intuitively obvious what  
 'i' should be when you're foreaching over X items taken from another  
 range, even if you do not know take returns another range.  Compare it  
 to calling a function on a range and foreaching on the result, you  
 would intuitively and immediately expect 'i' to relate to the result,  
 not the input.

 R
How should it behave on ranges without length, such as infinite ranges?
In exactly the same way. It just counts up until you break out of the foreach, or the 'i' value wraps around. In fact the behaviour I want is so trivial I think it could be provided by foreach itself, for iterations of anything. In which case whether 'i' was conceptually an "index" or simply a "count" would depend on whether the range passed to foreach (after all skip, take, etc) was itself indexable.
 Also, `enumerate` has the advantage of the `start` parameter, which  
 usefulness is demonstrated in `enumerate`'s example as well as in an  
 additional example in the bug report.
Sure, if you need more functionality reach for enumerate. We can have both; sensible default behaviour AND enumerate for more complicated cases. In my case, enumerate w/ start wouldn't have helped (my file was blocks of 6 lines, where I wanted to skip lines 1, 3, and 6 *of each block*)
 I'm not yet sure whether I think it should be implemented at the  
 language or library level, but I think the library approach has some  
 advantages.
Certainly, for the more complex usage. But I reckon we want both enumerate and a simple language solution which would do what I've been trying to describe. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Feb 14 2014
prev sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Regan Heath:

 FWIW I disagree.  I think it's immediately and intuitively 
 obvious what 'i' should be when you're foreaching over X items 
 taken from another range, even if you do not know take returns 
 another range.  Compare it to calling a function on a range and 
 foreaching on the result, you would intuitively and immediately 
 expect 'i' to relate to the result, not the input.
Using enumerate has several advantages. It gives a bit longer code, but it keeps as much complexity as possible out of the language. So the language gets simpler to implement and its compiler is smaller and simpler to debug. Also, using enumerate is more explicit, if you have an associative array you can iterate it in many ways: foreach (v; AA) {} foreach (k, v; AA) {} foreach (k; AA.byKeys) {} foreach (i, k; AA.byKeys.enumerate) {} foreach (i, v; AA.byValues.enumerate) {} foreach (k, v; AA.byPairs) {} foreach (i, k, v; AA.byPairs.enumerate) {} If you want all those schemes built in a language (and to use them without adding .enumerate) you risk making a mess. In this case "explicit is better than implicit". Python does the same with its enumerate function and keeps the for loop simple: for k in my_dict: pass for i, v in enumerate(my_dict.itervalues()): pass etc. In D we have a mess because tuples are not built-in. Instead of having a built-in functionality similar to what enumerate does, it's WAY better to have built-in tuples. Finding what's important and what is not important to have as built-ins in a language is an essential and subtle design problem. Bye, bearophile
Feb 14 2014
parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Fri, 14 Feb 2014 13:14:51 -0000, bearophile <bearophileHUGS lycos.com>  
wrote:

 Regan Heath:

 FWIW I disagree.  I think it's immediately and intuitively obvious what  
 'i' should be when you're foreaching over X items taken from another  
 range, even if you do not know take returns another range.  Compare it  
 to calling a function on a range and foreaching on the result, you  
 would intuitively and immediately expect 'i' to relate to the result,  
 not the input.
Using enumerate has several advantages.
In my case I didn't need any of these. Simple things should be simple and intuitive to write. Yes, we want enumerate *as well* especially for the more complex cases but we also want the basics to be simple, intuitive and easy. That's all I'm saying here. This seems to me to be very low hanging fruit. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Feb 14 2014
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Regan Heath:

 In my case I didn't need any of these.
I don't understand. Bye, bearophile
Feb 14 2014
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Isn't this discussion about adding an index to a range? If it is, 
then I have shown why adding it in the language is a bad idea.

Bye,
bearophile
Feb 14 2014
parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Friday, 14 February 2014 at 17:42:53 UTC, bearophile wrote:
 Isn't this discussion about adding an index to a range? If it 
 is, then I have shown why adding it in the language is a bad 
 idea.
As far as I understand it, it's about adding an index to _foreach_, as is already supported for arrays: foreach(v; [1,2,3,4]) writeln(v); foreach(i, v; [1,2,3,4]) writeln(i, " => ", v); But for ranges, the second form is not possible: foreach(v; iota(4)) // ok writeln(v); foreach(i, v; iota(4)) // Error: cannot infer argument types writeln(i, " => ", v);
Feb 14 2014
parent "bearophile" <bearophileHUGS lycos.com> writes:
Marc Sch├╝tz:

 As far as I understand it, it's about adding an index to 
 _foreach_, as is already supported for arrays:

 foreach(v; [1,2,3,4])
     writeln(v);
 foreach(i, v; [1,2,3,4])
     writeln(i, " => ", v);

 But for ranges, the second form is not possible:

 foreach(v; iota(4))           // ok
     writeln(v);
 foreach(i, v; iota(4))        // Error: cannot infer argument 
 types
     writeln(i, " => ", v);
I see. In my post I have explained why this is a bad idea (it's not explicit so it gives confusion, and it complicates the language/compiler). A better design is to remove the auto-indexing feature for arrays too, and use .enumerate in all cases, as in Python. Bye, bearophile
Feb 14 2014
prev sibling parent "Regan Heath" <regan netmail.co.nz> writes:
This turned into a bit of a full spec so I would understand if you TL;DR  
but it would be nice to get some feedback if you have the time..

On Fri, 14 Feb 2014 17:34:46 -0000, bearophile <bearophileHUGS lycos.com>  
wrote:
 Regan Heath:

 In my case I didn't need any of these.
I don't understand.
What I meant here is that I don't need the "advantages" provided by enumerate like the starting index. One thing I am unclear about from your response is what you mean by implicit in this context? Do you mean the process of inferring things (like the types in foreach)? (taken from subsequent reply)
 Isn't this discussion about adding an index to a range?
No, it's not. The counter I want would only be an index if the range was indexable, otherwise it's a count of foreach iterations (starting from 0). This counter is (if you like) an "index into the result set" which is not necessarily also an index into the source range (which may not be indexable). What we currently have with foreach is an index and only for indexable things. I want to instead generalise this to be a counter which is an index when the thing being enumerated is indexable, otherwise it is a count or "index into the result set". Lets call this change scheme #0. It solves my issue, and interestingly also would have meant we didn't need to add byKey or byValue to AA's, instead we could have simply made keys/values indexable ranges and not broken any existing code. Further details of scheme #0 below. (taken from subsequent reply)
 If you want all those schemes built in a language (and to use them  
 without adding .enumerate) you risk making
 a mess. In this case "explicit is better than implicit".
Have a read of what I have below and let me know if you think it's a "mess". Scheme #2 has more rules, and might be called a "mess" perhaps. But, scheme #1 is fairly clean and simple and I think better overall. The one downside is that without some additional syntax it cannot put tuple components nicely in context with descriptive variable names, so there is that. To be fair to all 3 schemes below, they mostly "just work" for simple cases and/or cases where different types are used for key/values in AA's and tuples. The more complicated rules only kick in to deal with the cases where there is ambiguity (AA's with the same type for key and value and tuples with multiple components of the same type). Anyway, on to the details.. *********** Scheme 0) So, what I want is for foreach to simply increment a counter after each call to the body of the foreach, giving me a counter from 0 to N (or infinity/wrap). It would do this when prompted to do so by a variable being supplied in the foreach statement in the usual way (for arrays/opApply) This counter would not be defined/understood to be an "index" into the object being enumerated necessarily (as it currently is), instead if the object is indexable then it would indeed be an index, otherwise it's a count (index into the result set). I had not been considering associative arrays until now, given current support (without built in tuples) they do not seem to be a special case to me. Foreach over byKey() should look/function identically to foreach over keys, likewise for byValue(). The only difference is that in the byKey()/byValue() case the counter is not necessarily an index into anything, though it would be if the underlying byKey() range was indexable. The syntax for this, is the same as we have for arrays/classes with opApply today. In other words, "it just works" and my example would compile and run as one might expect. This seems to me to be intuitive, useful and easy to implement. Further, I believe it leaves the door open to having built in tuples (or using library extensions like enumerate()), with similarly clean syntax and no "mess". *********** So, what if we had built in tuples? Well, seems to me we could do foreach over AAs/tuples in one of 2 ways or even a combination of both: Scheme 1) for AA's/tuples the value given to the foreach body is a voldemort (unnamed) type with a public property member for each component of the AA/tuple. In the case of AA's this would then be "key" and "value", for tuples it might be a, b, .., z, aa, bb, .. and so on. foreach(x; AA) {} // Use x.key and x.value foreach(i, x; AA) {} // Use i, x.key and x.value foreach(int i, x; AA) {} // Use i, x.key and x.value Extra/better: For non-AA tuples we could allow the members to be named using some sort of syntax, i.e. foreach(i, (x.bob, x.fred); AA) {} // Use i, x.bob and x.fred or foreach(i, x { int bob; string fred }; AA) {} // Use i, x.bob and x.fred or foreach(i, new x { int bob; string fred }; AA) {} // Use i, x.bob and x.fred Lets look at your examples re-written for scheme #1
 foreach (v; AA) {}
foreach (x; AA) { .. use x.value .. } // better? worse?
 foreach (k, v; AA) {}
foreach (x; AA) { .. use x.key, x.value .. } // better? worse?
 foreach (k; AA.byKeys) {}
same // no voldemort reqd
 foreach (i, k; AA.byKeys.enumerate) {}
foreach (i, k; AA.byKeys) {} // better. note, no voldemort reqd
 foreach (i, v; AA.byValues.enumerate) {}
foreach (i, v; AA.byValues) {} // better. note, no voldemort reqd
 foreach (k, v; AA.byPairs) {}
foreach (x; AA) { .. use x.key, x.value .. } // better
 foreach (i, k, v; AA.byPairs.enumerate) {}
foreach (i, x; AA) { .. use i and x.key, x.value .. } // better This is my preferred approach TBH, you might call it foreach on "packed" tuples. Scheme 2) the tuple is unpacked into separate variables given in the foreach. When no types are given, components are assigned to variables such that the rightmost is the last AA/tuple component and subsequent untyped variables get previous components up and until the N+1 th which gets index/count. foreach (v; AA) {} // v is "value" (last tuple component) foreach (k, v; AA) {} // k is "key" (2nd to last tuple component), ... foreach (i, k, v; AA) {} // i is "index/count" because AA only has 2 tuple components. So, if you have N tuple components and you supply N+1 variables you get the index/count. Supplying any more would be an error. However, if a type is given and the type can be unambiguously matched to a single tuple component then do so. double[string] AA; foreach (string k; AA) {} // k is "key" .. in which case, any additional unmatched untyped or 'int' variable is assigned the index/count. e.g. foreach (i, double v; AA) {} // i is index, v is "value" foreach (i, string k; AA) {} // i is index, k is "key" If more than one typed variable is given, match each unambiguously. foreach (string k, double v; AA) {} // i is index, k is "key", v is "value" .. and likewise any unmatched untyped or 'int' variable is assigned index/count. e.g. foreach (i, string k, double v; AA) {} // i is index, k is "key", v is "value" foreach (int i, string k, double v; AA) {} // i is index, k is "key", v is "value" Any ambiguous situation would result in an error requiring the use of one of .keys/values (in the case of an AA), or to specify types (where possible), or to specify them all in the tuple order, e.g. Using a worst case of.. int[int] AA; // Error: cannot infer binding of k; could be 'key' or 'value' foreach (int k; AA) {} // Solve using .keys/byKey()/values/byValue() foreach (k; AA.byKeys) {} // k is "key" foreach (i, k; AA.byKeys) {} // i is index/count, k is "key" // Solve using tuple order foreach (k, v; AA) {} // k is "key", v is "value" foreach (i, k, v; AA) {} // i is index/count, k is "key", v is "value" So, to your examples re-written for scheme #2
 foreach (v; AA) {}
same
 foreach (k, v; AA) {}
same
 foreach (k; AA.byKeys) {}
same
 foreach (i, k; AA.byKeys.enumerate) {}
foreach (i, k; AA.byKeys) {} // better
 foreach (i, v; AA.byValues.enumerate) {}
foreach (i, v; AA.byValues) {} // better
 foreach (k, v; AA.byPairs) {}
foreach (k, v; AA) {} // better
 foreach (i, k, v; AA.byPairs.enumerate) {}
foreach (i, k, v; AA) {} // better This scheme is more complicated than #1 so it's not my preferred solution. But, it does name the components better than #1. Scheme 3) Combination. If we were to combine these ideas we would need to prefer one scheme by default, if we select scheme #1, for example, then any time a foreach is specified we default to assuming #1 where possible, and #2 otherwise. In which case there are clear cases where scheme #2 is required/used: - when more than 2 variables are given, or - a specific type is given for the final variable. Note that #1 only has 3 possible forms (for AA or tuples): double[string] AA; foreach (v; AA) {} // #1 v is voldemort(key, value)/tuple foreach (i, v; AA) {} // #1 i is index/count, v is voldemort(key, value)/tuple foreach (int i, v; AA) {} // #1 i is index/count, v is voldemort(key, value)/tuple #2 would take effect in these forms.. foreach (i, double v; AA) {} // #2 (type given) i is index/count, v is value foreach (i, string k; AA) {} // #2 (type given) i is index/count, k is key foreach (i, k, v; AA) {} // #2 (3 variables) i is index/count, k is key, v is value Your examples re-written for scheme #3 With.. (*) any // voldemort scheme #1 works with any AA/tuple types even worst case "all one type" (A) int[int] AA; // worst case (B) double[string] AA;
 foreach (v; AA) {}
(*) foreach (x; AA) { .. use x.value .. } // better? (A) foreach (i, k, v; AA) { } // worse? (B) foreach (double v; AA) { } // worse?
 foreach (k, v; AA) {}
(*) foreach (x; AA) { .. use x.key, x.value .. } // better? (A) foreach (k, double v; AA) { } // force scheme #2, worse? (B) foreach (double v; AA) { } // force scheme #2, worse?
 foreach (k; AA.byKeys) {}
same // note, no voldemort reqd
 foreach (i, k; AA.byKeys.enumerate) {}
(*) foreach (i, k; AA.byKeys) {} // better, note; no voldemort reqd
 foreach (i, v; AA.byValues.enumerate) {}
(*) foreach (i, v; AA.byValues) {} // netter, note; no voldemort reqd
 foreach (k, v; AA.byPairs) {}
(*) foreach (x; AA) { .. use x.key, x.value .. } // better
 foreach (i, k, v; AA.byPairs.enumerate) {}
(*) foreach (i, x; AA) { .. use i and x.key, x.value .. } // better (A) foreach (i, k, v; AA) { } // better (B) foreach (i, k, v; AA) { } // better This is a trade off between #1 and #2 but on balance I feel it is worse than #1 so is not my preferred solution. *********** Thoughts? Regan -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Feb 17 2014