www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - access foreach index counter for Iterate n times

reply mw <mingwu gmail.com> writes:
Not sure if this has been discussed before:

https://tour.dlang.org/tour/en/basics/foreach

```
foreach (i, e; [4, 5, 6]) {
     writeln(i, ":", e);
}
// 0:4 1:5 2:6
```

so what if I want the loop index counter when I want to loop n 
times with the .. syntax, I tried:

```
foreach (i, e; 5 .. 10) {
     writeln(i, ":", e);
}
```

right now when I tried it, got syntax error:

onlineapp.d(4): Error: found `..` when expecting `)`
onlineapp.d(4): Error: found `)` when expecting `;` following 
statement

Yes, there are a number of manual work-around, but I think the 
this 2nd foreach loop should be as valid as the 1st one on 
arrays, so the language behavior is more consistent.

Thoughts?
May 20 2022
parent reply forkit <forkit gmail.com> writes:
On Saturday, 21 May 2022 at 01:01:40 UTC, mw wrote:
 Not sure if this has been discussed before:

 https://tour.dlang.org/tour/en/basics/foreach

 ```
 foreach (i, e; [4, 5, 6]) {
     writeln(i, ":", e);
 }
 // 0:4 1:5 2:6
 ```

 so what if I want the loop index counter when I want to loop n 
 times with the .. syntax, I tried:

 ```
 foreach (i, e; 5 .. 10) {
     writeln(i, ":", e);
 }
 ```

 right now when I tried it, got syntax error:

 onlineapp.d(4): Error: found `..` when expecting `)`
 onlineapp.d(4): Error: found `)` when expecting `;` following 
 statement

 Yes, there are a number of manual work-around, but I think the 
 this 2nd foreach loop should be as valid as the 1st one on 
 arrays, so the language behavior is more consistent.

 Thoughts?
// ------ module test; safe: import std; void main() { foreach(i, e; iota(5,11).array) writeln(i, ":", e); } /+ 0:5 1:6 2:7 3:8 4:9 5:10 +/ // ----------------------
May 20 2022
parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 5/20/22 18:26, forkit wrote:

 {
      foreach(i, e; iota(5,11).array)
          writeln(i, ":", e);
 }
That .array may be unnecessarily expensive in some cases because it allocates memory. The following is an alternative: foreach(i, e; iota(5,11).enumerate) writeln(i, ":", e); Ali
May 20 2022
parent reply mw <mingwu gmail.com> writes:
On Saturday, 21 May 2022 at 02:08:05 UTC, Ali Çehreli wrote:
 On 5/20/22 18:26, forkit wrote:

 {
      foreach(i, e; iota(5,11).array)
          writeln(i, ":", e);
 }
That .array may be unnecessarily expensive in some cases because it allocates memory. The following is an alternative: foreach(i, e; iota(5,11).enumerate) writeln(i, ":", e);
Sure, all nice workarounds. But isn't iota as a function call also expensive? I wrote this naturally after seeing the array foreach example: ``` foreach (i, e; 5 .. 10) ``` I don't even know the function iota, and why it's needed for a simple loop like this? ``` foreach (e; 5 .. 10) ``` just to get i? My point: the language should be more consistent.
May 20 2022
next sibling parent max haughton <maxhaton gmail.com> writes:
On Saturday, 21 May 2022 at 02:23:28 UTC, mw wrote:
 But isn't iota as a function call also expensive?
No. https://d.godbolt.org/z/TsKbe96dv
 I wrote this naturally after seeing the array foreach example:

 ```
 foreach (i, e; 5 .. 10)
 ```

 I don't even know the function iota, and why it's needed for a 
 simple loop like this?


 ```
 foreach (e; 5 .. 10)
 ```

 just to get i?

 My point: the language should be more consistent.
I don't disagree.
May 20 2022
prev sibling next sibling parent reply Mike Parker <aldacron gmail.com> writes:
On Saturday, 21 May 2022 at 02:23:28 UTC, mw wrote:

 My point: the language should be more consistent.
You get indexes in a foreach with arrays and associative arrays because they have indexes. You don't get them with input ranges because they have no indexes (hence, the need for enumerate). So is it really in consistent that a numeric range of 0 .. N has no index?
May 20 2022
next sibling parent mw <mingwu gmail.com> writes:
On Saturday, 21 May 2022 at 02:35:37 UTC, Mike Parker wrote:
 On Saturday, 21 May 2022 at 02:23:28 UTC, mw wrote:

 My point: the language should be more consistent.
You get indexes in a foreach with arrays and associative arrays because they have indexes. You don't get them with input ranges because they have no indexes (hence, the need for enumerate). So is it really in consistent that a numeric range of 0 .. N has no index?
More abstractly, it's not array/range index, it's a loop counter (index).
May 20 2022
prev sibling parent Nick Treleaven <nick geany.org> writes:
On Saturday, 21 May 2022 at 02:35:37 UTC, Mike Parker wrote:
 You get indexes in a foreach with arrays and associative arrays 
 because they have indexes. You don't get them with input ranges 
 because they have no indexes (hence, the need for enumerate).
`iota` with integer arguments produces a random access range, which is indexable. It would be nice if `foreach` supported an `index, element` pair with RA ranges, when `front` just returns an element, not a tuple.
May 21 2022
prev sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 5/20/22 19:23, mw wrote:

 But isn't iota as a function call also expensive?
I am sure someone will show some disassembly to compare. :)
 I wrote this naturally after seeing the array foreach example:

 ```
 foreach (i, e; 5 .. 10)
 ```
The biggest issue there is the number range, which is a weird thing in the language.
 My point: the language should be more consistent.
The automatic loop counter is available only for arrays. If we see number ranges things other than arrays (they are not arrays), then there is no consistency issue. One could argue to remove number ranges from the language but I don't think it's that big of a deal. Ali
May 20 2022
parent reply mw <mingwu gmail.com> writes:
On Saturday, 21 May 2022 at 02:40:06 UTC, Ali Çehreli wrote:
 On 5/20/22 19:23, mw wrote:

 But isn't iota as a function call also expensive?
I am sure someone will show some disassembly to compare. :)
from Max:
 https://d.godbolt.org/z/TsKbe96dv
Nice assembly.
 The biggest issue there is the number range, which is a weird 
 thing in the language.

 My point: the language should be more consistent.
The automatic loop counter is available only for arrays. If we see number ranges things other than arrays (they are not arrays), then there is no consistency issue.
Yes! let's call it "loop counter", regardless the user is looping over array, or (number) range ...: so, "loop counter" as a language concept.
 One could argue to remove number ranges from the language but I 
 don't think it's that big of a deal.
Why not keep it, and the compiler auto translate foreach (i, e; 5 .. 10) to: foreach(i, e; iota(5,11).enumerate) esp. after seeing the assembly.
May 20 2022
parent reply Adam D Ruppe <destructionator gmail.com> writes:
On Saturday, 21 May 2022 at 03:01:39 UTC, mw wrote:
 Yes! let's call it "loop counter"
It isn't a loop counter. It is an index back into the original source. Consider: foreach(INDEX, dchar character; "“”") writeln(INDEX); // 0 then 3
May 21 2022
parent reply mw <mingwu gmail.com> writes:
On Saturday, 21 May 2022 at 11:59:42 UTC, Adam D Ruppe wrote:
 On Saturday, 21 May 2022 at 03:01:39 UTC, mw wrote:
 Yes! let's call it "loop counter"
It isn't a loop counter. It is an index back into the original source. Consider: foreach(INDEX, dchar character; "“”") writeln(INDEX); // 0 then 3
Wow, thus surprised me again! 1) First, why not 0 then 4? Since dchar is 32 bits. 2) Second, compare: import std; void main() {    dstring ds = "“”";    writeln(ds.length);    foreach(INDEX, dchar character; ds)      writeln(INDEX, character); // 0 then } Output: 2 0“ 1” Explanations? If it's "loop counter", isn't the behavior more consistent?
May 21 2022
next sibling parent reply user1234 <user1234 12.de> writes:
On Saturday, 21 May 2022 at 15:06:31 UTC, mw wrote:
 On Saturday, 21 May 2022 at 11:59:42 UTC, Adam D Ruppe wrote:
 On Saturday, 21 May 2022 at 03:01:39 UTC, mw wrote:
 Yes! let's call it "loop counter"
It isn't a loop counter. It is an index back into the original source. Consider: foreach(INDEX, dchar character; "“”") writeln(INDEX); // 0 then 3
Wow, thus surprised me again! 1) First, why not 0 then 4? Since dchar is 32 bits. 2) Second, compare: import std; void main() {    dstring ds = "“”";    writeln(ds.length);    foreach(INDEX, dchar character; ds)      writeln(INDEX, character); // 0 then } Output: 2 0“ 1” Explanations?
Adam D Ruppe expamples implies auto (hidden) decoding, which is a special case, so it reads 3 bytes to decode the the 1st glyph. The word "counter" is actually correct if the foreach'd aggregate is truely capable of random accesses. That is the case for your second version, that iterated over a dstring.
 If it's "loop counter", isn't the behavior more consistent?
May 21 2022
next sibling parent Adam D Ruppe <destructionator gmail.com> writes:
On Saturday, 21 May 2022 at 15:21:09 UTC, user1234 wrote:
 Adam D Ruppe expamples implies auto (hidden) decoding
small nitpick, this is not autodecoding since you have to request it specifically by specifying dchar. autodecoding is referring to a Phobos thing where it gives dchar even though you didn't specifically ask for it.
May 21 2022
prev sibling parent mw <mingwu gmail.com> writes:
On Saturday, 21 May 2022 at 15:21:09 UTC, user1234 wrote:
 On Saturday, 21 May 2022 at 15:06:31 UTC, mw wrote:
 On Saturday, 21 May 2022 at 11:59:42 UTC, Adam D Ruppe wrote:
 On Saturday, 21 May 2022 at 03:01:39 UTC, mw wrote:
 Yes! let's call it "loop counter"
It isn't a loop counter. It is an index back into the original source. Consider: foreach(INDEX, dchar character; "“”") writeln(INDEX); // 0 then 3
Wow, thus surprised me again! 1) First, why not 0 then 4? Since dchar is 32 bits. 2) Second, compare: import std; void main() {    dstring ds = "“”";    writeln(ds.length);    foreach(INDEX, dchar character; ds)      writeln(INDEX, character); // 0 then } Output: 2 0“ 1” Explanations?
Adam D Ruppe expamples implies auto (hidden) decoding, which is a special case, so it reads 3 bytes to decode the the 1st glyph.
Why 3 bytes? Not 4 bytes? As dchar is specified as 32 bits here? https://dlang.org/spec/type.html
May 21 2022
prev sibling parent reply Adam D Ruppe <destructionator gmail.com> writes:
On Saturday, 21 May 2022 at 15:06:31 UTC, mw wrote:
 It isn't a loop counter. It is an index back into the original 
 source. Consider:
1) First, why not 0 then 4? Since dchar is 32 bits.
It is an index back into the *original source* given to foreach. I gave it a char[], not a dchar[]. So it is counting chars in that original char[]. If you do: auto thing = whatever_you_loop_over; foreach(index, item; thing) { then stuff_before_item == thing[0 .. index]; }
 Explanations?
The index there is the index into a dstring, which is stored differently.
 If it's "loop counter", isn't the behavior more consistent?
Also see `foreach_reverse` where it counts backwards because it is an index into the original array, not a counter.
May 21 2022
parent reply mw <mingwu gmail.com> writes:
On Saturday, 21 May 2022 at 15:23:08 UTC, Adam D Ruppe wrote:
 On Saturday, 21 May 2022 at 15:06:31 UTC, mw wrote:
 It isn't a loop counter. It is an index back into the 
 original source. Consider:
1) First, why not 0 then 4? Since dchar is 32 bits.
It is an index back into the *original source* given to foreach. I gave it a char[], not a dchar[]. So it is counting chars in that original char[].
Same question: Why 3 bytes? Not 4 bytes? As dchar is specified as 32 bits here? https://dlang.org/spec/type.html
May 21 2022
next sibling parent reply rikki cattermole <rikki cattermole.co.nz> writes:
Unicode.

Multi-byte code points.

UTF-8 and UTF-16 are variable length to produce a single Unicode 
codepoint that is 32bit.

writeln("“".length, " ", "”".length); // 3 3
May 21 2022
parent mw <mingwu gmail.com> writes:
On Saturday, 21 May 2022 at 15:50:43 UTC, rikki cattermole wrote:
 Unicode.

 Multi-byte code points.

 UTF-8 and UTF-16 are variable length to produce a single 
 Unicode codepoint that is 32bit.

 writeln("“".length, " ", "”".length); // 3 3
Thanks, variable-length encoding is the answer for this question.
May 21 2022
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, May 21, 2022 at 03:43:59PM +0000, mw via Digitalmars-d wrote:
 On Saturday, 21 May 2022 at 15:23:08 UTC, Adam D Ruppe wrote:
 On Saturday, 21 May 2022 at 15:06:31 UTC, mw wrote:
 It isn't a loop counter. It is an index back into the original
 source. Consider:
1) First, why not 0 then 4? Since dchar is 32 bits.
It is an index back into the *original source* given to foreach. I gave it a char[], not a dchar[]. So it is counting chars in that original char[].
Same question: Why 3 bytes? Not 4 bytes?
[...] Because UTF-8 is a variable-length encoding. T -- Life is too short to run proprietary software. -- Bdale Garbee
May 21 2022
prev sibling parent reply Adam D Ruppe <destructionator gmail.com> writes:
On Saturday, 21 May 2022 at 15:43:59 UTC, mw wrote:
 Same question:
 Why 3 bytes? Not 4 bytes?
Same answer: it is pointing at the character in the ORIGINAL ARRAY. The original array is NOT made out of dchars but it reads and converts on the fly while maintaining the correct position. See that's the beauty of it: how it gets there can be pretty complicated (especially when going backwards with foreach_reverse), but it always gives the right answer.
May 21 2022
parent reply mw <mingwu gmail.com> writes:
On Saturday, 21 May 2022 at 16:11:50 UTC, Adam D Ruppe wrote:
 On Saturday, 21 May 2022 at 15:43:59 UTC, mw wrote:
 Same question:
 Why 3 bytes? Not 4 bytes?
Same answer: it is pointing at the character in the ORIGINAL ARRAY. The original array is NOT made out of dchars but it reads and converts on the fly while maintaining the correct position.
UTF variable-length encoding is the answer I'm looking for.
 See that's the beauty of it: how it gets there can be pretty 
 complicated (especially when going backwards with 
 foreach_reverse), but it always gives the right answer.
OK, it's a loop array index, not a "loop counter"; so most of time these two things are the same, but in this special case of UTF variable-length encoding they are different. two comments: 1) can we also have a true "loop counter"? coming from a numeric computation application background, "loop counter" certainly is very useful. 2) can we also make it work for range? i.e. the question in my original 1st post.
May 21 2022
parent Adam D Ruppe <destructionator gmail.com> writes:
On Saturday, 21 May 2022 at 17:07:41 UTC, mw wrote:
 1) can we also have a true "loop counter"?
That's what the `enumerate` thing does. Or you can just int counter; for(whatever) { scope(exit) counter++; }
May 21 2022