digitalmars.D - access foreach index counter for Iterate n times

mw (23/23) May 20 2022 Not sure if this has been discussed before:

forkit (19/42) May 20 2022 // ------

=?UTF-8?Q?Ali_=c3=87ehreli?= (6/10) May 20 2022 That .array may be unnecessarily expensive in some cases because it

mw (14/23) May 20 2022 Sure, all nice workarounds.

max haughton (4/16) May 20 2022 No.
Mike Parker (6/7) May 20 2022 You get indexes in a foreach with arrays and associative arrays

mw (3/10) May 20 2022 More abstractly, it's not array/range index, it's a loop counter
Nick Treleaven (5/8) May 21 2022 `iota` with integer arguments produces a random access range,

=?UTF-8?Q?Ali_=c3=87ehreli?= (10/16) May 20 2022 The biggest issue there is the number range, which is a weird thing in

mw (11/23) May 20 2022 Nice assembly.

Adam D Ruppe (5/6) May 21 2022 It isn't a loop counter. It is an index back into the original

mw (18/24) May 21 2022 Wow, thus surprised me again!

user1234 (6/32) May 21 2022 Adam D Ruppe expamples implies auto (hidden) decoding, which is a

Adam D Ruppe (5/6) May 21 2022 small nitpick, this is not autodecoding since you have to request
mw (4/38) May 21 2022 Why 3 bytes? Not 4 bytes?

Adam D Ruppe (14/19) May 21 2022 It is an index back into the *original source* given to foreach.

mw (5/13) May 21 2022 Same question:

rikki cattermole (5/5) May 21 2022 Unicode.

mw (2/7) May 21 2022 Thanks, variable-length encoding is the answer for this question.

H. S. Teoh (6/23) May 21 2022 [...]
Adam D Ruppe (7/9) May 21 2022 Same answer: it is pointing at the character in the ORIGINAL

mw (11/21) May 21 2022 OK, it's a loop array index, not a "loop counter"; so most of

Adam D Ruppe (7/8) May 21 2022 That's what the `enumerate` thing does.

mw <mingwu gmail.com> writes:

Not sure if this has been discussed before:

https://tour.dlang.org/tour/en/basics/foreach

```
foreach (i, e; [4, 5, 6]) {
     writeln(i, ":", e);
}
// 0:4 1:5 2:6
```

so what if I want the loop index counter when I want to loop n 
times with the .. syntax, I tried:

```
foreach (i, e; 5 .. 10) {
     writeln(i, ":", e);
}
```

right now when I tried it, got syntax error:

onlineapp.d(4): Error: found `..` when expecting `)`
onlineapp.d(4): Error: found `)` when expecting `;` following 
statement

Yes, there are a number of manual work-around, but I think the 
this 2nd foreach loop should be as valid as the 1st one on 
arrays, so the language behavior is more consistent.

Thoughts?

May 20 2022

forkit <forkit gmail.com> writes:

On Saturday, 21 May 2022 at 01:01:40 UTC, mw wrote:
 Not sure if this has been discussed before:

 https://tour.dlang.org/tour/en/basics/foreach

 ```
 foreach (i, e; [4, 5, 6]) {
     writeln(i, ":", e);
 }
 // 0:4 1:5 2:6
 ```

 so what if I want the loop index counter when I want to loop n 
 times with the .. syntax, I tried:

 ```
 foreach (i, e; 5 .. 10) {
     writeln(i, ":", e);
 }
 ```

 right now when I tried it, got syntax error:

 onlineapp.d(4): Error: found `..` when expecting `)`
 onlineapp.d(4): Error: found `)` when expecting `;` following 
 statement

 Yes, there are a number of manual work-around, but I think the 
 this 2nd foreach loop should be as valid as the 1st one on 
 arrays, so the language behavior is more consistent.

 Thoughts?

// ------

module test;
 safe:

import std;

void main()
{
     foreach(i, e; iota(5,11).array)
         writeln(i, ":", e);
}

/+
0:5
1:6
2:7
3:8
4:9
5:10
+/

// ----------------------

May 20 2022

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 5/20/22 18:26, forkit wrote:

 {
      foreach(i, e; iota(5,11).array)
          writeln(i, ":", e);
 }

That .array may be unnecessarily expensive in some cases because it 
allocates memory. The following is an alternative:

     foreach(i, e; iota(5,11).enumerate)
         writeln(i, ":", e);

Ali

May 20 2022

mw <mingwu gmail.com> writes:

On Saturday, 21 May 2022 at 02:08:05 UTC, Ali Çehreli wrote:
 On 5/20/22 18:26, forkit wrote:

 {
      foreach(i, e; iota(5,11).array)
          writeln(i, ":", e);
 }

 That .array may be unnecessarily expensive in some cases 
 because it allocates memory. The following is an alternative:

     foreach(i, e; iota(5,11).enumerate)
         writeln(i, ":", e);


Sure, all nice workarounds.

But isn't iota as a function call also expensive?

I wrote this naturally after seeing the array foreach example:

```
foreach (i, e; 5 .. 10)
```

I don't even know the function iota, and why it's needed for a 
simple loop like this?


```
foreach (e; 5 .. 10)
```

just to get i?

My point: the language should be more consistent.

May 20 2022

max haughton <maxhaton gmail.com> writes:

On Saturday, 21 May 2022 at 02:23:28 UTC, mw wrote:
 But isn't iota as a function call also expensive?

No.

https://d.godbolt.org/z/TsKbe96dv
 I wrote this naturally after seeing the array foreach example:

 ```
 foreach (i, e; 5 .. 10)
 ```

 I don't even know the function iota, and why it's needed for a 
 simple loop like this?


 ```
 foreach (e; 5 .. 10)
 ```

 just to get i?

 My point: the language should be more consistent.

I don't disagree.

May 20 2022

Mike Parker <aldacron gmail.com> writes:

On Saturday, 21 May 2022 at 02:23:28 UTC, mw wrote:

 My point: the language should be more consistent.

You get indexes in a foreach with arrays and associative arrays 
because they have indexes. You don't get them with input ranges 
because they have no indexes (hence, the need for enumerate). So 
is it really in consistent that a numeric range of 0 .. N has no 
index?

May 20 2022

mw <mingwu gmail.com> writes:

On Saturday, 21 May 2022 at 02:35:37 UTC, Mike Parker wrote:
 On Saturday, 21 May 2022 at 02:23:28 UTC, mw wrote:

 My point: the language should be more consistent.

 You get indexes in a foreach with arrays and associative arrays 
 because they have indexes. You don't get them with input ranges 
 because they have no indexes (hence, the need for enumerate). 
 So is it really in consistent that a numeric range of 0 .. N 
 has no index?

More abstractly, it's not array/range index, it's a loop counter 
(index).

May 20 2022

Nick Treleaven <nick geany.org> writes:

On Saturday, 21 May 2022 at 02:35:37 UTC, Mike Parker wrote:
 You get indexes in a foreach with arrays and associative arrays 
 because they have indexes. You don't get them with input ranges 
 because they have no indexes (hence, the need for enumerate).

`iota` with integer arguments produces a random access range, 
which is indexable. It would be nice if `foreach` supported an 
`index, element` pair with RA ranges, when `front` just returns 
an element, not a tuple.

May 21 2022

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 5/20/22 19:23, mw wrote:

 But isn't iota as a function call also expensive?

I am sure someone will show some disassembly to compare. :)

 I wrote this naturally after seeing the array foreach example:

 ```
 foreach (i, e; 5 .. 10)
 ```

The biggest issue there is the number range, which is a weird thing in 
the language.

 My point: the language should be more consistent.

The automatic loop counter is available only for arrays. If we see 
number ranges things other than arrays (they are not arrays), then there 
is no consistency issue.

One could argue to remove number ranges from the language but I don't 
think it's that big of a deal.

Ali

May 20 2022

mw <mingwu gmail.com> writes:

On Saturday, 21 May 2022 at 02:40:06 UTC, Ali Çehreli wrote:
 On 5/20/22 19:23, mw wrote:

 But isn't iota as a function call also expensive?

 I am sure someone will show some disassembly to compare. :)

from Max:

 https://d.godbolt.org/z/TsKbe96dv

Nice assembly.



 The biggest issue there is the number range, which is a weird 
 thing in the language.

 My point: the language should be more consistent.

 The automatic loop counter is available only for arrays. If we 
 see number ranges things other than arrays (they are not 
 arrays), then there is no consistency issue.

Yes! let's call it "loop counter", regardless the user is looping 
over array, or (number) range ...: so, "loop counter" as a 
language concept.


 One could argue to remove number ranges from the language but I 
 don't think it's that big of a deal.


Why not keep it, and the compiler auto translate

foreach (i, e; 5 .. 10)

to:

foreach(i, e; iota(5,11).enumerate)


esp. after seeing the assembly.

May 20 2022

Adam D Ruppe <destructionator gmail.com> writes:

On Saturday, 21 May 2022 at 03:01:39 UTC, mw wrote:
 Yes! let's call it "loop counter"

It isn't a loop counter. It is an index back into the original 
source. Consider:

foreach(INDEX, dchar character; "“”")
    writeln(INDEX); // 0 then 3

May 21 2022

mw <mingwu gmail.com> writes:

On Saturday, 21 May 2022 at 11:59:42 UTC, Adam D Ruppe wrote:
 On Saturday, 21 May 2022 at 03:01:39 UTC, mw wrote:
 Yes! let's call it "loop counter"

 It isn't a loop counter. It is an index back into the original 
 source. Consider:

 foreach(INDEX, dchar character; "“”")
    writeln(INDEX); // 0 then 3

Wow, thus surprised me again!

1) First, why not 0 then 4? Since dchar is 32 bits.

2) Second, compare:

import std;
void main()
{
     dstring ds = "“”";
     writeln(ds.length);
     foreach(INDEX, dchar character; ds)
       writeln(INDEX, character); // 0 then
}

Output:
2
0“
1”

Explanations?


If it's "loop counter", isn't the behavior more consistent?

May 21 2022

user1234 <user1234 12.de> writes:

On Saturday, 21 May 2022 at 15:06:31 UTC, mw wrote:
 On Saturday, 21 May 2022 at 11:59:42 UTC, Adam D Ruppe wrote:
 On Saturday, 21 May 2022 at 03:01:39 UTC, mw wrote:
 Yes! let's call it "loop counter"

 It isn't a loop counter. It is an index back into the original 
 source. Consider:

 foreach(INDEX, dchar character; "“”")
    writeln(INDEX); // 0 then 3

 Wow, thus surprised me again!

 1) First, why not 0 then 4? Since dchar is 32 bits.

 2) Second, compare:

 import std;
 void main()
 {
     dstring ds = "“”";
     writeln(ds.length);
     foreach(INDEX, dchar character; ds)
       writeln(INDEX, character); // 0 then
 }

 Output:
 2
 0“
 1”

 Explanations?

Adam D Ruppe expamples implies auto (hidden) decoding, which is a 
special case, so it reads 3 bytes to decode the the 1st glyph. 
The word "counter" is actually correct if the foreach'd aggregate 
is truely capable of random accesses. That is the case for your 
second version, that iterated over a dstring.

 If it's "loop counter", isn't the behavior more consistent?

May 21 2022

Adam D Ruppe <destructionator gmail.com> writes:

On Saturday, 21 May 2022 at 15:21:09 UTC, user1234 wrote:
 Adam D Ruppe expamples implies auto (hidden) decoding

small nitpick, this is not autodecoding since you have to request 
it specifically by specifying dchar.

autodecoding is referring to a Phobos thing where it gives dchar 
even though you didn't specifically ask for it.

May 21 2022

mw <mingwu gmail.com> writes:

On Saturday, 21 May 2022 at 15:21:09 UTC, user1234 wrote:
 On Saturday, 21 May 2022 at 15:06:31 UTC, mw wrote:
 On Saturday, 21 May 2022 at 11:59:42 UTC, Adam D Ruppe wrote:
 On Saturday, 21 May 2022 at 03:01:39 UTC, mw wrote:
 Yes! let's call it "loop counter"

 It isn't a loop counter. It is an index back into the 
 original source. Consider:

 foreach(INDEX, dchar character; "“”")
    writeln(INDEX); // 0 then 3

 Wow, thus surprised me again!

 1) First, why not 0 then 4? Since dchar is 32 bits.

 2) Second, compare:

 import std;
 void main()
 {
     dstring ds = "“”";
     writeln(ds.length);
     foreach(INDEX, dchar character; ds)
       writeln(INDEX, character); // 0 then
 }

 Output:
 2
 0“
 1”

 Explanations?

 Adam D Ruppe expamples implies auto (hidden) decoding, which is 
 a special case, so it reads 3 bytes to decode the the 1st glyph.

Why 3 bytes? Not 4 bytes?

As dchar is specified as 32 bits here?

https://dlang.org/spec/type.html

May 21 2022

Adam D Ruppe <destructionator gmail.com> writes:

On Saturday, 21 May 2022 at 15:06:31 UTC, mw wrote:
 It isn't a loop counter. It is an index back into the original 
 source. Consider:

 1) First, why not 0 then 4? Since dchar is 32 bits.

It is an index back into the *original source* given to foreach.

I gave it a char[], not a dchar[]. So it is counting chars in 
that original char[].


If you do:

auto thing = whatever_you_loop_over;
foreach(index, item; thing) {
     then
       stuff_before_item == thing[0 .. index];
}


 Explanations?

The index there is the index into a dstring, which is stored 
differently.

 If it's "loop counter", isn't the behavior more consistent?

Also see `foreach_reverse` where it counts backwards because it 
is an index into the original array, not a counter.

May 21 2022

mw <mingwu gmail.com> writes:

On Saturday, 21 May 2022 at 15:23:08 UTC, Adam D Ruppe wrote:
 On Saturday, 21 May 2022 at 15:06:31 UTC, mw wrote:
 It isn't a loop counter. It is an index back into the 
 original source. Consider:

 1) First, why not 0 then 4? Since dchar is 32 bits.

 It is an index back into the *original source* given to foreach.

 I gave it a char[], not a dchar[]. So it is counting chars in 
 that original char[].

Same question:


Why 3 bytes? Not 4 bytes?

As dchar is specified as 32 bits here?

https://dlang.org/spec/type.html

May 21 2022

rikki cattermole <rikki cattermole.co.nz> writes:

Unicode.

Multi-byte code points.

UTF-8 and UTF-16 are variable length to produce a single Unicode 
codepoint that is 32bit.

writeln("“".length, " ", "”".length); // 3 3

May 21 2022

mw <mingwu gmail.com> writes:

On Saturday, 21 May 2022 at 15:50:43 UTC, rikki cattermole wrote:
 Unicode.

 Multi-byte code points.

 UTF-8 and UTF-16 are variable length to produce a single 
 Unicode codepoint that is 32bit.

 writeln("“".length, " ", "”".length); // 3 3

Thanks, variable-length encoding is the answer for this question.

May 21 2022

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sat, May 21, 2022 at 03:43:59PM +0000, mw via Digitalmars-d wrote:
 On Saturday, 21 May 2022 at 15:23:08 UTC, Adam D Ruppe wrote:
 On Saturday, 21 May 2022 at 15:06:31 UTC, mw wrote:
 It isn't a loop counter. It is an index back into the original
 source. Consider:

 
 1) First, why not 0 then 4? Since dchar is 32 bits.

 
 It is an index back into the *original source* given to foreach.
 
 I gave it a char[], not a dchar[]. So it is counting chars in that
 original char[].
 

 
 Same question:
 
 
 Why 3 bytes? Not 4 bytes?

[...]

Because UTF-8 is a variable-length encoding.


T

-- 
Life is too short to run proprietary software. -- Bdale Garbee

May 21 2022

Adam D Ruppe <destructionator gmail.com> writes:

On Saturday, 21 May 2022 at 15:43:59 UTC, mw wrote:
 Same question:
 Why 3 bytes? Not 4 bytes?

Same answer: it is pointing at the character in the ORIGINAL 
ARRAY. The original array is NOT made out of dchars but it reads 
and converts on the fly while maintaining the correct position.

See that's the beauty of it: how it gets there can be pretty 
complicated (especially when going backwards with 
foreach_reverse), but it always gives the right answer.

May 21 2022

mw <mingwu gmail.com> writes:

On Saturday, 21 May 2022 at 16:11:50 UTC, Adam D Ruppe wrote:
 On Saturday, 21 May 2022 at 15:43:59 UTC, mw wrote:
 Same question:
 Why 3 bytes? Not 4 bytes?

 Same answer: it is pointing at the character in the ORIGINAL 
 ARRAY. The original array is NOT made out of dchars but it 
 reads and converts on the fly while maintaining the correct 
 position.

UTF variable-length encoding is the answer I'm looking for.

 See that's the beauty of it: how it gets there can be pretty 
 complicated (especially when going backwards with 
 foreach_reverse), but it always gives the right answer.

OK, it's a loop array index, not a "loop counter"; so most of 
time these two things are the same, but in this special case of 
UTF variable-length encoding they are different.

two comments:

1) can we also have a true "loop counter"? coming from a numeric 
computation application background, "loop counter" certainly is 
very useful.

2) can we also make it work for range? i.e. the question in my 
original 1st post.

May 21 2022

Adam D Ruppe <destructionator gmail.com> writes:

On Saturday, 21 May 2022 at 17:07:41 UTC, mw wrote:
 1) can we also have a true "loop counter"?

That's what the `enumerate` thing does.

Or you can just

int counter;
for(whatever) {
     scope(exit) counter++;
}

May 21 2022

D Programming

C/C++ Programming

Other

digitalmars.D - access foreach index counter for Iterate n times