www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Bring back foreach int indexes

reply Steven Schveighoffer <schveiguy gmail.com> writes:
I don’t know how many times I get caught with size_t indexes but 
I want them to be int or uint. It’s especially painful in my 
class that I’m teaching where I don’t want to yet explain why int 
doesn’t work there and have to introduce casting or use to!int. 
All for the possibility that I have an array larger than 2 
billion elements.

I am forgetting why we removed this in the first place.

Can we have the compiler insert an assert at the loop start that 
the bounds are in range when you use a smaller int type? Clearly 
the common case is that the array is small enough for int indexes.

-Steve
Nov 29 2023
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Wednesday, 29 November 2023 at 14:56:50 UTC, Steven 
Schveighoffer wrote:
 I don’t know how many times I get caught with size_t indexes 
 but I want them to be int or uint. It’s especially painful in 
 my class that I’m teaching where I don’t want to yet explain 
 why int doesn’t work there and have to introduce casting or use 
 to!int. All for the possibility that I have an array larger 
 than 2 billion elements.

 I am forgetting why we removed this in the first place.

 Can we have the compiler insert an assert at the loop start 
 that the bounds are in range when you use a smaller int type? 
 Clearly the common case is that the array is small enough for 
 int indexes.
For those who are unaware, this used to work: ```d auto arr = [1, 2, 3]; foreach(int idx, v; arr) { ... } ``` But was removed at some point. I think it should be brought back (we are bringing stuff back now, right? Like hex strings?) -Steve
Nov 29 2023
next sibling parent reply bachmeier <no spam.net> writes:
On Wednesday, 29 November 2023 at 15:48:25 UTC, Steven 
Schveighoffer wrote:
 On Wednesday, 29 November 2023 at 14:56:50 UTC, Steven 
 Schveighoffer wrote:
 I don’t know how many times I get caught with size_t indexes 
 but I want them to be int or uint. It’s especially painful in 
 my class that I’m teaching where I don’t want to yet explain 
 why int doesn’t work there and have to introduce casting or 
 use to!int. All for the possibility that I have an array 
 larger than 2 billion elements.

 I am forgetting why we removed this in the first place.

 Can we have the compiler insert an assert at the loop start 
 that the bounds are in range when you use a smaller int type? 
 Clearly the common case is that the array is small enough for 
 int indexes.
For those who are unaware, this used to work: ```d auto arr = [1, 2, 3]; foreach(int idx, v; arr) { ... } ``` But was removed at some point. I think it should be brought back (we are bringing stuff back now, right? Like hex strings?) -Steve
Along the same lines, this won't even compile: ``` foreach(int idx; 0..arr.length) { } ``` Apparently someone decided the explicit `int` is an implicit conversion.
Nov 29 2023
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Wednesday, 29 November 2023 at 16:06:32 UTC, bachmeier wrote:
 Along the same lines, this won't even compile:

 ```
 foreach(int idx; 0..arr.length) {
 }
 ```

 Apparently someone decided the explicit `int` is an implicit 
 conversion.
I don't think that ever compiled, even in D1 (64-bit). What happens here is the two ends are converted to a common type (in this case size_t), and you can't assign size_t to int. But it's easy to fix, just change the top condition via a cast or conversion. The same is not available to a foreach over an array with an index. I would be fine fixing this to work in the same way as I suggested (with a pre-loop assert that it will work). It makes sense that it should act the same. -Steve
Nov 29 2023
parent reply bachmeier <no spam.net> writes:
On Wednesday, 29 November 2023 at 16:25:55 UTC, Steven 
Schveighoffer wrote:
 On Wednesday, 29 November 2023 at 16:06:32 UTC, bachmeier wrote:
 foreach(int idx; 0..arr.length) {
 }
 ```

 Apparently someone decided the explicit `int` is an implicit 
 conversion.
I don't think that ever compiled, even in D1 (64-bit). What happens here is the two ends are converted to a common type (in this case size_t), and you can't assign size_t to int. But it's easy to fix, just change the top condition via a cast or conversion. The same is not available to a foreach over an array with an index.
Yeah, but it's a matter of ugliness. This looks awful ``` foreach(idx; 0..(cast(int) arr.length)) { } ``` In the case you're talking about, you could do ``` foreach(_idx, v; arr) { int idx = cast(int) _idx; } ``` I don't mind ugly and verbose code if there's sufficient benefit. There's no benefit in this case.
Nov 29 2023
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Wednesday, 29 November 2023 at 21:47:09 UTC, bachmeier wrote:
 Yeah, but it's a matter of ugliness. This looks awful

 ```
 foreach(idx; 0..(cast(int) arr.length)) {
 }
 ```

 In the case you're talking about, you could do

 ```
 foreach(_idx, v; arr) {
   int idx = cast(int) _idx;
 }
 ```

 I don't mind ugly and verbose code if there's sufficient 
 benefit. There's no benefit in this case.
Understand that I don't disagree with you. But my point is that `foreach(int i; 0 .. arr.length)` *never compiled*, whereas `foreach(int i, v; arr)` did compile (without deprecation) at some point. This puts it in the category of "features that were removed", vs "features that should be added", and therefore gives a little less resistance to (re)allowing it. -Steve
Nov 29 2023
parent Paolo Invernizzi <paolo.invernizzi gmail.com> writes:
On Wednesday, 29 November 2023 at 22:30:30 UTC, Steven 
Schveighoffer wrote:
 On Wednesday, 29 November 2023 at 21:47:09 UTC, bachmeier wrote:
 Yeah, but it's a matter of ugliness. This looks awful

 ```
 foreach(idx; 0..(cast(int) arr.length)) {
 }
 ```

 In the case you're talking about, you could do

 ```
 foreach(_idx, v; arr) {
   int idx = cast(int) _idx;
 }
 ```

 I don't mind ugly and verbose code if there's sufficient 
 benefit. There's no benefit in this case.
Understand that I don't disagree with you. But my point is that `foreach(int i; 0 .. arr.length)` *never compiled*, whereas `foreach(int i, v; arr)` did compile (without deprecation) at some point. This puts it in the category of "features that were removed", vs "features that should be added", and therefore gives a little less resistance to (re)allowing it. -Steve
I'm totally +1 on this: it's a VERY common case, and it's an *explicit* request done to the compiler. Adding an assert instead of the deprecation it's fine, usually it's what we do right after the foreach statement to safely transform the ulong to an int.
Nov 30 2023
prev sibling parent reply IGotD- <nise nise.com> writes:
On Wednesday, 29 November 2023 at 21:47:09 UTC, bachmeier wrote:
 In the case you're talking about, you could do

 ```
 foreach(_idx, v; arr) {
   int idx = cast(int) _idx;
 }
 ```

 I don't mind ugly and verbose code if there's sufficient 
 benefit. There's no benefit in this case.
I don't understand why this is not an OK solution instead adding yet another lowering for the special case of using int instead of size_t. Right now in D indexes are size_t and should be the default. This feels more like a D3 discussion if indexes should be int or size_t.
Nov 30 2023
parent reply Hipreme <msnmancini hotmail.com> writes:
On Thursday, 30 November 2023 at 10:49:47 UTC, IGotD- wrote:
 On Wednesday, 29 November 2023 at 21:47:09 UTC, bachmeier wrote:
 In the case you're talking about, you could do

 ```
 foreach(_idx, v; arr) {
   int idx = cast(int) _idx;
 }
 ```

 I don't mind ugly and verbose code if there's sufficient 
 benefit. There's no benefit in this case.
I don't understand why this is not an OK solution instead adding yet another lowering for the special case of using int instead of size_t. Right now in D indexes are size_t and should be the default. This feels more like a D3 discussion if indexes should be int or size_t.
That is because there is no reason to have explicit conversions everywhere, they are overly verbose and ugly. And this is one of the reasons because I don't use `foreach` in my code. Good old `for` loop can let you decide the type. Modern programming languages should reduce friction, and not increase it.
Nov 30 2023
next sibling parent reply IGotD- <nise nise.com> writes:
On Thursday, 30 November 2023 at 11:01:15 UTC, Hipreme wrote:
 That is because there is no reason to have explicit conversions 
 everywhere, they are overly verbose and ugly. And this is one 
 of the reasons because I don't use `foreach` in my code. Good 
 old `for` loop can let you decide the type. Modern programming 
 languages should reduce friction, and not increase it.
One of the corner stones in D is that there are hardly any implicit conversions. I not saying personally against it but it is one of the main D "features".
Nov 30 2023
next sibling parent reply bachmeier <no spam.net> writes:
On Thursday, 30 November 2023 at 11:54:01 UTC, IGotD- wrote:
 On Thursday, 30 November 2023 at 11:01:15 UTC, Hipreme wrote:
 That is because there is no reason to have explicit 
 conversions everywhere, they are overly verbose and ugly. And 
 this is one of the reasons because I don't use `foreach` in my 
 code. Good old `for` loop can let you decide the type. Modern 
 programming languages should reduce friction, and not increase 
 it.
One of the corner stones in D is that there are hardly any implicit conversions. I not saying personally against it but it is one of the main D "features".
Do you have an example where it can cause problems in this case?
Nov 30 2023
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Thursday, November 30, 2023 7:31:42 AM MST bachmeier via Digitalmars-d 
wrote:
 On Thursday, 30 November 2023 at 11:54:01 UTC, IGotD- wrote:
 On Thursday, 30 November 2023 at 11:01:15 UTC, Hipreme wrote:
 That is because there is no reason to have explicit
 conversions everywhere, they are overly verbose and ugly. And
 this is one of the reasons because I don't use `foreach` in my
 code. Good old `for` loop can let you decide the type. Modern
 programming languages should reduce friction, and not increase
 it.
One of the corner stones in D is that there are hardly any implicit conversions. I not saying personally against it but it is one of the main D "features".
Do you have an example where it can cause problems in this case?
The main place that using int with the index for foreach would cause problems is if you're programming on a 32-bit system and then later start compiling that code on a 64-bit system. Because size_t is uint on 32-bit systems, using int with foreach works just fine aside from the issue of signed vs unsigned (which D doesn't consider to be a narrowing conversion, for better or worse). So, someone could use int with foreach on a 32-bit system and have no problems, but when they move to a 64-bit system, it could become a big problem, because there, size_t is ulong. So, code that worked fine on a 32-bit system could then break on a 64-bit system (assuming that it then starts operating on arrays that are larger than a 32-bit system could handle). If the implicit narrowing conversion is forbidden with foreach (like it is pretty much everywhere else in the language), and you used int or uint for the index, then when you go to compile that code on a 64-bit system, then you'll get a compiler error, and you can then fix the code as appropriate, whereas if we allow it to be int or uint and treat it like an explicit cast, then you have a silent bug. And because there wasn't actually an explicit cast involved, you can't even grep for the cast keyword. There's also no way to tell at a glance whether the programmer purposefully used int knowing that the conversion would take place or whether they just used int because that worked for what they were doing on a 32-bit system but then didn't work later on 64-bit systems (and of course, if the code worked with int on a 64-bit system originally doesn't mean that it will later, since the code that generates the array could later be changed and allow much larger arrays than was originally the case). Of course, if you're using int instead of uint, then you actually still risk bugs on 32-bit systems if the array is large enough (depending on what exactly is done with the index), but the core idea of disallowing the implicit conversion with foreach is to force the programmer to deal with the issue rather than silently having code that could have an index that's larger than the type being used can actually hold. Now, the question then is how likely that particular bug is vs someone purposefully using int on a 64-bit system, because they wanted int, and they didn't think that there was any way that they would be operating on an array that's larger than int.max or uint.max in length. Outside of operating on large files and "big data," there probably aren't a lot of programs that are going to have arrays large enough that int won't be enough to index them, so there will be a lot of code which could use int with foreach and have no problems whatsoever. And clearly, there are folks who have used int with foreach in the past and had no problems with it, so they're annoyed at being forced to use size_t instead. Personally, I tend to lean towards the pendatic approach here and think that anything and everything involving indexing should be using size_t so that it will work across architectures and not have bugs due to casting to smaller types, but obviously, smart people can disagree on the issue. It's not like allowing int in foreach is wrong across the board. It's just that it can cause a particular class of bugs, and arguably, it's an implicit conversion rather than an explicit conversion, which goes against how D's type conversion rules normally work. But it's also the case that using int in foreach was allowed and treated as an explicit cast for years. So, I think that the change to requiring size_t was a good one, but I can also see why some folks would get annoyed by it, and it's a much bigger deal than it would have been otherwise, because the behavior was changed rather than it always having required a type that size_t can implicitly convert to. - Jonathan M Davis
Nov 30 2023
parent Paolo Invernizzi <paolo.invernizzi gmail.com> writes:
On Thursday, 30 November 2023 at 15:25:52 UTC, Jonathan M Davis 
wrote:

 And because there wasn't actually an explicit cast involved, 
 you can't even grep for the cast keyword.
grep 'foreach(int'
Nov 30 2023
prev sibling parent reply Nick Treleaven <nick geany.org> writes:
On Thursday, 30 November 2023 at 11:54:01 UTC, IGotD- wrote:
 One of the corner stones in D is that there are hardly any 
 implicit conversions.
I don't think that is true. In modern languages these would not be allowed: typeof(null) -> T* Often I want a pointer that always points to a T, e.g. for function arguments. uint -> int int -> uint Even C compilers can catch these with warnings on. int -> dchar byte -> char Then there's the problem of 0 and 1 matching a bool overload instead of int. enum Enum {member = 1} Enum e = Enum.member << 2; // wait, 4 isn't in Enum I hope most of these can be disallowed in D3.
Dec 01 2023
parent Nick Treleaven <nick geany.org> writes:
On Friday, 1 December 2023 at 12:38:05 UTC, Nick Treleaven wrote:
 I hope most of these can be disallowed in D3.
I meant future editions of D.
Dec 01 2023
prev sibling parent Nick Treleaven <nick geany.org> writes:
On Thursday, 30 November 2023 at 11:01:15 UTC, Hipreme wrote:
 That is because there is no reason to have explicit conversions 
 everywhere, they are overly verbose and ugly. And this is one 
 of the reasons because I don't use `foreach` in my code. Good 
 old `for` loop can let you decide the type. Modern programming 
 languages should reduce friction, and not increase it.
Modern languages should detect bugs: ```d for (ubyte i = 0; i != a.length; i++) {} ``` The above compiles without error, but never terminates if `a` has
 255 elements.
```d foreach (ubyte i; 0 .. a.length) {} // compile error ```
Dec 01 2023
prev sibling next sibling parent Daniel N <no public.email> writes:
On Wednesday, 29 November 2023 at 15:48:25 UTC, Steven 
Schveighoffer wrote:
 For those who are unaware, this used to work:

 ```d
 auto arr = [1, 2, 3];
 foreach(int idx, v; arr) {
     ...
 }
 ```

 But was removed at some point. I think it should be brought 
 back (we are bringing stuff back now, right? Like hex strings?)

 -Steve
what's the problem, add optional parameter to enumerate which defaults to size_t? enumerate!int(xxx)
Nov 30 2023
prev sibling next sibling parent reply Johan <j j.nl> writes:
On Wednesday, 29 November 2023 at 15:48:25 UTC, Steven 
Schveighoffer wrote:
 On Wednesday, 29 November 2023 at 14:56:50 UTC, Steven 
 Schveighoffer wrote:
 I don’t know how many times I get caught with size_t indexes 
 but I want them to be int or uint. It’s especially painful in 
 my class that I’m teaching where I don’t want to yet explain 
 why int doesn’t work there and have to introduce casting or 
 use to!int. All for the possibility that I have an array 
 larger than 2 billion elements.

 I am forgetting why we removed this in the first place.

 Can we have the compiler insert an assert at the loop start 
 that the bounds are in range when you use a smaller int type? 
 Clearly the common case is that the array is small enough for 
 int indexes.
For those who are unaware, this used to work: ```d auto arr = [1, 2, 3]; foreach(int idx, v; arr) { ... } ``` But was removed at some point. I think it should be brought back (we are bringing stuff back now, right? Like hex strings?)
- What about the other integer types? (uint, short, byte, char?) - What if `arr.length >= int.max` ? Does the loop become infinite, or does the loop counter stay `size_t` (not accessible by user) and it is cast to `int` (`idx`) upon every iteration ? -Johan
Nov 30 2023
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Thursday, 30 November 2023 at 18:26:55 UTC, Johan wrote:
 - What about the other integer types?  (uint, short, byte, 
 char?)
Yeah, you can use those too. I think the right answer is to ensure the length of the array being iterated can't exceed the value range of the type using an assert. And oh god, I know we have to do `char` because "it's an integer too!". `bool` is also, you could do `foreach(bool idx, v; arr)`
 - What if `arr.length >= int.max` ? Does the loop become 
 infinite, or does the loop counter stay `size_t` (not 
 accessible by user) and it is cast to `int` (`idx`) upon every 
 iteration ?
I actually tested this (with `ubyte`, not `int`), and what happens is interesting. You only get `length % T.max` (or something like that) elements. For instance I did: ```d foreach(ubyte idx, v; iota(270).array) { writeln(idx); } ``` and it printed 0 to 13, and was done. So clearly it uses `ubyte` as the index for iteration, and also somehow converts the length to `ubyte` instead of the other way around. But it doesn't use it exactly, because modifying `idx` doesn't change the loop. So I don't see why it uses `ubyte` for the actual index instead of `size_t`. Honestly, maybe the easiest fix here is just to fix the actual lowering to be more sensical (I would have expected 270 iterations with repeated indexes 0 to 13 after going through 255). -Steve
Nov 30 2023
next sibling parent Siarhei Siamashka <siarhei.siamashka gmail.com> writes:
On Thursday, 30 November 2023 at 21:28:51 UTC, Steven 
Schveighoffer wrote:
 Honestly, maybe the easiest fix here is just to fix the actual 
 lowering to be more sensical (I would have expected 270 
 iterations with repeated indexes 0 to 13 after going through 
 255).
Or maybe a runtime error, similar to how out-of-bounds array accesses are handled would be more reasonable here?
Nov 30 2023
prev sibling parent reply Nick Treleaven <nick geany.org> writes:
On Thursday, 30 November 2023 at 21:28:51 UTC, Steven 
Schveighoffer wrote:
 So clearly it uses `ubyte` as the index for iteration, and also 
 somehow converts the length to `ubyte` instead of the other way 
 around.

 But it doesn't use it exactly, because modifying `idx` doesn't 
 change the loop. So I don't see why it uses `ubyte` for the 
 actual index instead of `size_t`.
This code: foreach(ubyte idx, v; new int[270]) { writeln(idx); } Lowers to: { scope int[] __r72 = (new int[](270LU))[]; ubyte __key71 = cast(ubyte)0u; for (; cast(int)__key71 < cast(int)cast(ubyte)__r72.length; cast(int)__key71 += 1) { int v = __r72[cast(ulong)__key71]; ubyte idx = __key71; writeln(idx); } } From `dmd -vcg-ast`. So the array length is cast to ubyte, giving 14.
 Honestly, maybe the easiest fix here is just to fix the actual 
 lowering to be more sensical (I would have expected 270 
 iterations with repeated indexes 0 to 13 after going through 
 255).
Well I get a deprecation message: Deprecation: foreach: loop index implicitly converted from `size_t` to `ubyte` So I think it will become an error (at least in a future edition anyway).
Dec 01 2023
parent Denis Feklushkin <feklushkin.denis gmail.com> writes:
Now that arrays become templates defined in druntime, maybe it 
will be possible to define array index size during array 
declaration and this problem will somehow go away?
Dec 10 2023
prev sibling parent reply Quirin Schroll <qs.il.paperinik gmail.com> writes:
On Wednesday, 29 November 2023 at 15:48:25 UTC, Steven 
Schveighoffer wrote:
 On Wednesday, 29 November 2023 at 14:56:50 UTC, Steven 
 Schveighoffer wrote:
 I don’t know how many times I get caught with size_t indexes 
 but I want them to be int or uint. It’s especially painful in 
 my class that I’m teaching where I don’t want to yet explain 
 why int doesn’t work there and have to introduce casting or 
 use to!int. All for the possibility that I have an array 
 larger than 2 billion elements.

 I am forgetting why we removed this in the first place.

 Can we have the compiler insert an assert at the loop start 
 that the bounds are in range when you use a smaller int type? 
 Clearly the common case is that the array is small enough for 
 int indexes.
For those who are unaware, this used to work: ```d auto arr = [1, 2, 3]; foreach(int idx, v; arr) { ... } ``` But was removed at some point. I think it should be brought back (we are bringing stuff back now, right? Like hex strings?) -Steve
Couldn’t you write a function `withIntIndex` or `withIndexType!int` such that you can check the array is indeed short enough? ```d auto withIndexType(T : ulong, U)(U[] array) { static struct WithIndexType { U[] array; int opApplyImpl(DG)(scope DG callback) { for (T index = 0; index < cast(T)array.length; ++index) { if (auto result = callback(index, array[index])) return result; } return 0; } alias opApply = opApplyImpl!(int delegate(T, ref U)); } assert(array.length < T.max, "withIndexType: array length is too big for index type"); return WithIndexType(array); } ``` The `alias opApply` makes it so that this works (notice ` safe` on `main`): ```d void main() safe { double[] xs = new double[](120); foreach (i, ref d; xs.withIndexType!byte) { static assert(is(typeof(i) == byte)); static assert(is(typeof(d) == double)); // your part :) } } ``` In all honesty, I don’t know why the `alias` trick even works, but using it, the compiler can infer the `foreach` types _and_ instantiate the `opApplyImpl` template with the concrete type of the loop delegate.
Dec 11 2023
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On Monday, 11 December 2023 at 22:22:27 UTC, Quirin Schroll wrote:
 On Wednesday, 29 November 2023 at 15:48:25 UTC, Steven 
 Schveighoffer wrote:
 On Wednesday, 29 November 2023 at 14:56:50 UTC, Steven 
 Schveighoffer wrote:
 I don’t know how many times I get caught with size_t indexes 
 but I want them to be int or uint. It’s especially painful in 
 my class that I’m teaching where I don’t want to yet explain 
 why int doesn’t work there and have to introduce casting or 
 use to!int. All for the possibility that I have an array 
 larger than 2 billion elements.

 I am forgetting why we removed this in the first place.

 Can we have the compiler insert an assert at the loop start 
 that the bounds are in range when you use a smaller int type? 
 Clearly the common case is that the array is small enough for 
 int indexes.
For those who are unaware, this used to work: ```d auto arr = [1, 2, 3]; foreach(int idx, v; arr) { ... } ``` But was removed at some point. I think it should be brought back (we are bringing stuff back now, right? Like hex strings?)
Couldn’t you write a function `withIntIndex` or `withIndexType!int` such that you can check the array is indeed short enough?
Yes, but... it is still in the compiler, just deprecated (as I realized later in this thread). We can just undeprecate it (with some extra checks added). Using a range/opApply wrapper also is going to bloat the code a bunch for not much benefit. This really is a case of a problem being solved that didn't exist. -Steve
Mar 27
prev sibling next sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 29 November 2023 at 14:56:50 UTC, Steven 
Schveighoffer wrote:
 I don’t know how many times I get caught with size_t indexes 
 but I want them to be int or uint. It’s especially painful in 
 my class that I’m teaching where I don’t want to yet explain 
 why int doesn’t work there and have to introduce casting or use 
 to!int. All for the possibility that I have an array larger 
 than 2 billion elements.

 I am forgetting why we removed this in the first place.

 Can we have the compiler insert an assert at the loop start 
 that the bounds are in range when you use a smaller int type? 
 Clearly the common case is that the array is small enough for 
 int indexes.

 -Steve
It cannot work in the general case, slice's size is a size_t . But if the compiler can prove it fits in 32bits, then there should be no problem. VRP could do that.
Nov 29 2023
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Wednesday, 29 November 2023 at 16:56:54 UTC, deadalnix wrote:

 It cannot work in the general case, slice's size is a size_t .

 But if the compiler can prove it fits in 32bits, then there 
 should be no problem. VRP could do that.
VRP does not work here. The array comes from anywhere and can have any length. It needs to be validated (if you insist) at runtime. But it's very very *VERY* uncommon to have an array with 2 billion elements. Note, this worked for years, pretty much without incident. I don't even remember why we removed it. It seems like one of those problems nobody had or was looking to solve. Yet, we solved it. Actually, I just tested it out, and it gives a deprecation warning, but builds! This means it's actually quite easy to fix this, just remove the deprecation, and add the assert. A great point from CyberShadow on the original PR that added the deprecation: https://github.com/dlang/dmd/pull/8941#issuecomment-496306412 And note this was to "fix" an issue with foreach_reverse on int indexes, it was just snuck in there... And here is another bug report that had a more detailed conversation (including from me): https://issues.dlang.org/show_bug.cgi?id=16149 -Steve
Nov 29 2023
parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/29/2023 11:27 AM, Steven Schveighoffer wrote:
 A great point from CyberShadow on the original PR that added the deprecation:
 
 https://github.com/dlang/dmd/pull/8941#issuecomment-496306412
In that case, the array length is known and VRP should apply. https://issues.dlang.org/show_bug.cgi?id=24450
Mar 24
prev sibling parent reply Liam McGillivray <yoshi.pit.link.mario gmail.com> writes:
On Wednesday, 29 November 2023 at 14:56:50 UTC, Steven 
Schveighoffer wrote:
 I don’t know how many times I get caught with size_t indexes 
 but I want them to be int or uint. It’s especially painful in 
 my class that I’m teaching where I don’t want to yet explain 
 why int doesn’t work there and have to introduce casting or use 
 to!int. All for the possibility that I have an array larger 
 than 2 billion elements.
Yes! This! Right now I have 22 of these deprecation warnings every time I compile my program. I was going to start a new thread recommending this feature be dedeprecated. I'm happy to find this old thread with Steve suggesting this very thing, and also glad to see most people here are on my side. Having to add another variable to do an explicit cast would be ugly. I don't want this to actually be removed at some point and destroy my code which should be perfectly acceptable. It should just be removed from deprecation. When I was thinking of starting this thread myself, I had the feeling that there would be some kind of objection from programmers more experienced than me. But it looks like Jonathan M Davis was the only one here to give a serious argument why it shouldn't be allowed. On Thursday, 30 November 2023 at 15:25:52 UTC, Jonathan M Davis wrote:
 Because size_t is uint on 32-bit systems, using int with 
 foreach works just fine aside from the issue of signed vs 
 unsigned (which D doesn't consider to be a narrowing 
 conversion, for better or worse). So, someone could use int 
 with foreach on a 32-bit system and have no problems, but when 
 they move to a 64-bit system, it could become a big problem, 
 because there, size_t is ulong. So, code that worked fine on a 
 32-bit system could then break on a 64-bit system (assuming 
 that it then starts operating on arrays that are larger than a 
 32-bit system could handle).
An interesting, not bad point, but I don't think it's enough to justify removing this language feature. It's just too unlikely of a scenario to be worth removing a feature which improves things far more often than not. Firstly, how often would it be that a program wouldn't explicitly require more array values than `uint` can fit, but is still capable of filling the array beyond that in places when the maximum array size is enough? For someone to do all the development and testing of their program on a 32-bit system must be a rare scenario. Even 10 years ago, if someone was running a 32-bit desktop operating system, it meant that either they had one of the older computers still in use, or they stupidly chose the 32-bit version even though their computer was 64-bit capable. The kinds of people who would use a programming language like D aren't the most likely people to make such mistakes. Those that write programs that other people use are even less likely. Now with Windows no longer coming in 32-bit versions, these days are largely behind. There are probably some people around using D for embedded applications, which may involve 32-bit microcontrollers. In 2024 and beyond, this is the only scenario where someone may realistically use D and do all the testing on a 32-bit system. They would then need to move the same program to a 64-bit system after testing for the problem to emerge. I just don't think this is likely enough to be worth removing the feature. In the unlikely chance this problem ever does happen, it's just one more of many places where bugs can happen. It might not ever happen. If it were to ever happen, it probably would have already back when 32-bit systems were more common. If there are no known cases of this, then I think it's safe to remove it from deprecation. Maybe disallow it from functions marked ` safe`, but generally, I think this feature should be allowed without deprecation warnings.
Mar 24
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
When size_t is 64 bits, the reason:

     foreach (int i; 0 .. array.length)

gives an error is the same reason that:

     size_t s;
     int i = s;

gives an error. A 64 bit integer cannot be converted to a 32 bit integer
without 
risk of losing data.

If the loop is rewritten as:

     foreach (i; 0 .. array.length)

then the problem goes away.

size_t is 64 bits for a machine with 64 bit pointers. This also means that 
registers are 64 bits in size, so no memory and no performance is saved by 
making the index 32 bits. It's the "natural" size for an index.

BTW, there have been endless bugs in C programs when converting them between 
16-32-64 bits, because C doesn't give an error when converting from a larger 
integer type to a smaller one. It just truncates. These bugs tend to be hidden 
and hard to track down. Such overflows are also exploited for malware purposes.

D is doing the right thing and the code is portable between 32<=>64 bits pretty 
much by default. When was the last time you heard of a 32<=>64 bit porting bug 
with D? I don't recall one. Many of D's design decisions came from living 
through the C/C++ conversion bugs by transitioning 16-32-64.

Just use:

     foreach (i; 0 .. array.length)

and let the compiler take care of it for you.
Mar 24
next sibling parent Liam McGillivray <yoshi.pit.link.mario gmail.com> writes:
On Sunday, 24 March 2024 at 16:33:06 UTC, Walter Bright wrote:
 When size_t is 64 bits, the reason:

     foreach (int i; 0 .. array.length)
I suppose I wasn't clear enough. When I say I want it to work without errors, I specifically meant in the following format: ``` foreach (uint i, entry; array) ``` I suppose for the other `foreach` format it would be nice too. But for the latter, it would be as easy as removing a deprecation. My position is that the latter format should be allowed and removed from deprecation for the foreseeable future outside of code marked ` safe`, but perhaps don't allow it in ` safe` code. I don't know if it's ever considered acceptable practice for runtime warnings to be automatically inserted into a program, but perhaps for debug builds a runtime warning can be inserted whenever the array is longer than, or perhaps more than 3/4 the size allowed by the type of `i` (as in the code above). But if it's still considered unacceptable to have the now-deprecated format shown above be brought back as a language feature, I suggest the following as a compromise: ``` foreach (cast uint i, entry; array) ``` and also ``` foreach (cast uint i; 0 .. array.length) ```
Mar 24
prev sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On Sunday, 24 March 2024 at 16:33:06 UTC, Walter Bright wrote:

 Just use:

     foreach (i; 0 .. array.length)

 and let the compiler take care of it for you.
The use case I have is you need to pass `i` to a function that takes an int. This is very common in C libraries (e.g. raylib). Now, in this case, the solution is quite easy: ```d foreach(i; 0 .. cast(int)array.length) // assumed foreach(i; 0 .. array.length.to!int) // checked ``` But the case is *not* as easy with a foreach over an array with an index: ```d foreach(int i, v; array) ``` In this case, without that mechanism, you have to cast i *every time it's used*. or have a goofy reassignment to another variable in each loop iteration. This is one of those quality of life issues that would be nice to get back. -Steve
Mar 27
prev sibling parent reply Nick Treleaven <nick geany.org> writes:
On Sunday, 24 March 2024 at 08:23:03 UTC, Liam McGillivray wrote:
 On Thursday, 30 November 2023 at 15:25:52 UTC, Jonathan M Davis 
 wrote:
 Because size_t is uint on 32-bit systems, using int with 
 foreach works just fine aside from the issue of signed vs 
 unsigned (which D doesn't consider to be a narrowing 
 conversion, for better or worse). So, someone could use int 
 with foreach on a 32-bit system and have no problems, but when 
 they move to a 64-bit system, it could become a big problem, 
 because there, size_t is ulong. So, code that worked fine on a 
 32-bit system could then break on a 64-bit system (assuming 
 that it then starts operating on arrays that are larger than a 
 32-bit system could handle).
An interesting, not bad point, but I don't think it's enough to justify removing this language feature. It's just too unlikely of a scenario to be worth removing a feature which improves things far more often than not.
It's good to make any integer truncation visible rather than implicit - that's the main reason. And given that it will be safe to use a smaller integer type than size_t when the array length is statically known (after https://github.com/dlang/dmd/pull/16334), some future people might expect a specified index type to be verified as able to hold every index in the array.
 Firstly, how often would it be that a program wouldn't 
 explicitly require more array values than `uint` can fit, but 
 is still capable of filling the array beyond that in places 
 when the maximum array size is enough?
The 64-bit version of the program may be expected to handle more data than the 32-bit version. That could even be the reason why it was ported to 64-bit. ...
 Maybe disallow it from functions marked ` safe`,
safe is for memory-safety, it shouldn't be conflated with other types of safety.
 But if it's still considered unacceptable to have the 
 now-deprecated format shown above be brought back as a language 
 feature, I suggest the following as a compromise:
 ```
     foreach (cast uint i, entry; array)
 ```
Possibly a wrapper could be made which asserts that the length fits: ```d foreach (i, entry; array.withIndex!uint) ```
Mar 25
parent Quirin Schroll <qs.il.paperinik gmail.com> writes:
On Monday, 25 March 2024 at 22:27:10 UTC, Nick Treleaven wrote:
 On Sunday, 24 March 2024 at 08:23:03 UTC, Liam McGillivray 
 wrote:
 On Thursday, 30 November 2023 at 15:25:52 UTC, Jonathan M 
 Davis wrote:
 Because size_t is uint on 32-bit systems, using int with 
 foreach works just fine aside from the issue of signed vs 
 unsigned (which D doesn't consider to be a narrowing 
 conversion, for better or worse). So, someone could use int 
 with foreach on a 32-bit system and have no problems, but 
 when they move to a 64-bit system, it could become a big 
 problem, because there, size_t is ulong. So, code that worked 
 fine on a 32-bit system could then break on a 64-bit system 
 (assuming that it then starts operating on arrays that are 
 larger than a 32-bit system could handle).
An interesting, not bad point, but I don't think it's enough to justify removing this language feature. It's just too unlikely of a scenario to be worth removing a feature which improves things far more often than not.
It's good to make any integer truncation visible rather than implicit - that's the main reason. And given that it will be safe to use a smaller integer type than size_t when the array length is statically known (after https://github.com/dlang/dmd/pull/16334), some future people might expect a specified index type to be verified as able to hold every index in the array.
 Firstly, how often would it be that a program wouldn't 
 explicitly require more array values than `uint` can fit, but 
 is still capable of filling the array beyond that in places 
 when the maximum array size is enough?
The 64-bit version of the program may be expected to handle more data than the 32-bit version. That could even be the reason why it was ported to 64-bit. ...
 Maybe disallow it from functions marked ` safe`,
safe is for memory-safety, it shouldn't be conflated with other types of safety.
I’d rather say ` safe` means no undefined behavior, or more precisely, the compiler gives errors for operations that might be UB. If all code you write is ` safe`, you don’t have UB in your program (given a perfect compiler). Disallowing implicit integer truncation is not a UB issue (it’s not UB to implicitly truncate, i.e. it’s not like signed integer overflow in C), but it’s disallowed for other good reasons. --- After reading this thread, I get the impression that `size_t` should be its own type, with the guarantee that `size_t` is equivalent to one of the other built-in unsigned integer types. The advantage would be that casts between other integer types and `size_t` would have to be explicit even if they can’t fail on the given platform. I’d require explicit casts for all of them, just to be simple and consistent. Conceptually, a size is not a n-bit number for a fixed n, unlike values of type `uint` or `ulong`. It’s a similar idea to having `char`, `wchar`, and `dchar` separate from `ubyte`, `ushort`, and `uint` even if they’re the same under the hood. Heck, unlike `size_t`, they relate to the exact same integer types under the hood on *every* platform. So in some sense, the argument for having `size_t` be different is even stronger than having character types be different. My bet is that Walter strongly disagrees with this, since he’s stands firmly on Booleans are integers as well. It’s not unreasonable, only how much sense it makes depending on where you come from.
Mar 27