www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - length's type.

reply zjh <fqbqrr 163.com> writes:
Can you change the type of 'length' from 'ulong' to 'int', so I 
haven't to convert it every time!
Jan 17
next sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Wednesday, January 17, 2024 7:55:37 PM MST zjh via Digitalmars-d-learn 
wrote:
 Can you change the type of 'length' from 'ulong' to 'int', so I
 haven't to convert it every time!
If you mean for arrays, length and array indices are specifically size_t so that their size will match the pointer size for the architecture. On 64-bit systems, that means that it's ulong (whereas on 32-bit systems, it would be uint). If it were int, then you couldn't access all of the elements of larger arrays (and arrays will get that large in some cases - e.g. when dealing with larger files). C/C++ does the same thing. If you want your code to be portable and to be able to handle larger arrays, then it should be using size_t for array indices and length and not int, in which case, you're not typically going to need to convert from ulong to int, because you'd just be using size_t, which would then be ulong on 64-bit systems. Obviously, when you do need to convert to int, then that can be annoying, but for a lot of code, using auto and size_t makes it so that you don't need to use int, and it would be a big problem in general if the language made length int. - Jonathan M Davis
Jan 17
parent reply zjh <fqbqrr 163.com> writes:
On Thursday, 18 January 2024 at 04:30:33 UTC, Jonathan M Davis 
wrote:
but for a lot of code, using auto and size_t makes it
 so that you don't need to use int, and it would be a big 
 problem in general if the language made length int.
It's hard to imagine that an `'int'` needs to be replaced with `'auto'`.
Jan 17
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Wednesday, January 17, 2024 11:33:48 PM MST zjh via Digitalmars-d-learn 
wrote:
 On Thursday, 18 January 2024 at 04:30:33 UTC, Jonathan M Davis
 wrote:
 but for a lot of code, using auto and size_t makes it

 so that you don't need to use int, and it would be a big
 problem in general if the language made length int.
It's hard to imagine that an `'int'` needs to be replaced with `'auto'`.
It's very common in D code to do stuff like auto a = foo(); or auto len = arr.length; That way, you automatically get the correct type. Obviously, there are cases where you need to force a particular type, and that can require casting, but inferring types often simplifies code. It's _very_ common in idiomatic D code to use auto when you don't need to force a specific type. And when dealing with arrays, it's very typical to use either auto or size_t, because then you get the correct integer type regardless of the platform, and then you only need to worry about casting to int in cases where you actually need int for whatever reason. But regardless of whether you want to use auto, there are very good reasons for why length is size_t, and C/C++ made exactly the same choice for the same reasons. You can certainly disagree with that choice, but it's not the kind of thing that stands much chance of ever being changed. - Jonathan M Davis
Jan 17
parent zjh <fqbqrr 163.com> writes:
On Thursday, 18 January 2024 at 07:44:00 UTC, Jonathan M Davis 
wrote:

```d
 auto a = foo();

 or

 auto len = arr.length;
Thank you for your reply, just to use `auto`.
Jan 18
prev sibling next sibling parent reply Siarhei Siamashka <siarhei.siamashka gmail.com> writes:
On Thursday, 18 January 2024 at 02:55:37 UTC, zjh wrote:
 Can you change the type of 'length' from 'ulong' to 'int', so I 
 haven't to convert it every time!
The explicit conversion `.length.to!int` has an extra benefit of doing a runtime check to ensure that the length value actually fits in an `int` variable. Just in case if somebody tries to use your code to process a 3GB array of bytes. It's more correct to rewrite the code to expect `size_t`. But one needs to be very careful about implementing it correctly, because silent casts between signed and unsigned data types may ruin your day. It's a major source of bugs, similar to the one discussed in https://forum.dlang.org/thread/vyvbrtmyelududcvukfb forum.dlang.org
Jan 27
parent reply zjh <fqbqrr 163.com> writes:
On Sunday, 28 January 2024 at 06:34:13 UTC, Siarhei Siamashka 
wrote:
 The explicit conversion `.length.to!int` has an extra benefit
I rarely use numbers over one million. But do I have to consider numbers over `4 billion` every day?
Jan 28
parent reply Olivier Pisano <olivier.pisano laposte.net> writes:
On Sunday, 28 January 2024 at 08:55:54 UTC, zjh wrote:
 On Sunday, 28 January 2024 at 06:34:13 UTC, Siarhei Siamashka 
 wrote:
 The explicit conversion `.length.to!int` has an extra benefit
I rarely use numbers over one million. But do I have to consider numbers over `4 billion` every day?
If .length were to be an int, D could not handle array of more than 2G bytes. The whole language would be useless on 64 bit systems.
Jan 28
next sibling parent monkyyy <crazymonkyyy gmail.com> writes:
On Sunday, 28 January 2024 at 16:16:34 UTC, Olivier Pisano wrote:
 On Sunday, 28 January 2024 at 08:55:54 UTC, zjh wrote:
 On Sunday, 28 January 2024 at 06:34:13 UTC, Siarhei Siamashka 
 wrote:
 The explicit conversion `.length.to!int` has an extra benefit
I rarely use numbers over one million. But do I have to consider numbers over `4 billion` every day?
If .length were to be an int, D could not handle array of more than 2G bytes. The whole language would be useless on 64 bit systems.
_Of bytes_ and if your messing with the type and still think that's an important concern you could make it a long for 63 bits and no silly 0-1 behavior a signed index of a datatype that is 2 words will still compete with the mythical computer ram of a max ram 64 bit machine; if you have 64^2 ubytes maybe you should rotate your prospective and store 256 counts of each.
Jan 28
prev sibling next sibling parent reply mw <mw g.c> writes:
On Sunday, 28 January 2024 at 16:16:34 UTC, Olivier Pisano wrote:
 If .length were to be an int, D could not handle array of more 
 than 2G bytes. The whole language would be useless on 64 bit 
 systems.
The array.length better to be *signed* `long` (signed size_t) instead of unsigned. Can you guess what is the output of this array element average calculation example: ================================== import std.algorithm; import std.stdio; void main() { long[] a = [-5000, 0]; long c = sum(a) / a.length; writeln(c); } ================================== See the result here: https://forum.dlang.org/post/cagloplexjfzubncxuza forum.dlang.org
Jan 28
parent reply Gary Chike <chikega gmail.com> writes:
On Sunday, 28 January 2024 at 17:25:49 UTC, mw wrote:

 See the result here:

 https://forum.dlang.org/post/cagloplexjfzubncxuza forum.dlang.org
I knew this outlandish output had to do with mixing of signed and unsigned types with resulting overflow. But I like the way Anthropic Claude2 explains it: --- The **outlandish output** you're observing in the D code occurs due to an **integer overflow**. Let's break down what's happening: 1. **Sum Calculation**: - The `sum(elem)` function correctly calculates the sum of all elements in the `elem` array, which is **15** (1 + 2 + 3 + 4 + 5). - So far, so good! 2. **Average Calculation**: - Next, you calculate the average of the `a` array using the expression `sum(a) / a.length`. - The `sum(a)` part correctly computes the sum of the elements in `a`, which is **-5000** (since `-5000 + 0 = -5000`). - The issue arises with `a.length`. 3. **Array Length and Unsigned Types**: - In D, the `.length` property of an array returns an unsigned integer (`ulong`), which is an unsigned 64-bit integer. - The length of the `a` array is **2** (since it has two elements: -5000 and 0). 4. **Integer Division**: - When you perform `sum(a) / a.length`, the division is done using integer division rules. - The result of `-5000 / 2` is **-2500** (integer division truncates the decimal part). 5. **Overflow**: - The result of `-2500` is stored in the `c` variable, which is of type `ulong`. - Since `ulong` is an unsigned type, the value wraps around due to overflow. - The actual value stored in `c` is **9223372036854773308**, which is the result of wrapping around from -2500 to a large positive value. 6. **Explanation**: - The overflow occurs because `-2500` (as a signed integer) is equivalent to a large positive value when interpreted as an unsigned 64-bit integer. - The result is not what you intended for the average calculation. To fix this, you can explicitly cast the sum to a signed integer before dividing to ensure correct behavior: ```d auto c = cast(double)(sum(a)) / a.length; ``` This will correctly compute the average and prevent overflow. The output should now match your expectations! 🚀 The corrected code: ```d module array_length_forum; import std.algorithm; import std.stdio; void main() { auto elem = [1, 2, 3, 4, 5]; writeln(sum(elem)); // 15 <- // The sum of all the elements in the range long[]a = [-5000, 0]; //auto c = sum(a)/a.length; // length() returns 'ulong', inferred as 'ulong' auto c = cast(double)(sum(a)) / a.length; writeln(typeid(c)); // double writeln(c); // -2500 correct output } ```
Feb 07
parent reply Gary Chike <chikega gmail.com> writes:
On Wednesday, 7 February 2024 at 19:20:12 UTC, Gary Chike wrote:

I just had to transcribe this to C just for grins :D
```c
#include <stdio.h>

int sumArray(int arr[], size_t size) {
     int total = 0;
     for (size_t i = 0; i < size; ++i) {
         total += arr[i];
     }
     return total;
}

int main(void) {
     long a[] = {-5000, 0};
     size_t aLength = sizeof(a) / sizeof(a[0]);

     double c = (double)sumArray((int*)a, aLength) / aLength;
     printf("Average: %.2lf\n", c); // -2500 <- correct output

     return 0;
}
```
Feb 07
parent reply Gary Chike <chikega gmail.com> writes:
On Wednesday, 7 February 2024 at 19:32:56 UTC, Gary Chike wrote:
 On Wednesday, 7 February 2024 at 19:20:12 UTC, Gary Chike wrote:
The output wasn't quite right. So I tweaked it a bit: ```c long sumArray(long arr[], size_t size) { long total = 0; for (size_t i = 0; i < size; ++i) { total += arr[i]; } return total; } int main(void) { long a[] = {-5000, 0}; size_t aLength = sizeof(a) / sizeof(a[0]); double c = (double)sumArray(a, aLength) / aLength; printf("Average: %.2lf\n", c); // -2500.00 return 0; } ```
Feb 07
parent reply Gary Chike <chikega gmail.com> writes:
On Wednesday, 7 February 2024 at 20:08:24 UTC, Gary Chike wrote:
 On Wednesday, 7 February 2024 at 19:32:56 UTC, Gary Chike wrote:
     double c = (double)sumArray(a, aLength) / aLength;
If I don't cast explicitly: `double c = sumArray(a, aLength) / aLength;` then I will get a similar result as the D code: `Average: 9223372036854773760.00`
Feb 07
parent reply Kevin Bailey <keraba yahoo.com> writes:
On Wednesday, 7 February 2024 at 20:13:40 UTC, Gary Chike wrote:
 On Wednesday, 7 February 2024 at 20:08:24 UTC, Gary Chike wrote:
 On Wednesday, 7 February 2024 at 19:32:56 UTC, Gary Chike 
 wrote:
     double c = (double)sumArray(a, aLength) / aLength;
If I don't cast explicitly: `double c = sumArray(a, aLength) / aLength;` then I will get a similar result as the D code: `Average: 9223372036854773760.00`
I don't think it's productive to compare the behavior to C. C is now 50 years old. One would hope that D has learned a few things in that time. How many times does the following loop print? I ran into this twice doing the AoC exercises. It would be nice if it Just Worked. ``` import std.stdio; int main() { char[] something = ['a', 'b', 'c']; for (auto i = -1; i < something.length; ++i) writeln("less than"); return 0; } ```
Feb 07
next sibling parent mw <m g.c> writes:
On Thursday, 8 February 2024 at 05:56:57 UTC, Kevin Bailey wrote:
 I don't think it's productive to compare the behavior to C. C 
 is now 50 years old. One would hope that D has learned a few 
 things in that time.

 How many times does the following loop print? I ran into this 
 twice doing the AoC exercises. It would be nice if it Just 
 Worked.
 ```
 import std.stdio;

 int main()
 {
   char[] something = ['a', 'b', 'c'];

   for (auto i = -1; i < something.length; ++i)
         writeln("less than");

   return 0;
 }
 ```
This is horrible, even if you use `int i`, it still won't work as you have thought (ok, I thought): ``` import std.stdio; int main() { char[] something = ['a', 'b', 'c']; for (int i = -1; i < something.length; ++i) writeln("less than"); writeln("done"); return 0; } ``` it will just output ``` done ```
Feb 07
prev sibling next sibling parent reply thinkunix <thinkunix zoho.com> writes:
Kevin Bailey via Digitalmars-d-learn wrote:
 How many times does the following loop print? I ran into this twice 
 doing the AoC exercises. It would be nice if it Just Worked.
 ```
 import std.stdio;
 
 int main()
 {
    char[] something = ['a', 'b', 'c'];
 
    for (auto i = -1; i < something.length; ++i)
          writeln("less than");
 
    return 0;
 }
 ```
 
Pretty nasty. This seems to work but just looks bad to me. I would never write code like this. It would also break if the array 'something' had more than int.max elements. ``` import std.stdio; int main() { char[] something = ['a', 'b', 'c']; // len = 3, type ulong writeln("len: ", something.length); writeln("typeid(something.length): ", typeid(something.length)); // To make the loop execute, must cast something.length // which is a ulong, to an int, which prevents i from // being promoted from int to ulong and overflowing. // The loop executes 4 times, when i is -1, 0, 1, and 2. for (auto i = -1; i < cast(int)something.length; ++i) { writeln("i: ", i); } return 0; } ``` output: len: 3 typeid(something.length): ulong i: -1 i: 0 i: 1 i: 2
Feb 08
parent reply Kevin Bailey <keraba yahoo.com> writes:
On Thursday, 8 February 2024 at 08:23:12 UTC, thinkunix wrote:
 I would never write code like this.
By all means, please share with us how you would have written that just as elegantly but "correct".
 It would also break if the array 'something' had more than 
 int.max elements.
Then don't cast it to an int. First of all, why didn't you cast it to a long? Second, why doesn't the language do this correctly so I don't have to cast it at all? If I explicitly use checkedint, it does, but I don't want to write Checked!(int, ProperCompare) all over the place. (Yes, I know I can alias it.) I'm asking, why is the default C compatibility (of all things) rather than "safety that I can override if I need to make it faster" ?
Feb 08
next sibling parent Arafel <er.krali gmail.com> writes:
On 8/2/24 16:00, Kevin Bailey wrote:
 I'm asking, why is the default C compatibility (of all things) rather 
 than "safety that I can override if I need to make it faster" ?
I'm sure there are more experienced people here that will be able to answer better, but as far as I remember, the policy has been like this since times immemorial:
 D code that happens to be valid C code as well should either behave exactly
like C, or not compile at all.
I can't find a quote in the official documentation right now, though. IIRC, there are a couple of specific situations where this is not the case, but I think this is the answer to your question ("why?"). You can well disagree with this policy, and there are probably many good reasons for that, but that's probably a deeper discussion.
Feb 08
prev sibling next sibling parent reply kdevel <kdevel vogtner.de> writes:
On Thursday, 8 February 2024 at 15:00:54 UTC, Kevin Bailey wrote:
 By all means, please share with us how you would have written 
 that just as elegantly but "correct".
Elegant and correct is this version: ```d import std.stdio; int main() { char[] something = ['a', 'b', 'c']; writeln("len: ", something.length); writeln("typeid(something.length): ", typeid(something.length)); writeln ("i: -1"); foreach (i, _; something) writeln("i: ", i); return 0; } ``` But it is still a bit too "clever" in the `foreach` statement due to the unused variable `_`.
Feb 08
parent reply Kevin Bailey <keraba yahoo.com> writes:
Arafel,

You're certainly correct. Priorities change. It used to be 
thought that backwards compatibility was the way to attract 
developers. But today the keyword is "safety".

Apparently 2022 was the year of the C++ successor. Some features 
of D were mentioned in the discussion but D as a candidate was 
not. Maybe it's time to re-visit the priorities.

Anyways, my post was simply to highlight the issue.

On Thursday, 8 February 2024 at 15:26:16 UTC, kdevel wrote:
 Elegant and correct is this version:

 ```d
 import std.stdio;

 int main()
 {
         char[] something = ['a', 'b', 'c'];

         writeln("len: ", something.length);
         writeln("typeid(something.length): ",
                 typeid(something.length));

         writeln ("i: -1");
         foreach (i, _; something)
                 writeln("i: ", i);
         return 0;
 }
 ```

 But it is still a bit too "clever" in the `foreach` statement 
 due to the unused variable `_`.
It has a bigger problem. What happens when I change the code in the loop but forget to change the code outside the loop? This is why people complain about Python's lack of a do/while loop. So no, not elegant. Additionally, it doesn't address the issue. It still requires me to both realize the issue with comparing an int to length's mystery type, as well as to fix it for the compiler. (And it's beside the fact that the start value could just as easily be an int (parameter for example) that could take a negative value. This was the case for one of my cases.)
Feb 08
next sibling parent kdevel <kdevel vogtner.de> writes:
On Thursday, 8 February 2024 at 16:54:36 UTC, Kevin Bailey wrote:
 [...]
 On Thursday, 8 February 2024 at 15:26:16 UTC, kdevel wrote:
 Elegant and correct is this version:

 ```d
 import std.stdio;

 int main()
 {
         char[] something = ['a', 'b', 'c'];

         writeln("len: ", something.length);
         writeln("typeid(something.length): ",
                 typeid(something.length));

         writeln ("i: -1");
         foreach (i, _; something)
                 writeln("i: ", i);
         return 0;
 }
 ```

 But it is still a bit too "clever" in the `foreach` statement 
 due to the unused variable `_`.
It has a bigger problem. What happens when I change the code in the loop but forget to change the code outside the loop?
How exactly is that pseudo index `-1` related to the real indices of the array `something`? This is absolutely not clear to me. I simply don't see what problem you are trying to solve.
Feb 08
prev sibling parent reply Gary Chike <chikega gmail.com> writes:
On Thursday, 8 February 2024 at 16:54:36 UTC, Kevin Bailey wrote:

 Additionally, it doesn't address the issue. It still requires 
 me to both realize the issue with comparing an int to length's 
 mystery type, as well as to fix it for the compiler.

 (And it's beside the fact that the start value could just as 
 easily be an int (parameter for example) that could take a 
 negative value. This was the case for one of my cases.)
It appears that some strongly typed languages like Rust necessitate explicit casting similar to D in this particular case. And some extremely strongly typed languages like Ada requires it to another level. ```rust fn main() { let something = vec!['a', 'b', 'c']; println!("len: {}", something.len()); // 3 for i in -1..=something.len() as i32 { println!("i: {}", i); } } ``` Like D, Rust's 'len()' function returns an unsigned integer `usize`. It has to be cast to `int` to deal with the `-1` index or it will not compile. ``` for i in -1..=something.len() { | ^^ the trait `Neg` is not implemented for `usize` ``` But there are some modern languages in which the length function returns a signed integer by default .. such as Odin. [`len :: proc(v: Array_Type) -> int {…}`](https://pkg.odin-lang.org/base/builtin/#len). So there is no requirement for casting in Odin in this particular case. An experienced Odin programmer replied to me on the Odin Discord stated: "It is int in the default context, but if the type system sees it is being used in the context of something that wants a uint it will return that." ```c package main import "core:fmt" main :: proc() { something := []byte{'a', 'b', 'c'} fmt.println("length of 'len(something): ", len(something)) fmt.printf("type of 'len(something)': %T\n", len(something)) // int for i := -1; i < len(something); i += 1 { fmt.println("i: ", i) } } ``` Thank you everyone for this interesting discussion on languages and language design.
Feb 08
parent reply Gary Chike <chikega gmail.com> writes:
On Friday, 9 February 2024 at 02:13:04 UTC, Gary Chike wrote:

Reviewing the default return type in a couple more newer 
languages for the length function or equivalent. (`len()`, 
`length()`, `size()`):

The Nim language appears to return an `int` type:
```python
let something =  ['a', 'b', 'c']



for i in -1..something.len.int:
     echo("i: ", i)
```
The Crystal language appears to return an `Int32` type:
```ruby
something = ['a', 'b', 'c']



(-1..something.size).each do |i|

end
```
Interesting to see that the signed integer type has been adopted 
as the return type for the length function by these newer 
languages.
Feb 08
parent reply Gary Chike <chikega gmail.com> writes:
On Friday, 9 February 2024 at 02:55:48 UTC, Gary Chike wrote:
 On Friday, 9 February 2024 at 02:13:04 UTC, Gary Chike wrote:
I spoke too soon, it appears Zig's default return type for its `len()` function is: `usize' - similar to Rust. ```rust const std = import("std"); pub fn main() !void { const something = [_]u8{ 'a', 'b', 'c' }; std.debug.print("len: {d}\n", .{something.len}); // Len: 3 const len_type = typeName( TypeOf(something.len)); std.debug.print("len type: {s}\n", .{len_type}); // len type: usize var i: i32 = -1; while (i <= intCast(i32, something.len)) : (i += 1) { std.debug.print("i: {d}\n", .{i}); } } ``` Output: ``` len: 3 len type: usize i: -1 i: 0 i: 1 i: 2 i: 3 ```
Feb 08
parent reply Danilo <codedan aol.com> writes:
Rust, Nim, Zig, Odin…?

Here is the Forum for D(lang). ;)
Feb 09
parent reply Sergey <kornburn yandex.ru> writes:
On Friday, 9 February 2024 at 08:04:56 UTC, Danilo wrote:
 Rust, Nim, Zig, Odin…?

 Here is the Forum for D(lang). ;)
But it is fine to see what others have.. Teach on their experience is useful This is how research is going
Feb 09
next sibling parent reply Gary Chike <chikega gmail.com> writes:
On Friday, 9 February 2024 at 12:15:29 UTC, Sergey wrote:
 On Friday, 9 February 2024 at 08:04:56 UTC, Danilo wrote:
 Rust, Nim, Zig, Odin…?

 Here is the Forum for D(lang). ;)
But it is fine to see what others have.. Teach on their experience is useful This is how research is going
Thank you Sergey! I definitely appreciate the wider perspective I've gained by peering into multiple languages. For example, Ada, being a Wirthian language, has an N-index based system, so indices can be negative which necessitates a return type of a signed type for the Length attribute. Pascal, Lua, and PL/I are other languages that are also N-index based vs just being either 0-index(most languages) or 1-index based (eg. Julia, Matlab, Fortran). :)
Feb 09
parent Gary Chike <chikega gmail.com> writes:
On Friday, 9 February 2024 at 16:49:37 UTC, Gary Chike wrote:

The underlying architecture of the language will often times 
dictate how certain constructs or design decisions are made. For 
example in Ada, every array has a `Length` attribute and it 
returns an `integer` type.

And since Ada is a very strongly typed language, there is no 
choice but to explicitly cast both the numerator and denominator. 
Ada will not allow you to cast only the numerator and allow the 
compiler to implicitly cast the denominator as is the case in 
most C-based languages. This will not compile:
`Avg := Float(Sum) / Len;`

```
with Ada.Text_IO; use Ada.Text_IO;
with Ada.Float_Text_IO; use Ada.Float_Text_IO;

procedure Main is
    type Int_Array is array (Positive range <>) of Integer;

    A   : Int_Array := (-5000, 0);
    Len : Integer := A'Length;
    Sum : Integer := A(1) + A(2);
    Avg : Float;
begin
    Avg := Float(Sum) / Float(Len);
    Put(Avg, 1, 2, 0);
    New_Line;
end Main;
```
Output: -2500.00
Feb 09
prev sibling parent Danilo <codedan aol.com> writes:
On Friday, 9 February 2024 at 12:15:29 UTC, Sergey wrote:
 Rust, Nim, Zig, Odin…?

 Here is the Forum for D(lang). ;)
But it is fine to see what others have.. Teach on their experience is useful This is how research is going
Sorry, I probably got confused by the use of different languages in every posting.
Feb 09
prev sibling parent reply thinkunix <thinkunix zoho.com> writes:
Kevin Bailey via Digitalmars-d-learn wrote:
 On Thursday, 8 February 2024 at 08:23:12 UTC, thinkunix wrote:
 I would never write code like this.
By all means, please share with us how you would have written that just as elegantly but "correct".
First off I, I am just a beginner with D. I joined this list to try to learn more about the language not to but heads with experts. I'm sorry if you took my response that way. My post was merely to show how, with my rudimentary knowledge, I could get the loop to execute 4 times, which appeared (to me) to be the intent of your code. Thank you for the exercise. I learned more about the D type system. I said I would not write code like that because: * why start at -1 if array indexes start at 0? * why use auto which made the type different than what .length is? You provided no context, or comment indicated what you were trying to achieve by starting with -1. Clearly I didn't understand your intent.
 It would also break if the array 'something' had more than int.max 
 elements.
Then don't cast it to an int. First of all, why didn't you cast it to a long?
I only "cast(int)something.length" so the type would match the type that "auto i = -1" would get, which was int, and this was to prevent comparing incompatible types, which caused the conversion, and the loop not to execute at all. As a beginner, I would expect that if you mismatch types, you can expect bad things to happen, and this is probably true in any language. If your issue is that the compiler didn't catch this, shouldn't you raise the issue on a compiler internals list? Maybe I've misunderstood the purpose of d-learn "Questions about learning and using D". scot
Feb 09
next sibling parent reply bachmeier <no spam.net> writes:
On Friday, 9 February 2024 at 11:00:09 UTC, thinkunix wrote:

 If your issue is that the compiler didn't catch this, shouldn't 
 you
 raise the issue on a compiler internals list?  Maybe I've 
 misunderstood
 the purpose of d-learn "Questions about learning and using D".
It's been discussed many, many times. The behavior is not going to change - there won't even be a compiler warning. (You'll have to check with the leadership for their reasons.) I think something like this, which is such an obviously bad design and hits so many new users, should be discussed in the learn forum so new users are aware of what's going on.
Feb 09
parent reply Nick Treleaven <nick geany.org> writes:
On Friday, 9 February 2024 at 15:19:32 UTC, bachmeier wrote:
 It's been discussed many, many times. The behavior is not going 
 to change - there won't even be a compiler warning. (You'll 
 have to check with the leadership for their reasons.)
Was (part of) the reason because it would disrupt existing code? If that was the blocker then editions are the solution.
Feb 12
next sibling parent reply "H. S. Teoh" <hsteoh qfbox.info> writes:
On Mon, Feb 12, 2024 at 05:26:25PM +0000, Nick Treleaven via
Digitalmars-d-learn wrote:
 On Friday, 9 February 2024 at 15:19:32 UTC, bachmeier wrote:
 It's been discussed many, many times. The behavior is not going to
 change - there won't even be a compiler warning. (You'll have to
 check with the leadership for their reasons.)
Was (part of) the reason because it would disrupt existing code? If that was the blocker then editions are the solution.
Honestly, I think this issue is blown completely out of proportion. The length of stuff in any language needs to be some type. D decided on an unsigned type. You just learn that and adapt your code accordingly, end of story. Issues like these can always be argued both ways, and the amount of energy spent in these debates far outweigh the trivial workarounds in code, of which there are many (use std.conv.to for bounds checks, just outright cast it if you know what you're doing (or just foolhardy), use CheckedInt, etc.). And the cost of any change to the type now also far, far outweighs any meager benefits it may have brought. It's just not worth it, IMNSHO. T -- Verbing weirds language. -- Calvin (& Hobbes)
Feb 12
next sibling parent reply bachmeier <no spam.net> writes:
On Monday, 12 February 2024 at 18:22:46 UTC, H. S. Teoh wrote:

 Honestly, I think this issue is blown completely out of 
 proportion.
Only for people that don't have to deal with the problems it causes.
 D decided on an unsigned type. You just learn that and adapt 
 your code accordingly, end of story.  Issues like these can 
 always be argued both ways, and the amount of energy spent in 
 these debates far outweigh the trivial workarounds in code, of 
 which there are many (use std.conv.to for bounds checks, just 
 outright cast it if you know what you're doing (or just 
 foolhardy), use CheckedInt, etc.).
A terrible language is one that makes you expend your energy thinking about workarounds rather than solving your problems. The default should be code that works. The workarounds should be for cases where you want to do something extremely unusual like subtracting from an unsigned type and having it wrap around.
Feb 12
parent reply "H. S. Teoh" <hsteoh qfbox.info> writes:
On Mon, Feb 12, 2024 at 07:34:36PM +0000, bachmeier via Digitalmars-d-learn
wrote:
 On Monday, 12 February 2024 at 18:22:46 UTC, H. S. Teoh wrote:
 
 Honestly, I think this issue is blown completely out of proportion.
Only for people that don't have to deal with the problems it causes.
I've run into size_t vs int issues many times. About half the time it exposed fallacious assumptions on my part about value types. The other half of the time a simple cast or std.conv.to invocation solved the problem. My guess is that most common use of .length in your typical D code is in (1) passing it to code that expect a length for various reasons, and (2) in loop conditions to avoid overrunning a buffer or overshooting some range. (1) is a non-problem, 90% of (2) is solved by using constructs like foreach() and/or ranges instead of overly-clever arithmetic involving length, which is almost always wrong or unnecessary. If you need to do subtraction with lengths, that's a big red flag that you're approaching your problem from the wrong POV. About the only time you need to do arithmetic with lengths is in low-level code like allocators or array copying, for which you really should be using higher-level constructs instead.
 D decided on an unsigned type. You just learn that and adapt your
 code accordingly, end of story.  Issues like these can always be
 argued both ways, and the amount of energy spent in these debates
 far outweigh the trivial workarounds in code, of which there are
 many (use std.conv.to for bounds checks, just outright cast it if
 you know what you're doing (or just foolhardy), use CheckedInt,
 etc.).
A terrible language is one that makes you expend your energy thinking about workarounds rather than solving your problems. The default should be code that works. The workarounds should be for cases where you want to do something extremely unusual like subtracting from an unsigned type and having it wrap around.
Yes, if I had my way, implicit conversions to/from unsigned types should be a compile error. As should comparisons between signed/unsigned values. But regardless, IMNSHO any programmer worth his wages ought to learn what an unsigned type is and how it works. A person should not be writing code if he can't even be bothered to learn how the machine that's he's programming actually works. To quote Knuth: People who are more than casually interested in computers should have at least some idea of what the underlying hardware is like. Otherwise the programs they write will be pretty weird. -- D. Knuth One of the reasons Walter settled on size_t being unsigned is that this reflects how the hardware actually works. Computer arithmetic is NOT highschool arithmetic; you do not have infinite width nor infinite precision, and you're working with binary, not decimal. This has consequences, and having the language pretend the distinction doesn't exist does not solve any problems. If an architectural astronaut works at such a high level of abstraction that he doesn't even understand how basic things about the hardware, like how uint or ulong work and how to use them correctly, maybe he should be promoted to a managerial role instead of writing code. T -- You are only young once, but you can stay immature indefinitely. -- azephrahel
Feb 12
parent reply Ivan Kazmenko <gassa mail.ru> writes:
On Monday, 12 February 2024 at 19:56:09 UTC, H. S. Teoh wrote:
 But regardless, IMNSHO any programmer worth his wages ought to 
 learn what an unsigned type is and how it works. A person 
 should not be writing code if he can't even be bothered to 
 learn how the machine that's he's programming actually works.
I'd like to note that even C++20 onwards has `.ssize`, which is signed size. I do use lengths in arithmetic sometimes, and that leads to silent bugs currently. On the other hand, since going from 16 bits to 32 and then 64, in my user-side programs, I had a flat zero bugs because some length was 2^{31} or greater -- but at the same time not 2^{32} or greater. So, in D, I usually `to!int` or `to!long` them anyway. Or cast in performance-critical places. Another perspective. Imagine a different perfect world, where programmers just had 64-bit integers and 64-bit address space, everywhere, from the start. A clean slate, engineers and programmers designing their first hardware and languages, but with such sizes already feasible. Kinda weird, but bear with me a bit. Now, imagine someone proposing to make sizes unsigned. Wouldn't that be a strange thing to do? The benefit of having a universal arithmetic type for everything, from the ground up -- instead of two competing types producing bugs at glue points -- seems to far outweigh any potential gains. Unsigned integers could have their small place, too, for bit masks and microoptimizations and whatnot, but why sizes? The few applications that really benefit from sizes of [2^{63}..2^{64}) would be the most odd ones, deserving some workarounds. Right now though, we just have to deal with the legacy, in software, hardware, and mind -- and with the fact that quite some environments are not 64-bit. Ivan Kazmenko.
Feb 13
next sibling parent reply Kevin Bailey <keraba yahoo.com> writes:
On Tuesday, 13 February 2024 at 23:57:12 UTC, Ivan Kazmenko wrote:
 I'd like to note that even C++20 onwards has `.ssize`, which is 
 signed size.

 I do use lengths in arithmetic sometimes, and that leads to 
 silent bugs currently.  On the other hand, since going from 16 
 bits to 32 and then 64, in my user-side programs, I had a flat 
 zero bugs because some length was 2^{31} or greater -- but at 
 the same time not 2^{32} or greater.  So, in D, I usually 
 `to!int` or `to!long` them anyway.  Or cast in 
 performance-critical places.

 Another perspective.  Imagine a different perfect world, where 
 programmers just had 64-bit integers and 64-bit address space, 
 everywhere, from the start.  A clean slate, engineers and 
 programmers designing their first hardware and languages, but 
 with such sizes already feasible.  Kinda weird, but bear with 
 me a bit.  Now, imagine someone proposing to make sizes 
 unsigned.  Wouldn't that be a strange thing to do?  The benefit 
 of having a universal arithmetic type for everything, from the 
 ground up -- instead of two competing types producing bugs at 
 glue points -- seems to far outweigh any potential gains.  
 Unsigned integers could have their small place, too, for bit 
 masks and microoptimizations and whatnot, but why sizes?  The 
 few applications that really benefit from sizes of 
 [2^{63}..2^{64}) would be the most odd ones, deserving some 
 workarounds.

 Right now though, we just have to deal with the legacy, in 
 software, hardware, and mind -- and with the fact that quite 
 some environments are not 64-bit.

 Ivan Kazmenko.
Personally, I don't have a problem with .length being unsigned. How do you have a negative length? My problem is that the language doesn't correctly compare signed and unsigned. Earlier in the thread, people mentioned NOT mentioning other languages but I just learned that Carbon correctly compares signed and unsigned ints. cheers
Feb 13
parent Siarhei Siamashka <siarhei.siamashka gmail.com> writes:
On Wednesday, 14 February 2024 at 00:56:21 UTC, Kevin Bailey 
wrote:
 Personally, I don't have a problem with .length being unsigned. 
 How do you have a negative length? My problem is that the 
 language doesn't correctly compare signed and unsigned.
The length itself is technically the index of a non-existing element right after the array. And -1 is technically the index of a non-existing element right before the array. Hence just mechanically reversing the direction of processing array elements during refactoring may be potentially dangerous if one is not careful enough.
Feb 17
prev sibling parent Kagamin <spam here.lot> writes:
On Tuesday, 13 February 2024 at 23:57:12 UTC, Ivan Kazmenko wrote:
 I do use lengths in arithmetic sometimes, and that leads to 
 silent bugs currently.  On the other hand, since going from 16 
 bits to 32 and then 64, in my user-side programs, I had a flat 
 zero bugs because some length was 2^{31} or greater -- but at 
 the same time not 2^{32} or greater.  So, in D, I usually 
 `to!int` or `to!long` them anyway.  Or cast in 
 performance-critical places.
I had a similar bug in C++: the find function returns npos sentinel value when not found, it was assigned to uint and then didn't match npos on comparison, but it would if they were signed.
Feb 16
prev sibling parent reply Nick Treleaven <nick geany.org> writes:
On Monday, 12 February 2024 at 18:22:46 UTC, H. S. Teoh wrote:
 On Mon, Feb 12, 2024 at 05:26:25PM +0000, Nick Treleaven via 
 Digitalmars-d-learn wrote:
 On Friday, 9 February 2024 at 15:19:32 UTC, bachmeier wrote:
 It's been discussed many, many times. The behavior is not 
 going to change - there won't even be a compiler warning. 
 (You'll have to check with the leadership for their reasons.)
Was (part of) the reason because it would disrupt existing code? If that was the blocker then editions are the solution.
Honestly, I think this issue is blown completely out of proportion. The length of stuff in any language needs to be some type. D decided on an unsigned type. You just learn that and adapt your code accordingly, end of story. Issues like these can always be argued both ways, and the amount of energy spent in these debates far outweigh the trivial workarounds in code, of which there are many (use std.conv.to for bounds checks, just outright cast it if you know what you're doing (or just foolhardy), use CheckedInt, etc.). And the cost of any change to the type now also far, far outweighs any meager benefits it may have brought. It's just not worth it, IMNSHO.
I don't want the type of .length to change, that indeed would be too disruptive. What I want is proper diagnostics like any well-regarded C compiler when I mix/implicit convert unsigned and signed types. Due to D's generic abilities, it's easier to make wrong assumptions about whether some integer is signed or unsigned. But even without that, C compilers accepted that this is a task for the compiler to diagnose rather than humans, because it is too bug-prone for humans.
Feb 13
parent "H. S. Teoh" <hsteoh qfbox.info> writes:
On Tue, Feb 13, 2024 at 06:36:22PM +0000, Nick Treleaven via
Digitalmars-d-learn wrote:
 On Monday, 12 February 2024 at 18:22:46 UTC, H. S. Teoh wrote:
[...]
 Honestly, I think this issue is blown completely out of proportion.
 The length of stuff in any language needs to be some type. D decided
 on an unsigned type. You just learn that and adapt your code
 accordingly, end of story.  Issues like these can always be argued
 both ways, and the amount of energy spent in these debates far
 outweigh the trivial workarounds in code, of which there are many
 (use std.conv.to for bounds checks, just outright cast it if you
 know what you're doing (or just foolhardy), use CheckedInt, etc.).
 And the cost of any change to the type now also far, far outweighs
 any meager benefits it may have brought.  It's just not worth it,
 IMNSHO.
I don't want the type of .length to change, that indeed would be too disruptive. What I want is proper diagnostics like any well-regarded C compiler when I mix/implicit convert unsigned and signed types.
I agree, mixing signed/unsigned types in the same expression ought to require a cast, and error out otherwise. Allowing them to be freely mixed, or worse, implicitly convert to each other, is just too error-prone.
 Due to D's generic abilities, it's easier to make wrong assumptions
 about whether some integer is signed or unsigned. But even without
 that, C compilers accepted that this is a task for the compiler to
 diagnose rather than humans, because it is too bug-prone for humans.
Indeed. T -- Живёшь только однажды.
Feb 13
prev sibling parent bachmeier <no spam.net> writes:
On Monday, 12 February 2024 at 17:26:25 UTC, Nick Treleaven wrote:
 On Friday, 9 February 2024 at 15:19:32 UTC, bachmeier wrote:
 It's been discussed many, many times. The behavior is not 
 going to change - there won't even be a compiler warning. 
 (You'll have to check with the leadership for their reasons.)
Was (part of) the reason because it would disrupt existing code? If that was the blocker then editions are the solution.
I don't want to write a speculative answer on Walter's reasoning, but I know that (a) this has come up many times, and (b) I've never seen him express an opinion that anything in the language related to unsigned types is problematic. I can't imagine that he has any intention of changing it, given the number of times it's been raised, but I can't claim any special knowledge of his views.
Feb 12
prev sibling parent Kevin Bailey <keraba yahoo.com> writes:
On Friday, 9 February 2024 at 11:00:09 UTC, thinkunix wrote:
 First off I, I am just a beginner with D.  I joined this list 
 to try to
 learn more about the language not to but heads with experts.  
 I'm sorry
 if you took my response that way.
Hi thinkunix, I did interpret your post as critical. Sorry if it wasn't intended to be and my reply had a little too much heat. I still think my reply was at least accurate, so replies below.
 My post was merely to show how, with my rudimentary knowledge, 
 I could
 get the loop to execute 4 times, which appeared (to me) to be 
 the intent of your code.  Thank you for the exercise.  I 
 learned more about the D type system.

 I said I would not write code like that because:
 * why start at -1 if array indexes start at 0?
The program that I was writing was most elegant doing that. Obviously I wasn't doing something as simple as the example. The post is simply to highlight the issue. Unfortunately I can't find the examples now. The code has been altered so grepping isn't finding it and, since AoC has 25 days, I'm not sure which ones it was. It /might/ have been this, where 'where_to_start' is signed and can be negative. It's a weird index of indexes thing, and quite unconventional. // Try it in the remaining groups. for (auto i = where_to_start; i < num_ss.length; ++i)
 * why use auto which made the type different than what .length 
 is?
Google "almost always auto" for why you should prefer it - don't miss Herb Sutter's post - but as someone else pointed out, it's no better with 'int' *or* 'ulong'. The "best" solution is to cast the returned length to long. This makes it work and, unless you're counting the atoms in the universe, should be sufficient on a reasonable machine. This is why I brought up the example. zjh was lamenting having to cast, as am I, much less think this hard about it.
 You provided no context, or comment indicated what you were 
 trying
 to achieve by starting with -1.  Clearly I didn't understand 
 your
 intent.
I wasn't asking a question. I know how to code this in D and I made it work. My post was to highlight the completely unnecessary need to cast. What happens when there's a more reasonable example? What happens when you need to compare 2 library function results when one is signed and the other not? Would you even know that you had to cast one? Or would you just get strange results and not know why? The issue exists completely outside of my example. Since you sound new, I'll mention that, yes, what I'm proposing can be a hair slower. But so what? 99 of a 100 programs won't notice and, if it does, your profiler will tell you where, you add the cast (or you add it ahead time, since that's what we have to do now), done. I understand that it is almost certainly too late for D but the world seems ready for an alternative to C++ and lots of languages are coming. This is just part of that discussion, right?
Feb 09
prev sibling parent Kagamin <spam here.lot> writes:
On Thursday, 8 February 2024 at 05:56:57 UTC, Kevin Bailey wrote:
 How many times does the following loop print? I ran into this 
 twice doing the AoC exercises. It would be nice if it Just 
 Worked.
 ```
 import std.stdio;

 int main()
 {
   char[] something = ['a', 'b', 'c'];

   for (auto i = -1; i < something.length; ++i)
         writeln("less than");

   return 0;
 }
 ```
Try this: ``` import std.stdio; int ilength(T)(in T[] a) { assert(a.length<=int.max); return cast(int)a.length; } int main() { char[] something = ['a', 'b', 'c']; for (auto i = -1; i < something.ilength; ++i) writeln("less than"); return 0; } ```
Feb 16
prev sibling parent Kagamin <spam here.lot> writes:
I have an idea to estimate how long strlen takes on an exabyte 
string.
Jan 29
prev sibling parent monkyyy <crazymonkyyy gmail.com> writes:
On Thursday, 18 January 2024 at 02:55:37 UTC, zjh wrote:
 Can you change the type of 'length' from 'ulong' to 'int', so I 
 haven't to convert it every time!
The devs are obviously very very wrong here I underflow indexs all the time But this is a pretty dead fight, I'd aim for a smart index type thats a "checkedint" with underflow protection and can alias to int; cause maybe that could someday happen.
Jan 28