www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - empty arrays and cast(bool): WAT

reply Timon Gehr <timon.gehr gmx.ch> writes:
Why does the following code behave funny?

void main(){
     string x = " "[1..1];
     writeln(cast(bool)x); // true
     char[] y = [' '][1..1];
     writeln(cast(bool)y); // false
}

Is there any reason for empty arrays to evaluate to true? This is very 
bug prone.
Feb 18 2012
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Timon Gehr:

 Is there any reason for empty arrays to evaluate to true? This is very 
 bug prone.
Related? http://d.puremagic.com/issues/show_bug.cgi?id=4733 Bye, bearophile
Feb 18 2012
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 02/19/2012 03:24 AM, bearophile wrote:
 Timon Gehr:

 Is there any reason for empty arrays to evaluate to true? This is very
 bug prone.
Related? http://d.puremagic.com/issues/show_bug.cgi?id=4733 Bye, bearophile
Roughly. I want it to be allowed, but to test the length instead of ptr.
Feb 18 2012
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, February 19, 2012 04:13:02 Timon Gehr wrote:
 On 02/19/2012 03:24 AM, bearophile wrote:
 Timon Gehr:
 Is there any reason for empty arrays to evaluate to true? This is very
 bug prone.
Related? http://d.puremagic.com/issues/show_bug.cgi?id=4733 Bye, bearophile
Roughly. I want it to be allowed, but to test the length instead of ptr.
That would potentially break code. It also makes arrays less consistent with other types that can be null, but the weirdness with [] == null already does that on some level. My general take on it is that you should just be aware that you need to use arr.empty when you mean empty and arr !is null when you mean null. So, if(arr) is the kind of code that should just be avoided with arrays. - Jonathan M Davis
Feb 18 2012
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 02/19/2012 04:17 AM, Jonathan M Davis wrote:
 On Sunday, February 19, 2012 04:13:02 Timon Gehr wrote:
 On 02/19/2012 03:24 AM, bearophile wrote:
 Timon Gehr:
 Is there any reason for empty arrays to evaluate to true? This is very
 bug prone.
Related? http://d.puremagic.com/issues/show_bug.cgi?id=4733 Bye, bearophile
Roughly. I want it to be allowed, but to test the length instead of ptr.
That would potentially break code.
It would also potentially fix code.
 It also makes arrays less consistent with
 other types that can be null,
If so, then in a good way.
 but the weirdness with [] == null already does
 that on some level.
If you think that is weird, consider this: static assert([1,2,3] == [1,2,3]);
 My general take on it is that you should just be aware that you need to use
 arr.empty when you mean empty and arr !is null when you mean null. So,

 if(arr)

 is the kind of code that should just be avoided with arrays.
Why? (step back a moment and stop thinking about what you know about DMDs implementation semantics and then answer)
Feb 18 2012
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, February 19, 2012 04:33:14 Timon Gehr wrote:
 Why? (step back a moment and stop thinking about what you know about
 DMDs implementation semantics and then answer)
Because of the whole nonsense of null and empty being treated the same in some circumstances and not in others. I'm immediately suspicious of any arry code which doesn't expcitly check for empty or null but relies on stuff like cast(bool)arr. The odds are too high that the programmer meant one and got the other. - Jonathan M Davis
Feb 18 2012
prev sibling parent reply "Bernard Helyer" <b.helyer gmail.com> writes:
On Sunday, 19 February 2012 at 03:33:14 UTC, Timon Gehr wrote:
 That would potentially break code.
It would also potentially fix code.
Well that's the stupidest thing I've read today. Can you point to people using it in the way that you expect? Besides which, that's just about the worst way to fix bugs in code is to change the language's behaviour.
Feb 18 2012
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 02/19/2012 04:56 AM, Bernard Helyer wrote:
 On Sunday, 19 February 2012 at 03:33:14 UTC, Timon Gehr wrote:
 That would potentially break code.
It would also potentially fix code.
Well that's the stupidest thing I've read today. Can you point to people using it in the way that you expect? Besides which, that's just about the worst way to fix bugs in code is to change the language's behaviour.
I was not being entirely serious here... Do you have any opinion about the topic?
Feb 18 2012
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 02/19/2012 05:01 AM, Timon Gehr wrote:
 On 02/19/2012 04:56 AM, Bernard Helyer wrote:
 On Sunday, 19 February 2012 at 03:33:14 UTC, Timon Gehr wrote:
 That would potentially break code.
It would also potentially fix code.
Well that's the stupidest thing I've read today. Can you point to people using it in the way that you expect? Besides which, that's just about the worst way to fix bugs in code is to change the language's behaviour.
I was not being entirely serious here...
Technically the statement is still true, of course.
Feb 18 2012
prev sibling parent reply "Bernard Helyer" <b.helyer gmail.com> writes:
On Sunday, 19 February 2012 at 04:01:45 UTC, Timon Gehr wrote:
 On 02/19/2012 04:56 AM, Bernard Helyer wrote:
 On Sunday, 19 February 2012 at 03:33:14 UTC, Timon Gehr wrote:
 That would potentially break code.
It would also potentially fix code.
Well that's the stupidest thing I've read today. Can you point to people using it in the way that you expect? Besides which, that's just about the worst way to fix bugs in code is to change the language's behaviour.
I was not being entirely serious here... Do you have any opinion about the topic?
I think it's too late to change the behaviour, and it's not too terrible, even if un-ideal.
Feb 18 2012
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 02/19/2012 06:15 AM, Bernard Helyer wrote:
 On Sunday, 19 February 2012 at 04:01:45 UTC, Timon Gehr wrote:
 On 02/19/2012 04:56 AM, Bernard Helyer wrote:
 On Sunday, 19 February 2012 at 03:33:14 UTC, Timon Gehr wrote:
 That would potentially break code.
It would also potentially fix code.
Well that's the stupidest thing I've read today. Can you point to people using it in the way that you expect? Besides which, that's just about the worst way to fix bugs in code is to change the language's behaviour.
I was not being entirely serious here... Do you have any opinion about the topic?
I think it's too late to change the behaviour,
It is a good time to change the behavior: We are still in a stage where almost every release of the reference compiler breaks some code.
 and it's not too terrible, even if un-ideal.
It is completely useless. It should rather be disabled as bearophile suggests and then be re-enabled with the sane semantics after enough time has passed.
Feb 18 2012
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, February 19, 2012 04:56:13 Bernard Helyer wrote:
 On Sunday, 19 February 2012 at 03:33:14 UTC, Timon Gehr wrote:
 That would potentially break code.
It would also potentially fix code.
Well that's the stupidest thing I've read today. Can you point to people using it in the way that you expect? Besides which, that's just about the worst way to fix bugs in code is to change the language's behaviour.
I've seen pull requests for code where people have done if(arr) thinking that it would check for empty. The bizarrities with null == [] cause subtle problems all the time. Now, whether more people have done if(arr) thinking that it checked null or more people have done it thinking that it checked empty, I don't know, but there's no question that some people do it thinking that it checks for empty. - Jonathan M Davis
Feb 18 2012
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, February 19, 2012 03:01:55 Timon Gehr wrote:
 Why does the following code behave funny?
 
 void main(){
      string x = " "[1..1];
      writeln(cast(bool)x); // true
      char[] y = [' '][1..1];
      writeln(cast(bool)y); // false
 }
 
 Is there any reason for empty arrays to evaluate to true? This is very
 bug prone.
Because they're not null. _null_ is what evaluates to false, not an empty array. But the fact that == treats null and [] as the same thing _does_ understandably cloud things a bit. But if(arr) is lowered to if(cast(bool)arr) which checks for null, not empty. So, this is fully expected. If you want to check whether an array is empty, then check whether it's empty, not whether it's true. - Jonathan M Davis
Feb 18 2012
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 02/19/2012 03:23 AM, Jonathan M Davis wrote:
 On Sunday, February 19, 2012 03:01:55 Timon Gehr wrote:
 Why does the following code behave funny?

 void main(){
       string x = " "[1..1];
       writeln(cast(bool)x); // true
       char[] y = [' '][1..1];
       writeln(cast(bool)y); // false
 }

 Is there any reason for empty arrays to evaluate to true? This is very
 bug prone.
Because they're not null. _null_ is what evaluates to false, not an empty array.
That is my point. It should not be that way. That is utter nonsense. It is a major wart. (Every literal with a null .ptr evaluates to false, btw, not just 'null')
 But the fact that == treats null and [] as the same thing _does_
 understandably cloud things a bit.
Nothing cloudy about that, that is sane behaviour and I want it to be consistently carried out everywhere. ('is' can still do a binary compare, that is what it is designed to do.)
 But

 if(arr)

 is lowered to

 if(cast(bool)arr)

 which checks for null, not empty. So, this is fully expected.
No, it is not what is expected, it is what is _observed_. It is almost certainly just a bug in the implementation and everyone thought it is part of the language. Consider this, and then don't tell me it is not a leftover from the ancient times when dynamic arrays implicitly converted to their .ptr fields: struct DynArray{ size_t length; int* ptr; alias ptr this; } void main(){ DynArray a; int[] b; writeln(cast(bool)a); // false writeln(cast(bool)b); // false a.ptr = new int; writeln(cast(bool)a); // true *(cast(int**)&b+1) = new int; // "b.ptr = new int" writeln(cast(bool)b); // true }
 If you want to check whether an array is empty, then check whether it's empty,
not whether
 it's true.
I can't help myself noticing that you always seem to defend the position that makes the code more verbose.
Feb 18 2012
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, February 19, 2012 04:24:46 Timon Gehr wrote:
 But the fact that == treats null and [] as the same thing _does_
 understandably cloud things a bit.
Nothing cloudy about that, that is sane behaviour and I want it to be consistently carried out everywhere. ('is' can still do a binary compare, that is what it is designed to do.)
It's lunacy IMHO. null and empty should _never_ have been conflated. They are two separate concepts. _That_ is a major wart in the language IMHO. We would have better off to not have null arrays at all than this halfway nonsense. But that ship sailed ages ago.
 I can't help myself noticing that you always seem to defend the position
 that makes the code more verbose.
I hadn't noticed. If that's true, it's probably because it's more explicit. And with arrays, I don't trust them with regards to this sort of stuff precisely because null and empty have been conflated. So, I check for what I mean, whether that's null or empty. And I consider code like arr == null to be sign that someone doesn't know what they're doing. Someone who understands how arrays work in D should either be doing arr.empty or arr.length == 0 if they're checking for empty and arr is null if they're checking for null. - Jonathan M Davis
Feb 18 2012
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 02/19/2012 04:38 AM, Jonathan M Davis wrote:
 On Sunday, February 19, 2012 04:24:46 Timon Gehr wrote:
 But the fact that == treats null and [] as the same thing _does_
 understandably cloud things a bit.
Nothing cloudy about that, that is sane behaviour and I want it to be consistently carried out everywhere. ('is' can still do a binary compare, that is what it is designed to do.)
It's lunacy IMHO. null and empty should _never_ have been conflated. They are two separate concepts. _That_ is a major wart in the language IMHO. We would have better off to not have null arrays at all than this halfway nonsense.
Agreed. I don't mind having it as a special value, but it should not have as far reaching consequences as it does right now.
 But that ship sailed ages ago.

 I can't help myself noticing that you always seem to defend the position
 that makes the code more verbose.
I hadn't noticed. If that's true, it's probably because it's more explicit.
Ok.
 And with arrays, I don't trust them with regards to this sort of stuff
 precisely because null and empty have been conflated. So, I check for what I
 mean, whether that's null or empty. And I consider code like arr == null to be
 sign that someone doesn't know what they're doing.
Indeed. I have never needed arr == null. Comparing class references to null using '==' is illegal. Probably the same could be done for arrays. Would also increase consistency.
 Someone who understands how arrays work in D should either be doing arr.empty
or arr.length == 0 if
 they're checking for empty and arr is null if they're checking for null.

 - Jonathan M Davis
That is true. It is also what worries me. Having cast(bool) evaluate to false for empty arrays is intuitive and has massive precedent in other programming languages.
Feb 18 2012
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, February 19, 2012 04:52:48 Timon Gehr wrote:
 That is true. It is also what worries me. Having cast(bool) evaluate to
 false for empty arrays is intuitive and has massive precedent in other
 programming languages.
If we think that the problems solved by it are greater than those caused by it, then the change should be made (and some of your examples make the current behavior look pretty bizarre). But it's all wrapped up in the whole nonsense of how null is handled with arrays, which is broken by design IMHO. - Jonathan M Davis
Feb 18 2012
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Timon Gehr

 Indeed. I have never needed arr == null. Comparing class references to 
 null using '==' is illegal. Probably the same could be done for arrays. 
 Would also increase consistency.
Is this in Bugzilla already? Bye, bearophile
Feb 19 2012
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 02/19/2012 01:54 PM, bearophile wrote:
 Timon Gehr

 Indeed. I have never needed arr == null. Comparing class references to
 null using '==' is illegal. Probably the same could be done for arrays.
 Would also increase consistency.
Is this in Bugzilla already? Bye, bearophile
I don't think so.
Feb 19 2012
prev sibling next sibling parent reply "Daniel Murphy" <yebblies nospamgmail.com> writes:
I agree, and I doubt it was intentional.

if (arr) should mean if (arr.length && arr.ptr)
But since you can only get (arr.length != 0 && arr.ptr == null) when doing 
unsafe things with arrays, I think it's entirely reasonable to use
if (arr) -> if (arr.length)

This is what I expected.

Much like with class references, distinguishing between null and empty is 
not '=='s job, that is what 'is' is for.

"Timon Gehr" <timon.gehr gmx.ch> wrote in message 
news:jhpl6j$2m6c$1 digitalmars.com...
 Why does the following code behave funny?

 void main(){
     string x = " "[1..1];
     writeln(cast(bool)x); // true
     char[] y = [' '][1..1];
     writeln(cast(bool)y); // false
 }

 Is there any reason for empty arrays to evaluate to true? This is very bug 
 prone. 
Feb 18 2012
parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Sun, 19 Feb 2012 06:06:19 -0000, Daniel Murphy  
<yebblies nospamgmail.com> wrote:

 I agree, and I doubt it was intentional.

 if (arr) should mean if (arr.length && arr.ptr)
 But since you can only get (arr.length != 0 && arr.ptr == null) when  
 doing
 unsafe things with arrays, I think it's entirely reasonable to use
 if (arr) -> if (arr.length)

 This is what I expected.
Just to provide a counterpoint, I would have expected if (arr) to check for null. Coming from a C background I expect the thing inside () to be compared to 0 and 'the thing' in my mind is the array itself, not any property of the array length included. Not saying checking for null is a better or worse idea, just saying that would be my initial impression of what it would do. I too am unsettled by the conflation of null and empty - and argued against it on several occasions, but hey, too late now I suspect. Regan
Feb 22 2012
parent reply bearophile <bearophileHUGS lycos.com> writes:
Regan Heath:

 I too am unsettled by the conflation of null and empty  
 - and argued against it on several occasions, but hey, too late now I  
 suspect.
It's not too much late. Bye, bearophile
Feb 22 2012
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Feb 22, 2012 at 01:13:24PM -0500, bearophile wrote:
 Regan Heath:
 
 I too am unsettled by the conflation of null and empty  
 - and argued against it on several occasions, but hey, too late now I  
 suspect.
It's not too much late.
[...] I agree. Conflating null and empty has been (one of) the source(s) of stupidities like the === operator in Javascript and PHP. I mean, c'mon. Will the next generation of languages start introducing the ==== operator now? We already have .empty, why should the language go out of its way to let you test an array with "if(array){...}"? Code like that is hard to read and fails to convey intent clearly. "if (array.empty)" and "if (array is null)" are much more self-documenting and less prone to misinterpretation. T -- "How are you doing?" "Doing what?"
Feb 22 2012
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 18 Feb 2012 21:01:55 -0500, Timon Gehr <timon.gehr gmx.ch> wrote:

 Why does the following code behave funny?

 void main(){
      string x = " "[1..1];
      writeln(cast(bool)x); // true
      char[] y = [' '][1..1];
      writeln(cast(bool)y); // false
 }

 Is there any reason for empty arrays to evaluate to true? This is very  
 bug prone.
Just to weigh in: 1. The most intuitive and useful thing is to check for arr.length. 2. Given that there are valid cases for checking null vs. empty, I think there should be a way to find usages of if(arr) for fixing legacy code. To disallow if(arr) simply because of legacy code is a step in the *wrong* direction. The experience with arrays is going to be a major contributing factor to the enjoyment of using D. I'd rather see the compiler show you where cases of if(arr) appear, and allow you to judge whether those should be arr.ptr or not. I agree with Timon, let's make it more intuitive, and break code, and let those people be able to fix that code somehow. If this means deprecating it for a time, so be it. -Steve
Mar 05 2012