www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Forbid dynamic arrays in boolean evaluation contexts

reply "bearophile" <bearophileHUGS lycos.com> writes:
A recent discussion in D.learn reminds me of an enhancement 
request of mine that is sleeping in Bugzilla since years:

http://d.puremagic.com/issues/show_bug.cgi?id=4733


The probles is that in D dynamic arrays can be non-null even when 
they are empty:


import std.stdio;
int[] foo() {
     auto a = [1];
     return a[0..0];
}
void main() {
     auto data = foo();
     if (data)
         writeln("here");
}


This is dangerous, so in D the safe and idiomatic way to test for 
empty arrays is to use std.array.empty().

So my proposal of Issue 4733 is to forbid (with the usual 
warning/deprecation intermediate steps) the use of dynamic arrays 
in a boolean context:


void main() {
     auto a = [1];
     if (a) {} // error, forbidden.
}


So to test empty/null you have to use empty() or "is null":

import std.array: empty;
void main() {
     auto a = [1];
     if (a.empty) {} // OK
     if (a is null) {} // OK
}


Bye,
bearophile
Mar 24 2013
next sibling parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Sunday, 24 March 2013 at 22:10:06 UTC, bearophile wrote:
 So my proposal of Issue 4733 is to forbid (with the usual 
 warning/deprecation intermediate steps) the use of dynamic 
 arrays in a boolean context:
Seems like a sensible idea. I've never really understood why null is acceptable as a slice anyway. It's not a class, so it shouldn't be allowed.
Mar 25 2013
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Peter Alexander:

 On Sunday, 24 March 2013 at 22:10:06 UTC, bearophile wrote:
 So my proposal of Issue 4733 is to forbid (with the usual 
 warning/deprecation intermediate steps) the use of dynamic 
 arrays in a boolean context:
Seems like a sensible idea. I've never really understood why null is acceptable as a slice anyway. It's not a class, so it shouldn't be allowed.
Maybe you are confusing issue 4733 with this other one: http://d.puremagic.com/issues/show_bug.cgi?id=3889 Unlike Issue 4733, issue 3889 doesn't cause bugs, so it's less important. Here I am not asking to disallow null as a slice. I am asking to disallow the implicit cast "dynamic array => bool". Because it's a source of bugs and there are simple & more explicit ways to do the same thing if you want to. Bye, bearophile
Mar 25 2013
parent "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Monday, 25 March 2013 at 14:09:33 UTC, bearophile wrote:
 Seems like a sensible idea. I've never really understood why 
 null is acceptable as a slice anyway. It's not a class, so it 
 shouldn't be allowed.
Maybe you are confusing issue 4733 with this other one: http://d.puremagic.com/issues/show_bug.cgi?id=3889
Not confusing, just making a tangential remark. I understand your proposal :-)
Mar 25 2013
prev sibling next sibling parent reply "Andrea Fontana" <nospam example.com> writes:
On Sunday, 24 March 2013 at 22:10:06 UTC, bearophile wrote:
 import std.array: empty;
 void main() {
     auto a = [1];
     if (a.empty) {} // OK
     if (a is null) {} // OK
 }


 Bye,
 bearophile
why not a.length == 0 ?
Mar 25 2013
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Andrea Fontana:

 why not a.length == 0 ?
a.length == 0 is OK of course, it's the same thing as writing a.empty. Bye, bearophile
Mar 25 2013
parent reply "Andrea Fontana" <nospam example.com> writes:
On Monday, 25 March 2013 at 14:20:27 UTC, bearophile wrote:
 Andrea Fontana:

 why not a.length == 0 ?
a.length == 0 is OK of course, it's the same thing as writing a.empty. Bye, bearophile
Is empty() there just to match range "interface"? length needs no import :)
Mar 25 2013
parent "bearophile" <bearophileHUGS lycos.com> writes:
Andrea Fontana:

 Is empty() there just to match range "interface"?
Right, it's useful for arrays to fulfill the range protocol.
 length needs no import :)
On the other hand in a longhish module you sometimes need std.array for other purposes, so you have imported it already. "data.empty" is shorter, and it contains only two (psychological) chunks, while "data.length == 0" contains four. So empty makes the code simpler. And empty doesn't break the syntax uniformity of UFCS chains :-) Bye, bearophile
Mar 25 2013
prev sibling next sibling parent reply "Daniel Murphy" <yebblies nospamgmail.com> writes:
"bearophile" <bearophileHUGS lycos.com> wrote in message 
news:bwgnbflygowctlisistg forum.dlang.org...
 So my proposal of Issue 4733 is to forbid (with the usual 
 warning/deprecation intermediate steps) the use of dynamic arrays in a 
 boolean context:
I am in favour of deprecating/removing this, then bringing it back so that if (arr) is the same as if (arr.length)
Mar 25 2013
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Daniel Murphy:

 I am in favour of deprecating/removing this, then bringing it 
 back so that
 if (arr)
 is the same as
 if (arr.length)
That was my original idea :-) And it seems a nice idea. But turning something into a deprecation and later error is useful because it doesn't introduce bugs. While if you later assign it some other semantics, old D code will behave differently (despite it's probably potentially buggy code in the first place). So it's a matter of how much you want to break old D code. Bye, bearophile
Mar 25 2013
parent "Daniel Murphy" <yebblies nospamgmail.com> writes:
"bearophile" <bearophileHUGS lycos.com> wrote in message 
news:fqjmqdhpenxnfadxwgbx forum.dlang.org...
 Daniel Murphy:

 I am in favour of deprecating/removing this, then bringing it back so 
 that
 if (arr)
 is the same as
 if (arr.length)
That was my original idea :-) And it seems a nice idea. But turning something into a deprecation and later error is useful because it doesn't introduce bugs. While if you later assign it some other semantics, old D code will behave differently (despite it's probably potentially buggy code in the first place). So it's a matter of how much you want to break old D code. Bye, bearophile
The deprecation process is long and slow, so when it finally finishes the error stage we can evaluate how much legacy code we risk breaking. Even if we never bring in the new meaning removing it will improve the language.
Mar 26 2013
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 25 Mar 2013 10:18:11 -0400, Daniel Murphy  
<yebblies nospamgmail.com> wrote:

 "bearophile" <bearophileHUGS lycos.com> wrote in message
 news:bwgnbflygowctlisistg forum.dlang.org...
 So my proposal of Issue 4733 is to forbid (with the usual
 warning/deprecation intermediate steps) the use of dynamic arrays in a
 boolean context:
I am in favour of deprecating/removing this, then bringing it back so that if (arr) is the same as if (arr.length)
I would favor just changing the behavior. The existing behavior is most often a bug when it is used, because quite often arrays are null (that is the default value), and quite often, people THINK they are checking if the array is empty instead of if the pointer is null. Because most empty arrays are null, and null arrays have zero length, their tests do not expose this bug. It's actually very difficult to generate a non-null empty array for testing. [] does not work, it returns a null array! Only in rare cases do people actually want to check for null, and in most people's code it is an error when a non-null but empty array is checked with if(arr). If people are interested in the pointer, they usually check if(arr.ptr) or if(arr !is null). I would posit that for every knowledgeable person who uses this "shortcut" to check for null, 100 people use it expecting it to be measuring the array length. If we go through a deprecation cycle, it makes people who do if(arr) for length have to change there code to if(arr.length) then back to if(arr) once the deprecation is over. The switch to if(arr.length) will be simple, the compiler will show all the places if(arr) is used, but the switch back will be more difficult since if(arr.length) will not be deprecated. What about a switch that identifies all places where if(arr) is done? Then people who use if(arr) for testing if(arr.ptr) can find and fix their cases. Or even a separate tool to do that. -Steve
Mar 25 2013
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Steven Schveighoffer:

 I would favor just changing the behavior.
If you just change the behavior, then I suggest to take in account associative arrays too: void main() { bool[int] aa; assert(!aa); aa = [1: true]; assert(aa); aa.remove(1); assert(aa); assert(!aa.length); } Bye, bearophile
Mar 25 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 25 Mar 2013 11:04:58 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Steven Schveighoffer:

 I would favor just changing the behavior.
If you just change the behavior, then I suggest to take in account associative arrays too: void main() { bool[int] aa; assert(!aa); aa = [1: true]; assert(aa); aa.remove(1); assert(aa); assert(!aa.length); }
This can be done without compiler changes. Just define opCast!bool for AssocArray (or whatever the template is, I can't remember). I do agree this should be done too. -Steve
Mar 25 2013
parent "bearophile" <bearophileHUGS lycos.com> writes:
Steven Schveighoffer:

 Just define opCast!bool for AssocArray (or whatever the 
 template is, I can't remember).
I opened an ER about this topic too :-) http://d.puremagic.com/issues/show_bug.cgi?id=3926 Bye, bearophile
Mar 25 2013
prev sibling next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Monday, 25 March 2013 at 14:46:27 UTC, Steven Schveighoffer 
wrote:
 I would favor just changing the behavior.
That would silently break my code.
Mar 25 2013
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Vladimir Panteleev:

 That would silently break my code.
Right, that's why I have asked to just forbid something. The proposal is a little breaking change either way, but turning something into an error allows you to find it quickly in most cases. (But D also has the compiles __trait, that does not give an error, it just silently returns false instead of true). On the other hand the behaviour change is not meant to happen all at once. It has to be a warning, then deprecation and then a change, along months and three or more DMD versions. So there is time to fix D code. And even if the old D code doesn't get changed, I think it's uncommon for code to rely on dynamic array to be actually (null,null) instead of being of zero length. So the total amount of breakage even for a semantic change is probably small. Considering the presence of the compiles __trait, the idea of the small behaviour change is not so bad. So far this proposal was well received (almost everyone in the thread seems to agree to change the current behaviour. But some persons prefer to change it into an error, while others prefer a more handy semantics change. In both cases the change path will start at the same way: warning, deprecation, and then an error or a behavour change). This proposal is one of the about fifteen tiny D breaking changes that I am suggesting since years. If Andrei Alexandrescu or Walter are reading this thread I'd like a comment on this very small proposal mostly meant to avoid certain bugs. Bye, bearophile
Mar 25 2013
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 25 Mar 2013 17:28:51 -0400, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Monday, 25 March 2013 at 14:46:27 UTC, Steven Schveighoffer wrote:
 I would favor just changing the behavior.
That would silently break my code.
It would seem incomplete not to have if(arr) work, and the way it works now is very error prone. You would have to change your code either way. Deprecating than reintroducing seems gratuitous to me. Most people do not use if(arr) to check for array pointer when if(arr.ptr) is more descriptive. How much do you use this "feature"? Can they simply be replaced with if(arr.ptr)? or are they more of the type Jacob has with if(auto a = getArray())? I'm trying to get a feel for how much breakage this causes, as I typically do not use that construct. -Steve
Mar 25 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 26 March 2013 at 01:57:10 UTC, Steven Schveighoffer 
wrote:
 On Mon, 25 Mar 2013 17:28:51 -0400, Vladimir Panteleev 
 <vladimir thecybershadow.net> wrote:

 On Monday, 25 March 2013 at 14:46:27 UTC, Steven Schveighoffer 
 wrote:
 I would favor just changing the behavior.
That would silently break my code.
It would seem incomplete not to have if(arr) work, and the way it works now is very error prone. You would have to change your code either way. Deprecating than reintroducing seems gratuitous to me. Most people do not use if(arr) to check for array pointer when if(arr.ptr) is more descriptive. How much do you use this "feature"? Can they simply be replaced with if(arr.ptr)? or are they more of the type Jacob has with if(auto a = getArray())? I'm trying to get a feel for how much breakage this causes, as I typically do not use that construct.
No, it's nothing like that. In fact, it's very simple: Users will download my open-source program, compile it successfully, then try using it - at which point, it will crash, produce incorrect output, or do something equally bad. Would you not agree that this is unacceptable? Furthermore, I would have no way to automatically find all places where I would need to change my code. I would need to look at every if statement of my program and see if its behavior depends on whether the string/array is empty or null. My program is quite large, so, again, this is unacceptable. I use this feature in the same way that anyone uses a nullable type. It's the same distinction between a pointer to struct that is null, or that is pointing to an instance containing the struct's .init. It's the same distinction between a value type T and the benefits of Nullable!T. "null" is usually used to indicate the absence of a value, as opposed to an empty value. Although I can see how this can trip up new users of D, personally I prefer things to be just the way they are now. That said, I wouldn't be against forcing one to write "s is null", if the consensus was that this would improve D.
Mar 25 2013
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 25 Mar 2013 22:21:47 -0400, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Tuesday, 26 March 2013 at 01:57:10 UTC, Steven Schveighoffer wrote:
 On Mon, 25 Mar 2013 17:28:51 -0400, Vladimir Panteleev  
 <vladimir thecybershadow.net> wrote:

 On Monday, 25 March 2013 at 14:46:27 UTC, Steven Schveighoffer wrote:
 I would favor just changing the behavior.
That would silently break my code.
It would seem incomplete not to have if(arr) work, and the way it works now is very error prone. You would have to change your code either way. Deprecating than reintroducing seems gratuitous to me. Most people do not use if(arr) to check for array pointer when if(arr.ptr) is more descriptive. How much do you use this "feature"? Can they simply be replaced with if(arr.ptr)? or are they more of the type Jacob has with if(auto a = getArray())? I'm trying to get a feel for how much breakage this causes, as I typically do not use that construct.
No, it's nothing like that. In fact, it's very simple: Users will download my open-source program, compile it successfully, then try using it - at which point, it will crash, produce incorrect output, or do something equally bad. Would you not agree that this is unacceptable?
Well, it's unacceptable as long as the code is not updated, or you caveat it by saying it only was tested on compiler version X or earlier. If we go through a deprecation cycle, and then reintroduce if(arr) to mean if(arr.length), then we are in the same situation as long as you haven't updated your code.
 Furthermore, I would have no way to automatically find all places where  
 I would need to change my code. I would need to look at every if  
 statement of my program and see if its behavior depends on whether the  
 string/array is empty or null. My program is quite large, so, again,  
 this is unacceptable.
That I agree is unacceptable. We would have to take the unusual step of having a tool/compiler flag to identify the places this happens. Asking people to review their code by hand is not what I had in mind.
 I use this feature in the same way that anyone uses a nullable type.  
 It's the same distinction between a pointer to struct that is null, or  
 that is pointing to an instance containing the struct's .init. It's the  
 same distinction between a value type T and the benefits of Nullable!T.  
 "null" is usually used to indicate the absence of a value, as opposed to  
 an empty value.
The breaking distinction is between null and a non-null empty string. In the cases where the code is setting a non-empty array, this is not an issue. This is why I asked about usage. It may turn out that you are checking for null, but really because when the value is set, it's never empty, it's not an issue.
 Although I can see how this can trip up new users of D, personally I  
 prefer things to be just the way they are now. That said, I wouldn't be  
 against forcing one to write "s is null", if the consensus was that this  
 would improve D.
I understand the idea to use null as a "special" array value. But the truth is, nulls act just like empty arrays in almost all aspects. The critical failure I see is this: int *p = null; if(p == null) is equivalent to if(!p) and if(p != null) is equivalent to if(p). But for arrays: int[] a = null; if(a == null) is NOT equivalent to if(!a), and if(a != null) is NOT equivalent to if(a). This is the part that screws up so many people. Because == does not do the same thing, we have a quirk that is silent and deadly. It's made even worse that for MOST cases, if(!a) is equivalent to if(a == null). So it's very easy to miss this critical distinction, and it's not at all intuitive. -Steve
Mar 25 2013
parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 26 March 2013 at 03:58:35 UTC, Steven Schveighoffer 
wrote:
 Well, it's unacceptable as long as the code is not updated, or 
 you caveat it by saying it only was tested on compiler version 
 X or earlier.
That's horrible. What ever happened to fail-fast?
 If we go through a deprecation cycle, and then reintroduce 
 if(arr) to mean if(arr.length), then we are in the same 
 situation as long as you haven't updated your code.
The deprecation process takes years. This is a distinction much bigger than two consecutive compiler versions. Also, D still sticks to the "if it looks like C, then it must work like C or fail to compile" principle. It's there for a reason, and I think the same reason applies to older versions of D.
 Furthermore, I would have no way to automatically find all 
 places where I would need to change my code. I would need to 
 look at every if statement of my program and see if its 
 behavior depends on whether the string/array is empty or null. 
 My program is quite large, so, again, this is unacceptable.
That I agree is unacceptable. We would have to take the unusual step of having a tool/compiler flag to identify the places this happens. Asking people to review their code by hand is not what I had in mind.
Even if you introduce a new flag, people may only catch the problem too late. A routine software upgrade (as by an OS's package manager) should not cause incorrect output that late in the cycle!
 The breaking distinction is between null and a non-null empty 
 string.  In the cases where the code is setting a non-empty 
 array, this is not an issue.  This is why I asked about usage.  
 It may turn out that you are checking for null, but really 
 because when the value is set, it's never empty, it's not an 
 issue.
Yes; to clarify, the program in question handles input that is often an empty string or array. In these cases, the length of the data is not important - it's just user data, regardless of its length. Its presence or absence is important.
 I understand the idea to use null as a "special" array value.  
 But the truth is, nulls act just like empty arrays in almost 
 all aspects.

 The critical failure I see is this:

 int *p = null;
 if(p == null) is equivalent to if(!p) and if(p != null) is 
 equivalent to if(p).

 But for arrays:

 int[] a = null;
 if(a == null) is NOT equivalent to if(!a), and if(a != null) is 
 NOT equivalent to if(a).  This is the part that screws up so 
 many people.  Because == does not do the same thing, we have a 
 quirk that is silent and deadly.
Yes, but your argument is based on using the equality (==) operator. This operator can do special things in certain cases. When you have a C string and a D string, the == operator will do very different things, so pointing out that its behavior is different when the pointer / .ptr is null, when its behavior is also different when the pointer / .ptr is non-null, is not much of an argument. The behavior is consistent with the identity (is) operator.
 It's made even worse that for MOST cases, if(!a) is equivalent 
 to if(a == null).  So it's very easy to miss this critical 
 distinction, and it's not at all intuitive.
Is it? If a was an object, that would invoke opEquals. I believe "a is null" is a more accurate equivalent for "!a".
Mar 26 2013
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 03/26/2013 03:21 AM, Vladimir Panteleev wrote:
 ...

 I use this feature in the same way that anyone uses a nullable type.
 It's the same distinction between a pointer to struct that is null, or
 that is pointing to an instance containing the struct's .init. It's the
 same distinction between a value type T and the benefits of Nullable!T.
 "null" is usually used to indicate the absence of a value, as opposed to
 an empty value.
It is not the same distinction. It is not like that for dynamic arrays! void main(){ assert(null is []); }
 Although I can see how this can trip up new users of D, personally I
 prefer things to be just the way they are now. That said, I wouldn't be
 against forcing one to write "s is null", if the consensus was that this
 would improve D.
"s.ptr is null", actually
Mar 26 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 26 March 2013 at 08:15:05 UTC, Timon Gehr wrote:
 On 03/26/2013 03:21 AM, Vladimir Panteleev wrote:
 ...

 I use this feature in the same way that anyone uses a nullable 
 type.
 It's the same distinction between a pointer to struct that is 
 null, or
 that is pointing to an instance containing the struct's .init. 
 It's the
 same distinction between a value type T and the benefits of 
 Nullable!T.
 "null" is usually used to indicate the absence of a value, as 
 opposed to
 an empty value.
It is not the same distinction. It is not like that for dynamic arrays! void main(){ assert(null is []); }
[] (the literal) has .ptr as null. That may or may not be a bug. To create a non-null empty array, you have to use something like (new uint[1])[0..0]. The "" literal does not have this problem.
 Although I can see how this can trip up new users of D, 
 personally I
 prefer things to be just the way they are now. That said, I 
 wouldn't be
 against forcing one to write "s is null", if the consensus was 
 that this
 would improve D.
"s.ptr is null", actually
The distinction being the case with a non-empty slice starting at address 0?
Mar 26 2013
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 03/26/2013 12:24 PM, Vladimir Panteleev wrote:
 On Tuesday, 26 March 2013 at 08:15:05 UTC, Timon Gehr wrote:
 On 03/26/2013 03:21 AM, Vladimir Panteleev wrote:
 ...

 I use this feature in the same way that anyone uses a nullable type.
 It's the same distinction between a pointer to struct that is null, or
 that is pointing to an instance containing the struct's .init. It's the
 same distinction between a value type T and the benefits of Nullable!T.
 "null" is usually used to indicate the absence of a value, as opposed to
 an empty value.
It is not the same distinction. It is not like that for dynamic arrays! void main(){ assert(null is []); }
[] (the literal) has .ptr as null. That may or may not be a bug.
It is simply left unspecified. Hence, relying on a distinction between null and empty arrays is extremely brittle.
 To
 create a non-null empty array, you have to use something like (new
 uint[1])[0..0].
Sure, therefore [] is what will be used. Also, it does not always work. auto x = { return (new uint[1])[0..0]; }(); void main(){ assert(x is null); }
 The "" literal does not have this problem.
Indeed, because it is guaranteed to be zero-terminated.
 Although I can see how this can trip up new users of D, personally I
 prefer things to be just the way they are now. That said, I wouldn't be
 against forcing one to write "s is null", if the consensus was that this
 would improve D.
"s.ptr is null", actually
The distinction being the case with a non-empty slice starting at address 0?
Yes. I think the conversion to bool is a leftover from when arrays implicitly decayed to pointers.
Mar 26 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 26 March 2013 at 13:12:42 UTC, Timon Gehr wrote:
 It is simply left unspecified. Hence, relying on a distinction 
 between null and empty arrays is extremely brittle.
I think calling it "extremely brittle" based on that [] is UD is an exaggeration. As I've stated, I maintain a nontrivial program which works with non-null empty strings and arrays, and rarely encountered difficulties.
 To
 create a non-null empty array, you have to use something like 
 (new
 uint[1])[0..0].
Sure, therefore [] is what will be used. Also, it does not always work. auto x = { return (new uint[1])[0..0]; }(); void main(){ assert(x is null); }
That looks like a CTFE bug. The same thing doesn't happen with strings and "".
 The distinction being the case with a non-empty slice starting 
 at
 address 0?
Yes. I think the conversion to bool is a leftover from when arrays implicitly decayed to pointers.
When would an array with null .ptr and non-zero length make sense? Do we want to care about such cases? We could simply state that slicing NULL is undefined behavior.
Mar 26 2013
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Vladimir Panteleev:

 We could simply state that slicing NULL is undefined behavior.
Undefined behaviors are bad. Bye, bearophile
Mar 26 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/26/13 10:07 AM, bearophile wrote:
 Vladimir Panteleev:

 We could simply state that slicing NULL is undefined behavior.
Undefined behaviors are bad. Bye, bearophile
Without -noboundscheck, which is opt-in, there is no undefined behavior. Andrei
Mar 26 2013
prev sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, March 26, 2013 12:24:56 Vladimir Panteleev wrote:
 [] (the literal) has .ptr as null. That may or may not be a bug.
As I understand it, it's very much on purpose. It avoids a needless memory allocation. - Jonathan M Davis
Mar 26 2013
next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 26 March 2013 at 17:32:12 UTC, Jonathan M Davis wrote:
 On Tuesday, March 26, 2013 12:24:56 Vladimir Panteleev wrote:
 [] (the literal) has .ptr as null. That may or may not be a 
 bug.
As I understand it, it's very much on purpose. It avoids a needless memory allocation.
In the same way that "" allocates memory? [] can be an empty slice of *anything*, but it might as well be a 0-length constant in the data segment, similar to "".
Mar 26 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 26 Mar 2013 13:34:02 -0400, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Tuesday, 26 March 2013 at 17:32:12 UTC, Jonathan M Davis wrote:
 On Tuesday, March 26, 2013 12:24:56 Vladimir Panteleev wrote:
 [] (the literal) has .ptr as null. That may or may not be a bug.
As I understand it, it's very much on purpose. It avoids a needless memory allocation.
In the same way that "" allocates memory? [] can be an empty slice of *anything*, but it might as well be a 0-length constant in the data segment, similar to "".
"" needs to point to the data segment because it has to point to a readable 0 character for C compatibility. The same is not true for []. The code for [] is modifiable without changing the compiler, you certainly can suggest something different. I think the function is _d_newarray, there may be more than one version (one that is for 0-initialized data, one that is not). Changing it will probably break code that depends on it being null (intentionally or not). But it definitely is NOT a bug. Any suggested change would be an enhancement request. -Steve
Mar 26 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 26 March 2013 at 17:57:47 UTC, Steven Schveighoffer 
wrote:
 But it definitely is NOT a bug.  Any suggested change would be 
 an enhancement request.
Why do you think it is not a bug? It is inconsistent with "", and what's the point of [] if you can just use "null"?
Mar 26 2013
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 26 Mar 2013 17:31:26 -0400, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Tuesday, 26 March 2013 at 17:57:47 UTC, Steven Schveighoffer wrote:
 But it definitely is NOT a bug.  Any suggested change would be an  
 enhancement request.
Why do you think it is not a bug? It is inconsistent with "", and what's the point of [] if you can just use "null"?
Meaning it functions as designed. Whether you agree with the design or not, it's still not a bug. And once again, "" is different because of C compatibility. If that were not a requirement, "" would map to null. There is no such requirement for general slices. -Steve
Mar 26 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 26 March 2013 at 22:04:32 UTC, Steven Schveighoffer 
wrote:
 On Tue, 26 Mar 2013 17:31:26 -0400, Vladimir Panteleev 
 <vladimir thecybershadow.net> wrote:

 On Tuesday, 26 March 2013 at 17:57:47 UTC, Steven 
 Schveighoffer wrote:
 But it definitely is NOT a bug.  Any suggested change would 
 be an enhancement request.
Why do you think it is not a bug? It is inconsistent with "", and what's the point of [] if you can just use "null"?
Meaning it functions as designed. Whether you agree with the design or not, it's still not a bug.
What I meant is: why do you think this was an intentional, thought-out design decision? Where is the justification for it? I'd say it's more likely it was written with no thought given to the distinction between null and empty arrays.
 And once again, "" is different because of C compatibility.  If 
 that were not a requirement, "" would map to null.
Why do you think so? Sorry, but to me it just seems like you're stating personal conjecture as absolute facts. This discussion is ultimately inconsequential, because we probably can't change it now as it would break code, but your certainty regarding the origins of these decisions lead me to believe that you know something I don't.
Mar 26 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 26 Mar 2013 19:28:44 -0400, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Tuesday, 26 March 2013 at 22:04:32 UTC, Steven Schveighoffer wrote:
 On Tue, 26 Mar 2013 17:31:26 -0400, Vladimir Panteleev  
 <vladimir thecybershadow.net> wrote:

 On Tuesday, 26 March 2013 at 17:57:47 UTC, Steven Schveighoffer wrote:
 But it definitely is NOT a bug.  Any suggested change would be an  
 enhancement request.
Why do you think it is not a bug? It is inconsistent with "", and what's the point of [] if you can just use "null"?
Meaning it functions as designed. Whether you agree with the design or not, it's still not a bug.
What I meant is: why do you think this was an intentional, thought-out design decision? Where is the justification for it? I'd say it's more likely it was written with no thought given to the distinction between null and empty arrays.
The design is that the pointer of a 0-length array when allocating one is inconsequential. Rather than allocate space on the heap or designate space on the data segment, null is available and free, and it happens to be the default you get when you declare an array. The code is specifically designed to return null, see these lines in the new array function: if (length == 0 || size == 0) result = null; link to github: https://github.com/D-Programming-Language/druntime/blob/master/src/rt/lifetime.d#L774 size being the size of the element. I was not around when this decision was made, but it seems clear from the code that case was handled specifically to return null. Seeing as null arrays and empty non-null arrays are functionally equivalent in every respect except for this weird boolean evaluation anomaly (which I think should be fixed), I don't see the problem.
 And once again, "" is different because of C compatibility.  If that  
 were not a requirement, "" would map to null.
Why do you think so? Sorry, but to me it just seems like you're stating personal conjecture as absolute facts.
It makes sense, why allocate any space, be it static data space or heap space, when null works perfectly fine? The same decisions that lead to [] returning null would apply to strings if C didn't have to read the pointed-at data. "" isn't even consistent with itself! For example: teststring1.d: module teststring1; import teststring2; void main() { dotest(""); dotest2(); } teststring2.d: module teststring2; import std.stdio; void dotest(string s) { writeln(s == ""); writeln(s is ""); } void dotest2() { dotest(""); } Outputs: true false true true I would contend that relying on pointers being a certain value is a bad idea for arrays unless you really need that, in which case you should have to be explicit. In fact, relying on whatever [] returns being a certain pointer would be bad as well. You should only rely on null pointing at null. Any plans to make the spec *specifically* require [] being non-null would be an enhancement request.
 This discussion is ultimately inconsequential, because we probably can't  
 change it now as it would break code, but your certainty regarding the  
 origins of these decisions lead me to believe that you know something I  
 don't.
I wasn't there, I didn't make the decision, but it's implied in the existing code and the spec. -Steve
Mar 26 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Wednesday, 27 March 2013 at 03:03:13 UTC, Steven Schveighoffer 
wrote:
 The design is that the pointer of a 0-length array when 
 allocating one is inconsequential.  Rather than allocate space 
 on the heap or designate space on the data segment, null is 
 available and free, and it happens to be the default you get 
 when you declare an array.
Allocating space on the heap?? You don't need to "designate" space anywhere - on the heap, stack, or data segment, because an empty array takes NO space. As I said before, ANY value for .ptr other than null would be good, even if it's cast(T*)1. What .ptr is pointing at is irrelevant, since the slice has 0 length, and indexing it with any index would be an out-of-bounds array access.
 The code is specifically designed
No. The code is specifically WRITTEN. For the third time: You don't know if this was a conscious design decision made with the distinction between null and empty arrays in mind, so please stop wording things like it was.
 I was not around when this decision was made, but it seems
 I wasn't there, I didn't make the decision, but it's implied in
This is what I wanted to clear up. Your previous posts were worded in such a way that you seemed absolutely certain of the reasonings for why the code ended up like that. Many arguments in either direction could be presented by either side, but it doesn't change the fact that we don't know whether the current state of things was a result of careful planning or not. I could also present lengthy arguments in favor of why making [] have a non-null .ptr would have made more sense, and why I think that code's author disregarded the distinctions of non-null empty arrays, but as I said in my previous post, even if we knew the answers, they wouldn't change anything, as we can't change []'s behavior now.
Mar 27 2013
next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 27 Mar 2013 08:18:34 -0400, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Wednesday, 27 March 2013 at 03:03:13 UTC, Steven Schveighoffer wrote:
 The code is specifically designed
No. The code is specifically WRITTEN. For the third time: You don't know if this was a conscious design decision made with the distinction between null and empty arrays in mind, so please stop wording things like it was.
Well, somebody wrote the code that way. Whether they considered possibly pointing at a non-null value or not, I don't know. But I assume they picked null from convenience and from the fact that nulls are a perfectly valid spot to point at for an empty array. To say it is a bug is incorrect -- the spec does not call for null or non-null, the design is perfectly valid given the constraints.
 I was not around when this decision was made, but it seems
 I wasn't there, I didn't make the decision, but it's implied in
This is what I wanted to clear up. Your previous posts were worded in such a way that you seemed absolutely certain of the reasonings for why the code ended up like that.
I can only go on what I see in the code. It seems certain that whoever wrote it is deliberately returning null. It's also REQUIRED by the spec for "" to return a pointer to a readable 0, so null is not possible. I can only assume, with a very high degree of certainty, that if that requirement was gone, "" would also return null given the behavior of the array literal code, and the assumption the same developer wrote both. I am not 100% certain, as I didn't write the code, I guess you caught me? It's more like 99% certain. This of course is my opinion. -Steve
Mar 27 2013
prev sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 27 March 2013 at 12:18:35 UTC, Vladimir Panteleev 
wrote:
 No. The code is specifically WRITTEN. For the third time: You 
 don't know if this was a conscious design decision made with 
 the distinction between null and empty arrays in mind, so 
 please stop wording things like it was.
Well that is not sure. I don't have the answer. BUT ! In D, you'll find many places where identity and value are conflated. This is one instance of the problem. This is pretty bad in general. implicit cast to bool usually imply using the value (otherwise, an int would always cast to true) and it is rather surprising that it does check for identity for slices.
Mar 27 2013
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 03/26/2013 10:31 PM, Vladimir Panteleev wrote:
 On Tuesday, 26 March 2013 at 17:57:47 UTC, Steven Schveighoffer wrote:
 But it definitely is NOT a bug.  Any suggested change would be an
 enhancement request.
Why do you think it is not a bug? It is inconsistent with "", and what's the point of [] if you can just use "null"?
IOW, what is the point of "null" if you can just use [].
Mar 26 2013
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Timon Gehr:

 IOW, what is the point of "null" if you can just use [].
I usually prefer to use [], because null is a literal for pointers and class references, while [] is a literal specific for arrays (and strings), so its meaning is more clear (in D I'd even like a [:] literal that represents an empty associative array). On the other hand if you compile a program that uses null instead of [] you see some differences. In the current dmd compiler returning null is more efficient. I have seen code where this difference in performance matters: int[] foo() { return []; } int[] bar() { return null; } void main() {} _D4temp3fooFZAi: L0: push EAX mov EAX,offset FLAT:_D11TypeInfo_Ai6__initZ push 0 push EAX call near ptr __d_arrayliteralTX mov EDX,EAX add ESP,8 pop ECX xor EAX,EAX ret _D4temp3barFZAi: xor EAX,EAX xor EDX,EDX ret Bye, bearophile
Mar 26 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 26 March 2013 at 23:57:32 UTC, bearophile wrote:
 Timon Gehr:

 IOW, what is the point of "null" if you can just use [].
I usually prefer to use [], because null is a literal for pointers and class references, while [] is a literal specific for arrays (and strings), so its meaning is more clear (in D I'd even like a [:] literal that represents an empty associative array). On the other hand if you compile a program that uses null instead of [] you see some differences. In the current dmd compiler returning null is more efficient. I have seen code where this difference in performance matters: int[] foo() { return []; } int[] bar() { return null; } void main() {} _D4temp3fooFZAi: L0: push EAX mov EAX,offset FLAT:_D11TypeInfo_Ai6__initZ push 0 push EAX call near ptr __d_arrayliteralTX mov EDX,EAX add ESP,8 pop ECX xor EAX,EAX ret _D4temp3barFZAi: xor EAX,EAX xor EDX,EDX ret Bye, bearophile
That is a compiler bug isn't it ?
Mar 26 2013
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
deadalnix:

 On the other hand if you compile a program that uses null 
 instead of [] you see some differences. In the current dmd 
 compiler returning null is more efficient. I have seen code 
 where this difference in performance matters:
... That is a compiler bug isn't it ?
I don't know. Maybe it's just a missed optimization :-) Bye, bearophile
Mar 26 2013
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 26 Mar 2013 23:21:35 -0400, deadalnix <deadalnix gmail.com> wrote:

 On Tuesday, 26 March 2013 at 23:57:32 UTC, bearophile wrote:
 Timon Gehr:

 IOW, what is the point of "null" if you can just use [].
I usually prefer to use [], because null is a literal for pointers and class references, while [] is a literal specific for arrays (and strings), so its meaning is more clear (in D I'd even like a [:] literal that represents an empty associative array). On the other hand if you compile a program that uses null instead of [] you see some differences. In the current dmd compiler returning null is more efficient. I have seen code where this difference in performance matters: int[] foo() { return []; } int[] bar() { return null; } void main() {} _D4temp3fooFZAi: L0: push EAX mov EAX,offset FLAT:_D11TypeInfo_Ai6__initZ push 0 push EAX call near ptr __d_arrayliteralTX mov EDX,EAX add ESP,8 pop ECX xor EAX,EAX ret _D4temp3barFZAi: xor EAX,EAX xor EDX,EDX ret Bye, bearophile
That is a compiler bug isn't it ?
No. [] calls the hook for _d_arrayliteral, whose source is not known at compile time. Runtime functions cannot be inlined, which will be set in stone once the runtime is a dynamic library. -Steve
Mar 26 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 27 March 2013 at 03:26:59 UTC, Steven Schveighoffer 
wrote:
 On Tue, 26 Mar 2013 23:21:35 -0400, deadalnix 
 <deadalnix gmail.com> wrote:

 On Tuesday, 26 March 2013 at 23:57:32 UTC, bearophile wrote:
 Timon Gehr:

 IOW, what is the point of "null" if you can just use [].
I usually prefer to use [], because null is a literal for pointers and class references, while [] is a literal specific for arrays (and strings), so its meaning is more clear (in D I'd even like a [:] literal that represents an empty associative array). On the other hand if you compile a program that uses null instead of [] you see some differences. In the current dmd compiler returning null is more efficient. I have seen code where this difference in performance matters: int[] foo() { return []; } int[] bar() { return null; } void main() {} _D4temp3fooFZAi: L0: push EAX mov EAX,offset FLAT:_D11TypeInfo_Ai6__initZ push 0 push EAX call near ptr __d_arrayliteralTX mov EDX,EAX add ESP,8 pop ECX xor EAX,EAX ret _D4temp3barFZAi: xor EAX,EAX xor EDX,EDX ret Bye, bearophile
That is a compiler bug isn't it ?
No. [] calls the hook for _d_arrayliteral, whose source is not known at compile time. Runtime functions cannot be inlined, which will be set in stone once the runtime is a dynamic library. -Steve
The compiler know the result by advance, it don't need the source of the _d_arrayliteral.
Mar 26 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 26 Mar 2013 23:31:03 -0400, deadalnix <deadalnix gmail.com> wrote:

 On Wednesday, 27 March 2013 at 03:26:59 UTC, Steven Schveighoffer wrote:
 No.  [] calls the hook for _d_arrayliteral, whose source is not known  
 at compile time.  Runtime functions cannot be inlined, which will be  
 set in stone once the runtime is a dynamic library.
The compiler know the result by advance, it don't need the source of the _d_arrayliteral.
The value of the empty literal is an implementation detail, not defined by the compiler or spec as far as I know. -Steve
Mar 26 2013
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 03/27/2013 05:06 AM, Steven Schveighoffer wrote:
 On Tue, 26 Mar 2013 23:31:03 -0400, deadalnix <deadalnix gmail.com> wrote:

 On Wednesday, 27 March 2013 at 03:26:59 UTC, Steven Schveighoffer wrote:
 No.  [] calls the hook for _d_arrayliteral, whose source is not known
 at compile time.  Runtime functions cannot be inlined, which will be
 set in stone once the runtime is a dynamic library.
The compiler know the result by advance, it don't need the source of the _d_arrayliteral.
The value of the empty literal is an implementation detail, not defined by the compiler or spec as far as I know. -Steve
Therefore it is currently a valid optimization to replace it by null, no matter what the runtime does.
Mar 27 2013
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 27 Mar 2013 07:27:01 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 03/27/2013 05:06 AM, Steven Schveighoffer wrote:
 On Tue, 26 Mar 2013 23:31:03 -0400, deadalnix <deadalnix gmail.com>  
 wrote:

 On Wednesday, 27 March 2013 at 03:26:59 UTC, Steven Schveighoffer  
 wrote:
 No.  [] calls the hook for _d_arrayliteral, whose source is not known
 at compile time.  Runtime functions cannot be inlined, which will be
 set in stone once the runtime is a dynamic library.
The compiler know the result by advance, it don't need the source of the _d_arrayliteral.
The value of the empty literal is an implementation detail, not defined by the compiler or spec as far as I know. -Steve
Therefore it is currently a valid optimization to replace it by null, no matter what the runtime does.
This is true, but it doesn't make this issue a bug. It's just a missed optimization. -Steve
Mar 27 2013
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 26 Mar 2013 18:56:07 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 03/26/2013 10:31 PM, Vladimir Panteleev wrote:
 On Tuesday, 26 March 2013 at 17:57:47 UTC, Steven Schveighoffer wrote:
 But it definitely is NOT a bug.  Any suggested change would be an
 enhancement request.
Why do you think it is not a bug? It is inconsistent with "", and what's the point of [] if you can just use "null"?
IOW, what is the point of "null" if you can just use [].
One could say why have null if you can just use T[].init. It's two names for the same thing, not uncommon in programming. In this case, however, I think handling or allowing [] is good for generic code. The question then becomes what should [] return? To have it return null seems acceptable to me -- the default empty array is null, and null behaves quite well as a zero-length array. The spec allows for it, and it's free to use null. -Steve
Mar 26 2013
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 26 Mar 2013 13:32:02 -0400, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Tuesday, March 26, 2013 12:24:56 Vladimir Panteleev wrote:
 [] (the literal) has .ptr as null. That may or may not be a bug.
As I understand it, it's very much on purpose. It avoids a needless memory allocation.
That is correct. Since null and non-null-but-empty arrays are essentially equivalent, it would be wasteful to allocate a new block for 0 bytes. Same as if you did new T[0]. -Steve
Mar 26 2013
prev sibling parent "Daniel Murphy" <yebblies nospamgmail.com> writes:
"Steven Schveighoffer" <schveiguy yahoo.com> wrote in message 
news:op.wuibbpkbeav7ka stevens-macbook-pro.local...
 I would favor just changing the behavior.  The existing behavior is most 
 often a bug when it is used, because quite often arrays are null (that is 
 the default value), and quite often, people THINK they are checking if the 
 array is empty instead of if the pointer is null.  Because most empty 
 arrays are null, and null arrays have zero length, their tests do not 
 expose this bug.  It's actually very difficult to generate a non-null 
 empty array for testing. [] does not work, it returns a null array!

 Only in rare cases do people actually want to check for null, and in most 
 people's code it is an error when a non-null but empty array is checked 
 with if(arr).  If people are interested in the pointer, they usually check 
 if(arr.ptr) or if(arr !is null).

 I would posit that for every knowledgeable person who uses this "shortcut" 
 to check for null, 100 people use it expecting it to be measuring the 
 array length.

 If we go through a deprecation cycle, it makes people who do if(arr) for 
 length have to change there code to if(arr.length) then back to if(arr) 
 once the deprecation is over.  The switch to if(arr.length) will be 
 simple, the compiler will show all the places if(arr) is used, but the 
 switch back will be more difficult since if(arr.length) will not be 
 deprecated.
There is no need to ever switch the code back.
 What about a switch that identifies all places where if(arr) is done? 
 Then people who use if(arr) for testing if(arr.ptr) can find and fix their 
 cases.  Or even a separate tool to do that.

 -Steve
I think the deprecation process is a better option than a switch, because it forces code to be fixed and gives people time to do so. The less code we silently break the higher chance this has of Walter-approval.
Mar 26 2013
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-03-24 23:10, bearophile wrote:
 A recent discussion in D.learn reminds me of an enhancement request of
 mine that is sleeping in Bugzilla since years:

 http://d.puremagic.com/issues/show_bug.cgi?id=4733


 The probles is that in D dynamic arrays can be non-null even when they
 are empty:


 import std.stdio;
 int[] foo() {
      auto a = [1];
      return a[0..0];
 }
 void main() {
      auto data = foo();
      if (data)
          writeln("here");
 }


 This is dangerous, so in D the safe and idiomatic way to test for empty
 arrays is to use std.array.empty().

 So my proposal of Issue 4733 is to forbid (with the usual
 warning/deprecation intermediate steps) the use of dynamic arrays in a
 boolean context:


 void main() {
      auto a = [1];
      if (a) {} // error, forbidden.
 }


 So to test empty/null you have to use empty() or "is null":

 import std.array: empty;
 void main() {
      auto a = [1];
      if (a.empty) {} // OK
      if (a is null) {} // OK
 }
I just started to use this in one place :( if (auto a = getArray()) { } getArray returns null if now array exists. -- /Jacob Carlborg
Mar 25 2013
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 25 Mar 2013 11:44:22 -0400, Jacob Carlborg <doob me.com> wrote:

 I just started to use this in one place :(

 if (auto a = getArray()) { }

 getArray returns null if now array exists.
That one is not as easy. It will have to be split into two lines, the declaration and the test. -Steve
Mar 25 2013
prev sibling next sibling parent reply "Phil Lavoie" <maidenphil hotmail.com> writes:
On Sunday, 24 March 2013 at 22:10:06 UTC, bearophile wrote:
 A recent discussion in D.learn reminds me of an enhancement 
 request of mine that is sleeping in Bugzilla since years:

 http://d.puremagic.com/issues/show_bug.cgi?id=4733


 The probles is that in D dynamic arrays can be non-null even 
 when they are empty:


 import std.stdio;
 int[] foo() {
     auto a = [1];
     return a[0..0];
 }
 void main() {
     auto data = foo();
     if (data)
         writeln("here");
 }


 This is dangerous, so in D the safe and idiomatic way to test 
 for empty arrays is to use std.array.empty().

 So my proposal of Issue 4733 is to forbid (with the usual 
 warning/deprecation intermediate steps) the use of dynamic 
 arrays in a boolean context:


 void main() {
     auto a = [1];
     if (a) {} // error, forbidden.
 }


 So to test empty/null you have to use empty() or "is null":

 import std.array: empty;
 void main() {
     auto a = [1];
     if (a.empty) {} // OK
     if (a is null) {} // OK
 }


 Bye,
 bearophile
Hi, IMHO, somebody coming from a C/C++ background (like me) has no problem realizing that if( var ) means either if not null or if not 0. There was talk about changing the behavior of if( arr ) to mean if( !arr.empty ) but I believe this is the worst thing to do, since it would incorporate some inconsistencies with usual pointers. int[] foo() { auto var = new int[ 0 ]; return var; } int * bar() { auto var = cast( int * )malloc( 0 ); return var; } void main() { if( foo() ) { //Would not pass, since foo is empty. } if( bar() ) { //Would pass, since bar is not null. } } I prefer to have code that explicitly states what is going on anyways: if( arr !is null && !arr.empty ) { blablabla; } Whenever somebody decides to used the abbreviated expression if(arr), I think that the behavior should be the one of the C language. I'm not sure it is wise just yet to establish that whenever somebody test a slice to see if it's null that somebody also means to test if it is empty. Interesting idea though. Cheers! Phil
Mar 25 2013
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Phil Lavoie:

 IMHO, somebody coming from a C/C++ background (like me) has no 
 problem realizing that if( var ) means either if not null or if 
 not 0.
Some D programmers come from other languages, where if(arr) means if its len is different from zero. D dynamic arrays are 2 words, so you can have a length zero with a not-null pointer. So there is space for some semantic confusion.
 There was talk about changing the behavior of if( arr ) to mean 
 if( !arr.empty ) but I believe this is the worst thing to do, 
 since it would incorporate some inconsistencies with usual 
 pointers.
D dynamic arrays aren't pointers, they are composed of two words, so you can't assume the semantics is exactly the same. I think assuming if(arr) means if(!arr.empty) is still better than the current situation :-) Bye, bearophile
Mar 25 2013
parent reply "Phil Lavoie" <maidenphil hotmail.com> writes:
On Monday, 25 March 2013 at 16:14:18 UTC, bearophile wrote:
 Phil Lavoie:

 IMHO, somebody coming from a C/C++ background (like me) has no 
 problem realizing that if( var ) means either if not null or 
 if not 0.
Some D programmers come from other languages, where if(arr) means if its len is different from zero. D dynamic arrays are 2 words, so you can have a length zero with a not-null pointer. So there is space for some semantic confusion.
 There was talk about changing the behavior of if( arr ) to 
 mean if( !arr.empty ) but I believe this is the worst thing to 
 do, since it would incorporate some inconsistencies with usual 
 pointers.
D dynamic arrays aren't pointers, they are composed of two words, so you can't assume the semantics is exactly the same. I think assuming if(arr) means if(!arr.empty) is still better than the current situation :-) Bye, bearophile
Herro, Of course you can argue that not every programmers come from C/C++ background, but, if I'm correct, D is aimed to be a C++ replacement, right? On the other hand, C is like the most used language on the universe (aliens included), so it makes sense to consider its expression semantics first. I am aware that those aren't the strongest arguments, but if we'd have to compare programmer backgrounds to make a decision regarding expression behaviors, I think C should win. Obviously, slices are not single word pointers, but they are pointers after all (or at the very least own one). Just saying that comparing a slice for its initial state or an invalid state is not the same as checking length, but I think we agree on that. Where we disagree is on default behavior of if(arr). You think it should check for length to be more programmer friendly, or rather, avoid unwanted errors (though if I was that programmer, I would not think it's friendly at all :( ) whereas I think it should act as if it was a pointer in C. Since it is already the behavior of the language, on a scale of 10, how would you prioritize your suggestion :D? How much of a change do you think it would make? I respect your opinions and suggestions, though I have never fell in the trap you presented in your opening statement :), nor did I even think someone would. I do believe that, in any case, this form is best: if( arr !is null && !arr.empty ) Peace, Phil
Mar 25 2013
next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
25-Mar-2013 20:43, Phil Lavoie пишет:
 I do believe that, in any case, this form is best:
 if( arr !is null && !arr.empty )
Now write that one thousand times and count the typos.
 Peace,
 Phil
-- Dmitry Olshansky
Mar 25 2013
next sibling parent reply "Phil Lavoie" <maidenphil hotmail.com> writes:
On Monday, 25 March 2013 at 16:46:46 UTC, Dmitry Olshansky wrote:
 25-Mar-2013 20:43, Phil Lavoie пишет:
 I do believe that, in any case, this form is best:
 if( arr !is null && !arr.empty )
Now write that one thousand times and count the typos.
 Peace,
 Phil
A thousand? !arr.empty instead of arr.length? Was that your point?
Mar 25 2013
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
25-Mar-2013 21:09, Phil Lavoie пишет:
 On Monday, 25 March 2013 at 16:46:46 UTC, Dmitry Olshansky wrote:
 25-Mar-2013 20:43, Phil Lavoie пишет:
 I do believe that, in any case, this form is best:
 if( arr !is null && !arr.empty )
Now write that one thousand times and count the typos.
 Peace,
 Phil
A thousand? !arr.empty instead of arr.length? Was that your point?
That short path better be correct path in most cases and this form is exactly in the opposite direction: treating if (arr) as if(arr.ptr) Not to mention the following "idiom": if( arr !is null && !arr.empty ) -- Dmitry Olshansky
Mar 25 2013
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Mar 25, 2013 at 08:46:42PM +0400, Dmitry Olshansky wrote:
 25-Mar-2013 20:43, Phil Lavoie пишет:
I do believe that, in any case, this form is best:
if( arr !is null && !arr.empty )
Now write that one thousand times and count the typos.
[...] What's wrong with just writing: if (arr.length > 0) ? .length will return 0 both when arr is null and when it's non-null but empty. T -- Heads I win, tails you lose.
Mar 25 2013
parent reply "Phil Lavoie" <maidenphil hotmail.com> writes:
On Monday, 25 March 2013 at 17:14:47 UTC, H. S. Teoh wrote:
 On Mon, Mar 25, 2013 at 08:46:42PM +0400, Dmitry Olshansky 
 wrote:
 25-Mar-2013 20:43, Phil Lavoie пишет:
I do believe that, in any case, this form is best:
if( arr !is null && !arr.empty )
Now write that one thousand times and count the typos.
[...] What's wrong with just writing: if (arr.length > 0) ? .length will return 0 both when arr is null and when it's non-null but empty. T
Nothing is wrong with that apparently. I was not aware arr.length tolerated null slices. Does it keeps its behavior in unsafe or system mode? Phil
Mar 25 2013
next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Monday, March 25, 2013 18:18:00 Phil Lavoie wrote:
 Nothing is wrong with that apparently. I was not aware arr.length
 tolerated null slices. Does it keeps its behavior in unsafe or
 system mode?
If you haven't read this article on arrays, then you should: http://dlang.org/d-array-article.html An array in D is basically (though not exactly) struct A(T) { A ptr; size_t length; } Almost nothing cares about if ptr is null. If you're checking length, then it just checks length (which is 0 if ptr is null). If you append to it or set length, then the runtime will look at ptr and allocate or reallocate it if it needs to (or just increase length if it can do so). == does something along the lines of if(lhs.length != rhs.length) return false; if(lhs.ptr is rhs.ptr) return true; for(size_t i = 0; i < lhs.length; ++i) { if(lhs[i] != rhs[i]) return false; } return true; So, it doesn't care one whit whether ptr is null or not. Almost nothing cares. About the only thing that cares is the is operator - i.e. arr is null. However, the problem here is that cast(bool)arr returns whether arr.ptr is null rather than arr.length == 0, which is inconsistent with almost everything else that goes on with arrays, and so it's error-prone. - Jonathan M Davis
Mar 25 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Monday, March 25, 2013 13:32:36 Jonathan M Davis wrote:
 However, the problem here is that cast(bool)arr returns whether
 arr.ptr is null rather than arr.length == 0, which is inconsistent with
 almost everything else that goes on with arrays, and so it's error-prone.
Actually, that should be that returns whether arr !s null rather than arr.length != 0, but you get the idea. - Jonathan M Davis
Mar 25 2013
prev sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Phil Lavoie:

 I was not aware arr.length tolerated null slices. Does it keeps 
 its behavior in unsafe or system mode?
That behaviour is always kept. Because the underlying data structure doesn't change. Bye, bearophile
Mar 25 2013
parent "Phil Lavoie" <maidenphil hotmail.com> writes:
On Monday, 25 March 2013 at 17:55:22 UTC, bearophile wrote:
 Phil Lavoie:

 I was not aware arr.length tolerated null slices. Does it 
 keeps its behavior in unsafe or system mode?
That behaviour is always kept. Because the underlying data structure doesn't change. Bye, bearophile
Good to know, thanks. Phil
Mar 25 2013
prev sibling next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Phil Lavoie:

 You think it should check for length to be more programmer 
 friendly,
My proposal is to turn "if(dyn_arr)" into a syntax error. Bye, bearophile
Mar 25 2013
parent reply "Phil Lavoie" <maidenphil hotmail.com> writes:
On Monday, 25 March 2013 at 17:05:31 UTC, bearophile wrote:
 Phil Lavoie:

 You think it should check for length to be more programmer 
 friendly,
My proposal is to turn "if(dyn_arr)" into a syntax error. Bye, bearophile
Yeah, but I also read you were in favor of changing its behavior. I'd say that it would make more sense to remove it than to change its behavior. However, testing for null is kinda useful :) Phil
Mar 25 2013
next sibling parent "Phil Lavoie" <maidenphil hotmail.com> writes:
 However, testing for null is kinda useful :)
Scratch that, I thought you were implying something else :)
Mar 25 2013
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 03/25/2013 06:13 PM, Phil Lavoie wrote:
 ...

 However, testing for null is kinda useful :)
 ...
It is currently basically useless for array slices, because relying on it is brittle.
Mar 25 2013
parent reply "Phil Lavoie" <maidenphil hotmail.com> writes:
On Monday, 25 March 2013 at 17:25:13 UTC, Timon Gehr wrote:
 On 03/25/2013 06:13 PM, Phil Lavoie wrote:
 ...

 However, testing for null is kinda useful :)
 ...
It is currently basically useless for array slices, because relying on it is brittle.
Well, since they CAN be null, it is at least useful in contracts programming. void foo( int[] zeSlice ) in { assert( zeSlice !is null, "passing null slice" ) } body { } Also, imagine, for some reasons, you have that string[ ( int[] ) ] mapOfSlices; ... //initialize all strings, but make sure their corresding slices are null, because an empty slice has a different meaning. auto aSlice = mapOfSlices.get( "toto", null ); Still comparing against null, since it has a different meaning, maybe null means not found and empty means found but without value.
Mar 25 2013
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Mar 25, 2013 at 06:34:23PM +0100, Phil Lavoie wrote:
[...]
 Still comparing against null, since it has a different meaning,
 maybe null means not found and empty means found but without value.
If you want to make this distinction, use std.typecons.Nullable. T -- Obviously, some things aren't very obvious.
Mar 25 2013
parent "Phil Lavoie" <maidenphil hotmail.com> writes:
On Monday, 25 March 2013 at 17:40:27 UTC, H. S. Teoh wrote:
 On Mon, Mar 25, 2013 at 06:34:23PM +0100, Phil Lavoie wrote:
 [...]
 Still comparing against null, since it has a different meaning,
 maybe null means not found and empty means found but without 
 value.
If you want to make this distinction, use std.typecons.Nullable. T
Well, Nullable just acts as if it could be null. Why would I use it when I got a type that can be null, without overhead? Are you saying that they are planning to remove null for slices? Because if you're not then I truly don't see what's your point.
Mar 25 2013
prev sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Monday, March 25, 2013 18:34:23 Phil Lavoie wrote:
 On Monday, 25 March 2013 at 17:25:13 UTC, Timon Gehr wrote:
 On 03/25/2013 06:13 PM, Phil Lavoie wrote:
 ...
 
 However, testing for null is kinda useful :)
 ...
It is currently basically useless for array slices, because relying on it is brittle.
Well, since they CAN be null, it is at least useful in contracts programming. void foo( int[] zeSlice ) in { assert( zeSlice !is null, "passing null slice" ) } body { }
But why would you care? Almost nothing cares about the difference between a null array and an empty array, and unless a function is treating null as something explicitly different from empty (which is generally a bad idea), it really isn't going to care. All of the various array operations will treat them the same.
 Also, imagine, for some reasons, you have that
 string[ ( int[] ) ] mapOfSlices;
 ...
 //initialize all strings, but make sure their corresding slices
 are null, because an empty slice has a different meaning.
 
 auto aSlice = mapOfSlices.get( "toto", null );
 
 Still comparing against null, since it has a different meaning,
 maybe null means not found and empty means found but without
 value.
In general, relying on a null array and an empty array having different meanings is just begging for trouble. About the only places that I would even consider doing so is a member variable which is explicitly set to null to indicate the lack of a value (and even then, using Nullable might be a good idea for clarity and to reduce the risk of bugs) or a function which returns null to indicate something different from a valid value (like not found). - Jonathan M Davis
Mar 25 2013
parent "Phil Lavoie" <maidenphil hotmail.com> writes:
On Monday, 25 March 2013 at 17:44:42 UTC, Jonathan M Davis wrote:
 On Monday, March 25, 2013 18:34:23 Phil Lavoie wrote:
 On Monday, 25 March 2013 at 17:25:13 UTC, Timon Gehr wrote:
 On 03/25/2013 06:13 PM, Phil Lavoie wrote:
 ...
 
 However, testing for null is kinda useful :)
 ...
It is currently basically useless for array slices, because relying on it is brittle.
Well, since they CAN be null, it is at least useful in contracts programming. void foo( int[] zeSlice ) in { assert( zeSlice !is null, "passing null slice" ) } body { }
But why would you care? Almost nothing cares about the difference between a null array and an empty array, and unless a function is treating null as something explicitly different from empty (which is generally a bad idea), it really isn't going to care. All of the various array operations will treat them the same.
I wrote it before reading your post on everything regarding length. I realize now it makes less sense, though there is still the possibility of having a special case for nulls, I mean, this is why Nullable exists in the first place, right? All I'm saying is, there is no point for me to use Nullable for types that I know can be null unless somebody plans on removing null for that type. I'm not sure how much clarity doing var.isNull vs. var is null is going to add for an already nullable type? Regarding the if(dynamic_array) issue, I am not in favor of keeping it. I just think that it currently does exactly what I expect from it (though I don't use it), and I'm not sure how much changing it would be helpful. I understand that you think it should test for length, I'm just saying this maybe ain't the predictable behavior for everybody. Phil
Mar 25 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Monday, March 25, 2013 17:43:24 Phil Lavoie wrote:
 I do believe that, in any case, this form is best:
 if( arr !is null && !arr.empty )
That's utterly pointless. empty checks that length == 0, and length == 0 if the array is null. It can be useful to check whether an array is null with the is operator for cases where that's used to indicate that a result wasn't found or something like that, but very little cares about the difference between a null array and an empty one, and attempting to treat them as different tends to be very error-prone. I don't really like how null works with arrays, since it's generally treated as the same as an empty array (including for ==), but that's the way it works in D, and you just have to deal with it. And given how arrays in D work in general, having if(arr) check specifically for null rather than empty is definitely error-prone. - Jonathan M Davis
Mar 25 2013
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Mar 25, 2013 at 01:15:59PM -0400, Jonathan M Davis wrote:
 On Monday, March 25, 2013 17:43:24 Phil Lavoie wrote:
 I do believe that, in any case, this form is best:
 if( arr !is null && !arr.empty )
That's utterly pointless. empty checks that length == 0, and length == 0 if the array is null. It can be useful to check whether an array is null with the is operator for cases where that's used to indicate that a result wasn't found or something like that, but very little cares about the difference between a null array and an empty one, and attempting to treat them as different tends to be very error-prone. I don't really like how null works with arrays, since it's generally treated as the same as an empty array (including for ==), but that's the way it works in D, and you just have to deal with it. And given how arrays in D work in general, having if(arr) check specifically for null rather than empty is definitely error-prone.
[...] I think this is all just a storm in a teacup. Just use .length: import std.stdio; void main() { int[] a; writeln(a is null); // prints true writeln(a.length); // prints 0 a ~= 1; a.length = 0; writeln(a is null); // prints false writeln(a.length); // prints 0 } So just use "if (a.length > 0)" and you're fine. The distinction between null and non-null for arrays is, IMO, an implementation-specific detail that should not be relied upon. In my mind, it's the implementation that gets to decide when an array should be null and when it shouldn't. User code had better not be dependent on that kind of detail. If you *really* need to tell the difference, just use Nullable to wrap around the array. T -- English has the lovely word "defenestrate", meaning "to execute by throwing someone out a window", or more recently "to remove Windows from a computer and replace it with something useful". :-) -- John Cowan
Mar 25 2013
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 03/24/2013 11:10 PM, bearophile wrote:
 A recent discussion in D.learn reminds me of an enhancement request of
 mine that is sleeping in Bugzilla since years:

 http://d.puremagic.com/issues/show_bug.cgi?id=4733


 The probles is that in D dynamic arrays can be non-null even when they
 are empty:


 import std.stdio;
 int[] foo() {
      auto a = [1];
      return a[0..0];
 }
 void main() {
      auto data = foo();
      if (data)
          writeln("here");
 }


 This is dangerous, so in D the safe and idiomatic way to test for empty
 arrays is to use std.array.empty().

 So my proposal of Issue 4733 is to forbid (with the usual
 warning/deprecation intermediate steps) the use of dynamic arrays in a
 boolean context:


 void main() {
      auto a = [1];
      if (a) {} // error, forbidden.
 }


 So to test empty/null you have to use empty() or "is null":

 import std.array: empty;
 void main() {
      auto a = [1];
      if (a.empty) {} // OK
      if (a is null) {} // OK
 }
 ...
Well, cast(bool)a currently checks a.ptr: void main(){ auto x = (cast(void*)null)[0..1]; assert(x !is null); assert(x); } Also, IMO null arrays should either be removed or [] should be guaranteed to be non-null. Maybe cast(bool)a simply shouldn't work. (though, personally, I'd lean towards checking length.)
Mar 25 2013
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Timon Gehr:

 Also, IMO null arrays should either be removed or [] should be 
 guaranteed to be non-null.
This is a separated topic. Associative arrays share related problems. Bye, bearophile
Mar 25 2013
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 03/25/2013 06:11 PM, bearophile wrote:
 Timon Gehr:

 Also, IMO null arrays should either be removed or [] should be
 guaranteed to be non-null.
This is a separated topic.
Not at all. It is very closely related.
 ...
Mar 26 2013
prev sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Monday, March 25, 2013 10:29:46 H. S. Teoh wrote:
 On Mon, Mar 25, 2013 at 01:15:59PM -0400, Jonathan M Davis wrote:
 On Monday, March 25, 2013 17:43:24 Phil Lavoie wrote:
 I do believe that, in any case, this form is best:
 if( arr !is null && !arr.empty )
That's utterly pointless. empty checks that length == 0, and length == 0 if the array is null. It can be useful to check whether an array is null with the is operator for cases where that's used to indicate that a result wasn't found or something like that, but very little cares about the difference between a null array and an empty one, and attempting to treat them as different tends to be very error-prone. I don't really like how null works with arrays, since it's generally treated as the same as an empty array (including for ==), but that's the way it works in D, and you just have to deal with it. And given how arrays in D work in general, having if(arr) check specifically for null rather than empty is definitely error-prone.
[...] I think this is all just a storm in a teacup. Just use .length: import std.stdio; void main() { int[] a; writeln(a is null); // prints true writeln(a.length); // prints 0 a ~= 1; a.length = 0; writeln(a is null); // prints false writeln(a.length); // prints 0 } So just use "if (a.length > 0)" and you're fine. The distinction between null and non-null for arrays is, IMO, an implementation-specific detail that should not be relied upon.
You can rely on null being null, and you can rely on any array that you've explictly set to null being null as long as no other mutating operations are used on it. That's guaranteed.
 In my
 mind, it's the implementation that gets to decide when an array should
 be null and when it shouldn't. User code had better not be dependent on
 that kind of detail. If you *really* need to tell the difference, just
 use Nullable to wrap around the array.
It can be valuable to have a function which returns an array specifically return null to indicate something, and I believe that Phobos does this in some places. But that's about the only place that it's safe to rely on an array being null, as it returned null explicitly. It's relying on an array being null when it was not explicitly set to null which doesn't work. - Jonathan M Davis
Mar 25 2013