digitalmars.D.learn - The Nullity Of strings and Its Meaning

kdevel (85/85) Jul 08 2017 Yesterday I noticed that std.uri.decodeComponent does not

Timon Gehr (7/11) Jul 08 2017 Not nearly as much as it would need to be to justify the current
ag0aep6g (31/98) Jul 08 2017 Yeah, that's considered "explicit". Also happens with `if (s)`.

kdevel (34/59) Jul 08 2017 Just saw that my first example was wrong, it should read

ag0aep6g (38/66) Jul 09 2017 As I said: I wouldn't mind if it went away. I don't see a strong use

kdevel (24/59) Jul 09 2017 As a D novice am not in the position to suggest changes in the

kdevel (3/5) Jul 09 2017 Shall read: "But that second predicate was not the one chosen in
ag0aep6g (29/49) Jul 09 2017 Nullity of D strings is quite different from nullity of C strings. A

kdevel (11/18) Jul 09 2017 A C string is a sequence of storage units containing legitimate

ag0aep6g (3/5) Jul 09 2017 The two being null and ""? There are more than those. One for

Jonathan M Davis via Digitalmars-d-learn (20/23) Jul 09 2017 There are going to be functions that return null rather than an empty sl...

Jonathan M Davis via Digitalmars-d-learn (92/177) Jul 08 2017 A dynamic array in D is essentially

kdevel (14/31) Jul 09 2017 My starting point wasn't to check for emptiness but the question

kdevel <kdevel vogtner.de> writes:

Yesterday I noticed that std.uri.decodeComponent does not 
'preserve' the
nullity of its argument:

    1 void main ()
    2 {
    3    import std.uri;
    4    string s = null;
    5    assert (s is null);
    6    assert (s.decodeComponent);
    7 }

The assertion in line 6 fails. This failure gave rise to a more 
general
investigation on strings. After some research I found that one
"cannot implicitly convert expression (s) of type string to bool" 
as in

    1 void main ()
    2 {
    3    string s;
    4    bool b = s;
    5 }

Nonetheless in certain boolean contexts strings convert to bool 
as here:

    1 void main ()
    2 {
    3    import std.stdio;
    4    string s; // equivalent to s = null
    5    writeln (s ? true : false);
    6    s = "";
    7    writeln (s ? true : false);
    8 }

The code prints

    false
    true

to the console. This lead me to the insight, that in D there are 
two
distinct kinds of empty strings: Those having a ptr which is null 
and
the other. It seems that this ptr nullity not only determines 
whether
the string compares equal to null in an IdentityExpression [1] 
but also
the result of the above mentioned conversion in the boolean 
context.

I wonder if this distinction is meaningful and---if not---why it 
is
exposed to the application programmer so prominently.

Then today I found this piece of code

    1 void main ()
    2 {
    3    string s = null;
    4    string t = "";
    5    assert (s is t);
    6 }

which, according to the wording in [1]

   "For static and dynamic arrays, identity is defined as 
referring to
    the same array elements and the same number of elements."

shall succeed but its assertion fails [2]. I anticipate the
implementation compares the ptrs even in the case of zero 
elements.

A last example of 'deviant behavior' I found is this:

     1 import std.stdio;
     2 import std.file;
     3 void main ()
     4 {
     5    string s = null;
     6    try
     7       mkdir (s);
     8    catch (Exception e)
     9       e.msg.writeln;
    10
    11    s = "";
    12    try
    13       mkdir (s);
    14    catch (Exception e)
    15       e.msg.writeln;
    16 }

Using DMD v2.073.2 the first expression terminates the programm 
with a
segmentation fault. With 2.074.1 the program prints

    : Bad address
    : No such file or directory

I find that a bit confusing.

[1] https://dlang.org/spec/expression.html#identity_expressions
[2] https://issues.dlang.org/show_bug.cgi?id=17623

Jul 08 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 08.07.2017 19:16, kdevel wrote:
 
 I wonder if this distinction is meaningful

Not nearly as much as it would need to be to justify the current 
behavior. It's mostly a historical accident.

 and---if not---why it is
 exposed to the application programmer so prominently.

I don't think there is a good reason except backwards-compatibility.
Also see: https://github.com/dlang/dmd/pull/4623
(This is the pull request that restored the bad behaviour after it had 
been fixed.)

Jul 08 2017

ag0aep6g <anonymous example.com> writes:

On 07/08/2017 07:16 PM, kdevel wrote:

 The assertion in line 6 fails. This failure gave rise to a more general
 investigation on strings. After some research I found that one
 "cannot implicitly convert expression (s) of type string to bool" as in

[...]
 Nonetheless in certain boolean contexts strings convert to bool as here:
 
     1 void main ()
     2 {
     3    import std.stdio;
     4    string s; // equivalent to s = null
     5    writeln (s ? true : false);
     6    s = "";
     7    writeln (s ? true : false);
     8 }

Yeah, that's considered "explicit". Also happens with `if (s)`.

 The code prints
 
     false
     true
 
 to the console. This lead me to the insight, that in D there are two
 distinct kinds of empty strings: Those having a ptr which is null and
 the other. It seems that this ptr nullity not only determines whether
 the string compares equal to null in an IdentityExpression [1] but also
 the result of the above mentioned conversion in the boolean context.

Yup. Though I'd say the distinction is null vs every other array, not 
null vs other empty arrays.

null is one specific array. It happens to be empty, but that doesn't 
really matter. `foo is null` compares with the null array. It doesn't 
check for emptiness. Conversion to bool also compares with null. The 
concept of emptiness is unrelated.

Maybe detecting empty arrays would be more useful. As far as I know, 
there's no killer argument either way. Changing it now would break code, 
of course.

Personally, I wouldn't mind if those conversions to bool just went away. 
It's not obvious what exactly is being checked, and it's not hard to be 
explicit about it with .ptr and/or .length. But as Timon notes, that has 
been attempted, and it broke code. So it was reverted, and that's that.

 I wonder if this distinction is meaningful and---if not---why it is
 exposed to the application programmer so prominently.

"Prominently"? It only shows up when you convert to bool. You only get 
surprised if you expect that to check for emptiness (or something else 
entirely). And you don't really have a reason to expect that. You can 
easily avoid the issue by being more explicit in your code (`arr.ptr is 
null`, `arr.length == 0`/`arr.empty`).

 Then today I found this piece of code
 
     1 void main ()
     2 {
     3    string s = null;
     4    string t = "";
     5    assert (s is t);
     6 }
 
 which, according to the wording in [1]
 
    "For static and dynamic arrays, identity is defined as referring to
     the same array elements and the same number of elements."
 
 shall succeed but its assertion fails [2]. I anticipate the
 implementation compares the ptrs even in the case of zero elements.

The spec isn't very clear there. What does "the same array elements" 
mean for empty arrays? Can two arrays refer to "the same array elements" 
but have different lengths? It seems like "referring to the same array 
elements" is supposed to mean "having the same value in .ptr" without 
mentioning .ptr.

The implementation obviously compares .ptr and .length.

 A last example of 'deviant behavior' I found is this:
 
      1 import std.stdio;
      2 import std.file;
      3 void main ()
      4 {
      5    string s = null;
      6    try
      7       mkdir (s);
      8    catch (Exception e)
      9       e.msg.writeln;
     10
     11    s = "";
     12    try
     13       mkdir (s);
     14    catch (Exception e)
     15       e.msg.writeln;
     16 }
 
 Using DMD v2.073.2 the first expression terminates the programm with a
 segmentation fault. With 2.074.1 the program prints
 
     : Bad address
     : No such file or directory
 
 I find that a bit confusing.

That looks like a bug/oddity in mkdir. null is as valid a string as "". 
It shouldn't give a worse exception message.

But the message for `""` isn't exactly good, either. Of course the 
directory doesn't exist, yet; I'm trying to create it!

Jul 08 2017

kdevel <kdevel vogtner.de> writes:

Just saw that my first example was wrong, it should read

       1 void main ()
       2 {
       3    import std.uri;
       4    string a = "";
       5    assert (a);
       6    auto s = a.decodeComponent;
       7    assert (s);
       8 }

The non-nullity was not preserved. Only the second assert fails.

On Saturday, 8 July 2017 at 18:39:47 UTC, ag0aep6g wrote:
 On 07/08/2017 07:16 PM, kdevel wrote:

 null is one specific array. It happens to be empty, but that 
 doesn't really matter. `foo is null` compares with the null 
 array. It doesn't check for emptiness. Conversion to bool also 
 compares with null. The concept of emptiness is unrelated.

But why? What is the intended use of converting a string (or any 
other dynamic array) to bool?

In Issue 17623 Vladimir pointed out, that in the case of strings 
there may be a need to store an empty D-string which also is a 
NUL-terminated C-String. It would be sufficient if the ptr-Value 
would convert for checking if there is a valid part of memory 
containing the NUL byte.

Moreover everything I've written about strings is also valid for 
e.g. dynamic arrays of doubles. Here there are also two different 
kinds of empty arrays which compare equal but are not identical. 
I see no purpose for that.

 I wonder if this distinction is meaningful and---if not---why 
 it is
 exposed to the application programmer so prominently.

 "Prominently"? It only shows up when you convert to bool.

The conversion to bool (in a bool context) is part of the 
interface of the type. The interface of a type *is* prominently 
exposed.

 You only get surprised if you expect that to check for 
 emptiness (or something else entirely).

As mentioned I was surprised, that the non-nullity did not pass 
thru decodeComponent.

 The spec isn't very clear there. What does "the same array 
 elements" mean for empty arrays?

Mathematically that's easily answered: 
https://en.wikipedia.org/wiki/Universal_quantification#The_empty_set

(mkdir)
 Using DMD v2.073.2 the first expression terminates the 
 programm with a
 segmentation fault. With 2.074.1 the program prints
 
     : Bad address
     : No such file or directory
 
 I find that a bit confusing.

 That looks like a bug/oddity in mkdir. null is as valid a 
 string as "". It shouldn't give a worse exception message.

 But the message for `""` isn't exactly good, either. Of course 
 the directory doesn't exist, yet; I'm trying to create it!

I would expect the same error message (ENOENT) in both cases. The 
EFAULT in
the first case occurs if you invoke POSIX mkdir with NULL as 
first argument.

Jul 08 2017

ag0aep6g <anonymous example.com> writes:

On 07/09/2017 01:12 AM, kdevel wrote:
 On Saturday, 8 July 2017 at 18:39:47 UTC, ag0aep6g wrote:
 On 07/08/2017 07:16 PM, kdevel wrote:

 
 null is one specific array. It happens to be empty, but that doesn't 
 really matter. `foo is null` compares with the null array. It doesn't 
 check for emptiness. Conversion to bool also compares with null. The 
 concept of emptiness is unrelated.

 
 But why? What is the intended use of converting a string (or any other 
 dynamic array) to bool?

As I said: I wouldn't mind if it went away. I don't see a strong use 
case that justifies the non-obvious behavior of `if (arr)`. But 
apparently it is being used, and breaking code is a no-no.

As for how it's used, I'd start digging at the link Timon has posted.

 In Issue 17623 Vladimir pointed out, that in the case of strings there 
 may be a need to store an empty D-string which also is a NUL-terminated 
 C-String. It would be sufficient if the ptr-Value would convert for 
 checking if there is a valid part of memory containing the NUL byte.

But just looking at .ptr doesn't tell if there's a '\0'. You'd have to 
dereference the pointer too.

And that's not what Vladimir is getting at. Issue 17623 is about `arr1 
is arr2`, not about conversions to bool like `if (arr)`. It makes sense 
that `null !is ""`. They're not "the same". One place where the 
difference matters is when working with C strings.

Issue 17623 is absolutely valid. But it's much more likely that the spec 
will be changed rather than the implementation.

 Moreover everything I've written about strings is also valid for e.g. 
 dynamic arrays of doubles. Here there are also two different kinds of 
 empty arrays which compare equal but are not identical. I see no purpose 
 for that.

So you'd make `arr1 is arr2` true when they're empty, ignoring a 
difference in pointers. Otherwise, it would still compare pointers. Right?

I don't think that's a good idea, simply because it's a special case.

I noticed that you haven't mentioned `==`. You're probably aware of it, 
but if not we might be talking past each other. So, just to be clear: 
You can also compare arrays with `==` which compares elements. `null == 
""` is true.

 You only get surprised if you expect that to check for emptiness (or 
 something else entirely).

 
 As mentioned I was surprised, that the non-nullity did not pass thru 
 decodeComponent.

decodeComponent doesn't seem to return the same (identical) string you 
pass it, most of the time. Try "foo":

----
void main()
{
     import std.uri;
     string a = "foo";
     auto s = a.decodeComponent;
     assert(s == a); /* passes */
     assert(s is a); /* fails */
}
----

decodeComponent simply gives no promise of preserving pointers. You also 
shouldn't rely on it returning null for a null input, even when it 
currently does that.

 The spec isn't very clear there. What does "the same array elements" 
 mean for empty arrays?

 
 Mathematically that's easily answered: 
 https://en.wikipedia.org/wiki/Universal_quantification#The_empty_set

So "two empty arrays refer to the same elements" is true because 
everything said about the elements of empty arrays is true? Is "two 
empty arrays do *not* refer to the same elements" also true?

Jul 09 2017

kdevel <kdevel vogtner.de> writes:

On Sunday, 9 July 2017 at 10:32:23 UTC, ag0aep6g wrote:
 On 07/09/2017 01:12 AM, kdevel wrote:
 On Saturday, 8 July 2017 at 18:39:47 UTC, ag0aep6g wrote:
 On 07/08/2017 07:16 PM, kdevel wrote:



[...]

 Moreover everything I've written about strings is also valid 
 for e.g. dynamic arrays of doubles. Here there are also two 
 different kinds of empty arrays which compare equal but are 
 not identical. I see no purpose for that.

 So you'd make `arr1 is arr2` true when they're empty, ignoring 
 a difference in pointers. Otherwise, it would still compare 
 pointers. Right?

As a D novice am not in the position to suggest changes in the 
language (yet).
I would appreciate a documentation that accurately represents 
what is implemented.

 I don't think that's a good idea, simply because it's a special 
 case.

 I noticed that you haven't mentioned `==`. You're probably 
 aware of it, but if not we might be talking past each other. 
 So, just to be clear: You can also compare arrays with `==` 
 which compares elements. `null == ""` is true.

As mentioned in the subject my posting is about the state of 
affairs wrt. the (non-)nullity of strings. In C/C++ once a char * 
variable became non-NULL 'it' never loses this property. In D 
this is not the case: The non-null value ""
'becomes' null in

    "".decodeComponent

 You only get surprised if you expect that to check for 
 emptiness (or something else entirely).

 
 As mentioned I was surprised, that the non-nullity did not 
 pass thru decodeComponent.

 decodeComponent doesn't seem to return the same (identical) 
 string you pass it, most of the time.

Sure. But I am writing about the string value which comprises the 
(non-)nullity of the string. This is not preserved.

[...]

 decodeComponent simply gives no promise of preserving pointers.

string is not a pointer but a type. To the user of string it is 
completely irrelevant, if the nullity of the string is 
implemented by referring to a pointer inside the implementation 
of string.

 You also shouldn't rely on it returning null for a null input, 
 even when it currently does that.

I assumed that a non-null string is returned for a non-null input.

 The spec isn't very clear there. What does "the same array 
 elements" mean for empty arrays?

 
 Mathematically that's easily answered: 
 https://en.wikipedia.org/wiki/Universal_quantification#The_empty_set

 So "two empty arrays refer to the same elements" is true 
 because everything said about the elements of empty arrays is 
 true? Is "two empty arrays do *not* refer to the same elements" 
 also true?

Yes. But that second proposition what not the one chosen in the 
documentation. It was not chosen because it does not extend to 
the nontrivial case where one has more than zero elements. ;-)

Stefan

Jul 09 2017

kdevel <kdevel vogtner.de> writes:

On Sunday, 9 July 2017 at 13:51:44 UTC, kdevel wrote:

 But that second proposition what not the one chosen in the 
 documentation.

Shall read: "But that second predicate was not the one chosen in 
the documentation."

Jul 09 2017

ag0aep6g <anonymous example.com> writes:

On 07/09/2017 03:51 PM, kdevel wrote:
 On Sunday, 9 July 2017 at 10:32:23 UTC, ag0aep6g wrote:

[...]
 As mentioned in the subject my posting is about the state of affairs 
 wrt. the (non-)nullity of strings. In C/C++ once a char * variable 
 became non-NULL 'it' never loses this property. In D this is not the 
 case: The non-null value ""
 'becomes' null in
 
     "".decodeComponent

Nullity of D strings is quite different from nullity of C strings. A 
null D string is a valid string with length 0. A null char* is not a 
proper C string. It doesn't have length 0. It has no length.

A C function can't return a null char* when it's supposed to return an 
empty string. But a D function can return a null char[] in that case.

[...]
 Sure. But I am writing about the string value which comprises the 
 (non-)nullity of the string. This is not preserved.

Just like other pointers are not preserved. In the .ptr field of a D 
array, a null pointer isn't special. Null arrays aren't special beyond 
having a unique name.

[...]
 string is not a pointer but a type. To the user of string it is 
 completely irrelevant, if the nullity of the string is implemented by 
 referring to a pointer inside the implementation of string.

string is a type that involves a pointer. The type is not opaque. The 
user can access the pointer.

A null array is not some magic (invalid) value. It's just just the one 
that has a null .ptr and a zero .length.

I think that's widely known, but it might not actually be in the spec. 
At least, I can't find it. The page on arrays [1] just says that 
"`.init` returns `null`" and that "pointers are initialized to `null`, 
without saying what null means for arrays. On the `null` expression [2], 
the spec mentions a "null value" of arrays, but again doesn't say what 
that means.

 You also shouldn't rely on it returning null for a null input, even 
 when it currently does that.

 
 I assumed that a non-null string is returned for a non-null input.

As far as I see, you had no reason to assume that. If the spec or some 
other document mislead you, it needs fixing.

[...]
 Yes. But that second proposition what not the one chosen in the 
 documentation. It was not chosen because it does not extend to the 
 nontrivial case where one has more than zero elements. ;-)

Or the spec's just poorly written there, and wasn't meant the way you've 
interpreted it.


[1] https://dlang.org/spec/arrays.html
[2] https://dlang.org/spec/expression.html#null

Jul 09 2017

kdevel <kdevel vogtner.de> writes:

On Sunday, 9 July 2017 at 15:10:56 UTC, ag0aep6g wrote:
 On 07/09/2017 03:51 PM, kdevel wrote:
 On Sunday, 9 July 2017 at 10:32:23 UTC, ag0aep6g wrote:


[...]

 A null char* is not a proper C string.

A C string is a sequence of storage units containing legitimate 
character values of which the last one is the NUL character.

 It doesn't have length 0. It has no length.

In C a NULL ptr does not refer to anything including not to a C 
string.

 A C function can't return a null char* when it's supposed to 
 return an empty string.

That is true. And this it what mislead me thinking that D behaves 
the same.

 But a D function can return a null char[] in that case.

Yes, it can obviously return one of the two representations of 
the empty D string.

Stefan

Jul 09 2017

ag0aep6g <anonymous example.com> writes:

On Sunday, 9 July 2017 at 18:55:51 UTC, kdevel wrote:
 Yes, it can obviously return one of the two representations of 
 the empty D string.

The two being null and ""? There are more than those. One for 
every possible .ptr value.

Jul 09 2017

Jonathan M Davis via Digitalmars-d-learn writes:

On Sunday, July 9, 2017 1:51:44 PM MDT kdevel via Digitalmars-d-learn wrote:
 You also shouldn't rely on it returning null for a null input,
 even when it currently does that.

 I assumed that a non-null string is returned for a non-null input.

There are going to be functions that return null rather than an empty slice
of the original array. You really can't rely on getting an empty array
instead of a null one from a function unless the documentation tells you
that. For most purposes, there is no practical difference between a null
array and an empty array, so very little code is written which cares about
the difference. The only place where I would expect a function in a library
to distinguish is if its documentation says that it does (e.g. if returning
null means something specific or if it specifically says that the result is
a slice of the input).

In general, relying on whether a dynamic array is null or not outside of
code that you control or functions that are explicit about what they so with
null is risky business.

Sometimes, I wish that null were not treated as empty, and you were forced
to allocate a new array or somesuch rather than having null arrays just work
- then you could actually rely on stuff being null or not - but that would
also result in a lot more segfaults when people screwed up. The status quo
works surprisingly well overall. It just makes it dangerous to do much with
distinguishing null arrays from empty ones.

- Jonathan M Davis

Jul 09 2017

Jonathan M Davis via Digitalmars-d-learn writes:

On Saturday, July 8, 2017 5:16:51 PM MDT kdevel via Digitalmars-d-learn 
wrote:
 Yesterday I noticed that std.uri.decodeComponent does not
 'preserve' the
 nullity of its argument:

     1 void main ()
     2 {
     3    import std.uri;
     4    string s = null;
     5    assert (s is null);
     6    assert (s.decodeComponent);
     7 }

 The assertion in line 6 fails. This failure gave rise to a more
 general
 investigation on strings. After some research I found that one
 "cannot implicitly convert expression (s) of type string to bool"
 as in

     1 void main ()
     2 {
     3    string s;
     4    bool b = s;
     5 }

 Nonetheless in certain boolean contexts strings convert to bool
 as here:

     1 void main ()
     2 {
     3    import std.stdio;
     4    string s; // equivalent to s = null
     5    writeln (s ? true : false);
     6    s = "";
     7    writeln (s ? true : false);
     8 }

 The code prints

     false
     true

 to the console. This lead me to the insight, that in D there are
 two
 distinct kinds of empty strings: Those having a ptr which is null
 and
 the other. It seems that this ptr nullity not only determines
 whether
 the string compares equal to null in an IdentityExpression [1]
 but also
 the result of the above mentioned conversion in the boolean
 context.

 I wonder if this distinction is meaningful and---if not---why it
 is
 exposed to the application programmer so prominently.

 Then today I found this piece of code

     1 void main ()
     2 {
     3    string s = null;
     4    string t = "";
     5    assert (s is t);
     6 }

 which, according to the wording in [1]

    "For static and dynamic arrays, identity is defined as
 referring to
     the same array elements and the same number of elements."

 shall succeed but its assertion fails [2]. I anticipate the
 implementation compares the ptrs even in the case of zero
 elements.

 A last example of 'deviant behavior' I found is this:

      1 import std.stdio;
      2 import std.file;
      3 void main ()
      4 {
      5    string s = null;
      6    try
      7       mkdir (s);
      8    catch (Exception e)
      9       e.msg.writeln;
     10
     11    s = "";
     12    try
     13       mkdir (s);
     14    catch (Exception e)
     15       e.msg.writeln;
     16 }

 Using DMD v2.073.2 the first expression terminates the programm
 with a
 segmentation fault. With 2.074.1 the program prints

     : Bad address
     : No such file or directory

 I find that a bit confusing.

 [1] https://dlang.org/spec/expression.html#identity_expressions
 [2] https://issues.dlang.org/show_bug.cgi?id=17623

A dynamic array in D is essentially

struct DynamicArray(T)
{
    size_t length;
    T* ptr;
}

That's not _exactly_ what it is at the moment (it actually does stuff with
void* rather than templates unfortunately), but essentially, that's what it
is and what it behaves like.

In the case of dyanamic arrays, null is a dynamic array whose ptr is null
and whose length is 0.

The empty property for arrays checks whether the length of the array is 0.
So, any array with a length of 0 (regardless of its ptr) is considered
empty.

The is expression checks for bitwise equality. So,

arr is null

checks for whether the array has a null ptr and a 0 length. In _most_
circumstances, that's equvialent to checking that the array's ptr is null,
but if you do something screwy with unitialized memory, then you could end
up with a ptr value of null and a non-zero length, and

arr is null

would be false. The == expression, on the other, hand checks that the
elements are equal. So, it does something similar to

if(lhs.length != rhs.length)
    return false;
for(size_t i = 0; i < lhs.length; ++i)
{
    if(lhs.ptr[i] != rhs.ptr[i])
        return false;
}
return true;

So, if the lengths are 0, no iterating happens, and the two arrays are
considered equal. This means that a null array is equal to any other empty
array, regardless of the value of ptr. It's also why I would consider

arr == null

to be a code smell. IMHO, if you want to check for empty, then you should
use the empty property or check length directly, since those are clear about
your intent, whereas with

arr == null

you always have the question of whether they should have used an is
expression or whether they were simpy checking for an empty array.

If you understand all of this, it is perfectly possible to write code which
treats null arrays as distinct from empty arrays. However, it's _very_ easy
to get into a situation where you have an empty array rather than a null
one.  Pretty much as soon as you do anything to a null array other than pass
it around or compare it, trusting that it's still null can get error-prone.
And that's why a number of folks think that it's just plain error-prone to
try and treat null arrays as special - but some folks who understand the
issues continue to do so anyway, because they know enough to make it work
and consider the distinction valuable.

Personally, I think that it can make sense to have a function explicitly
return null to indicate something, but beyond that, I'd actually consider
using std.typecons.Nullable to make the whole thing clear, even if it is a
bit dumb to have to wrap a nullable type in a Nullable to treat it as null.

As for conversions to bool, not much implcitly converts to bool - dynamic
arrays included. However, conditional expressions in if statements, loops,
ternary expressions, and assertions actually insert an invisible, explicit
cast. So, even though the conversion _looks_ implicit, it's actually
explicit. So,

if(cond)
{
}

is actually

if(cast(bool)cond)
{
}

For user-defined types, that means that the way to affect how they're
treated in condition expressions is to overload opCast to bool. For,
built-in types, the result varies depending on how it was decided to casting
that type to bool would work. For pointers,

cast(bool)ptr

becomes

ptr !is null

which makes a lot of sense. Unfortunately, because dynamic arrays were just
pointers in C, D has historically treated dynamic arrays as pointers under
certain circumstances and implictly converted them to value of their ptr
property. Fortunately, in many cases, that has been fixed, and the compiler
has gotten stricter. Unforunately, however, it is still the case that
casting a dynamic array to bool checks its ptr value for null. This works
fine if you know what  you're doing but is frequently surprising to folks
and is arguably error-prone. It _was_ temporarily fixed at one point by
deprecating using arrays in conditional expressions, but some major D
contributors (Andrei included) who understood how to correctly treat null,
dynamic arrays as special did not like the change, and it was reverted.

So, basically, you should be _very_ wary of ever using a dynamic array in a
conditional expression directly. If you know what you're doing, it can be
done correctly, but it's error prone, and it's arguably a code smell,
because folks reading your code don't necessarily know that you know what
you're doing well enough to get it right.

- Jonathan M Davis

Jul 08 2017

kdevel <kdevel vogtner.de> writes:

On Saturday, 8 July 2017 at 23:12:15 UTC, Jonathan M Davis wrote:
 On Saturday, July 8, 2017 5:16:51 PM MDT kdevel via 
 Digitalmars-d-learn wrote:

[...]

 IMHO, if you want to check for empty, then you should use the 
 empty property or check length directly, since those are clear 
 about your intent, whereas with

My starting point wasn't to check for emptiness but the question 
if I can use the additional two states (string var is null or !is 
null) of a string variable to indicate if a value is absent.

 If you understand all of this, it is perfectly possible to 
 write code which treats null arrays as distinct from empty 
 arrays. However, it's _very_ easy to get into a situation where 
 you have an empty array rather than a null one.

My case was: I get a null one from

    "".decodeComponent

where I did not expect it. (cf. my corrected example in my post 
"13 hours ago", i.e Saturday, 08 July 2017, 23:12:20 +00:00).

 Pretty much as soon as you do anything to a null array other 
 than pass it around or compare it, trusting that it's still 
 null can get error-prone.

It's the other way round. I was assuming that it is still not 
null (My example in my first post was wrong).

[...]

 Personally, I think that it can make sense to have a function 
 explicitly return null to indicate something, but beyond that, 
 I'd actually consider using std.typecons.Nullable to make the 
 whole thing clear, even if it is a bit dumb to have to wrap a 
 nullable type in a Nullable to treat it as null.

You hit the nail on the head.

Stefan

Jul 09 2017

D Programming

C/C++ Programming

Other

digitalmars.D.learn - The Nullity Of strings and Its Meaning