digitalmars.D.learn - "" gives an empty string, while "".idup gives null

simendsjo (5/5) Aug 03 2011 void main() {

bearophile (4/10) Aug 03 2011 I think someone has even suggested to statically forbid "is null" on str...

simendsjo (6/16) Aug 03 2011 How should I test for null if not with "is null"? There is a difference

Mike Parker (10/30) Aug 03 2011 This is apparently a bug. Somehow, the idup is clobbering the pointer.

Jonathan M Davis (13/46) Aug 03 2011 I don't know if it's a bug or not. The string _was_ duped. assert(s == "...

simendsjo (31/77) Aug 03 2011 I would think it's a bug, but strings doesn't quite behave as regular

Jonathan M Davis (24/109) Aug 03 2011 If you look at the spec ( http://d-programming-language.org/arrays.html ...

simendsjo (12/121) Aug 03 2011 Schveighoffer also states it is as designed.

Steven Schveighoffer (19/31) Aug 03 2011 I would recommend against depending on the difference between null and
Jonathan M Davis (19/152) Aug 03 2011 You can check for null. There _is_ a difference between null and empty. ...
Dmitry Olshansky (9/163) Aug 03 2011 length works even for "null" arrays and returns 0. Even cleaner way is

Steven Schveighoffer (20/26) Aug 03 2011 An empty string manifest constant (i.e. string literal) still must have ...

Jonathan M Davis (11/46) Aug 03 2011 Given that if you really wanted the duped string to be empty instead of ...

simendsjo <simendsjo gmail.com> writes:

void main() {
     assert(is(typeof("") == typeof("".idup))); // both is immutable(char)[]

     assert(""      !is null);
     assert("".idup !is null); // fails - s is null. Why?
}

Aug 03 2011

bearophile <bearophileHUGS lycos.com> writes:

simendsjo:

 void main() {
      assert(is(typeof("") == typeof("".idup))); // both is immutable(char)[]
 
      assert(""      !is null);
      assert("".idup !is null); // fails - s is null. Why?
 }

I think someone has even suggested to statically forbid "is null" on strings :-)

Bye,
bearophile

Aug 03 2011

simendsjo <simendsjo gmail.com> writes:

On 03.08.2011 15:49, bearophile wrote:
 simendsjo:

 void main() {
       assert(is(typeof("") == typeof("".idup))); // both is immutable(char)[]

       assert(""      !is null);
       assert("".idup !is null); // fails - s is null. Why?
 }

 I think someone has even suggested to statically forbid "is null" on strings
:-)

 Bye,
 bearophile

How should I test for null if not with "is null"? There is a difference 
between null and empty, and avoiding this is not necessarily easy or 
even wanted.
I couldn't find anything in the specification stating this difference.
So... Is it a bug?

Aug 03 2011

Mike Parker <aldacron gmail.com> writes:

On 8/3/2011 11:23 PM, simendsjo wrote:
 On 03.08.2011 15:49, bearophile wrote:
 simendsjo:

 void main() {
 assert(is(typeof("") == typeof("".idup))); // both is immutable(char)[]

 assert("" !is null);
 assert("".idup !is null); // fails - s is null. Why?
 }

 I think someone has even suggested to statically forbid "is null" on
 strings :-)

 Bye,
 bearophile

 How should I test for null if not with "is null"? There is a difference
 between null and empty, and avoiding this is not necessarily easy or
 even wanted.
 I couldn't find anything in the specification stating this difference.
 So... Is it a bug?

This is apparently a bug. Somehow, the idup is clobbering the pointer. 
You can see it more clearly here:

void main()
{
	assert("".ptr);
	
	auto s = "".idup;
	assert(s.ptr); // boom!
}

Aug 03 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday 04 August 2011 00:27:12 Mike Parker wrote:
 On 8/3/2011 11:23 PM, simendsjo wrote:
 On 03.08.2011 15:49, bearophile wrote:
 simendsjo:
 void main() {
 assert(is(typeof("") == typeof("".idup))); // both is
 immutable(char)[]
 
 assert("" !is null);
 assert("".idup !is null); // fails - s is null. Why?
 }

 
 I think someone has even suggested to statically forbid "is null" on
 strings :-)
 
 Bye,
 bearophile

 
 How should I test for null if not with "is null"? There is a difference
 between null and empty, and avoiding this is not necessarily easy or
 even wanted.
 I couldn't find anything in the specification stating this difference.
 So... Is it a bug?

 
 This is apparently a bug. Somehow, the idup is clobbering the pointer.
 You can see it more clearly here:
 
 void main()
 {
 	assert("".ptr);
 
 	auto s = "".idup;
 	assert(s.ptr); // boom!
 }

I don't know if it's a bug or not. The string _was_ duped. assert(s == "") 
passes. So, as far as equality goes, they're equal, and they don't point to 
the same memory. Now, you'd think that the new string would be just empty 
rather than null, but whether it's a bug or not depends exactly on what dup 
and idup are supposed to do with regards to null. It's probably just a side 
effect of how dup and idup are implemented rather than it being planned one way 
or the other. I don't know if it matters or not though. In general, I don't 
like the conflation of null and empty, but is this particular case, you _do_ 
get a string which is equal to the original and which doesn't point to the 
same memory. So, I don't know whether this should be considered a bug or not. 
It depends on what dup and idup are ultimately supposed to do.

- Jonathan M Davis

Aug 03 2011

simendsjo <simendsjo gmail.com> writes:

On 03.08.2011 18:18, Jonathan M Davis wrote:
 On Thursday 04 August 2011 00:27:12 Mike Parker wrote:
 On 8/3/2011 11:23 PM, simendsjo wrote:
 On 03.08.2011 15:49, bearophile wrote:
 simendsjo:
 void main() {
 assert(is(typeof("") == typeof("".idup))); // both is
 immutable(char)[]

 assert("" !is null);
 assert("".idup !is null); // fails - s is null. Why?
 }

 I think someone has even suggested to statically forbid "is null" on
 strings :-)

 Bye,
 bearophile

 How should I test for null if not with "is null"? There is a difference
 between null and empty, and avoiding this is not necessarily easy or
 even wanted.
 I couldn't find anything in the specification stating this difference.
 So... Is it a bug?

 This is apparently a bug. Somehow, the idup is clobbering the pointer.
 You can see it more clearly here:

 void main()
 {
 	assert("".ptr);

 	auto s = "".idup;
 	assert(s.ptr); // boom!
 }

 I don't know if it's a bug or not. The string _was_ duped. assert(s == "")
 passes. So, as far as equality goes, they're equal, and they don't point to
 the same memory. Now, you'd think that the new string would be just empty
 rather than null, but whether it's a bug or not depends exactly on what dup
 and idup are supposed to do with regards to null. It's probably just a side
 effect of how dup and idup are implemented rather than it being planned one way
 or the other. I don't know if it matters or not though. In general, I don't
 like the conflation of null and empty, but is this particular case, you _do_
 get a string which is equal to the original and which doesn't point to the
 same memory. So, I don't know whether this should be considered a bug or not.
 It depends on what dup and idup are ultimately supposed to do.

 - Jonathan M Davis

I would think it's a bug, but strings doesn't quite behave as regular 
references anyway...
But why should dup/idup change the semantics of the array?

void main() {
     // A null string or empty string works as expected
     string s1;
     assert(s1           is  null);
     assert(s1.ptr       is  null);
     assert(s1           ==  ""); // We can check for empty even if it's 
null, and it's equal to ""
     assert(s1.length    ==  0);  // ...and length even if it's null
     s1 = "";
     assert(s1           !is null);
     assert(s1.ptr       !is null);
     assert(s1.length    ==  0);
     assert(s1           ==  "");

     // the same applies to null mutable arrays
     char[] s2;
     assert(s2           is  null);
     assert(s2.ptr       is  null);
     assert(s2           ==  "");
     assert(s2.length    ==  0);
     // but with .dup/.idup things is different!
     s2 = "".dup;
     //assert(s2           !is null); // fails
     //assert(s2.ptr       !is null); // fails
     assert(s2.length    ==  0); // but... s2 is null..?
     assert(s2           ==  "");
     assert(s2           ==  s1);
}

Aug 03 2011

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

 On 03.08.2011 18:18, Jonathan M Davis wrote:
 On Thursday 04 August 2011 00:27:12 Mike Parker wrote:
 On 8/3/2011 11:23 PM, simendsjo wrote:
 On 03.08.2011 15:49, bearophile wrote:
 simendsjo:
 void main() {
 assert(is(typeof("") == typeof("".idup))); // both is
 immutable(char)[]
 
 assert("" !is null);
 assert("".idup !is null); // fails - s is null. Why?
 }

 
 I think someone has even suggested to statically forbid "is null" on
 strings :-)
 
 Bye,
 bearophile

 
 How should I test for null if not with "is null"? There is a difference
 between null and empty, and avoiding this is not necessarily easy or
 even wanted.
 I couldn't find anything in the specification stating this difference.
 So... Is it a bug?

 
 This is apparently a bug. Somehow, the idup is clobbering the pointer.
 You can see it more clearly here:
 
 void main()
 {
 
 assert("".ptr);
 
 auto s = "".idup;
 assert(s.ptr); // boom!
 
 }

 
 I don't know if it's a bug or not. The string _was_ duped. assert(s ==
 "") passes. So, as far as equality goes, they're equal, and they don't
 point to the same memory. Now, you'd think that the new string would be
 just empty rather than null, but whether it's a bug or not depends
 exactly on what dup and idup are supposed to do with regards to null.
 It's probably just a side effect of how dup and idup are implemented
 rather than it being planned one way or the other. I don't know if it
 matters or not though. In general, I don't like the conflation of null
 and empty, but is this particular case, you _do_ get a string which is
 equal to the original and which doesn't point to the same memory. So, I
 don't know whether this should be considered a bug or not. It depends on
 what dup and idup are ultimately supposed to do.
 
 - Jonathan M Davis

 
 I would think it's a bug, but strings doesn't quite behave as regular
 references anyway...
 But why should dup/idup change the semantics of the array?
 
 void main() {
 // A null string or empty string works as expected
 string s1;
 assert(s1 is null);
 assert(s1.ptr is null);
 assert(s1 == ""); // We can check for empty even if it's
 null, and it's equal to ""
 assert(s1.length == 0); // ...and length even if it's null
 s1 = "";
 assert(s1 !is null);
 assert(s1.ptr !is null);
 assert(s1.length == 0);
 assert(s1 == "");
 
 // the same applies to null mutable arrays
 char[] s2;
 assert(s2 is null);
 assert(s2.ptr is null);
 assert(s2 == "");
 assert(s2.length == 0);
 // but with .dup/.idup things is different!
 s2 = "".dup;
 //assert(s2 !is null); // fails
 //assert(s2.ptr !is null); // fails
 assert(s2.length == 0); // but... s2 is null..?
 assert(s2 == "");
 assert(s2 == s1);
 }

If you look at the spec ( http://d-programming-language.org/arrays.html ), it 
says:

dup: Create a dynamic array of the same size and copy the contents of the 
array into it.

idup: Create a dynamic array of the same size and copy the contents of
the 
array into it. The copy is typed as being immutable. D 2.0 only


This is _exactly_ what dup and idup are doing. You get a new array with the 
exact same size and contents. null doesn't factor into it at all. So, per the 
spec, there's no bug here at all. dup and idup promise _nothing_ with regards 
to null.

It may be that it would be better if dup and idup returned an array which was 
null if the original was null, and that would also be within the spec, but 
what dup and idup do at the moment _does_ follow the spec.

So, feel free to file a bug report on it. Maybe it'll get changed, but the 
current behavior follows the spec. And given how arrays don't generally treat 
empty and null as being different, I wouldn't really expect an array to stay 
null if you do _anything_ to it other than simply pass it around or check its 
value. In this case, you're creating a new array, and D just doesn't generally 
care about null vs empty when it comes to arrays. I wouldn't argue that that's 
a good thing (because I don't really think that it is), but because of that, 
you can't really expect much to treat null and empty as being different. And 
in this particular case, it's not only debatable as to whether it matters, but 
the current behavior is completely within the spec.

- Jonathan M Davis

Aug 03 2011

simendsjo <simendsjo gmail.com> writes:

On 03.08.2011 19:15, Jonathan M Davis wrote:
 On 03.08.2011 18:18, Jonathan M Davis wrote:
 On Thursday 04 August 2011 00:27:12 Mike Parker wrote:
 On 8/3/2011 11:23 PM, simendsjo wrote:
 On 03.08.2011 15:49, bearophile wrote:
 simendsjo:
 void main() {
 assert(is(typeof("") == typeof("".idup))); // both is
 immutable(char)[]

 assert("" !is null);
 assert("".idup !is null); // fails - s is null. Why?
 }

 I think someone has even suggested to statically forbid "is null" on
 strings :-)

 Bye,
 bearophile

 How should I test for null if not with "is null"? There is a difference
 between null and empty, and avoiding this is not necessarily easy or
 even wanted.
 I couldn't find anything in the specification stating this difference.
 So... Is it a bug?

 This is apparently a bug. Somehow, the idup is clobbering the pointer.
 You can see it more clearly here:

 void main()
 {

 assert("".ptr);

 auto s = "".idup;
 assert(s.ptr); // boom!

 }

 I don't know if it's a bug or not. The string _was_ duped. assert(s ==
 "") passes. So, as far as equality goes, they're equal, and they don't
 point to the same memory. Now, you'd think that the new string would be
 just empty rather than null, but whether it's a bug or not depends
 exactly on what dup and idup are supposed to do with regards to null.
 It's probably just a side effect of how dup and idup are implemented
 rather than it being planned one way or the other. I don't know if it
 matters or not though. In general, I don't like the conflation of null
 and empty, but is this particular case, you _do_ get a string which is
 equal to the original and which doesn't point to the same memory. So, I
 don't know whether this should be considered a bug or not. It depends on
 what dup and idup are ultimately supposed to do.

 - Jonathan M Davis

 I would think it's a bug, but strings doesn't quite behave as regular
 references anyway...
 But why should dup/idup change the semantics of the array?

 void main() {
 // A null string or empty string works as expected
 string s1;
 assert(s1 is null);
 assert(s1.ptr is null);
 assert(s1 == ""); // We can check for empty even if it's
 null, and it's equal to ""
 assert(s1.length == 0); // ...and length even if it's null
 s1 = "";
 assert(s1 !is null);
 assert(s1.ptr !is null);
 assert(s1.length == 0);
 assert(s1 == "");

 // the same applies to null mutable arrays
 char[] s2;
 assert(s2 is null);
 assert(s2.ptr is null);
 assert(s2 == "");
 assert(s2.length == 0);
 // but with .dup/.idup things is different!
 s2 = "".dup;
 //assert(s2 !is null); // fails
 //assert(s2.ptr !is null); // fails
 assert(s2.length == 0); // but... s2 is null..?
 assert(s2 == "");
 assert(s2 == s1);
 }

 If you look at the spec ( http://d-programming-language.org/arrays.html ), it
 says:

 dup: Create a dynamic array of the same size and copy the contents of the
 array into it.

 idup: Create a dynamic array of the same size and copy the contents of
the
 array into it. The copy is typed as being immutable. D 2.0 only


 This is _exactly_ what dup and idup are doing. You get a new array with the
 exact same size and contents. null doesn't factor into it at all. So, per the
 spec, there's no bug here at all. dup and idup promise _nothing_ with regards
 to null.

 It may be that it would be better if dup and idup returned an array which was
 null if the original was null, and that would also be within the spec, but
 what dup and idup do at the moment _does_ follow the spec.

 So, feel free to file a bug report on it. Maybe it'll get changed, but the
 current behavior follows the spec. And given how arrays don't generally treat
 empty and null as being different, I wouldn't really expect an array to stay
 null if you do _anything_ to it other than simply pass it around or check its
 value. In this case, you're creating a new array, and D just doesn't generally
 care about null vs empty when it comes to arrays. I wouldn't argue that that's
 a good thing (because I don't really think that it is), but because of that,
 you can't really expect much to treat null and empty as being different. And
 in this particular case, it's not only debatable as to whether it matters, but
 the current behavior is completely within the spec.

 - Jonathan M Davis

Schveighoffer also states it is as designed.
But it really doesn't behave as one (at least I) would expect.
So in essence (as bearophile says), "is null" should not be used on arrays.

I was bitten by a bug because of this, and used "" intead of "".idup to 
avoid this, but given D doesn't distinguish between empty and null 
arrays, this doesn't feel very safe now..

In the code in question I have a lazy initialized string. The problem is 
that I would see if it has been initialized, but an empty string is also 
a valid value. Because I shouldn't check for null, I now have to add 
another field to the struct to see if the array has been initialized. 
This feels like a really suboptimal solution.

Aug 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 03 Aug 2011 14:26:54 -0400, simendsjo <simendsjo gmail.com> wrote:

 Schveighoffer also states it is as designed.
 But it really doesn't behave as one (at least I) would expect.
 So in essence (as bearophile says), "is null" should not be used on  
 arrays.

 I was bitten by a bug because of this, and used "" intead of "".idup to  
 avoid this, but given D doesn't distinguish between empty and null  
 arrays, this doesn't feel very safe now..

I would recommend against depending on the difference between null and  
not-null-but-empty arrays.  But in any case, "".idup is mainly pointless,  
there is never a need to idup a string, since it's already immutable (and  
therefore can be passed wherever you need it).

 In the code in question I have a lazy initialized string. The problem is  
 that I would see if it has been initialized, but an empty string is also  
 a valid value. Because I shouldn't check for null, I now have to add  
 another field to the struct to see if the array has been initialized.  
 This feels like a really suboptimal solution.

Where is it that you need to use idup?  I think you may be using that  
without need (or if you are using code that violates immutability, that  
code is incorrect), but I don't know what your code looks like so I might  
be wrong.

In any case, there may be a better way to do what you want, without the  
extra field.

At the very least, here is a function that can help you:

myIdup(string s)
{
    return s.length == 0 ? "" : s.idup;
}

Note that this kind of thing *ONLY* works for strings, because string  
literals are not null.  For normal arrays, I wouldn't expect this to work.

-Steve

Aug 03 2011

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

 On 03.08.2011 19:15, Jonathan M Davis wrote:
 On 03.08.2011 18:18, Jonathan M Davis wrote:
 On Thursday 04 August 2011 00:27:12 Mike Parker wrote:
 On 8/3/2011 11:23 PM, simendsjo wrote:
 On 03.08.2011 15:49, bearophile wrote:
 simendsjo:
 void main() {
 assert(is(typeof("") == typeof("".idup))); // both is
 immutable(char)[]
 
 assert("" !is null);
 assert("".idup !is null); // fails - s is null. Why?
 }

 
 I think someone has even suggested to statically forbid "is null" on
 strings :-)
 
 Bye,
 bearophile

 
 How should I test for null if not with "is null"? There is a
 difference between null and empty, and avoiding this is not
 necessarily easy or even wanted.
 I couldn't find anything in the specification stating this
 difference. So... Is it a bug?

 
 This is apparently a bug. Somehow, the idup is clobbering the pointer.
 You can see it more clearly here:
 
 void main()
 {
 
 assert("".ptr);
 
 auto s = "".idup;
 assert(s.ptr); // boom!
 
 }

 
 I don't know if it's a bug or not. The string _was_ duped. assert(s ==
 "") passes. So, as far as equality goes, they're equal, and they don't
 point to the same memory. Now, you'd think that the new string would be
 just empty rather than null, but whether it's a bug or not depends
 exactly on what dup and idup are supposed to do with regards to null.
 It's probably just a side effect of how dup and idup are implemented
 rather than it being planned one way or the other. I don't know if it
 matters or not though. In general, I don't like the conflation of null
 and empty, but is this particular case, you _do_ get a string which is
 equal to the original and which doesn't point to the same memory. So, I
 don't know whether this should be considered a bug or not. It depends
 on what dup and idup are ultimately supposed to do.
 
 - Jonathan M Davis

 
 I would think it's a bug, but strings doesn't quite behave as regular
 references anyway...
 But why should dup/idup change the semantics of the array?
 
 void main() {
 // A null string or empty string works as expected
 string s1;
 assert(s1 is null);
 assert(s1.ptr is null);
 assert(s1 == ""); // We can check for empty even if it's
 null, and it's equal to ""
 assert(s1.length == 0); // ...and length even if it's null
 s1 = "";
 assert(s1 !is null);
 assert(s1.ptr !is null);
 assert(s1.length == 0);
 assert(s1 == "");
 
 // the same applies to null mutable arrays
 char[] s2;
 assert(s2 is null);
 assert(s2.ptr is null);
 assert(s2 == "");
 assert(s2.length == 0);
 // but with .dup/.idup things is different!
 s2 = "".dup;
 //assert(s2 !is null); // fails
 //assert(s2.ptr !is null); // fails
 assert(s2.length == 0); // but... s2 is null..?
 assert(s2 == "");
 assert(s2 == s1);
 }

 
 If you look at the spec ( http://d-programming-language.org/arrays.html
 ), it says:
 
 dup: Create a dynamic array of the same size and copy the contents of
 the array into it.
 
 idup: Create a dynamic array of the same size and copy the contents of
 the array into it. The copy is typed as being immutable. D 2.0 only
 
 
 This is _exactly_ what dup and idup are doing. You get a new array with
 the exact same size and contents. null doesn't factor into it at all.
 So, per the spec, there's no bug here at all. dup and idup promise
 _nothing_ with regards to null.
 
 It may be that it would be better if dup and idup returned an array which
 was null if the original was null, and that would also be within the
 spec, but what dup and idup do at the moment _does_ follow the spec.
 
 So, feel free to file a bug report on it. Maybe it'll get changed, but
 the current behavior follows the spec. And given how arrays don't
 generally treat empty and null as being different, I wouldn't really
 expect an array to stay null if you do _anything_ to it other than
 simply pass it around or check its value. In this case, you're creating
 a new array, and D just doesn't generally care about null vs empty when
 it comes to arrays. I wouldn't argue that that's a good thing (because I
 don't really think that it is), but because of that, you can't really
 expect much to treat null and empty as being different. And in this
 particular case, it's not only debatable as to whether it matters, but
 the current behavior is completely within the spec.
 
 - Jonathan M Davis

 
 Schveighoffer also states it is as designed.
 But it really doesn't behave as one (at least I) would expect.
 So in essence (as bearophile says), "is null" should not be used on arrays.
 
 I was bitten by a bug because of this, and used "" intead of "".idup to
 avoid this, but given D doesn't distinguish between empty and null
 arrays, this doesn't feel very safe now..
 
 In the code in question I have a lazy initialized string. The problem is
 that I would see if it has been initialized, but an empty string is also
 a valid value. Because I shouldn't check for null, I now have to add
 another field to the struct to see if the array has been initialized.
 This feels like a really suboptimal solution.

You can check for null. There _is_ a difference between null and empty. It's 
just that if you do anything which causes an array to allocate, there's no 
real guarantee whether it'll be null or empty if it has no elements in it. So, 
if you have

Struct S
{
 string s = null;
}

You can rely on that being null until you set it. What you have to watch out 
for whether it ends up being set to null when you set it (which is presumably 
where you ran into the problem). So, you don't _have_ to have a separate 
variable indicating whether the array has been initialized, but it's certainly 
less error prone if you do that.

The places where it generally makes the most sense to be able to distinguish 
between empty and null are with function parameters and return values. And 
there _are_ portions of Phobos which rely on that difference, but it's 
definitely true that you have to be careful when dealing with the difference 
between empty and null.

- Jonathan M Davis

Aug 03 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 03.08.2011 22:26, simendsjo wrote:
 On 03.08.2011 19:15, Jonathan M Davis wrote:
 On 03.08.2011 18:18, Jonathan M Davis wrote:
 On Thursday 04 August 2011 00:27:12 Mike Parker wrote:
 On 8/3/2011 11:23 PM, simendsjo wrote:
 On 03.08.2011 15:49, bearophile wrote:
 simendsjo:
 void main() {
 assert(is(typeof("") == typeof("".idup))); // both is
 immutable(char)[]

 assert("" !is null);
 assert("".idup !is null); // fails - s is null. Why?
 }

 I think someone has even suggested to statically forbid "is 
 null" on
 strings :-)

 Bye,
 bearophile

 How should I test for null if not with "is null"? There is a 
 difference
 between null and empty, and avoiding this is not necessarily easy or
 even wanted.
 I couldn't find anything in the specification stating this 
 difference.
 So... Is it a bug?

 This is apparently a bug. Somehow, the idup is clobbering the 
 pointer.
 You can see it more clearly here:

 void main()
 {

 assert("".ptr);

 auto s = "".idup;
 assert(s.ptr); // boom!

 }

 I don't know if it's a bug or not. The string _was_ duped. assert(s ==
 "") passes. So, as far as equality goes, they're equal, and they don't
 point to the same memory. Now, you'd think that the new string 
 would be
 just empty rather than null, but whether it's a bug or not depends
 exactly on what dup and idup are supposed to do with regards to null.
 It's probably just a side effect of how dup and idup are implemented
 rather than it being planned one way or the other. I don't know if it
 matters or not though. In general, I don't like the conflation of null
 and empty, but is this particular case, you _do_ get a string which is
 equal to the original and which doesn't point to the same memory. 
 So, I
 don't know whether this should be considered a bug or not. It 
 depends on
 what dup and idup are ultimately supposed to do.

 - Jonathan M Davis

 I would think it's a bug, but strings doesn't quite behave as regular
 references anyway...
 But why should dup/idup change the semantics of the array?

 void main() {
 // A null string or empty string works as expected
 string s1;
 assert(s1 is null);
 assert(s1.ptr is null);
 assert(s1 == ""); // We can check for empty even if it's
 null, and it's equal to ""
 assert(s1.length == 0); // ...and length even if it's null
 s1 = "";
 assert(s1 !is null);
 assert(s1.ptr !is null);
 assert(s1.length == 0);
 assert(s1 == "");

 // the same applies to null mutable arrays
 char[] s2;
 assert(s2 is null);
 assert(s2.ptr is null);
 assert(s2 == "");
 assert(s2.length == 0);
 // but with .dup/.idup things is different!
 s2 = "".dup;
 //assert(s2 !is null); // fails
 //assert(s2.ptr !is null); // fails
 assert(s2.length == 0); // but... s2 is null..?
 assert(s2 == "");
 assert(s2 == s1);
 }

 If you look at the spec ( 
 http://d-programming-language.org/arrays.html ), it
 says:

 dup: Create a dynamic array of the same size and copy the contents 
 of the
 array into it.

 idup: Create a dynamic array of the same size and copy the 
 contents of the
 array into it. The copy is typed as being immutable. D 2.0 only


 This is _exactly_ what dup and idup are doing. You get a new array 
 with the
 exact same size and contents. null doesn't factor into it at all. So, 
 per the
 spec, there's no bug here at all. dup and idup promise _nothing_ with 
 regards
 to null.

 It may be that it would be better if dup and idup returned an array 
 which was
 null if the original was null, and that would also be within the 
 spec, but
 what dup and idup do at the moment _does_ follow the spec.

 So, feel free to file a bug report on it. Maybe it'll get changed, 
 but the
 current behavior follows the spec. And given how arrays don't 
 generally treat
 empty and null as being different, I wouldn't really expect an array 
 to stay
 null if you do _anything_ to it other than simply pass it around or 
 check its
 value. In this case, you're creating a new array, and D just doesn't 
 generally
 care about null vs empty when it comes to arrays. I wouldn't argue 
 that that's
 a good thing (because I don't really think that it is), but because 
 of that,
 you can't really expect much to treat null and empty as being 
 different. And
 in this particular case, it's not only debatable as to whether it 
 matters, but
 the current behavior is completely within the spec.

 - Jonathan M Davis

 Schveighoffer also states it is as designed.
 But it really doesn't behave as one (at least I) would expect.
 So in essence (as bearophile says), "is null" should not be used on 
 arrays.

 I was bitten by a bug because of this, and used "" intead of "".idup 
 to avoid this, but given D doesn't distinguish between empty and null 
 arrays, this doesn't feel very safe now..

 In the code in question I have a lazy initialized string. The problem 
 is that I would see if it has been initialized, but an empty string is 
 also a valid value. Because I shouldn't check for null, I now have to 
 add another field to the struct to see if the array has been 
 initialized. This feels like a really suboptimal solution.

length works even for "null" arrays and returns 0. Even cleaner way is 
to use std.array.empty:
char[] abc = null;
assert(abc.empty);

So there is no uninitialized arrays, there are just different versions 
of empty slices.

-- 
Dmitry Olshansky

Aug 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 03 Aug 2011 06:35:08 -0400, simendsjo <simendsjo gmail.com> wrote:

 void main() {
      assert(is(typeof("") == typeof("".idup))); // both is  
 immutable(char)[]

      assert(""      !is null);
      assert("".idup !is null); // fails - s is null. Why?
 }

An empty string manifest constant (i.e. string literal) still must have a  
valid pointer, because it's mandated that the string have a zero byte  
appended to it.  This is so you can pass it to C functions which expect  
null-terminated strings.

So essentially, there is a '\0' in memory, and "" points to that character  
with a length of 0

However, idup calls a runtime function which *purposely* asks to make a  
copy.  However, it's *NOT* required to copy the 'zero after the string'  
part.

The implementation, knowing that a null array is equivalent to an empty  
array, is going to return null to avoid the performance penalty of  
allocating a block that won't be used.  If you append, it will simply  
allocate a block as needed.

I see no reason the runtime should waste cycles or a perfectly good  
16-byte buffer to give you an empty array.

Definitely functions as designed, not a bug.  If you would like different  
behavior, you are going to have to have a really really good use case to  
get this changed.

-Steve

Aug 03 2011

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

 On Wed, 03 Aug 2011 06:35:08 -0400, simendsjo <simendsjo gmail.com> wrote:
 void main() {
 
 assert(is(typeof("") == typeof("".idup))); // both is
 
 immutable(char)[]
 
 assert("" !is null);
 assert("".idup !is null); // fails - s is null. Why?
 
 }

 
 An empty string manifest constant (i.e. string literal) still must have a
 valid pointer, because it's mandated that the string have a zero byte
 appended to it. This is so you can pass it to C functions which expect
 null-terminated strings.
 
 So essentially, there is a '\0' in memory, and "" points to that character
 with a length of 0
 
 However, idup calls a runtime function which *purposely* asks to make a
 copy. However, it's *NOT* required to copy the 'zero after the string'
 part.
 
 The implementation, knowing that a null array is equivalent to an empty
 array, is going to return null to avoid the performance penalty of
 allocating a block that won't be used. If you append, it will simply
 allocate a block as needed.
 
 I see no reason the runtime should waste cycles or a perfectly good
 16-byte buffer to give you an empty array.
 
 Definitely functions as designed, not a bug. If you would like different
 behavior, you are going to have to have a really really good use case to
 get this changed.

Given that if you really wanted the duped string to be empty instead of null, 
it wouldn't be very hard to write a wrapper function for dup which did that, 
I'd be _very_ surprised if you could find a use case where dup should allocate 
for an empty string.

I don't generally like the fact that D tends to conflate null and empty, but 
you're creating a new array here. It's not at all surprising if it ends up 
null if it has no elements in it. In general though, you need to be fairly 
careful about where you rely on the difference between empty and null. If any 
kind of memory allocation occurs to an array and its length is 0, it's pretty 
much free game as to whether it's empty or null.

- Jonathan M Davis

Aug 03 2011

D Programming

C/C++ Programming

Other

digitalmars.D.learn - "" gives an empty string, while "".idup gives null