digitalmars.D.bugs - [Issue 2934] New: "".dup does not return empty string

d-bugmail puremagic.com (19/19) May 04 2009 http://d.puremagic.com/issues/show_bug.cgi?id=2934

d-bugmail puremagic.com (11/11) May 04 2009 http://d.puremagic.com/issues/show_bug.cgi?id=2934
d-bugmail puremagic.com (21/21) May 04 2009 http://d.puremagic.com/issues/show_bug.cgi?id=2934

Derek Parnell (12/40) May 04 2009 Huh??? Duplicating something should give one a duplicate.

Steven Schveighoffer (15/61) May 04 2009 what's not intuitive is comparing an array (which is a struct) to null.

Derek Parnell (29/45) May 04 2009 Hmmm ... interesting. I regard the array not as a struct but as a concep...

Steven Schveighoffer (26/65) May 05 2009 Yes, but null is a pointer. Can I make just any struct with a pointer, ...

d-bugmail puremagic.com (38/43) May 05 2009 http://d.puremagic.com/issues/show_bug.cgi?id=2934
d-bugmail puremagic.com (24/24) May 05 2009 http://d.puremagic.com/issues/show_bug.cgi?id=2934

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=2934

           Summary: "".dup does not return empty string
           Product: D
           Version: unspecified
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: DMD
        AssignedTo: bugzilla digitalmars.com
        ReportedBy: qian.xu funkwerk-itk.com


The following code will throw an exception: 
  char[] s;
  assert( s.dup  is null); // OK
  assert("".dup !is null); // FAILED

"".dup is expectly also an empty string.

Confirmed with dmd v1, gdc


--

May 04 2009

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=2934






Sorry. I should have post the following code:

  char[] s;
  assert(s     is null);
  assert(s.dup is null);

  assert(""     !is null); // OK
  assert("".dup !is null); // FAILED

The last two lines behave not consistent. 
Either both are failed, or both are passed.


--

May 04 2009

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=2934


schveiguy yahoo.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID





From posts in the newsgroup, I've determined that this bug is invalid:

1. Duplicating an empty array should always return a null array.  Otherwise,
you'd have to allocate space to store 0 data bytes in order for the result to
be non-null.

2. String literals have a null character implicitly appended to them by the
compiler.  This is done to ease calling c functions.  So a string literal's
pointer cannot be null, since it has to point to a static zero byte.

The spec identifies specifically item 2 here:
http://www.digitalmars.com/d/1.0/arrays.html#strings

see the section describing "C's printf and Strings"

I could not find a reference for item 1, but I remember reading something about
it.  Regardless of it is identified specifically in the spec or not, it is not
a bug, as the alternative would be to allocate blocks for 0-sized arrays.


--

May 04 2009

Derek Parnell <derek psych.ward> writes:

On Mon, 4 May 2009 17:44:56 +0000 (UTC), d-bugmail puremagic.com wrote:

 http://d.puremagic.com/issues/show_bug.cgi?id=2934
 
 schveiguy yahoo.com changed:
 
            What    |Removed                     |Added
 ----------------------------------------------------------------------------
              Status|NEW                         |RESOLVED
          Resolution|                            |INVALID
 

 From posts in the newsgroup, I've determined that this bug is invalid:
 
 1. Duplicating an empty array should always return a null array.  Otherwise,
 you'd have to allocate space to store 0 data bytes in order for the result to
 be non-null.
 
 2. String literals have a null character implicitly appended to them by the
 compiler.  This is done to ease calling c functions.  So a string literal's
 pointer cannot be null, since it has to point to a static zero byte.
 
 The spec identifies specifically item 2 here:
 http://www.digitalmars.com/d/1.0/arrays.html#strings
 
 see the section describing "C's printf and Strings"
 
 I could not find a reference for item 1, but I remember reading something about
 it.  Regardless of it is identified specifically in the spec or not, it is not
 a bug, as the alternative would be to allocate blocks for 0-sized arrays.

Huh??? Duplicating something should give one a duplicate.

I do not think that this is an invalid bug.

Ok, so duplicating an empty array causes memory to be allocated - so what!
I asked for a duplicate so give me a duplicate, please.

To me, the "no surprise" path is simple. Duplicating an empty array should
return an empty array. Duplicating a null array should return a null array.

Is that not intuitive?

-- 
Derek Parnell
Melbourne, Australia
skype: derek.j.parnell

May 04 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 04 May 2009 16:56:49 -0400, Derek Parnell <derek psych.ward> wrote:

 On Mon, 4 May 2009 17:44:56 +0000 (UTC), d-bugmail puremagic.com wrote:

 http://d.puremagic.com/issues/show_bug.cgi?id=2934

 schveiguy yahoo.com changed:

            What    |Removed                     |Added
 ----------------------------------------------------------------------------
              Status|NEW                         |RESOLVED
          Resolution|                            |INVALID


 From posts in the newsgroup, I've determined that this bug is invalid:

 1. Duplicating an empty array should always return a null array.   
 Otherwise,
 you'd have to allocate space to store 0 data bytes in order for the  
 result to
 be non-null.

 2. String literals have a null character implicitly appended to them by  
 the
 compiler.  This is done to ease calling c functions.  So a string  
 literal's
 pointer cannot be null, since it has to point to a static zero byte.

 The spec identifies specifically item 2 here:
 http://www.digitalmars.com/d/1.0/arrays.html#strings

 see the section describing "C's printf and Strings"

 I could not find a reference for item 1, but I remember reading  
 something about
 it.  Regardless of it is identified specifically in the spec or not, it  
 is not
 a bug, as the alternative would be to allocate blocks for 0-sized  
 arrays.

 Huh??? Duplicating something should give one a duplicate.

 I do not think that this is an invalid bug.

 Ok, so duplicating an empty array causes memory to be allocated - so  
 what!
 I asked for a duplicate so give me a duplicate, please.

 To me, the "no surprise" path is simple. Duplicating an empty array  
 should
 return an empty array. Duplicating a null array should return a null  
 array.

 Is that not intuitive?

what's not intuitive is comparing an array (which is a struct) to null.

char[] arr1 = "";
char[] arr2 = null;

assert(arr1 == arr2); // OK
assert(arr1 == null); // FAIL

I'd say that comparing an array to null should always succeed if the array  
is empty, but I guess some people may use the fact that the pointer is not  
null in an empty array.  I definitely don't want the runtime to allocate  
blocks of data when requested to allocate 0 bytes.

In any case, this bug is not valid, because the compiler acts as specified  
by the spec.

I never compare arrays to null if I can remember, I always check the  
length instead, which is consistent for both null and empty arrays.

-Steve

May 04 2009

Derek Parnell <derek psych.ward> writes:

On Mon, 04 May 2009 17:16:45 -0400, Steven Schveighoffer wrote:


 what's not intuitive is comparing an array (which is a struct) to null.

Hmmm ... interesting. I regard the array not as a struct but as a concept
implemented in D as a struct.
 
 char[] arr1 = "";
 char[] arr2 = null;
 
 assert(arr1 == arr2); // OK
 assert(arr1 == null); // FAIL
 
 I'd say that comparing an array to null should always succeed if the array  
 is empty, but I guess some people may use the fact that the pointer is not  
 null in an empty array.

Yes, some people rely on the distinction.

However, I think that this ought to be the case ...

 char[] arr1 = "";
 char[] arr2 = null;
 
 assert(arr1 == arr2); // FAIL
 assert(arr1 == null); // FAIL

 assert(arr2 == ""); // FAIL
 assert(arr2 == arr1); // FAIL

 assert(null == ""); // FAIL

Simply because an empty array is one with an allocation and a null array is
one without an allocation therefore they are not the same thing. So the
'==' equality test should tell the coder that there are two different
beasties at play here.

I know that there is an "efficiency" aspect to this. 

A "proper" test IMO is that an array is null if arr.ptr == null and
arr.length = 0, but I suspect that will be evil to the speed aficionados.


  I definitely don't want the runtime to allocate  
 blocks of data when requested to allocate 0 bytes.

Then don't allocate zero bytes.
 
 In any case, this bug is not valid, because the compiler acts as specified  
 by the spec.

I'm having trouble locating the specification for this. 

 I never compare arrays to null if I can remember, I always check the  
 length instead, which is consistent for both null and empty arrays.

I do the same as you.

-- 
Derek Parnell
Melbourne, Australia
skype: derek.j.parnell

May 04 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 04 May 2009 20:02:01 -0400, Derek Parnell <derek psych.ward> wrote:

 On Mon, 04 May 2009 17:16:45 -0400, Steven Schveighoffer wrote:


 what's not intuitive is comparing an array (which is a struct) to null.

 Hmmm ... interesting. I regard the array not as a struct but as a concept
 implemented in D as a struct.

Yes, but null is a pointer.  Can I make just any struct with a pointer,  
and expect to be able to compare it to null (and have it direct that  
comparision to the pointer)?

The distinction that an array is a struct and not a pointer or reference  
is one of the frequent causes of newbie frustration, because they just  
don't get it at first.  I know of no other language that implements arrays  
like this (where the length is local, but the data is shared).

It's also one of the gems of D if you learn to use it correctly.

 char[] arr1 = "";
 char[] arr2 = null;

 assert(arr1 == arr2); // OK
 assert(arr1 == null); // FAIL

 I'd say that comparing an array to null should always succeed if the  
 array
 is empty, but I guess some people may use the fact that the pointer is  
 not
 null in an empty array.

 Yes, some people rely on the distinction.

 However, I think that this ought to be the case ...

  char[] arr1 = "";
  char[] arr2 = null;
 assert(arr1 == arr2); // FAIL
  assert(arr1 == null); // FAIL

  assert(arr2 == ""); // FAIL
  assert(arr2 == arr1); // FAIL

  assert(null == ""); // FAIL

 Simply because an empty array is one with an allocation and a null array  
 is
 one without an allocation therefore they are not the same thing. So the
 '==' equality test should tell the coder that there are two different
 beasties at play here.

I would be also fine with this, as it would discourage comparing to null.   
I'd also be fine with comparing an array to null being a syntax error.   
You can always do arr.ptr == null.

 I know that there is an "efficiency" aspect to this.

 A "proper" test IMO is that an array is null if arr.ptr == null and
 arr.length = 0, but I suspect that will be evil to the speed aficionados.

Such an array is an anomaly, and shouldn't ever occur, unless someone  
forces it by setting the ptr specifically.  I don't think it's worth the  
extra code to cover this very rare possibility.

  I definitely don't want the runtime to allocate
 blocks of data when requested to allocate 0 bytes.

 Then don't allocate zero bytes.

Sometimes, you don't know whether it's going to be zero bytes or not until  
runtime.  I don't want to have to check for zero-length arrays everywhere  
I dup, when the GC does it for me.

 In any case, this bug is not valid, because the compiler acts as  
 specified
 by the spec.

 I'm having trouble locating the specification for this.

As far as the "" being not null, the spec does talk about it (although  
indirectly) as I cited in the original bug resolution.  As far as  
returning a null array when allocating zero bytes, there is nothing I  
could find in the spec, but this means it's up to the implementer.  So the  
implementation does not violate the spec, and it can be considered desired  
behavior, not an accident.

I'd be interested to know what Walter had in mind.

-Steve

May 05 2009

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=2934


qian.xu funkwerk-itk.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|INVALID                     |





 2. String literals have a null character implicitly appended to them by the
 compiler.  This is done to ease calling c functions.  So a string literal's
 pointer cannot be null, since it has to point to a static zero byte.

I am fully agree with you. But before using ".dup" a string variable has
triple-state (null, empty or not empty). After adding a ".dup" to an empty
string, it might be reduced to two. This will break existing code, if defensive
copies of strings are made. 

An example is as follows:

  class test {
    private char[] val;
    char[] getVal() {
      return val.dup; // make a defensive copy to avoid unexpected change from
outside
    }
    void setVal(char[] val) {
      this.val = val.dup;
    }
  }

  myTestObj.setVal("");
  char[] s = myTestObj.getVal;
  if (s is null) {
    // do task 1
  }
  else if (s == "") {
    // do task 2
  }
  else {
    // do task 3
  }

In this case, task 2 is expected to be performed. However task 1 will be
performed. 


 Regardless of it is identified specifically in the spec or not, it is not
 a bug, as the alternative would be to allocate blocks for 0-sized arrays.

Did you mean, that this is a feature request? I would like to regard the
inconsistency of the dup-effect as a defect.


--

May 05 2009

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=2934






From that point of view, your request makes a lot more sense.

But there are two counter arguments:

1. Comparing an array to null has limited utility, I don't think it should be
in widespread use, as most of the time you only care if the array is empty or
not.  There may be special cases, but in those cases, you can use arr.ptr ==
null.  It would have been much better if arr == null never compiled.

2. Duping an empty array has limited defensive utility.  You can just as easily
return the array itself.  If it weren't for the horrendous append behavior, it
would be a no brainer:

T[] edup(T)(T[] arr)
{
   return arr.length == 0 ? arr : arr.dup;
}

usage:

return arr.edup();

Allocating data for duping an empty array is not an acceptable pessimization. 
However, I thought of another possible solution:  A dup of an empty, non-null
array can return a pointer into the read only data segment.  This would allow a
non-allocation on duping an empty array, would not return a pointer to null,
and would not accidentally overwrite the original array if appending is done.

So a fix can be done.


--

May 05 2009

D Programming

C/C++ Programming

Other

digitalmars.D.bugs - [Issue 2934] New: "".dup does not return empty string