www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - std.string.chomp error

reply simendsjo <simen.endsjo pandavre.com> writes:
The documentation says
/*******************************************
  * Returns s[] sans trailing delimiter[], if any.
  * If delimiter[] is null, removes trailing CR, LF, or CRLF, if any.
  */

To adhere to the documentation, chomp must be changed from:

C[] chomp(C, C1)(C[] s, in C1[] delimiter)
{
     if (endsWith(s, delimiter))
     {
         return s[0 .. $ - delimiter.length];
     }
     return s;
}

to:

C[] chomp(C, C1)(C[] s, in C1[] delimiter)
{
     if (delimiter == null)
       return chomp(s);
     else if (endsWith(s, delimiter))
       return s[0 .. $ - delimiter.length];
     else
       return s;
}
Aug 09 2010
next sibling parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Mon, 09 Aug 2010 18:58:36 +0200, simendsjo wrote:

 The documentation says
 /*******************************************
   * Returns s[] sans trailing delimiter[], if any. * If delimiter[] is
   null, removes trailing CR, LF, or CRLF, if any. */
 
 To adhere to the documentation, chomp must be changed from:
 
 C[] chomp(C, C1)(C[] s, in C1[] delimiter) {
      if (endsWith(s, delimiter))
      {
          return s[0 .. $ - delimiter.length];
      }
      return s;
 }
 
 to:
 
 C[] chomp(C, C1)(C[] s, in C1[] delimiter) {
      if (delimiter == null)
        return chomp(s);
      else if (endsWith(s, delimiter))
        return s[0 .. $ - delimiter.length];
      else
        return s;
 }

Either that, or the documentation for it needs to be changed. Anyway, it would be great if you'd report this. http://d.puremagic.com/issues/ Thanks! -Lars
Aug 09 2010
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Lars T. Kyllingstad:
 Either that, or the documentation for it needs to be changed.  Anyway, it 
 would be great if you'd report this.

A really basic unit testing is able to catch an error like this. This means Phobos needs more unit tests. Stuff that may be added to that unittest: import std.stdio, std.string; void main() { assert(chomp("hello") == "hello"); assert(chomp("hello\n") == "hello"); assert(chomp("hello\r\n") == "hello"); assert(chomp("hello", "") == "hello"); assert(chomp("hello\n", "") == "hello"); assert(chomp("hello\r\n", "") == "hello"); assert(chomp("hello\r\n", "") == "hello"); assert(chomp("helloxxx", "x") == "helloxx"); } But instead of: if (delimiter == null) Is better to write: if (delimiter.length == 0) Or: if (delimiter == "") See http://d.puremagic.com/issues/show_bug.cgi?id=3889 Bye, bearophile
Aug 09 2010
parent reply simendsjo <simen.endsjo pandavre.com> writes:
On 09.08.2010 19:34, bearophile wrote:
 Lars T. Kyllingstad:
 Either that, or the documentation for it needs to be changed.  Anyway, it
 would be great if you'd report this.

A really basic unit testing is able to catch an error like this. This means Phobos needs more unit tests. Stuff that may be added to that unittest: import std.stdio, std.string; void main() { assert(chomp("hello") == "hello"); assert(chomp("hello\n") == "hello"); assert(chomp("hello\r\n") == "hello"); assert(chomp("hello", "") == "hello"); assert(chomp("hello\n", "") == "hello"); assert(chomp("hello\r\n", "") == "hello"); assert(chomp("hello\r\n", "") == "hello"); assert(chomp("helloxxx", "x") == "helloxx"); } But instead of: if (delimiter == null) Is better to write: if (delimiter.length == 0) Or: if (delimiter == "") See http://d.puremagic.com/issues/show_bug.cgi?id=3889 Bye, bearophile

Ahem.. :) Yes, I did miss your answer! How I got fooled by the preview pane and never noticed the scrollbar. I cannot see how your other bug report relates to this though. For chomps part it's just an implementation vs. documentation issue.
Aug 09 2010
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
simendsjo:
 Ahem.. :) Yes, I did miss your answer! How I got fooled by the preview 
 pane and never noticed the scrollbar.

No problem, it happens, don't worry.
 I cannot see how your other bug report relates to this though.

My other bug report is about this line in your code: if (delimiter == null) I don't like it :-) Bye, bearophile
Aug 09 2010
next sibling parent simendsjo <simen.endsjo pandavre.com> writes:
On 10.08.2010 02:09, Jonathan M Davis wrote:
 On Monday, August 09, 2010 16:59:07 bearophile wrote:
 simendsjo:
 Ahem.. :) Yes, I did miss your answer! How I got fooled by the preview
 pane and never noticed the scrollbar.

No problem, it happens, don't worry.
 I cannot see how your other bug report relates to this though.

My other bug report is about this line in your code: if (delimiter == null) I don't like it :-) Bye, bearophile

Why, because it should be if(delimiter is null) or just if(!delimiter) - Jonathan M Davis

Hehe.. You're a bit beyond my D level right now. At least I now know null == false and you can do reference is null :)
Aug 09 2010
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Jonathan M Davis:
 Why, because it should be
 
 if(delimiter is null)
 
 
 or just
 
 if(!delimiter)

if (delimiter.length == 0) Or if (!delimiter.length) Bye, bearophile
Aug 09 2010
next sibling parent simendsjo <simen.endsjo pandavre.com> writes:
On 10.08.2010 03:08, bearophile wrote:
 Jonathan M Davis:
 Why, because it should be

 if(delimiter is null)


 or just

 if(!delimiter)

if (delimiter.length == 0) Or if (!delimiter.length) Bye, bearophile

Isn't that very different things? You cannot use .length if delimiter is null.
Aug 09 2010
prev sibling parent Jonathan M Davis <jmdavisprog gmail.com> writes:
On Monday, August 09, 2010 18:12:11 simendsjo wrote:
 On 10.08.2010 03:08, bearophile wrote:
 Jonathan M Davis:
 Why, because it should be
 
 if(delimiter is null)
 
 
 or just
 
 if(!delimiter)

if (delimiter.length == 0) Or if (!delimiter.length) Bye, bearophile

Isn't that very different things? You cannot use .length if delimiter is null.

If that's what you're looking for, then the proper thing to do would be to import std.array and do this if(delimiter.empty) It wille handle both null and length == 0 cases. Not to mention, it's much more range-y that way. - Jonathan M Davis
Aug 09 2010
prev sibling parent Jonathan M Davis <jmdavisprog gmail.com> writes:
On Monday, August 09, 2010 16:59:07 bearophile wrote:
 simendsjo:
 Ahem.. :) Yes, I did miss your answer! How I got fooled by the preview
 pane and never noticed the scrollbar.

No problem, it happens, don't worry.
 I cannot see how your other bug report relates to this though.

My other bug report is about this line in your code: if (delimiter == null) I don't like it :-) Bye, bearophile

Why, because it should be if(delimiter is null) or just if(!delimiter) - Jonathan M Davis
Aug 09 2010
prev sibling parent reply simendsjo <simen.endsjo pandavre.com> writes:
On 09.08.2010 19:16, Lars T. Kyllingstad wrote:
 On Mon, 09 Aug 2010 18:58:36 +0200, simendsjo wrote:

 The documentation says
 /*******************************************
    * Returns s[] sans trailing delimiter[], if any. * If delimiter[] is
    null, removes trailing CR, LF, or CRLF, if any. */

 To adhere to the documentation, chomp must be changed from:

 C[] chomp(C, C1)(C[] s, in C1[] delimiter) {
       if (endsWith(s, delimiter))
       {
           return s[0 .. $ - delimiter.length];
       }
       return s;
 }

 to:

 C[] chomp(C, C1)(C[] s, in C1[] delimiter) {
       if (delimiter == null)
         return chomp(s);
       else if (endsWith(s, delimiter))
         return s[0 .. $ - delimiter.length];
       else
         return s;
 }

Either that, or the documentation for it needs to be changed. Anyway, it would be great if you'd report this. http://d.puremagic.com/issues/ Thanks! -Lars

Ok; http://d.puremagic.com/issues/show_bug.cgi?id=4608
Aug 09 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
simendsjo:
 Ok; http://d.puremagic.com/issues/show_bug.cgi?id=4608

You seem to have missed my answer :-) Bye, bearophile
Aug 09 2010
parent simendsjo <simen.endsjo pandavre.com> writes:
On 09.08.2010 23:58, bearophile wrote:
 simendsjo:
 Ok; http://d.puremagic.com/issues/show_bug.cgi?id=4608

You seem to have missed my answer :-) Bye, bearophile

No, but I don't know if it's the documentation or implementation that's correct.
Aug 09 2010
prev sibling next sibling parent reply Jonathan M Davis <jmdavisprog gmail.com> writes:
On Monday, August 09, 2010 17:09:03 simendsjo wrote:
 On 10.08.2010 02:09, Jonathan M Davis wrote:
 On Monday, August 09, 2010 16:59:07 bearophile wrote:
 simendsjo:
 Ahem.. :) Yes, I did miss your answer! How I got fooled by the preview
 pane and never noticed the scrollbar.

No problem, it happens, don't worry.
 I cannot see how your other bug report relates to this though.

My other bug report is about this line in your code: if (delimiter == null) I don't like it :-) Bye, bearophile

Why, because it should be if(delimiter is null) or just if(!delimiter) - Jonathan M Davis

Hehe.. You're a bit beyond my D level right now. At least I now know null == false and you can do reference is null :)

IIRC, you're not really supposed to do "delim == null" but rather us "delim is null" or shorten to to just "!delim". Why, I don't recall off the top of my head, but it might be because "delim == null" would be calling Object.opEquals(), and there's no need for that function call (though fortunately "delim == null" translates to Object.opEquals(delim, null) rather than delim.opEquals(null) which avoids issues where the lhs is null and causes it to go boom). In either case, for null checks, I'd suggest either just using the reference by itself or to explictly use "is null" if you want the extra clarity. - Jonathan M Davis
Aug 09 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Lars T. Kyllingstad:
 There, I don't agree with you.  Arrays are a sort of pseudo-reference 
 type, so I don't mind 'null' being a sort of pseudo-null in that 
 context.  Actually, I find it to be quite elegant.  It's a matter of 
 taste, I guess.

I suggest you to write down the things you don't like and the design you desire here. Bye, bearophile
Aug 10 2010
parent bearophile <bearophileHUGS lycos.com> writes:
Lars T. Kyllingstad:
 I like how it is designed now, that was my point.

Sorry, I misread you.
Aug 10 2010
prev sibling next sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Mon, 09 Aug 2010 17:35:56 -0700, Jonathan M Davis wrote:

 On Monday, August 09, 2010 17:09:03 simendsjo wrote:
 On 10.08.2010 02:09, Jonathan M Davis wrote:
 On Monday, August 09, 2010 16:59:07 bearophile wrote:
 simendsjo:
 Ahem.. :) Yes, I did miss your answer! How I got fooled by the
 preview pane and never noticed the scrollbar.

No problem, it happens, don't worry.
 I cannot see how your other bug report relates to this though.

My other bug report is about this line in your code: if (delimiter == null) I don't like it :-) Bye, bearophile

Why, because it should be if(delimiter is null) or just if(!delimiter) - Jonathan M Davis

Hehe.. You're a bit beyond my D level right now. At least I now know null == false and you can do reference is null :)

IIRC, you're not really supposed to do "delim == null" but rather us "delim is null" or shorten to to just "!delim". Why, I don't recall off the top of my head, but it might be because "delim == null" would be calling Object.opEquals(), and there's no need for that function call (though fortunately "delim == null" translates to Object.opEquals(delim, null) rather than delim.opEquals(null) which avoids issues where the lhs is null and causes it to go boom). In either case, for null checks, I'd suggest either just using the reference by itself or to explictly use "is null" if you want the extra clarity.

No, using 'is' won't work. Check this out: int[] a; assert (a == null); assert (a is null); a = new int[10]; a.length = 0; assert (a == null); assert (a !is null); The thing is, '==' tests whether two arrays are equal, that is, that they are equally long and that their elements are equal. Any empty array is equal to null -- in fact, in this context 'null' is just a way of denoting an empty array that doesn't point to any particular memory block (i.e. hasn't been initialised yet). // This is what '==' does bool mimicEquals(int[] a, int[] b) { if (a.length != b.length) return false; foreach (i; 0 .. a.length) if (a[i] != b[i]) return false; return true; } 'is', on the other hand, tests whether two arrays are identical, i.e. that they have the same length and *refer to the same piece of memory*. // This is (sort of) what 'is' does bool mimicIs(int[] a, int[] b) { return (a.ptr == b.ptr && a.length == b.length); } -Lars
Aug 10 2010
prev sibling next sibling parent Jonathan M Davis <jmdavisprog gmail.com> writes:
On Tuesday 10 August 2010 00:30:37 Lars T. Kyllingstad wrote:
 No, using 'is' won't work.  Check this out:
 
   int[] a;
   assert (a == null);
   assert (a is null);
 
   a = new int[10];
   a.length = 0;
   assert (a == null);
   assert (a !is null);
 
 The thing is, '==' tests whether two arrays are equal, that is, that they
 are equally long and that their elements are equal.  Any empty array is
 equal to null -- in fact, in this context 'null' is just a way of
 denoting an empty array that doesn't point to any particular memory block
 (i.e. hasn't been initialised yet).
 
   // This is what '==' does
   bool mimicEquals(int[] a, int[] b)
   {
       if (a.length != b.length) return false;
       foreach (i; 0 .. a.length) if (a[i] != b[i]) return false;
       return true;
   }
 
 'is', on the other hand, tests whether two arrays are identical, i.e.
 that they have the same length and *refer to the same piece of memory*.
 
   // This is (sort of) what 'is' does
   bool mimicIs(int[] a, int[] b)
   {
      return (a.ptr == b.ptr  &&  a.length == b.length);
   }
 
 -Lars

Actually, it looks to me that that's an argument for using is for checking for null rather than ==, since == isn't really going to tell you. The fact that == doesn't care about whether an array is null makes it not work for checking for whether an array is null. 1. As I understand it, using is instead of == is for all references, not just arrays and their bizarre pseudo-null state. Using is with a class will avoid calling opEquals() and does exactly what you want when checking whether a class reference is null. 2. For arrays, if you want to check whether it really is null, then you _must_ use is, because == obviously isn't going to tell you. It'll just lump empty arrays in with null ones. For instance, if you want to check that an array has never been initialized or that it has been set to null and never set to something else, then you need to use is. 3. On the other hand, if what you really care about is checking whether an array has any elements and you don't care about whether it's null or not, then the empty function/property would be the better way to go. It's quite explicit, and it's more generic, doing things the way that ranges are done. Personally, I think that the way that null is handled with arrays and associative arrays is a poor design choice (if they're null, they should be null until you assign to them with new rather than this whole null but not null nonsense), but we're stuck with it I guess. - Jonathan M Davis
Aug 10 2010
prev sibling next sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Tue, 10 Aug 2010 01:48:17 -0700, Jonathan M Davis wrote:

 On Tuesday 10 August 2010 00:30:37 Lars T. Kyllingstad wrote:
 No, using 'is' won't work.  Check this out:
 
   int[] a;
   assert (a == null);
   assert (a is null);
 
   a = new int[10];
   a.length = 0;
   assert (a == null);
   assert (a !is null);
 
 The thing is, '==' tests whether two arrays are equal, that is, that
 they are equally long and that their elements are equal.  Any empty
 array is equal to null -- in fact, in this context 'null' is just a way
 of denoting an empty array that doesn't point to any particular memory
 block (i.e. hasn't been initialised yet).
 
   // This is what '==' does
   bool mimicEquals(int[] a, int[] b)
   {
       if (a.length != b.length) return false; foreach (i; 0 ..
       a.length) if (a[i] != b[i]) return false; return true;
   }
 
 'is', on the other hand, tests whether two arrays are identical, i.e.
 that they have the same length and *refer to the same piece of memory*.
 
   // This is (sort of) what 'is' does
   bool mimicIs(int[] a, int[] b)
   {
      return (a.ptr == b.ptr  &&  a.length == b.length);
   }
 
 -Lars

Actually, it looks to me that that's an argument for using is for checking for null rather than ==, since == isn't really going to tell you. The fact that == doesn't care about whether an array is null makes it not work for checking for whether an array is null.

I guess it depends on what behaviour you're after. In the present case, if you want chomp(a, null) and chomp(a, "") to do the same thing, then you should use '=='. If you want chomp(a, "") to simply do nothing, use 'is'. I just figured that the former was the desired behaviour here. If it isn't, I agree with you. :)
 1. As I understand it, using is instead of == is for all references, not
 just arrays and their bizarre pseudo-null state. Using is with a class
 will avoid calling opEquals() and does exactly what you want when
 checking whether a class reference is null.

Fun fact: Actually, 'is' works for any type. assert (1 is 1); As I've understood it, 'a is b' is true if the variables a and b contain the exact same bits. If a and b are value types, this must mean they have the same value, and if they are references (including arrays), it means they refer to the same data.
 2. For arrays, if you want to check whether it really is null, then you
 _must_ use is, because == obviously isn't going to tell you. It'll just
 lump empty arrays in with null ones. For instance, if you want to check
 that an array has never been initialized or that it has been set to null
 and never set to something else, then you need to use is.
 
 3. On the other hand, if what you really care about is checking whether
 an array has any elements and you don't care about whether it's null or
 not, then the empty function/property would be the better way to go.
 It's quite explicit, and it's more generic, doing things the way that
 ranges are done.

I totally agree with you. Lately, I have started using "empty" (as well as the other range primitives) for arrays myself. I just disagreed that 'is' would produce what I perceived to be the right behaviour for the function in question. But that perception may well be wrong. ;)
 Personally, I think that the way that null is handled with arrays and
 associative arrays is a poor design choice (if they're null, they should
 be null until you assign to them with new rather than this whole null
 but not null nonsense), but we're stuck with it I guess.

There, I don't agree with you. Arrays are a sort of pseudo-reference type, so I don't mind 'null' being a sort of pseudo-null in that context. Actually, I find it to be quite elegant. It's a matter of taste, I guess. -Lars
Aug 10 2010
prev sibling next sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Tue, 10 Aug 2010 07:50:34 -0400, bearophile wrote:

 Lars T. Kyllingstad:
 There, I don't agree with you.  Arrays are a sort of pseudo-reference
 type, so I don't mind 'null' being a sort of pseudo-null in that
 context.  Actually, I find it to be quite elegant.  It's a matter of
 taste, I guess.

I suggest you to write down the things you don't like and the design you desire here.

?? I like how it is designed now, that was my point. -Lars
Aug 10 2010
prev sibling parent Jonathan M Davis <jmdavisprog gmail.com> writes:
On Tuesday 10 August 2010 02:34:33 Lars T. Kyllingstad wrote:
 I guess it depends on what behaviour you're after.  In the present case,
 if you want chomp(a, null) and chomp(a, "") to do the same thing, then
 you should use '=='.  If you want chomp(a, "") to simply do nothing, use
 'is'.  I just figured that the former was the desired behaviour here.  If
 it isn't, I agree with you. :)

I was really talking about the general case rather than chomp() in specific, but's fine with me if "" and null are treated the same here, I suppose. Still, I really don't like it when null and empty is treated the same way (in D or anything else). There's a difference between something not existing and it existing but not having anything in it. But, as I said, we're stuck with the way it is in D. - Jonathan M Davis
Aug 10 2010