www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Can't use std.algorithm.remove on a char[]?

reply TheGag96 <thegag96 gmail.com> writes:
I was just writing some code trying to remove a value from a 
character array, but the compiler complained "No overload matches 
for remove", and if I specifically say use std.algorithm.remove() 
the compiler doesn't think it fits any definition. For reference, 
this would be all I'm doing:

char[] thing = ['a', 'b', 'c'];
thing = thing.remove(1);

Is this a bug? std.algorithm claims remove() works on any forward 
range...
Apr 30 2016
parent reply ag0aep6g <anonymous example.com> writes:
On 30.04.2016 18:44, TheGag96 wrote:
 I was just writing some code trying to remove a value from a character
 array, but the compiler complained "No overload matches for remove", and
 if I specifically say use std.algorithm.remove() the compiler doesn't
 think it fits any definition. For reference, this would be all I'm doing:

 char[] thing = ['a', 'b', 'c'];
 thing = thing.remove(1);

 Is this a bug? std.algorithm claims remove() works on any forward range...
The documentation is wrong. 1) remove requires a bidirectional range. The constraints and parameter documentation correctly say so. char[] is a bidirectional range, though. 2) remove requires lvalue elements. char[] fails this, as the range primitives decode the chars on-the-fly to dchars. Pull request to fix the documentation: https://github.com/dlang/phobos/pull/4271 By the way, I think requiring lvalues is too restrictive. It should work with assignable elements. Also, it has apparently been missed that const/immutable can make non-assignable lvalues. You can use std.utf.byCodeUnit to get a char range over an char[], but using it here is a bit awkward, because there's no (documented) way to get the array back from a byCodeUnit range: ---- char[] thing = ['a', 'b', 'c']; thing = thing[0 .. thing.byCodeUnit.remove(1).length]; ---- You could also use ubyte[] instead of char[]: ---- ubyte[] thing = ['a', 'b', 'c']; thing = thing.remove(1); ----
Apr 30 2016
parent reply Jon D <jond noreply.com> writes:
On Saturday, 30 April 2016 at 18:32:32 UTC, ag0aep6g wrote:
 On 30.04.2016 18:44, TheGag96 wrote:
 I was just writing some code trying to remove a value from a 
 character
 array, but the compiler complained "No overload matches for 
 remove", and
 if I specifically say use std.algorithm.remove() the compiler 
 doesn't
 think it fits any definition. For reference, this would be all 
 I'm doing:

 char[] thing = ['a', 'b', 'c'];
 thing = thing.remove(1);

 Is this a bug? std.algorithm claims remove() works on any 
 forward range...
The documentation is wrong. 1) remove requires a bidirectional range. The constraints and parameter documentation correctly say so. char[] is a bidirectional range, though. 2) remove requires lvalue elements. char[] fails this, as the range primitives decode the chars on-the-fly to dchars. Pull request to fix the documentation: https://github.com/dlang/phobos/pull/4271 By the way, I think requiring lvalues is too restrictive. It should work with assignable elements. Also, it has apparently been missed that const/immutable can make non-assignable lvalues.
There's a ticket open related to the lvalue element requirement: https://issues.dlang.org/show_bug.cgi?id=8930 Personally, I think this example is more compelling than the one in the ticket. It seems very reasonable to expect that std.algorithm.remove will work regardless of whether the elements are characters, integers, ubytes, etc. If an initial step is to fix the documentation, it would be helpful to include specifically that it doesn't work with characters. It's not obvious that characters don't meet the requirement. --Jon
Apr 30 2016
parent reply ag0aep6g <anonymous example.com> writes:
On 30.04.2016 21:08, Jon D wrote:
 If an initial step is to fix the documentation, it would be helpful to
 include specifically that it doesn't work with characters. It's not
 obvious that characters don't meet the requirement.
Characters are not the problem. remove works fine on a range of chars, when the elements are assignable lvalues. char[] as a range has neither assignable elements, nor lvalue elements. That is, lines 3 and 4 here don't compile: ---- import std.range: front; char[] a = ['f', 'o', 'o']; a.front = 'g'; auto ptr = &a.front; ----
Apr 30 2016
next sibling parent reply Jon D <jond noreply.com> writes:
On Saturday, 30 April 2016 at 19:21:30 UTC, ag0aep6g wrote:
 On 30.04.2016 21:08, Jon D wrote:
 If an initial step is to fix the documentation, it would be 
 helpful to
 include specifically that it doesn't work with characters. 
 It's not
 obvious that characters don't meet the requirement.
Characters are not the problem. remove works fine on a range of chars, when the elements are assignable lvalues. char[] as a range has neither assignable elements, nor lvalue elements. That is, lines 3 and 4 here don't compile: ---- import std.range: front; char[] a = ['f', 'o', 'o']; a.front = 'g'; auto ptr = &a.front; ----
I didn't mean to suggest making the documentation technically incorrect. Just that it be helpful in important cases that won't necessarily be obvious. To me, char[] is an important case, one that's not made obvious by listing the hasLvalueElements constraint by itself. --Jon
Apr 30 2016
parent ag0aep6g <anonymous example.com> writes:
On 30.04.2016 21:41, Jon D wrote:
 I didn't mean to suggest making the documentation technically incorrect.
 Just that it be helpful in important cases that won't necessarily be
 obvious. To me, char[] is an important case, one that's not made obvious
 by listing the hasLvalueElements constraint by itself.
Sure. I wouldn't object to having a little reminder there that char[] does not meet the requirements.
Apr 30 2016
prev sibling parent reply TheGag96 <thegag96 gmail.com> writes:
On Saturday, 30 April 2016 at 19:21:30 UTC, ag0aep6g wrote:
 On 30.04.2016 21:08, Jon D wrote:
 If an initial step is to fix the documentation, it would be 
 helpful to
 include specifically that it doesn't work with characters. 
 It's not
 obvious that characters don't meet the requirement.
Characters are not the problem. remove works fine on a range of chars, when the elements are assignable lvalues. char[] as a range has neither assignable elements, nor lvalue elements. That is, lines 3 and 4 here don't compile: ---- import std.range: front; char[] a = ['f', 'o', 'o']; a.front = 'g'; auto ptr = &a.front; ----
Why exactly is it like this? I would understand why strings (immutable character arrays) behave like this, but I feel like plain old character arrays should work the same as an array of ubytes when treated as a range... Or is there some other string-related behavior that would get broken by this?
Apr 30 2016
parent reply ag0aep6g <anonymous example.com> writes:
On 01.05.2016 07:29, TheGag96 wrote:
 Why exactly is it like this? I would understand why strings (immutable
 character arrays) behave like this, but I feel like plain old character
 arrays should work the same as an array of ubytes when treated as a
 range... Or is there some other string-related behavior that would get
 broken by this?
It's because of auto-decoding. char[] is an array of chars, but it's been made a range of dchars. Calling front on a char[] decodes up to four chars into one dchar. Obviously you can't take the address of the dchar, because it's just a return value. You can't assign through front, because the number of chars could be different from what's currently there. The whole array would have to be re-arranged then, which would be unexpectedly costly for the user. The auto-decoding behavior was chosen to make dealing with char[] less bug-prone. With auto-decoding you don't have to worry about cutting a multibyte sequence in half (can still cut a grapheme cluster in half, though). However, as you see, it creates other headaches, and it's considered a mistake by many. Getting rid of it now would be a major breaking change. Could be worthwhile, though.
May 01 2016
parent reply TheGag96 <thegag96 gmail.com> writes:
On Sunday, 1 May 2016 at 09:11:22 UTC, ag0aep6g wrote:
 It's because of auto-decoding. char[] is an array of chars, but 
 it's been made a range of dchars. Calling front on a char[] 
 decodes up to four chars into one dchar.

 Obviously you can't take the address of the dchar, because it's 
 just a return value.

 You can't assign through front, because the number of chars 
 could be different from what's currently there. The whole array 
 would have to be re-arranged then, which would be unexpectedly 
 costly for the user.

 The auto-decoding behavior was chosen to make dealing with 
 char[] less bug-prone. With auto-decoding you don't have to 
 worry about cutting a multibyte sequence in half (can still cut 
 a grapheme cluster in half, though). However, as you see, it 
 creates other headaches, and it's considered a mistake by many. 
 Getting rid of it now would be a major breaking change. Could 
 be worthwhile, though.
That's a shame. I'd really like to be able to remove a character from a character array... I supplier I just have to get used to casting to ubyte[] or maybe just using dchar[] instead. I'm honestly surprised I haven't encountered this behavior sooner. It's interesting - as I keep coding in D and learning about how amazing it is to use, I always find these weird quirks here and there to remind me that it isn't flawless... I guess it can't be perfect, haha.
May 01 2016
parent ag0aep6g <anonymous example.com> writes:
On 02.05.2016 01:51, TheGag96 wrote:
 That's a shame. I'd really like to be able to remove a character from a
 character array... I supplier I just have to get used to casting to
 ubyte[] or maybe just using dchar[] instead.
Instead of casting you can also use `representation`[1] and `assumeUTF`[2] to convert to ubyte[] and back: ---- char[] thing = ['a', 'b', 'c']; thing = thing.representation.remove(1).assumeUTF; ---- [1] https://dlang.org/phobos/std_string.html#.representation [2] https://dlang.org/phobos/std_string.html#.assumeUTF
May 01 2016