digitalmars.D - Re: eliminate junk from std.string?

Jerry Quinn <jlquinn optonline.net> Jan 11 2011

Jerry Quinn <jlquinn optonline.net> Jan 11 2011

spir <denis.spir gmail.com> Jan 12 2011
spir <denis.spir gmail.com> Jan 12 2011

Jerry Quinn <jlquinn optonline.net> writes:

Andrei Alexandrescu Wrote:

 On 1/11/11 1:45 PM, Jerry Quinn wrote:
 Unclear if iswhite() refers to ASCII whitespace or Unicode.  If Unicode, which
version of the standard?


 Not sure.
 
 enum dchar LS = '\u2028';                                   /// UTF line 
 separator
 enum dchar PS = '\u2029';                                   /// UTF 
 paragraph separator
 
 bool iswhite(dchar c)
 {
      return c <= 0x7F
          ? indexOf(whitespace, c) != -1
          : (c == PS || c == LS);
 }
 
 Which version?


This looks pretty incomplete if the goal is to return true for any unicode
whitespace character.   My comment was really that if we're going to offer
things like this, they need to be more completely defined.


 Same comment for icmp().  Also, in the Unicode standard, case folding can
depend on the specific language.


 That uses toUniLower. Not sure how that works.


And doesn't mention details about the Unicode standard version it implements.


 You've got chop() marked as deprecated.  Is popBack() going to make
 sense as something that removes a variable number of chars from a
 string in the CR-LF case?  That might be a bit too magical.


 Well I found little use for chop in e.g. Perl. People either use chomp 
 or want to remove the last character. I think chop is useless.


Agreed, chomp is more useful.  My question is whether popBack() should
automatically act like perl chomp() for strings or not?

 One set of functions I'd like to see are startsWith() and endsWith().  I find
them frequently useful in Java and an irritating lack in the C++ standard
library.


 Yah, those are in std.algorithm. Ideally we'd move everything that's 
 applicable beyond strings to std.algorithm.


Ah, missed those.

Jerry

Jan 11 2011

Jerry Quinn <jlquinn optonline.net> writes:

Jerry Quinn Wrote:


 Same comment for icmp().  Also, in the Unicode standard, case folding can
depend on the specific language.


 That uses toUniLower. Not sure how that works.


 And doesn't mention details about the Unicode standard version it implements.


Actually it does.  *munch* *munch* my words are delicious.

It would be good to have better docs on what icmp() does.  Also, it might make
sense to do icmp() using unicode case folding and normalization rather than
simple lowercase.  Thinking out loud here.

Jan 11 2011

spir <denis.spir gmail.com> writes:

On 01/12/2011 07:22 AM, Jerry Quinn wrote:
 Jerry Quinn Wrote:


 Same comment for icmp().  Also, in the Unicode standard, case folding can
depend on the specific language.


 That uses toUniLower. Not sure how that works.


 And doesn't mention details about the Unicode standard version it implements.


 Actually it does.  *munch* *munch* my words are delicious.

 It would be good to have better docs on what icmp() does.  Also, it might make
sense to do icmp() using unicode case folding and normalization rather than
simple lowercase.  Thinking out loud here.


You'll get this very soon. (see 
https://bitbucket.org/stephan/dunicode/src/bcf19471ebf9/unicodedata.d 
for details)


Denis
_________________
vita es estrany
spir.wikidot.com

Jan 12 2011

spir <denis.spir gmail.com> writes:

On 01/12/2011 07:22 AM, Jerry Quinn wrote:
 Jerry Quinn Wrote:


 Same comment for icmp().  Also, in the Unicode standard, case folding can
depend on the specific language.


 That uses toUniLower. Not sure how that works.


 And doesn't mention details about the Unicode standard version it implements.


 Actually it does.  *munch* *munch* my words are delicious.

 It would be good to have better docs on what icmp() does.  Also, it might make
sense to do icmp() using unicode case folding and normalization rather than
simple lowercase.  Thinking out loud here.


You'll get this very soon. (see 
https://bitbucket.org/stephan/dunicode/src/bcf19471ebf9/unicodedata.d 
for details)


Denis
_________________
vita es estrany
spir.wikidot.com

Jan 12 2011

D Programming

C/C++ Programming

Other

digitalmars.D - Re: eliminate junk from std.string?