www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Re: eliminate junk from std.string?

reply Jerry Quinn <jlquinn optonline.net> writes:
Andrei Alexandrescu Wrote:

 On 1/11/11 1:45 PM, Jerry Quinn wrote:
 Unclear if iswhite() refers to ASCII whitespace or Unicode.  If Unicode, which
version of the standard?

Not sure. enum dchar LS = '\u2028'; /// UTF line separator enum dchar PS = '\u2029'; /// UTF paragraph separator bool iswhite(dchar c) { return c <= 0x7F ? indexOf(whitespace, c) != -1 : (c == PS || c == LS); } Which version?

This looks pretty incomplete if the goal is to return true for any unicode whitespace character. My comment was really that if we're going to offer things like this, they need to be more completely defined.
 Same comment for icmp().  Also, in the Unicode standard, case folding can
depend on the specific language.

That uses toUniLower. Not sure how that works.

And doesn't mention details about the Unicode standard version it implements.
 You've got chop() marked as deprecated.  Is popBack() going to make
 sense as something that removes a variable number of chars from a
 string in the CR-LF case?  That might be a bit too magical.

Well I found little use for chop in e.g. Perl. People either use chomp or want to remove the last character. I think chop is useless.

Agreed, chomp is more useful. My question is whether popBack() should automatically act like perl chomp() for strings or not?
 One set of functions I'd like to see are startsWith() and endsWith().  I find
them frequently useful in Java and an irritating lack in the C++ standard
library.

Yah, those are in std.algorithm. Ideally we'd move everything that's applicable beyond strings to std.algorithm.

Ah, missed those. Jerry
Jan 11 2011
parent reply Jerry Quinn <jlquinn optonline.net> writes:
Jerry Quinn Wrote:


 Same comment for icmp().  Also, in the Unicode standard, case folding can
depend on the specific language.

That uses toUniLower. Not sure how that works.

And doesn't mention details about the Unicode standard version it implements.

Actually it does. *munch* *munch* my words are delicious. It would be good to have better docs on what icmp() does. Also, it might make sense to do icmp() using unicode case folding and normalization rather than simple lowercase. Thinking out loud here.
Jan 11 2011
next sibling parent spir <denis.spir gmail.com> writes:
On 01/12/2011 07:22 AM, Jerry Quinn wrote:
 Jerry Quinn Wrote:


 Same comment for icmp().  Also, in the Unicode standard, case folding can
depend on the specific language.

That uses toUniLower. Not sure how that works.

And doesn't mention details about the Unicode standard version it implements.

Actually it does. *munch* *munch* my words are delicious. It would be good to have better docs on what icmp() does. Also, it might make sense to do icmp() using unicode case folding and normalization rather than simple lowercase. Thinking out loud here.

You'll get this very soon. (see https://bitbucket.org/stephan/dunicode/src/bcf19471ebf9/unicodedata.d for details) Denis _________________ vita es estrany spir.wikidot.com
Jan 12 2011
prev sibling parent spir <denis.spir gmail.com> writes:
On 01/12/2011 07:22 AM, Jerry Quinn wrote:
 Jerry Quinn Wrote:


 Same comment for icmp().  Also, in the Unicode standard, case folding can
depend on the specific language.

That uses toUniLower. Not sure how that works.

And doesn't mention details about the Unicode standard version it implements.

Actually it does. *munch* *munch* my words are delicious. It would be good to have better docs on what icmp() does. Also, it might make sense to do icmp() using unicode case folding and normalization rather than simple lowercase. Thinking out loud here.

You'll get this very soon. (see https://bitbucket.org/stephan/dunicode/src/bcf19471ebf9/unicodedata.d for details) Denis _________________ vita es estrany spir.wikidot.com
Jan 12 2011