digitalmars.D - Should this work?

Manu (24/24) Jan 09 2014 This works fine:

Tobias Pankrath (5/41) Jan 09 2014 std.algorithm.find returns the type it gets as input, so it's
Tobias Pankrath (5/7) Jan 09 2014 --

Manu (4/13) Jan 09 2014 If I have to type that, I'm going to write my own string library...

Marco Leise (14/31) Jan 09 2014

Manu (2/31) Jan 09 2014 Awesome! Although it looks like you still have a lot of work ahead of yo...

Marco Leise (6/7) Jan 09 2014

Manu (5/10) Jan 09 2014 When there are a zillion unit tests >_<

Marco Leise (12/27) Jan 09 2014

Jacob Carlborg (8/15) Jan 09 2014 That has been suggested before and the counter argument is that people

H. S. Teoh (16/31) Jan 09 2014 [...]

Brad Anderson (5/38) Jan 09 2014 I was of the opinion that phobos needed an experimental section

Marco Leise (22/68) Jan 10 2014 Dub is a nice extension to D, but it falls way to short for

Manu (14/30) Jan 09 2014 I've heard that, and I think that's a lame argument. Would people rather

Jacob Carlborg (4/11) Jan 09 2014 I think it's a good idea, others don't.

Marco Leise (6/17) Jan 10 2014 When do we have a meeting of the elders to decide on this

Benjamin Thaut (14/31) Jan 09 2014 I feel exactly the same. C# has way more utility functions that are

Kira Backes (7/11) Jan 09 2014 std.algorithm.indexOf was deprecated, not std.string.indexOf, so
Jesse Phillips (9/12) Jan 09 2014 Interesting, I've had the opposite experience. I keep trying to

Jacob Carlborg (12/20) Jan 09 2014 Or as in Ruby on Rails:

John Colvin (8/15) Jan 09 2014 In order to return the result as a string it would require an

Manu (4/20) Jan 09 2014 Ah yes. Well I really just want the offset anyway...

John Colvin (3/11) Jan 09 2014 Agreed. std.range and std.algorithm should be unicode correct
Dicebot (4/12) Jan 09 2014 I have 0 ideas how are you going to get same functionality in C

Manu (9/23) Jan 09 2014 It's nice that it's unicode correct, but it's not nice that you have to ...

Dicebot (14/28) Jan 09 2014 That I do agree. One idea is that once everything is split into

Manu (13/37) Jan 09 2014 That's great and all, but it's no good if I have to pay for it (time and

Dicebot (6/9) Jan 09 2014 Then first (and mandatory) thing to do is stop using `string`
Jacob Carlborg (4/7) Jan 09 2014 There are couple of functions in std.ascii but not what you needed here.

Andrei Alexandrescu (21/24) Jan 09 2014 No, but probably in the minority.

Manu (24/48) Jan 09 2014 The thing is, that pesky string thing is usually a trivial detail in an

Walter Bright (2/4) Jan 22 2014 Pretty much true. And it was worth it :-)

H. S. Teoh (59/121) Jan 09 2014 You have to be doing something wrong... formatting error messages is as
Jacob Carlborg (9/21) Jan 09 2014 Even if you do get how ranges work it can be difficult to figure out

Andrei Alexandrescu (7/13) Jan 10 2014 That's a documentation issue. We've pursued generalization of string

Jacob Carlborg (5/10) Jan 10 2014 They might not have a default "filler" object but you can pass the

Andrei Alexandrescu (5/13) Jan 10 2014 By that I was implying that the whole notion is not sensible for general...

Jacob Carlborg (5/9) Jan 10 2014 Fair enough, I just though I found a couple of more that could be

John Colvin (2/19) Jan 09 2014 Oh. I see you actually wanted strrchr behaviour. That's different.

Marco Leise (20/46) Jan 09 2014 The point about graphemes is good. D's functions still stop

Jerry (9/18) Jan 09 2014 Actually, you can do tons of NLP without grapheme clusters. If you're

Marco Leise (10/35) Jan 10 2014 Sorry, I got confused with the Unicode definitions. I see now

Jacob Carlborg (4/10) Jan 10 2014 Thunderbird sees that as two characters. Ruby sees it as three.

Marco Leise (14/24) Jan 10 2014 I think this is the (or one of the) official documents about

Regan Heath (21/46) Jan 09 2014 I feel exactly the same way. I must admit I haven't done any serious D ...

Regan Heath (7/9) Jan 09 2014 What I meant here is why can't we alias or wrap the generic routines (fr...

Manu (10/17) Jan 09 2014 We can and should. Very liberally.

Regan Heath (10/31) Jan 10 2014 We need, if one does not exist already, a dependency mapper tool. One

Craig Dillabaugh (16/26) Jan 09 2014 I think this would be a nice solution. I only use D for string

Adam D. Ruppe (16/23) Jan 09 2014 http://dlang.org/phobos/std_array.html#split

Craig Dillabaugh (15/39) Jan 09 2014 Thats the thing. In most cases the correct way to do something

H. S. Teoh (11/28) Jan 09 2014 Yeah, any public imports should be mentioned somewhere in the docs,

Jacob Carlborg (4/10) Jan 09 2014 I agree, and it should be automatic.

Adam D. Ruppe (8/14) Jan 09 2014 Yeah, and indeed it is a bit weird that so many of the functions

Andrei Alexandrescu (3/5) Jan 09 2014 PRP

Manu (2/49) Jan 09 2014 Or just alias the functions useful for string processing...

Jacob Carlborg (4/5) Jan 09 2014 I agree. It already has some aliases, converting to lower and uppercase.

Andrei Alexandrescu (3/6) Jan 10 2014 I wouldn't want to get to the point where many functions have 2-3 names.

Jacob Carlborg (6/7) Jan 10 2014 They're aliased in from std.uni, I think that's a different thing. It's
Regan Heath (11/18) Jan 10 2014 This is only a problem if they are all in the same sphere of concern.. b...

Andrei Alexandrescu (4/20) Jan 10 2014 The way I see it one learns a name for an algorithm (low cognitive load)...

Regan Heath (34/57) Jan 13 2014 .

Jacob Carlborg (5/8) Jan 13 2014 I think "contains" is a way better name. That's what most other
Andrei Alexandrescu (8/31) Jan 22 2014 I think you can answer that for yourself. Just take the approach to its

Regan Heath (23/57) Jan 23 2014 ss

Andrei Alexandrescu (7/11) Jan 23 2014 Ionno. Just look at the current morass with

Jacob Carlborg (9/14) Jan 24 2014 Personally I would expect "any" to take a predicate and return "true" if...

Andrea Fontana (7/13) Jan 24 2014 +1
Stanislav Blinov (7/11) Jan 24 2014 I agree on the latter point. As for "contains"... Well, if we

Regan Heath (11/22) Jan 24 2014 This is the complete opposite of the point I was trying to make :p

Regan Heath (8/20) Jan 24 2014 Except in the case of string, where we also want an overload taking more...

Manu (3/9) Jan 25 2014 A great example of when the string function should not be conflated with

Peter Alexander (8/24) Jan 25 2014 You always want the overload.

Jacob Carlborg (6/13) Jan 25 2014 I agree. Since strings are just a kind of array, it would be
"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (6/12) Jan 25 2014 I don't disagree, but naming and intuitive semantics should match

Peter Alexander (4/20) Jan 25 2014 100% agree. The key thing is that it should be consistent between

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (13/15) Jan 25 2014 Indeed. It is better to have to look up the name in the beginning.

Marco Leise (19/31) Jan 26 2014 Am Sat, 25 Jan 2014 14:36:52 +0000

Manu (8/33) Jan 25 2014 Does that work in all cases of strings wrt utf encodings? String

Andrei Alexandrescu (12/28) Jan 25 2014 Not automatically, but you could do things such as

Regan Heath (17/19) Jan 27 2014 This is a tangent to my suggestion.

Dicebot (10/28) Jan 27 2014 I think that is a small short-term learning advantage but huge

Regan Heath (19/49) Jan 28 2014 No, you really don't.

Dicebot (8/23) Jan 29 2014 Trusting intuition is not acceptable. I will go and check in docs

Regan Heath (27/40) Jan 29 2014 Sure it is, if we're talking about making life easier for beginners and ...

Dicebot (12/24) Jan 29 2014 I won't be confused but I won't also be sure. For example, it may

Regan Heath (5/29) Jan 29 2014 *shrug* agree to disagree on all points.

Dicebot (2/4) Jan 29 2014 Peace!

Andrei Alexandrescu (8/24) Jan 27 2014 I just don't think this scales, though I understand it can sound

Regan Heath (7/34) Jan 28 2014 What specifically didn't work? All I can recall are UTF and slicing

Andrei Alexandrescu (4/12) Jan 28 2014 Problem is what we had was a crappy strings API because it used none of

Regan Heath (10/22) Jan 29 2014 Sure, but it would be better still if the commonly expected names for

Jakob Ovrum (3/19) Jan 25 2014 Both `find` and `canFind` support subrange search, and that works

Regan Heath (14/24) Jan 24 2014 Not *quite* the same. Any is/was in the same module as canFind and for ...

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (16/18) Jan 22 2014 Just a side track:

Andrei Alexandrescu (3/8) Jan 09 2014 Pull request please.

Adam D. Ruppe (4/5) Jan 09 2014 import std.string;

John Colvin (2/7) Jan 09 2014 How on earth did I miss that...

Manu (7/19) Jan 09 2014 I have to wonder the same thing.

Andrei Alexandrescu (3/18) Jan 09 2014 Probably an xref of indexOf/lastIndexOf in find would be useful. PRP
Jacob Carlborg (9/15) Jan 10 2014 But "strchr" is a good name? If I wanted the index of a character in a

Dmitry Olshansky (4/9) Jan 09 2014 +1 LOL
Dicebot (2/7) Jan 09 2014 It is not the same thing as sample with byGrapheme though.

Adam D. Ruppe (3/4) Jan 09 2014 Right, but it works for ascii (and others) and shows std.string

Manu (25/30) Jan 09 2014 So is it 'correct'? The docs don't really say what it does. Is 'index' i...

Adam D. Ruppe (23/27) Jan 09 2014 Yes, with the caveat that it might find a surrogate pair (like H

Adam D. Ruppe (6/6) Jan 09 2014 BTW, I'll say it again: it was a *lot* easier to get started with

H. S. Teoh (13/20) Jan 09 2014 I thought it still is? Except that a lot of it is now implicit via

Adam D. Ruppe (21/22) Jan 09 2014 Yeah, mostly, though sometimes the disambiguation leaks the other

H. S. Teoh (34/48) Jan 09 2014 Right, so it should be mentioned in std.string.

Andrei Alexandrescu (3/15) Jan 09 2014 A tutorial on string manipulation in D would be awesome.

John Colvin (3/24) Jan 10 2014 That would be a very useful asset.

Jacob Carlborg (5/12) Jan 10 2014 std.uni was available in D1 as well.

Jacob Carlborg (4/7) Jan 10 2014 There was std.uni back in the D1 days as well ;)

Dmitry Olshansky (5/10) Jan 09 2014 Not at all. Take time to read the Unicode standard.

Dmitry Olshansky (5/14) Jan 09 2014 To clarify: grapheme cluster is not a pair, nor it's a surrogate pair,

Brad Anderson (5/20) Jan 09 2014 DDox improves on this a bit by giving a table with brief

Manu (2/18) Jan 09 2014 I prefer this immeasurably.
Jacob Carlborg (4/7) Jan 10 2014 What's the hold up of making the official documentation use DDox?

Kira Backes (3/5) Jan 10 2014 I’m also interested in this, since the current documentation is
Dmitry Olshansky (4/9) Jan 10 2014 Seconded.

Andrei Alexandrescu (3/13) Jan 10 2014 Let's set to switch to ddox with 2.065.

Dicebot (5/20) Jan 10 2014 Let's not put to much stress on release/deployment process (which

Brad Anderson (8/31) Jan 10 2014 Updating the website is almost strictly Andrei's domain so he

Dicebot (3/10) Jan 12 2014 Andrew should have access too as any release-related updates are

Jerry (6/19) Jan 17 2014 This looks much nicer as a summary. I would personally prefer to have

Jesse Phillips (14/28) Jan 09 2014 I find this to be true in other languages, except the "block of

Jacob Carlborg (5/7) Jan 10 2014 It's easier in a more object oriented language. It's most likely that

Jesse Phillips (3/15) Jan 09 2014 It is interesting that you ask this about the D code but not the

H. S. Teoh (32/41) Jan 09 2014 Yeah, that blob of links is useless unless you already knew what you

Jacob Carlborg (13/25) Jan 10 2014 I'm convinced the both of the tables on in the std.algorithm

Brad Anderson (16/47) Jan 09 2014 I absolutely hate the "does not match any template declaration"

H. S. Teoh (70/90) Jan 09 2014 Yeah, that error drives me up the wall too. I often get screenfuls of

Brad Anderson (86/219) Jan 09 2014 Interesting and there is a lot of flexibility there. It does make

H. S. Teoh (72/218) Jan 09 2014 The way I see it, is that any sig constraints should go on the outer

Brad Anderson (5/17) Jan 09 2014 Ok, you've convinced me. I still think highlighting which

Timon Gehr (3/22) Jan 09 2014 static assert is not a good way to implement custom error messages

H. S. Teoh (14/38) Jan 10 2014 It's not just about custom error messages; it's about picking up a

QAston (2/2) Jan 12 2014 Your proposal is awesome, this should be in phobos style guide
Jakob Ovrum (4/23) Jan 13 2014 There is nothing stopping you from writing constraints that

Jacob Carlborg (5/42) Jan 10 2014 If I recall correctly, Andrei has mentioned that something like the

Manu (47/134) Jan 09 2014 *THIS* .. I've always thought that, and intuitively written my D code th...
H. S. Teoh (82/171) Jan 09 2014 Really? I only encounter those kinds of errors once in a while. They

Atila Neves (22/123) Jan 10 2014 I agree that std.algorithm is better than , but let's

H. S. Teoh (10/52) Jan 10 2014 You're right, my C++ is outdated. I'm not exactly motivated to keep up
Craig Dillabaugh (5/27) Jan 10 2014 In our company we have people working with Visual Studio 2005, so

Manu (48/215) Jan 09 2014 I think not really knowing quite what you need to do in advance elevates

Jacob Carlborg (10/12) Jan 10 2014 For UTF-8 strings it's an extra if-statement:
Walter Bright (4/9) Jan 26 2014 You're right, and I see the same thing when I use ranges.

Jacob Carlborg (21/45) Jan 09 2014 As other as said, the problem is that "find" returns a range, which is

Manu (23/68) Jan 09 2014 Then there's probably a fundamental problem somewhere, and it should be

Jacob Carlborg (15/27) Jan 10 2014 I think the confusion comes from strings are just plain arrays, which

Jakob Ovrum (41/42) Jan 09 2014 Using std.algorithm or std.range requires learning about ranges.

Manu <turkeyman gmail.com> writes:

This works fine:
  string x = find("Hello", 'H');

This doesn't:
  string y = find(retro("Hello"), 'H');
  > Error: cannot implicitly convert expression (find(retro("Hello"), 'H'))
of type Result!() to string

Is that wrong? That seems to be how the docs suggest it should be used.

On a side note, am I the only one that finds std.algorithm/std.range/etc
for string processing really obtuse?
I can rarely understand the error messages, so say it's better than STL is
optimistic.
Using std.algorithm and std.range to do string manipulation feels really
lame to me.
I hate looking through the docs of 3-4 modules to understand the complete
set of useful string operations (std.string, std.uni, std.algorithm,
std.range... at least).
I also find the names of the generic algorithms are often unrelated to the
name of the string operation.
My feeling is, everyone is always on about how cool D is at string, but
other than 'char[]', and the builtin slice operator, I feel really
unproductive whenever I do any heavy string manipulation in D.
I also hate that I need to import at least 4-5 modules to do anything
useful with strings... I feel my program bloating and cringe with every
gigantic import that sources exactly one symbol.

Jan 09 2014

"Tobias Pankrath" <tobias pankrath.net> writes:

On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:
 This works fine:
   string x = find("Hello", 'H');

 This doesn't:
   string y = find(retro("Hello"), 'H');
   > Error: cannot implicitly convert expression 
 (find(retro("Hello"), 'H'))
 of type Result!() to string

 Is that wrong? That seems to be how the docs suggest it should 
 be used.

 On a side note, am I the only one that finds 
 std.algorithm/std.range/etc
 for string processing really obtuse?
 I can rarely understand the error messages, so say it's better 
 than STL is
 optimistic.
 Using std.algorithm and std.range to do string manipulation 
 feels really
 lame to me.
 I hate looking through the docs of 3-4 modules to understand 
 the complete
 set of useful string operations (std.string, std.uni, 
 std.algorithm,
 std.range... at least).
 I also find the names of the generic algorithms are often 
 unrelated to the
 name of the string operation.
 My feeling is, everyone is always on about how cool D is at 
 string, but
 other than 'char[]', and the builtin slice operator, I feel 
 really
 unproductive whenever I do any heavy string manipulation in D.
 I also hate that I need to import at least 4-5 modules to do 
 anything
 useful with strings... I feel my program bloating and cringe 
 with every
 gigantic import that sources exactly one symbol.

std.algorithm.find returns the type it gets as input, so it's 
retros return type and not string. I agree, that it isn't always 
obvious which types are expected or returned in std.algorithm and 
especially std.container

Jan 09 2014

"Tobias Pankrath" <tobias pankrath.net> writes:

On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:
 Is that wrong? That seems to be how the docs suggest it should 
 be used.

--
string s = find(retro("Hello"), "H").source;
--
Is that working?

Jan 09 2014

Manu <turkeyman gmail.com> writes:

On 10 January 2014 00:19, Tobias Pankrath <tobias pankrath.net> wrote:

 On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:

 Is that wrong? That seems to be how the docs suggest it should be used.

 --
 string s = find(retro("Hello"), "H").source;
 --
 Is that working?

If I have to type that, I'm going to write my own string library...
There's no argument where that can be considered superior to:
strrchr("Hello", 'H');

Jan 09 2014

Marco Leise <Marco.Leise gmx.de> writes:

Am Fri, 10 Jan 2014 00:33:28 +1000
schrieb Manu <turkeyman gmail.com>:

 On 10 January 2014 00:19, Tobias Pankrath <tobias pankrath.net> wrote:
 
 On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:

 Is that wrong? That seems to be how the docs suggest it should be used.

 --
 string s = find(retro("Hello"), "H").source;
 --
 Is that working?

 
 If I have to type that, I'm going to write my own string library...
 There's no argument where that can be considered superior to:
 strrchr("Hello", 'H');

 
If you do let me know, we can merge the efforts.
Coincidentally what I started uses your std.simd:
http://code.dlang.org/packages/fast
https://github.com/mleise/fast

I haven't pushed the latest changes which include updates to
the latest D versions and switching between lookup tables and
SSE3 for char in string search.

The idea is to build a collection of the fastest versions of
basic utility functions. No safety nets, no garbage collection.

-- 
Marco

Jan 09 2014

Manu <turkeyman gmail.com> writes:

On 10 January 2014 01:04, Marco Leise <Marco.Leise gmx.de> wrote:

 Am Fri, 10 Jan 2014 00:33:28 +1000
 schrieb Manu <turkeyman gmail.com>:

 On 10 January 2014 00:19, Tobias Pankrath <tobias pankrath.net> wrote:

 On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:

 Is that wrong? That seems to be how the docs suggest it should be



 used.

 --
 string s = find(retro("Hello"), "H").source;
 --
 Is that working?

 If I have to type that, I'm going to write my own string library...
 There's no argument where that can be considered superior to:
 strrchr("Hello", 'H');

 If you do let me know, we can merge the efforts.
 Coincidentally what I started uses your std.simd:
 http://code.dlang.org/packages/fast
 https://github.com/mleise/fast

 I haven't pushed the latest changes which include updates to
 the latest D versions and switching between lookup tables and
 SSE3 for char in string search.

 The idea is to build a collection of the fastest versions of
 basic utility functions. No safety nets, no garbage collection.

Awesome! Although it looks like you still have a lot of work ahead of you :)

Jan 09 2014

Marco Leise <Marco.Leise gmx.de> writes:

Am Fri, 10 Jan 2014 01:20:26 +1000
schrieb Manu <turkeyman gmail.com>:

 Awesome! Although it looks like you still have a lot of work ahead of you :)

 
So... when was std.simd going to be in Phobos again? :p

-- 
Marco

Jan 09 2014

Manu <turkeyman gmail.com> writes:

On 10 January 2014 01:56, Marco Leise <Marco.Leise gmx.de> wrote:

 Am Fri, 10 Jan 2014 01:20:26 +1000
 schrieb Manu <turkeyman gmail.com>:

 Awesome! Although it looks like you still have a lot of work ahead of

 you :)

 So... when was std.simd going to be in Phobos again? :p

When there are a zillion unit tests >_<
And I kinda wanna prove it is efficient on other architectures before it is
committed to the stone tablet that is phobos; that can never be changed
once committed.

Jan 09 2014

Marco Leise <Marco.Leise gmx.de> writes:

Am Fri, 10 Jan 2014 02:21:35 +1000
schrieb Manu <turkeyman gmail.com>:

 On 10 January 2014 01:56, Marco Leise <Marco.Leise gmx.de> wrote:
 
 Am Fri, 10 Jan 2014 01:20:26 +1000
 schrieb Manu <turkeyman gmail.com>:

 Awesome! Although it looks like you still have a lot of work ahead of

 you :)

 So... when was std.simd going to be in Phobos again? :p

 
 When there are a zillion unit tests >_<
 And I kinda wanna prove it is efficient on other architectures before it is
 committed to the stone tablet that is phobos; that can never be changed
 once committed.

 
I Phobos should follow OpenGL in this regard and use a
prefix like `etc` for useful but not finalized modules, so
early adapters can try out new modules compare them with any
existing API in Phobos where applicable (e.g. streams,
json, ...) and report any issues. I have a feeling that right
now most modules are tested by 2 people prior to the merge,
because they spent a life in obscurity.

-- 
Marco

Jan 09 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-09 17:35, Marco Leise wrote:

 I Phobos should follow OpenGL in this regard and use a
 prefix like `etc` for useful but not finalized modules, so
 early adapters can try out new modules compare them with any
 existing API in Phobos where applicable (e.g. streams,
 json, ...) and report any issues. I have a feeling that right
 now most modules are tested by 2 people prior to the merge,
 because they spent a life in obscurity.

That has been suggested before and the counter argument is that people 
will start using and complain when it's changed, even if it's in an 
experimental. Someone here said that the javax. packages originally was 
experimental packages to they continued to live in the javax namespace 
to avoid breaking changes.

-- 
/Jacob Carlborg

Jan 09 2014

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Jan 09, 2014 at 09:19:40PM +0100, Jacob Carlborg wrote:
 On 2014-01-09 17:35, Marco Leise wrote:
 
I Phobos should follow OpenGL in this regard and use a
prefix like `etc` for useful but not finalized modules, so
early adapters can try out new modules compare them with any
existing API in Phobos where applicable (e.g. streams,
json, ...) and report any issues. I have a feeling that right
now most modules are tested by 2 people prior to the merge,
because they spent a life in obscurity.

 
 That has been suggested before and the counter argument is that
 people will start using and complain when it's changed, even if it's
 in an experimental. Someone here said that the javax. packages
 originally was experimental packages to they continued to live in
 the javax namespace to avoid breaking changes.

[...]

Maybe instead of calling it 'etc' we should outright call it
'experimental'. If you have code like:

	import experimental.myawesomemodule;
	...

I doubt you'd object very much when you have to rename it to:

	import std.myawesomemodule;
	...

since the word 'experimental' staring you in the face every time you
open up the file will be a constant nagging reminder that you're
depending on something unstable, giving you motivation to want to move
it to something stable as soon as you can.


T

-- 
"I speak better English than this villain Bush" -- Mohammed Saeed al-Sahaf,
Iraqi Minister of Information

Jan 09 2014

"Brad Anderson" <eco gnuk.net> writes:

On Thursday, 9 January 2014 at 20:40:30 UTC, H. S. Teoh wrote:
 On Thu, Jan 09, 2014 at 09:19:40PM +0100, Jacob Carlborg wrote:
 On 2014-01-09 17:35, Marco Leise wrote:
 
I Phobos should follow OpenGL in this regard and use a
prefix like `etc` for useful but not finalized modules, so
early adapters can try out new modules compare them with any
existing API in Phobos where applicable (e.g. streams,
json, ...) and report any issues. I have a feeling that right
now most modules are tested by 2 people prior to the merge,
because they spent a life in obscurity.

 
 That has been suggested before and the counter argument is that
 people will start using and complain when it's changed, even 
 if it's
 in an experimental. Someone here said that the javax. packages
 originally was experimental packages to they continued to live 
 in
 the javax namespace to avoid breaking changes.

 [...]

 Maybe instead of calling it 'etc' we should outright call it
 'experimental'. If you have code like:

 	import experimental.myawesomemodule;
 	...

 I doubt you'd object very much when you have to rename it to:

 	import std.myawesomemodule;
 	...

 since the word 'experimental' staring you in the face every 
 time you
 open up the file will be a constant nagging reminder that you're
 depending on something unstable, giving you motivation to want 
 to move
 it to something stable as soon as you can.


 T

I was of the opinion that phobos needed an experimental section 
for getting real world testing of proposed modules but these days 
I think we should just stick things up on dub (including modules 
proposed for inclusion in phobos).

Jan 09 2014

Marco Leise <Marco.Leise gmx.de> writes:

Am Thu, 09 Jan 2014 23:32:37 +0000
schrieb "Brad Anderson" <eco gnuk.net>:

 On Thursday, 9 January 2014 at 20:40:30 UTC, H. S. Teoh wrote:
 On Thu, Jan 09, 2014 at 09:19:40PM +0100, Jacob Carlborg wrote:
 On 2014-01-09 17:35, Marco Leise wrote:
 
I Phobos should follow OpenGL in this regard and use a
prefix like `etc` for useful but not finalized modules, so
early adapters can try out new modules compare them with any
existing API in Phobos where applicable (e.g. streams,
json, ...) and report any issues. I have a feeling that right
now most modules are tested by 2 people prior to the merge,
because they spent a life in obscurity.

 
 That has been suggested before and the counter argument is that
 people will start using and complain when it's changed, even 
 if it's
 in an experimental. Someone here said that the javax. packages
 originally was experimental packages to they continued to live 
 in
 the javax namespace to avoid breaking changes.

 [...]

 Maybe instead of calling it 'etc' we should outright call it
 'experimental'. If you have code like:

 	import experimental.myawesomemodule;
 	...

 I doubt you'd object very much when you have to rename it to:

 	import std.myawesomemodule;
 	...

 since the word 'experimental' staring you in the face every 
 time you
 open up the file will be a constant nagging reminder that you're
 depending on something unstable, giving you motivation to want 
 to move
 it to something stable as soon as you can.


 T

 
 I was of the opinion that phobos needed an experimental section 
 for getting real world testing of proposed modules but these days 
 I think we should just stick things up on dub (including modules 
 proposed for inclusion in phobos).

Dub is a nice extension to D, but it falls way to short for
what I expect package management to do on Linux:

o system wide installation of shared libraries
o keep one library per ABI, where ABI is a cross of:
  (x86,amd64) x (dmd,ldc,gdc) x (compiler version)
o therefore accept custom library installation paths
o remove packages that weren't explicitly requested and are
  not a dependency of something else

Actually I don't expect dub to do all that. It duplicates
parts of the existing package manager, which has to be used to
seamlessly integrate with the rest of the system anyway.
Dub does make building foreign packages a snap and that's what
it is great for, but soon people will expect complete
applications with their dependencies to be in the package
list. At least as soon as someone writes something _popular_
in D that uses more than just DMD. (This excludes vibe.d for
example, which is - I think - the most notable D product
outside of this community.)

-- 
Marco

Jan 10 2014

Manu <turkeyman gmail.com> writes:

On 10 January 2014 06:19, Jacob Carlborg <doob me.com> wrote:

 On 2014-01-09 17:35, Marco Leise wrote:

  I Phobos should follow OpenGL in this regard and use a
 prefix like `etc` for useful but not finalized modules, so
 early adapters can try out new modules compare them with any
 existing API in Phobos where applicable (e.g. streams,
 json, ...) and report any issues. I have a feeling that right
 now most modules are tested by 2 people prior to the merge,
 because they spent a life in obscurity.

 That has been suggested before and the counter argument is that people
 will start using and complain when it's changed, even if it's in an
 experimental.


I've heard that, and I think that's a lame argument. Would people rather
break peoples code *who deliberately chose to use a beta feature, and
accept the contract while doing so (that it would later be moved to 'std'
proper)*, or consistently produce features that have very little proven
foundation in practical application? It takes year(/s) before enough people
can have had a crack at a new API in enough scenarios to reveal where it
went right, and where it went wrong.

In the case of std.simd, I'm not ever going to consider presenting it for
inclusion until such a time I'm absolutely happy with it (although in this
case, it's also just not finished ;), and since it's not readily available,
that really just relies on my using it in enough of my own projects that I
manage to satisfy myself... it makes no sense.

Someone here said that the javax. packages originally was experimental
 packages to they continued to live in the javax namespace to avoid breaking
 changes.

 --
 /Jacob Carlborg

Jan 09 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-10 01:57, Manu wrote:

 I've heard that, and I think that's a lame argument. Would people rather
 break peoples code *who deliberately chose to use a beta feature, and
 accept the contract while doing so (that it would later be moved to
 'std' proper)*, or consistently produce features that have very little
 proven foundation in practical application? It takes year(/s) before
 enough people can have had a crack at a new API in enough scenarios to
 reveal where it went right, and where it went wrong.

I think it's a good idea, others don't.

-- 
/Jacob Carlborg

Jan 09 2014

Marco Leise <Marco.Leise gmx.de> writes:

Am Fri, 10 Jan 2014 08:42:14 +0100
schrieb Jacob Carlborg <doob me.com>:

 On 2014-01-10 01:57, Manu wrote:
 
 I've heard that, and I think that's a lame argument. Would people rather
 break peoples code *who deliberately chose to use a beta feature, and
 accept the contract while doing so (that it would later be moved to
 'std' proper)*, or consistently produce features that have very little
 proven foundation in practical application? It takes year(/s) before
 enough people can have had a crack at a new API in enough scenarios to
 reveal where it went right, and where it went wrong.

 
 I think it's a good idea, others don't.

When do we have a meeting of the elders to decide on this
matter?

-- 
Marco

Jan 10 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 09.01.2014 15:07, schrieb Manu:
 On a side note, am I the only one that finds std.algorithm/std.range/etc
 for string processing really obtuse?
 I can rarely understand the error messages, so say it's better than STL
 is optimistic.
 Using std.algorithm and std.range to do string manipulation feels really
 lame to me.
 I hate looking through the docs of 3-4 modules to understand the
 complete set of useful string operations (std.string, std.uni,
 std.algorithm, std.range... at least).
 I also find the names of the generic algorithms are often unrelated to
 the name of the string operation.
 My feeling is, everyone is always on about how cool D is at string, but
 other than 'char[]', and the builtin slice operator, I feel really
 unproductive whenever I do any heavy string manipulation in D.
 I also hate that I need to import at least 4-5 modules to do anything
 useful with strings... I feel my program bloating and cringe with every
 gigantic import that sources exactly one symbol.


named in a way that actually helps you understand what they do.

The best example in D is the deprection of indexOf. Now you have to call 
countUntil. But if I have to choose between the two names, indexOf 
actually tells me what it does, while countUntil does not. count until 
what? The confusion mostly comes from the condition which is a template 
argument with default value. Not to speak of the issues with UTF8 
characters, where countUntil does not actually give you a index into the 
array, but actually gives you the index of the character it found. So 
you can't use whatever comes out of countUntil for slicing.


-- 
Kind Regards
Benjamin Thaut

Jan 09 2014

"Kira Backes" <kira.backes nrwsoft.de> writes:

  On Thursday, 9 January 2014 at 14:25:20 UTC, Benjamin Thaut 
wrote:
 The best example in D is the deprection of indexOf. Now you 
 have to call countUntil. But if I have to choose between the 
 two names, indexOf actually tells me what it does, while 
 countUntil does not. count until what?

std.algorithm.indexOf was deprecated, not std.string.indexOf, so 
you can still use it of course and it still gives you the byte 
(array-access) index of the supplied parameter. And countUntil 
counts elements until it finds the supplied parameter. I think 
this is logical and useful and easy to understand.

Jan 09 2014

"Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:

On Thursday, 9 January 2014 at 14:25:20 UTC, Benjamin Thaut wrote:

 that are named in a way that actually helps you understand what 
 they do.

Interesting, I've had the opposite experience. I keep trying to 

course ever more desired.



     if(string.IsNullOrEmpty(str))

vs

     if(str.empty)

keeps throwing me off.

Jan 09 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-10 02:04, Jesse Phillips wrote:

 Interesting, I've had the opposite experience. I keep trying to perform

 more desired.



      if(string.IsNullOrEmpty(str))

 vs

      if(str.empty)

 keeps throwing me off.

Or as in Ruby on Rails:

if str.blank?
end

"str" is conisderd blank if:

* it's nil (null)
* empty (its length is 0)
* it only contains whitespce

BTW, it works on all objects, not just strings. For arrays it will check 
the length as well, but for other objects it will just check for nil.

-- 
/Jacob Carlborg

Jan 09 2014

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:
 This works fine:
   string x = find("Hello", 'H');

 This doesn't:
   string y = find(retro("Hello"), 'H');
   > Error: cannot implicitly convert expression 
 (find(retro("Hello"), 'H'))
 of type Result!() to string

In order to return the result as a string it would require an 
allocation. You have to request that allocation (and associated 
eager evaluation) explicitly

string y = "Hello".retro.find('H').to!string;


However, I think to get the expected result from unicode you need

string y = "Hello".byGrapheme.retro.find('H').to!string;

but I might be wrong.

Jan 09 2014

Manu <turkeyman gmail.com> writes:

On 10 January 2014 00:34, John Colvin <john.loughran.colvin gmail.com>wrote:

 On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:

 This works fine:
   string x = find("Hello", 'H');

 This doesn't:
   string y = find(retro("Hello"), 'H');
   > Error: cannot implicitly convert expression (find(retro("Hello"),
 'H'))
 of type Result!() to string

 In order to return the result as a string it would require an allocation.
 You have to request that allocation (and associated eager evaluation)
 explicitly

 string y = "Hello".retro.find('H').to!string;

Ah yes. Well I really just want the offset anyway...


However, I think to get the expected result from unicode you need
 string y = "Hello".byGrapheme.retro.find('H').to!string;

 but I might be wrong.

Bugger that. This is not an example of "D is good at strings!".

Jan 09 2014

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Thursday, 9 January 2014 at 15:14:04 UTC, Manu wrote:
 However, I think to get the expected result from unicode you 
 need
 string y = "Hello".byGrapheme.retro.find('H').to!string;

 but I might be wrong.

 Bugger that. This is not an example of "D is good at strings!".

Agreed. std.range and std.algorithm should be unicode correct 
with strings and leave the byte by byte access to ubyte arrays.

Jan 09 2014

"Dicebot" <public dicebot.lv> writes:

On Thursday, 9 January 2014 at 15:14:04 UTC, Manu wrote:
 However, I think to get the expected result from unicode you 
 need
 string y = "Hello".byGrapheme.retro.find('H').to!string;

 but I might be wrong.

 Bugger that. This is not an example of "D is good at strings!".

I have 0 ideas how are you going to get same functionality in C 
with strchr. This small line uses quite lot of features to be 
reliably unicode-correct.

Jan 09 2014

Manu <turkeyman gmail.com> writes:

On 10 January 2014 02:05, Dicebot <public dicebot.lv> wrote:

 On Thursday, 9 January 2014 at 15:14:04 UTC, Manu wrote:

 However, I think to get the expected result from unicode you need

 string y = "Hello".byGrapheme.retro.find('H').to!string;

 but I might be wrong.

 Bugger that. This is not an example of "D is good at strings!".

 I have 0 ideas how are you going to get same functionality in C with
 strchr. This small line uses quite lot of features to be reliably
 unicode-correct.

It's nice that it's unicode correct, but it's not nice that you have to be
familiar with a massive amount of the standard library and you need to
search through 4-5 (huge! and often poorly documented) modules to find the
functions you need to perform _basic string operations_, like finding the
last instance of a character...
My standing opinion is that string manipulation in D is not nice, it is
possibly the most difficult and time consuming I have used in any language
ever. Am I alone?

Jan 09 2014

"Dicebot" <public dicebot.lv> writes:

On Thursday, 9 January 2014 at 16:22:08 UTC, Manu wrote:
 It's nice that it's unicode correct, but it's not nice that you 
 have to be
 familiar with a massive amount of the standard library and you 
 need to
 search through 4-5 (huge! and often poorly documented) modules 
 to find the
 functions you need to perform _basic string operations_, like 
 finding the
 last instance of a character...

That I do agree. One idea is that once everything is split into 
smaller packages we can start providing meta-packages that do 
public imports of small sets of commonly used functions.

Still once needed functions are found I do consider end result 
very robust for what it actually does and don't know any other 
language that does it better.

 My standing opinion is that string manipulation in D is not 
 nice, it is
 possibly the most difficult and time consuming I have used in 
 any language
 ever. Am I alone?

Unicode is the doom. If you only keep ASCII in mind you statement 
is indeed true and D stuff seems ridiculously complicated 
compared even to plain C. But it has also teached me that _every 
single_ program I have written before in other languages was 
broken in regards to Unicode handling. So, yes, it is quite 
difficult but it is the cost for doing what no one else does - 
being correct out of the box. Well, at least in most scenarios :)

Jan 09 2014

Manu <turkeyman gmail.com> writes:

On 10 January 2014 02:36, Dicebot <public dicebot.lv> wrote:

 On Thursday, 9 January 2014 at 16:22:08 UTC, Manu wrote:

 It's nice that it's unicode correct, but it's not nice that you have to be
 familiar with a massive amount of the standard library and you need to
 search through 4-5 (huge! and often poorly documented) modules to find the
 functions you need to perform _basic string operations_, like finding the
 last instance of a character...

 That I do agree. One idea is that once everything is split into smaller
 packages we can start providing meta-packages that do public imports of
 small sets of commonly used functions.

 Still once needed functions are found I do consider end result very robust
 for what it actually does and don't know any other language that does it
 better.


  My standing opinion is that string manipulation in D is not nice, it is
 possibly the most difficult and time consuming I have used in any language
 ever. Am I alone?

 Unicode is the doom. If you only keep ASCII in mind you statement is
 indeed true and D stuff seems ridiculously complicated compared even to
 plain C. But it has also teached me that _every single_ program I have
 written before in other languages was broken in regards to Unicode
 handling. So, yes, it is quite difficult but it is the cost for doing what
 no one else does - being correct out of the box. Well, at least in most
 scenarios :)

That's great and all, but it's no good if I have to pay for it (time and
money!) even when that's not a requirement. I'm dealing with ascii right
now.
At very least, there needs to be massive assistance. std.string should
probably offer a crap load of aliases and wrappers for common operations.
And I hate how std.algorithm looks in intellisense pop ups, you never have
any idea what types you're dealing with, everything is templates, many
levels deep.
And then it's riddled with these little wrappers around 'Impl' types, which
just adds more layers to the typing confusion.
I want string functions that deal with types like 'string', not
'Unqual!(ElementEncodingType!(ElementType!Range))[]'

Jan 09 2014

"Dicebot" <public dicebot.lv> writes:

On Thursday, 9 January 2014 at 17:21:11 UTC, Manu wrote:
 money!) even when that's not a requirement. I'm dealing with 
 ascii right
 now.

Then first (and mandatory) thing to do is stop using `string` 
type and switch to `ubyte[]` (or wrapper from std.ascii)

Second thing to do then will be to complain about lack of 
`ubyte[]` specializations/overloads for most string processing 
functions ;)

Jan 09 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-09 18:20, Manu wrote:

 That's great and all, but it's no good if I have to pay for it (time and
 money!) even when that's not a requirement. I'm dealing with ascii right
 now.

There are couple of functions in std.ascii but not what you needed here.

-- 
/Jacob Carlborg

Jan 09 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/9/14 8:21 AM, Manu wrote:
 My standing opinion is that string manipulation in D is not nice, it is
 possibly the most difficult and time consuming I have used in any
 language ever. Am I alone?

No, but probably in the minority.

The long and short of it is, you must get ranges in order to enjoy the 
power of D algorithms (as per http://goo.gl/dVprVT).

std.{algorithm,range} are commonly mentioned as an attractive asset of 
D, and those who get that style of doing things have no trouble applying 
such notions to a variety of data, notably including strings. So going 
with the attitude "I don't use, know, or care for phobos... I just want 
to do this pesky string thing!" is bound to create frustration.

I personally find strings very easy to deal with in D. They might be 
easier in Perl or sometimes Python, but at a steep efficiency cost.

Walter has recently written a non-trivial utility that beats the pants 
off (3x performance) the equivalent C program that has been highly 
scrutinized and honed for literally decades by dozens (hundreds?) of 
professionals. Walter's implementations uses ranges and algorithms (a 
few standard, many custom) through and through. If all goes well we'll 
open-source it. He himself is now an range/algorithm convert, even 
though he'd be the first to point the no-nonsense nature of a function 
like strrchr. (And btw strrchr is after all a POS because it needs to 
scan the string left to right... so lastIndex is faster!)


Andrei

Jan 09 2014

Manu <turkeyman gmail.com> writes:

On 10 January 2014 15:48, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org
 wrote:

 On 1/9/14 8:21 AM, Manu wrote:

 My standing opinion is that string manipulation in D is not nice, it is
 possibly the most difficult and time consuming I have used in any
 language ever. Am I alone?

 No, but probably in the minority.

 The long and short of it is, you must get ranges in order to enjoy the
 power of D algorithms (as per http://goo.gl/dVprVT).

 std.{algorithm,range} are commonly mentioned as an attractive asset of D,
 and those who get that style of doing things have no trouble applying such
 notions to a variety of data, notably including strings. So going with the
 attitude "I don't use, know, or care for phobos... I just want to do this
 pesky string thing!" is bound to create frustration.

The thing is, that pesky string thing is usually a trivial detail in an
otherwise completely unrelated task. I'm not joking when I've had details
like formatting a useful error message take 90% of the time to complete
some totally unrelated task.
I guess I'm a little isolated from high level algorithms, because I spend
most of my time at the level of twiddling bits.

This is a key motivation for my kicking off this all-D game project, and
getting others involved. I need excuse to push myself to have more
involvement with these type of things. Doing more high-level code than I
usually do will help, and having other D users also in the project will
keep me in check, and hopefully improve my D code a lot while at it ;)

I personally find strings very easy to deal with in D. They might be easier
 in Perl or sometimes Python, but at a steep efficiency cost.

 Walter has recently written a non-trivial utility that beats the pants off
 (3x performance) the equivalent C program that has been highly scrutinized
 and honed for literally decades by dozens (hundreds?) of professionals.
 Walter's implementations uses ranges and algorithms (a few standard, many
 custom) through and through. If all goes well we'll open-source it. He
 himself is now an range/algorithm convert, even though he'd be the first to
 point the no-nonsense nature of a function like strrchr. (And btw strrchr
 is after all a POS because it needs to scan the string left to right... so
 lastIndex is faster!)


How long did it take to get him there? I suspect he made the leap only when
a particular task that motivated him to do so came up. I suspect I'm likely
to follow that same pattern given the context; like him, I'm a somewhat
no-frills practicality-oriented programmer, and don't get too excited about
futuristic shiny things unless it's readily apparent they can make my
workload simpler and more efficient (although I would also require it not
sacrifice computation efficiency). But my point remains, as a trivial
ancillary detail - I'm not doing stuff with strings; I'm working on other
stuff that just _has_ some strings - it's not presented in a way that one
can just get the job done with low friction, and without at least tripling
the number of imports from the std library.

Jan 09 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 1/9/2014 10:37 PM, Manu wrote:
 How long did it take to get him there? I suspect he made the leap only when a
 particular task that motivated him to do so came up.

Pretty much true. And it was worth it :-)

Jan 22 2014

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Jan 10, 2014 at 04:37:03PM +1000, Manu wrote:
 On 10 January 2014 15:48, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org
 wrote:

 
 On 1/9/14 8:21 AM, Manu wrote:

 My standing opinion is that string manipulation in D is not nice,
 it is possibly the most difficult and time consuming I have used in
 any language ever. Am I alone?

 No, but probably in the minority.

 The long and short of it is, you must get ranges in order to enjoy
 the power of D algorithms (as per http://goo.gl/dVprVT).

 std.{algorithm,range} are commonly mentioned as an attractive asset
 of D, and those who get that style of doing things have no trouble
 applying such notions to a variety of data, notably including
 strings. So going with the attitude "I don't use, know, or care for
 phobos... I just want to do this pesky string thing!" is bound to
 create frustration.

 
 The thing is, that pesky string thing is usually a trivial detail in
 an otherwise completely unrelated task. I'm not joking when I've had
 details like formatting a useful error message take 90% of the time to
 complete some totally unrelated task.

You have to be doing something wrong... formatting error messages is as
trivial as using std.string.format:

	if (argsAreBad(x,y,z))
		throw new Exception("Parameters x=%s y=%s z=%s are invalid!"
					.format(x,y,z));

I can't imagine what can be simpler than this. (Not to mention, %s in D
just means "string format of X", so the above code will actually work
for x, y, z of *any* type that has some kind of conversion to string.
Try this with C/C++, and you'll be segfaulting all day.)


 I guess I'm a little isolated from high level algorithms, because I
 spend most of my time at the level of twiddling bits.

That would explain your difficulty with Phobos algorithms. :)


 This is a key motivation for my kicking off this all-D game project,
 and getting others involved. I need excuse to push myself to have more
 involvement with these type of things. Doing more high-level code than
 I usually do will help, and having other D users also in the project
 will keep me in check, and hopefully improve my D code a lot while at
 it ;)

Well, maybe the reward of not having to grit your teeth everytime you do
string manipulation in D will motivate you to learn how to use Phobos
effectively? :)


 I personally find strings very easy to deal with in D. They might be
 easier in Perl or sometimes Python, but at a steep efficiency cost.

 Walter has recently written a non-trivial utility that beats the
 pants off (3x performance) the equivalent C program that has been
 highly scrutinized and honed for literally decades by dozens
 (hundreds?) of professionals.  Walter's implementations uses ranges
 and algorithms (a few standard, many custom) through and through. If
 all goes well we'll open-source it. He himself is now an
 range/algorithm convert, even though he'd be the first to point the
 no-nonsense nature of a function like strrchr. (And btw strrchr is
 after all a POS because it needs to scan the string left to right...
 so lastIndex is faster!)

 
 How long did it take to get him there? I suspect he made the leap only
 when a particular task that motivated him to do so came up. I suspect
 I'm likely to follow that same pattern given the context; like him,
 I'm a somewhat no-frills practicality-oriented programmer, and don't
 get too excited about futuristic shiny things unless it's readily
 apparent they can make my workload simpler and more efficient
 (although I would also require it not sacrifice computation
 efficiency).

I'm not the kind to get excited about futuristic shiny things either...
I don't even use a GUI, for example! (Well, technically I do, since I'm
running on X11, but it's so bare bones to the point that my manager is
baffled how I could even begin to use such an interface. I barely ever
touch the mouse except when browsing, for one thing. Almost everything
is completely keyboard-driven.) And I'm also skeptical of new trendy
overhyped things that has people jumping on the bandwagon by droves --
and usually it turns out that it's just another ordinary idea blown out
of proportion by the PR machine.

Yet I had no trouble getting up to speed with Phobos algorithms.  I
*will* say there's a learning curve, though -- you need to understand
what ranges are and why they're the way they are, before you can fully
grok Phobos algorithms. Andrei's article "On Iteration" (linked from the
std.range docs) is almost a must-read. But IMO it's more than worth the
time to learn this. It will revolutionize the way you think about code.
;-)


 But my point remains, as a trivial ancillary detail - I'm not doing
 stuff with strings; I'm working on other stuff that just _has_ some
 strings - it's not presented in a way that one can just get the job
 done with low friction, and without at least tripling the number of
 imports from the std library.

But that's the thing, if you have some level of facility with ranges,
you could be using exactly the same algorithms for your other stuff as
you'd use for strings. That's much less mental overhead than having to
remember one set of API's for manipulating said other stuff, and a
different set of API's for manipulating strings.

The number of imports needed, though, is a different issue. That's
something that Phobos needs improvement in. At least the last time I
checked, the "Phobos philosophy", as stated on dlang.org, is that you
shouldn't need to import half the library just to do a single simple
operation like reading a file. Unfortunately, from what I can tell, that
philosophy hasn't really been carried through. Lazy imports, discussed
earlier this week, are a direction I'd like to see implemented some time
in the near future. Some of the code bloat just from importing a single
std module is a bit excessive, and bugs me quite a bit.

Nevertheless, I haven't experienced any "high friction" issues in
getting stuff done with strings. Once you learn where things are and
what is available, it's pretty straightforward to throw something
together. It does take a bit of time to learn this, but honestly, that's
not any more effort than learning C for the first time and learning what
strchr or memset means, and when to use strcat and when not to. In fact,
I'd argue that learning the C string functions is a lot more effort,
because they have so many pitfalls and gotchas that you must memorize
and constantly keep in mind, otherwise your program suddenly acquires
gratuitous segfaults, pointer bugs, and buffer overruns. IME, it takes
*more* effort to write string manipulation code in C, rather than less,
since so many more things can go wrong.


T

-- 
Turning your clock 15 minutes ahead won't cure lateness---you're just making
time go faster!

Jan 09 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-10 06:48, Andrei Alexandrescu wrote:
 On 1/9/14 8:21 AM, Manu wrote:
 My standing opinion is that string manipulation in D is not nice, it is
 possibly the most difficult and time consuming I have used in any
 language ever. Am I alone?

 No, but probably in the minority.

 The long and short of it is, you must get ranges in order to enjoy the
 power of D algorithms (as per http://goo.gl/dVprVT).

 std.{algorithm,range} are commonly mentioned as an attractive asset of
 D, and those who get that style of doing things have no trouble applying
 such notions to a variety of data, notably including strings. So going
 with the attitude "I don't use, know, or care for phobos... I just want
 to do this pesky string thing!" is bound to create frustration.

Even if you do get how ranges work it can be difficult to figure out 
where a function is located, in std.algorithms, std.string, std.array, 
std.uni or std.range. Like, "is this a string operation or a general 
container algorithm?". Why is there a std.string.indexOf function? Isn't 
that a general array operation or algorithm? Isn't 
std.string.(left|right)Justify a general operation as well?

-- 
/Jacob Carlborg

Jan 09 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/9/14 11:53 PM, Jacob Carlborg wrote:
 Even if you do get how ranges work it can be difficult to figure out
 where a function is located, in std.algorithms, std.string, std.array,
 std.uni or std.range. Like, "is this a string operation or a general
 container algorithm?". Why is there a std.string.indexOf function? Isn't
 that a general array operation or algorithm? Isn't
 std.string.(left|right)Justify a general operation as well?

That's a documentation issue. We've pursued generalization of string 
algorithms with good result. As such indexOf is susceptible for 
generalization. However, the justification functions are unlikely to be 
useful for other data types because most don't have a notion of "filler" 
object.

Andrei

Jan 10 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-10 09:29, Andrei Alexandrescu wrote:

 That's a documentation issue. We've pursued generalization of string
 algorithms with good result. As such indexOf is susceptible for
 generalization. However, the justification functions are unlikely to be
 useful for other data types because most don't have a notion of "filler"
 object.

They might not have a default "filler" object but you can pass the 
"filler" as an argument.

-- 
/Jacob Carlborg

Jan 10 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/10/14 12:49 AM, Jacob Carlborg wrote:
 On 2014-01-10 09:29, Andrei Alexandrescu wrote:

 That's a documentation issue. We've pursued generalization of string
 algorithms with good result. As such indexOf is susceptible for
 generalization. However, the justification functions are unlikely to be
 useful for other data types because most don't have a notion of "filler"
 object.

 They might not have a default "filler" object but you can pass the
 "filler" as an argument.

By that I was implying that the whole notion is not sensible for general 
types. Honest, I did consider generalizing everything in std.string, but 
the algorithms left in there made little sense for other types than strings.

Andrei

Jan 10 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-10 17:14, Andrei Alexandrescu wrote:

 By that I was implying that the whole notion is not sensible for general
 types. Honest, I did consider generalizing everything in std.string, but
 the algorithms left in there made little sense for other types than
 strings.

Fair enough, I just though I found a couple of more that could be 
generalized.

-- 
/Jacob Carlborg

Jan 10 2014

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Thursday, 9 January 2014 at 14:34:43 UTC, John Colvin wrote:
 On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:
 This works fine:
  string x = find("Hello", 'H');

 This doesn't:
  string y = find(retro("Hello"), 'H');
  > Error: cannot implicitly convert expression 
 (find(retro("Hello"), 'H'))
 of type Result!() to string

 In order to return the result as a string it would require an 
 allocation. You have to request that allocation (and associated 
 eager evaluation) explicitly

 string y = "Hello".retro.find('H').to!string;


 However, I think to get the expected result from unicode you 
 need

 string y = "Hello".byGrapheme.retro.find('H').to!string;

 but I might be wrong.

Oh. I see you actually wanted strrchr behaviour. That's different.

Jan 09 2014

Marco Leise <Marco.Leise gmx.de> writes:

Am Thu, 09 Jan 2014 15:20:13 +0000
schrieb "John Colvin" <john.loughran.colvin gmail.com>:

 On Thursday, 9 January 2014 at 14:34:43 UTC, John Colvin wrote:
 On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:
 This works fine:
  string x =3D find("Hello", 'H');

 This doesn't:
  string y =3D find(retro("Hello"), 'H');
  > Error: cannot implicitly convert expression=20
 (find(retro("Hello"), 'H'))
 of type Result!() to string

 In order to return the result as a string it would require an=20
 allocation. You have to request that allocation (and associated=20
 eager evaluation) explicitly

 string y =3D "Hello".retro.find('H').to!string;


 However, I think to get the expected result from unicode you=20
 need

 string y =3D "Hello".byGrapheme.retro.find('H').to!string;

 but I might be wrong.

=20
 Oh. I see you actually wanted strrchr behaviour. That's different.

The point about graphemes is good. D's functions still stop
mid-way. From UTF-8 you can iterate UTF-32 code points, but
grapheme clusters are the new characters. I.e. the basic need
to iterate Unicode _characters_ is not supported!
I cannot even come up with use cases for working with code
points and think they are a conceptual black hole. Something
carried over from a time when grapheme clusters didn't exist.

When you search for 'A', '=C3=84' shows up when it is built from
an A and the "two dots" symbol. It also has the walk length 2.
This isn't an issue as long as we use strings from languages
that are traditionally well supported with single code-unit
characters.

Basically the element type when iterating over a string would
have to be another string of arbitrary length, since you could
attach any number of combining diacritical symbols to a
letter. See?: e=CD=9C=CD=A1=CD=9F=CD=9E

--=20
Marco

Jan 09 2014

Jerry <jlquinn optonline.net> writes:

Marco Leise <Marco.Leise gmx.de> writes:

 Am Thu, 09 Jan 2014 15:20:13 +0000
 schrieb "John Colvin" <john.loughran.colvin gmail.com>:

 The point about graphemes is good. D's functions still stop
 mid-way. From UTF-8 you can iterate UTF-32 code points, but
 grapheme clusters are the new characters. I.e. the basic need
 to iterate Unicode _characters_ is not supported!
 I cannot even come up with use cases for working with code
 points and think they are a conceptual black hole. Something
 carried over from a time when grapheme clusters didn't exist.

Actually, you can do tons of NLP without grapheme clusters.  If you're
paranoid, you standardize on a specific Unicode normalization first.

You can probably get a bit better results by paying attention to
clusters, but I suspect it will be a marginal improvement.

That said, I do agree with the OP that the string API is currently more
complex to understand than I'd like.  However, it's significantly easier
to use than what's in standard C++ for anything beyond ascii.

Jerry

Jan 09 2014

Marco Leise <Marco.Leise gmx.de> writes:

Am Thu, 09 Jan 2014 15:51:36 -0500
schrieb Jerry <jlquinn optonline.net>:

 Marco Leise <Marco.Leise gmx.de> writes:
=20
 Am Thu, 09 Jan 2014 15:20:13 +0000
 schrieb "John Colvin" <john.loughran.colvin gmail.com>:

=20
 The point about graphemes is good. D's functions still stop
 mid-way. From UTF-8 you can iterate UTF-32 code points, but
 grapheme clusters are the new characters. I.e. the basic need
 to iterate Unicode _characters_ is not supported!
 I cannot even come up with use cases for working with code
 points and think they are a conceptual black hole. Something
 carried over from a time when grapheme clusters didn't exist.

=20
 Actually, you can do tons of NLP without grapheme clusters.  If you're
 paranoid, you standardize on a specific Unicode normalization first.
=20
 You can probably get a bit better results by paying attention to
 clusters, but I suspect it will be a marginal improvement.
=20
 That said, I do agree with the OP that the string API is currently more
 complex to understand than I'd like.  However, it's significantly easier
 to use than what's in standard C++ for anything beyond ascii.
=20
 Jerry

Sorry, I got confused with the Unicode definitions. I see now
that a grapheme cluster is e.g. \r\n. What I really meant is
that Phobos needs to support graphemes. But seeing that
monsters like this exist: n=CD=A0g, I don't even know if this is
one character or two, but right now Phobos sees it as three
characters.

--=20
Marco

Jan 10 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-10 17:01, Marco Leise wrote:

 Sorry, I got confused with the Unicode definitions. I see now
 that a grapheme cluster is e.g. \r\n. What I really meant is
 that Phobos needs to support graphemes. But seeing that
 monsters like this exist: n͠g, I don't even know if this is
 one character or two, but right now Phobos sees it as three
 characters.

Thunderbird sees that as two characters. Ruby sees it as three.

-- 
/Jacob Carlborg

Jan 10 2014

Marco Leise <Marco.Leise gmx.de> writes:

Am Fri, 10 Jan 2014 18:07:54 +0100
schrieb Jacob Carlborg <doob me.com>:

 On 2014-01-10 17:01, Marco Leise wrote:
=20
 Sorry, I got confused with the Unicode definitions. I see now
 that a grapheme cluster is e.g. \r\n. What I really meant is
 that Phobos needs to support graphemes. But seeing that
 monsters like this exist: n=CD=A0g, I don't even know if this is
 one character or two, but right now Phobos sees it as three
 characters.

=20
 Thunderbird sees that as two characters. Ruby sees it as three.

I think this is the (or one of the) official documents about
where a "user-perceived character" ends:

http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundary_Rules

According to this, the above n=CD=A0g is indeed defined as 2
characters. Ruby is just no better than Phobos :p


=C2=BBGrapheme cluster boundaries are important for collation,
 regular expressions, UI interactions (such as mouse selection,
 arrow key movement, backspacing), segmentation for vertical
 text, identification of boundaries for first-letter styling,
 and counting =E2=80=9Ccharacter=E2=80=9D positions within text.=C2=AB

--=20
Marco

Jan 10 2014

"Regan Heath" <regan netmail.co.nz> writes:

On Thu, 09 Jan 2014 14:07:36 -0000, Manu <turkeyman gmail.com> wrote:
 This works fine:
   string x = find("Hello", 'H');

 This doesn't:
   string y = find(retro("Hello"), 'H');
   > Error: cannot implicitly convert expression (find(retro("Hello"),  
 'H'))
 of type Result!() to string

 Is that wrong? That seems to be how the docs suggest it should be used.

 On a side note, am I the only one that finds std.algorithm/std.range/etc
 for string processing really obtuse?
 I can rarely understand the error messages, so say it's better than STL  
 is  optimistic.
 Using std.algorithm and std.range to do string manipulation feels really
 lame to me.
 I hate looking through the docs of 3-4 modules to understand the complete
 set of useful string operations (std.string, std.uni, std.algorithm,
 std.range... at least).
 I also find the names of the generic algorithms are often unrelated to  
 the name of the string operation.
 My feeling is, everyone is always on about how cool D is at string, but
 other than 'char[]', and the builtin slice operator, I feel really
 unproductive whenever I do any heavy string manipulation in D.
 I also hate that I need to import at least 4-5 modules to do anything
 useful with strings... I feel my program bloating and cringe with every
 gigantic import that sources exactly one symbol.

I feel exactly the same way.  I must admit I haven't done any serious D  
for a couple of years now, and the main reason is lack of free time, but  
the other is that each time I come back to try and do something I get  
weird arse error messages like the one you got above.

I realise that it is probably the way it is, to avoid bloating the  
language with several ways to do the same thing.  I agree with that  
position, however..  I don't think it's a bad thing (TM) to have a  
custom/specific set of operations for a given area which re-use more  
generic operations behind the scenes.

In other words, why can't we alias or wrap the generic routines in  
std.string such that the expected operations are easy to find and do  
exactly what you'd expect, for strings.

If someone is dealing with generic code where the ranges involved might be  
strings/arrays or might be something else of course they will call  
std.range functions, but if they are only dealing with strings there  
should be string specific functions for them to call - which may/may not  
use std.range or std.algorithm functions etc behind the scenes.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Jan 09 2014

"Regan Heath" <regan netmail.co.nz> writes:

On Thu, 09 Jan 2014 17:15:41 -0000, Regan Heath <regan netmail.co.nz>  
wrote:

 In other words, why can't we alias or wrap the generic routines in  
 std.string

What I meant here is why can't we alias or wrap the generic routines (from  
std.range, std.algo.. into aliases/functions) in std.string.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Jan 09 2014

Manu <turkeyman gmail.com> writes:

On 10 January 2014 03:17, Regan Heath <regan netmail.co.nz> wrote:

 On Thu, 09 Jan 2014 17:15:41 -0000, Regan Heath <regan netmail.co.nz>
 wrote:

  In other words, why can't we alias or wrap the generic routines in
 std.string

 What I meant here is why can't we alias or wrap the generic routines (from
 std.range, std.algo.. into aliases/functions) in std.string.


We can and should. Very liberally.
I'm still very concerned about the magnitude of bloat that gets pulled in
by any of these modules though. They're all intimately connected, none of
them seem to be able to exist without all of the others.
And there are some really huge template functions out there. Massive
functions, which take multiple template arguments (N^2 permutations), where
the template types might only affects one or 2 lines... they need to be
broken down into very small template functions, and a non-templated inner
function.

Jan 09 2014

"Regan Heath" <regan netmail.co.nz> writes:

On Thu, 09 Jan 2014 17:25:13 -0000, Manu <turkeyman gmail.com> wrote:

 On 10 January 2014 03:17, Regan Heath <regan netmail.co.nz> wrote:

 On Thu, 09 Jan 2014 17:15:41 -0000, Regan Heath <regan netmail.co.nz>
 wrote:

  In other words, why can't we alias or wrap the generic routines in
 std.string

 What I meant here is why can't we alias or wrap the generic routines  
 (from
 std.range, std.algo.. into aliases/functions) in std.string.


 We can and should. Very liberally.
 I'm still very concerned about the magnitude of bloat that gets pulled in
 by any of these modules though. They're all intimately connected, none of
 them seem to be able to exist without all of the others.
 And there are some really huge template functions out there. Massive
 functions, which take multiple template arguments (N^2 permutations),  
 where
 the template types might only affects one or 2 lines... they need to be
 broken down into very small template functions, and a non-templated inner
 function.

We need, if one does not exist already, a dependency mapper tool.  One  
which would give some sort of graphical/hierarchical output of modules and  
their dependencies, ideally drilling right down to the functions, methods,  
variables etc being used.

Sounds fun, and there is a DMD frontend to build on right?  Anyone got the  
spare time?

Regan

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Jan 10 2014

"Craig Dillabaugh" <cdillaba cg.scs.carleton.ca> writes:

On Thursday, 9 January 2014 at 17:15:43 UTC, Regan Heath wrote:

clip

 In other words, why can't we alias or wrap the generic routines 
 in std.string such that the expected operations are easy to 
 find and do exactly what you'd expect, for strings.

 If someone is dealing with generic code where the ranges 
 involved might be strings/arrays or might be something else of 
 course they will call std.range functions, but if they are only 
 dealing with strings there should be string specific functions 
 for them to call - which may/may not use std.range or 
 std.algorithm functions etc behind the scenes.

 R

I think this would be a nice solution.  I only use D for string 
processing rarely and as a result I always struggle a bit, 
because I can never remember where to go to look for things.  
Happily, my most recent experience with it was fairly smooth.

A while ago I was trying to do something with splitter on a 
string and I ended up asking a question on D.learn.  I got into a 
very confusing debate because the person trying to help me 
thought I was using the splitter in std.array and I was using the 
one from another module (see the last few posts from here):

http://www.digitalmars.com/d/archives/digitalmars/D/learn/splitting_numbers_from_a_test_file_39448.html

It would be nice if std.string in D provided a nice, easy, string 
manipulation that swept most of the difficulties under the table, 
and provided links in the documentation to the functions they 
wrap for when people want to do more complex things.

Jan 09 2014

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Thursday, 9 January 2014 at 18:57:26 UTC, Craig Dillabaugh 
wrote:
 A while ago I was trying to do something with splitter on a 
 string and I ended up asking a question on D.learn. [...]

 It would be nice if std.string in D provided a nice, easy, 
 string manipulation that swept most of the difficulties under 
 the table

http://dlang.org/phobos/std_array.html#split

Note that std.array is publicly imported from std.string so this 
works:

void main() {
         import std.string;
         auto parts = "hello".split("l");

         import std.stdio;
         writeln(parts);
}


 provided links in the documentation to the functions they wrap 
 for when people want to do more complex things.

Actually, when writing my D book, I decided to spend more time on 
the unicode stuff in strings than these basic operations, since I 
thought these were pretty straightforward.

But maybe the docs suck more than I thought. I learned most of D 
string stuff from Phobos1 which kept it all simple...

Jan 09 2014

"Craig Dillabaugh" <cdillaba cg.scs.carleton.ca> writes:

On Thursday, 9 January 2014 at 19:05:19 UTC, Adam D. Ruppe wrote:
 On Thursday, 9 January 2014 at 18:57:26 UTC, Craig Dillabaugh 
 wrote:
 A while ago I was trying to do something with splitter on a 
 string and I ended up asking a question on D.learn. [...]

 It would be nice if std.string in D provided a nice, easy, 
 string manipulation that swept most of the difficulties under 
 the table

 http://dlang.org/phobos/std_array.html#split

 Note that std.array is publicly imported from std.string so 
 this works:

 void main() {
         import std.string;
         auto parts = "hello".split("l");

         import std.stdio;
         writeln(parts);
 }


 provided links in the documentation to the functions they wrap 
 for when people want to do more complex things.

 Actually, when writing my D book, I decided to spend more time 
 on the unicode stuff in strings than these basic operations, 
 since I thought these were pretty straightforward.

 But maybe the docs suck more than I thought. I learned most of 
 D string stuff from Phobos1 which kept it all simple...

Thats the thing.  In most cases the correct way to do something 
in D, does end up being rather nice.  However, its often a bit of 
a challenge finding the that correct way!

When I had my troubles I expected to find the library solutions 
in std.string (remember I rarely use D's string processing 
utilities). It never really occurred to me that I might want to 
check std.array for the function I wanted. So what it std.array 
is imported when I import std.string, as a programmer I still had 
no idea 'split()' was there!

At the very least the documentation for std.string should say 
something along the lines of:

"The libraries std.unicode and std.array also include a number of 
functions that operate on strings, so if what you are looking for 
isn't here, try looking there."

Jan 09 2014

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Jan 09, 2014 at 08:53:12PM +0000, Craig Dillabaugh wrote:
[...]
 Thats the thing.  In most cases the correct way to do something in
 D, does end up being rather nice.  However, its often a bit of a
 challenge finding the that correct way!
 
 When I had my troubles I expected to find the library solutions in
 std.string (remember I rarely use D's string processing utilities).
 It never really occurred to me that I might want to check std.array
 for the function I wanted. So what it std.array is imported when I
 import std.string, as a programmer I still had no idea 'split()' was
 there!
 
 At the very least the documentation for std.string should say
 something along the lines of:
 
 "The libraries std.unicode and std.array also include a number of
 functions that operate on strings, so if what you are looking for
 isn't here, try looking there."

Yeah, any public imports should be mentioned somewhere in the docs,
otherwise it's just random invisible magic as far as the end-user is
concerned ("Hmm, I imported std.string in one module, and array.front
works, but in this other module, array.front doesn't work! Why? Who
knows.");

Please submit a pull request to add that to the docs.


T

-- 
People walk. Computers run.

Jan 09 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-10 00:34, H. S. Teoh wrote:

 Yeah, any public imports should be mentioned somewhere in the docs,
 otherwise it's just random invisible magic as far as the end-user is
 concerned ("Hmm, I imported std.string in one module, and array.front
 works, but in this other module, array.front doesn't work! Why? Who
 knows.");

 Please submit a pull request to add that to the docs.

I agree, and it should be automatic.

-- 
/Jacob Carlborg

Jan 09 2014

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Thursday, 9 January 2014 at 20:53:13 UTC, Craig Dillabaugh 
wrote:
 Thats the thing.  In most cases the correct way to do something 
 in D, does end up being rather nice.  However, its often a bit 
 of a challenge finding the that correct way!

Yeah, and indeed it is a bit weird that so many of the functions 
moved from std.string to std.array. (Yet are still specialized on 
strings... I think they have to be in the same module just to be 
in the same overload set though.)

 "The libraries std.unicode and std.array also include a number 
 of functions that operate on strings, so if what you are 
 looking for isn't here, try looking there."

Aye, I think the documentation could use a few higher level 
overviews that bring the modules together.

Jan 09 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/9/14 3:39 PM, Adam D. Ruppe wrote:
 Aye, I think the documentation could use a few higher level overviews
 that bring the modules together.

PRP

Andrei

Jan 09 2014

Manu <turkeyman gmail.com> writes:

On 10 January 2014 06:53, Craig Dillabaugh <cdillaba cg.scs.carleton.ca>wrote:

 On Thursday, 9 January 2014 at 19:05:19 UTC, Adam D. Ruppe wrote:

 On Thursday, 9 January 2014 at 18:57:26 UTC, Craig Dillabaugh wrote:

 A while ago I was trying to do something with splitter on a string and I
 ended up asking a question on D.learn. [...]

 It would be nice if std.string in D provided a nice, easy, string
 manipulation that swept most of the difficulties under the table

 http://dlang.org/phobos/std_array.html#split

 Note that std.array is publicly imported from std.string so this works:

 void main() {
         import std.string;
         auto parts = "hello".split("l");

         import std.stdio;
         writeln(parts);
 }


  provided links in the documentation to the functions they wrap for when
 people want to do more complex things.

 Actually, when writing my D book, I decided to spend more time on the
 unicode stuff in strings than these basic operations, since I thought these
 were pretty straightforward.

 But maybe the docs suck more than I thought. I learned most of D string
 stuff from Phobos1 which kept it all simple...

 Thats the thing.  In most cases the correct way to do something in D, does
 end up being rather nice.  However, its often a bit of a challenge finding
 the that correct way!

 When I had my troubles I expected to find the library solutions in
 std.string (remember I rarely use D's string processing utilities). It
 never really occurred to me that I might want to check std.array for the
 function I wanted. So what it std.array is imported when I import
 std.string, as a programmer I still had no idea 'split()' was there!

 At the very least the documentation for std.string should say something
 along the lines of:

 "The libraries std.unicode and std.array also include a number of
 functions that operate on strings, so if what you are looking for isn't
 here, try looking there."

Or just alias the functions useful for string processing...

Jan 09 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-10 02:34, Manu wrote:

 Or just alias the functions useful for string processing...

I agree. It already has some aliases, converting to lower and uppercase.

-- 
/Jacob Carlborg

Jan 09 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/9/14 11:56 PM, Jacob Carlborg wrote:
 On 2014-01-10 02:34, Manu wrote:

 Or just alias the functions useful for string processing...

 I agree. It already has some aliases, converting to lower and uppercase.

I wouldn't want to get to the point where many functions have 2-3 names.

Andrei

Jan 10 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-10 09:16, Andrei Alexandrescu wrote:

 I wouldn't want to get to the point where many functions have 2-3 names.

They're aliased in from std.uni, I think that's a different thing. It's 
not like Ruby which has both "collect" and "map", in the same place, 
meaning the same thing.

-- 
/Jacob Carlborg

Jan 10 2014

"Regan Heath" <regan netmail.co.nz> writes:

On Fri, 10 Jan 2014 08:16:53 -0000, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 1/9/14 11:56 PM, Jacob Carlborg wrote:
 On 2014-01-10 02:34, Manu wrote:

 Or just alias the functions useful for string processing...

 I agree. It already has some aliases, converting to lower and uppercase.

 I wouldn't want to get to the point where many functions have 2-3 names.

This is only a problem if they are all in the same sphere of concern.. by  
that I mean if you're looking for string functions and you find 2 names  
for the same function this would be wrong/confusing/pointless.  But, if  
you have one name in the string category and one in the range category and  
they were both the same function underneath I don't see this as the "same"  
problem, right?

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Jan 10 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/10/14 6:07 AM, Regan Heath wrote:
 On Fri, 10 Jan 2014 08:16:53 -0000, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 1/9/14 11:56 PM, Jacob Carlborg wrote:
 On 2014-01-10 02:34, Manu wrote:

 Or just alias the functions useful for string processing...

 I agree. It already has some aliases, converting to lower and uppercase.

 I wouldn't want to get to the point where many functions have 2-3 names.

 This is only a problem if they are all in the same sphere of concern..
 by that I mean if you're looking for string functions and you find 2
 names for the same function this would be wrong/confusing/pointless.
 But, if you have one name in the string category and one in the range
 category and they were both the same function underneath I don't see
 this as the "same" problem, right?

The way I see it one learns a name for an algorithm (low cognitive load) 
and then uses it everywhere. This is not Go.

Andrei

Jan 10 2014

"Regan Heath" <regan netmail.co.nz> writes:

On Fri, 10 Jan 2014 16:30:12 -0000, Andrei Alexandrescu  =

<SeeWebsiteForEmail erdani.org> wrote:

 On 1/10/14 6:07 AM, Regan Heath wrote:
 On Fri, 10 Jan 2014 08:16:53 -0000, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 1/9/14 11:56 PM, Jacob Carlborg wrote:
 On 2014-01-10 02:34, Manu wrote:

 Or just alias the functions useful for string processing...

 I agree. It already has some aliases, converting to lower and  =




 uppercase.

 I wouldn't want to get to the point where many functions have 2-3  =



 names.

 This is only a problem if they are all in the same sphere of concern.=


.
 by that I mean if you're looking for string functions and you find 2
 names for the same function this would be wrong/confusing/pointless.
 But, if you have one name in the string category and one in the range=


 category and they were both the same function underneath I don't see
 this as the "same" problem, right?

 The way I see it one learns a name for an algorithm (low cognitive loa=

d)  =

 and then uses it everywhere. This is not Go.

Sure.  But, lets take an example: std.algorithm.canFind is more or less =
 =

what you might call std.string.contains (which does not exist - instead =
 =

we'd use indexOf !=3D -1.. I think).

What is the harm in having an alias in std.string called contains which =
 =

simply calls std.algorithm.canFind?

Sure, it opens the door to someone using both canFind and contains on  =

strings in their code.  So what?  Use of contains is more likely/intuiti=
ve  =

for string related code, but both are intelligible.  canFind will be mor=
e  =

likely in generic code, where you would think of that generic algorithm =
 =

name.

It seems to me that people think of algorithms by different names in  =

different contexts.  In the context of strings "contains" would make the=
  =

most intuitive sense to the most people.

Side-issue.. from std.algorithm:

bool canFind(alias pred =3D "a =3D=3D b", R, E)(R haystack, E needle) if=
  =

(is(typeof(find!pred(haystack, needle))));
Returns true if and only if **value** can be found in range. Performs  =

=CE=9F(needle.length) evaluations of pred.

What is **value** shouldn't that be needle?

R

-- =

Using Opera's revolutionary email client: http://www.opera.com/mail/

Jan 13 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-13 13:53, Regan Heath wrote:

 Sure.  But, lets take an example: std.algorithm.canFind is more or less
 what you might call std.string.contains (which does not exist - instead
 we'd use indexOf != -1.. I think).

I think "contains" is a way better name. That's what most other 
languages use, I think.

-- 
/Jacob Carlborg

Jan 13 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/13/14 4:53 AM, Regan Heath wrote:
 On Fri, 10 Jan 2014 16:30:12 -0000, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 The way I see it one learns a name for an algorithm (low cognitive
 load) and then uses it everywhere. This is not Go.

 Sure.  But, lets take an example: std.algorithm.canFind is more or less
 what you might call std.string.contains (which does not exist - instead
 we'd use indexOf != -1.. I think).

Well there's no perfection in this world :o).

 What is the harm in having an alias in std.string called contains which
 simply calls std.algorithm.canFind?

I think you can answer that for yourself. Just take the approach to its 
logical conclusion.

 Sure, it opens the door to someone using both canFind and contains on
 strings in their code.  So what?  Use of contains is more
 likely/intuitive for string related code, but both are intelligible.
 canFind will be more likely in generic code, where you would think of
 that generic algorithm name.

 It seems to me that people think of algorithms by different names in
 different contexts.  In the context of strings "contains" would make the
 most intuitive sense to the most people.

I agree that good names are difficult to find. I think you'd have a hard 
time with a "the more the merrier" stance.

 Side-issue.. from std.algorithm:

 bool canFind(alias pred = "a == b", R, E)(R haystack, E needle) if
 (is(typeof(find!pred(haystack, needle))));
 Returns true if and only if **value** can be found in range. Performs
 Ο(needle.length) evaluations of pred.

 What is **value** shouldn't that be needle?

Please file a bug or pull request. Thanks!


Andrei

Jan 22 2014

"Regan Heath" <regan netmail.co.nz> writes:

On Wed, 22 Jan 2014 19:39:14 -0000, Andrei Alexandrescu  =

<SeeWebsiteForEmail erdani.org> wrote:

 On 1/13/14 4:53 AM, Regan Heath wrote:
 On Fri, 10 Jan 2014 16:30:12 -0000, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 The way I see it one learns a name for an algorithm (low cognitive
 load) and then uses it everywhere. This is not Go.

 Sure.  But, lets take an example: std.algorithm.canFind is more or le=


ss
 what you might call std.string.contains (which does not exist - inste=


ad
 we'd use indexOf !=3D -1.. I think).

 Well there's no perfection in this world :o).

 What is the harm in having an alias in std.string called contains whi=


ch
 simply calls std.algorithm.canFind?

 I think you can answer that for yourself. Just take the approach to it=

s  =

 logical conclusion.

You mean the best possible name in all contexts, yes!  :p

Seriously, I am not suggesting we do this for all functions all the time=
,  =

but just enough so that most users find what they expect to find and get=
  =

what they expect to get, where it doesn't break D's philosophy of not  =

doing inefficient things for the sake of being generic of course.

 Sure, it opens the door to someone using both canFind and contains on=


 strings in their code.  So what?  Use of contains is more
 likely/intuitive for string related code, but both are intelligible.
 canFind will be more likely in generic code, where you would think of=


 that generic algorithm name.

 It seems to me that people think of algorithms by different names in
 different contexts.  In the context of strings "contains" would make =


the
 most intuitive sense to the most people.

 I agree that good names are difficult to find. I think you'd have a ha=

rd  =

 time with a "the more the merrier" stance.

This.  Not my position.  Rather I am suggesting we identify individual  =

omissions (like std.string.contains) and add an alias.  So that people  =

don't have to struggle quite so much when switching to D.  The lower the=
  =

bar and all that..

 Side-issue.. from std.algorithm:

 bool canFind(alias pred =3D "a =3D=3D b", R, E)(R haystack, E needle)=


 if
 (is(typeof(find!pred(haystack, needle))));
 Returns true if and only if **value** can be found in range. Performs=


 =CE=9F(needle.length) evaluations of pred.

 What is **value** shouldn't that be needle?

 Please file a bug or pull request. Thanks!

Bug filed.

R

Jan 23 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/23/14 8:06 AM, Regan Heath wrote:
 This.  Not my position.  Rather I am suggesting we identify individual
 omissions (like std.string.contains) and add an alias.  So that people
 don't have to struggle quite so much when switching to D.  The lower the
 bar and all that..

Ionno. Just look at the current morass with 
https://github.com/D-Programming-Language/phobos/pull/1875. We have two 
names for the same function "canFind" and "any". Then we want to 
deprecate one, but look at how much impact it's having on Phobos alone. 
Are you sure you want to add a _third_?


Andrei

Jan 23 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-23 21:53, Andrei Alexandrescu wrote:

 Ionno. Just look at the current morass with
 https://github.com/D-Programming-Language/phobos/pull/1875. We have two
 names for the same function "canFind" and "any". Then we want to
 deprecate one, but look at how much impact it's having on Phobos alone.
 Are you sure you want to add a _third_?

Personally I would expect "any" to take a predicate and return "true" if 
it can find any matching element. If a predicate is not supplied it 
would behave as the opposite of "empty".

I would expect "contains" to take a element and check if it exists in 
the range.

I think "canFind" is just a weird name.

-- 
/Jacob Carlborg

Jan 24 2014

"Andrea Fontana" <nospam example.com> writes:

On Friday, 24 January 2014 at 08:21:12 UTC, Jacob Carlborg wrote:
 Personally I would expect "any" to take a predicate and return 
 "true" if it can find any matching element. If a predicate is 
 not supplied it would behave as the opposite of "empty".

+1

 I would expect "contains" to take a element and check if it 
 exists in the range.

+1

 I think "canFind" is just a weird name.

+1

I also think that contains/canFind/etc ... should get advantage 
of sorted ranges to speed up their searches (now it seems they 
don't!)

Jan 24 2014

"Stanislav Blinov" <stanislav.blinov gmail.com> writes:

On Friday, 24 January 2014 at 08:21:12 UTC, Jacob Carlborg wrote:
 On 2014-01-23 21:53, Andrei Alexandrescu wrote:

 I would expect "contains" to take a element and check if it 
 exists in the range.

 I think "canFind" is just a weird name.

I agree on the latter point. As for "contains"... Well, if we 
address the terminology, we should consider that ranges are not 
really containers, therefore "contains" would be slightly 
incorrect. Perhaps "encounters" or "isWithin"? :)

On a serious note though, "contains" is leagues ahead of 
"canFind".

Jan 24 2014

"Regan Heath" <regan netmail.co.nz> writes:

On Fri, 24 Jan 2014 08:36:07 -0000, Stanislav Blinov  
<stanislav.blinov gmail.com> wrote:

 On Friday, 24 January 2014 at 08:21:12 UTC, Jacob Carlborg wrote:
 On 2014-01-23 21:53, Andrei Alexandrescu wrote:

 I would expect "contains" to take a element and check if it exists in  
 the range.

 I think "canFind" is just a weird name.

 I agree on the latter point. As for "contains"... Well, if we address  
 the terminology, we should consider that ranges are not really  
 containers, therefore "contains" would be slightly incorrect. Perhaps  
 "encounters" or "isWithin"? :)

This is the complete opposite of the point I was trying to make :p

I don't want a generic name/function, or a range specific name/function we  
already have the generic one, and probably a range one, I want a string  
specific one - in this particular example - with a name people will expect  
to find (pun intended) when doing string manipulation.

 On a serious note though, "contains" is leagues ahead of "canFind".

I think it depends on the context.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Jan 24 2014

"Regan Heath" <regan netmail.co.nz> writes:

On Fri, 24 Jan 2014 08:21:12 -0000, Jacob Carlborg <doob me.com> wrote:

 On 2014-01-23 21:53, Andrei Alexandrescu wrote:

 Ionno. Just look at the current morass with
 https://github.com/D-Programming-Language/phobos/pull/1875. We have two
 names for the same function "canFind" and "any". Then we want to
 deprecate one, but look at how much impact it's having on Phobos alone.
 Are you sure you want to add a _third_?

 Personally I would expect "any" to take a predicate and return "true" if  
 it can find any matching element. If a predicate is not supplied it  
 would behave as the opposite of "empty".



 I would expect "contains" to take a element and check if it exists in  
 the range.

Except in the case of string, where we also want an overload taking more  
than a single element aka a substring.

 I think "canFind" is just a weird name.

Me too, but it makes sense as a "generic" name I think.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Jan 24 2014

Manu <turkeyman gmail.com> writes:

On 24 January 2014 21:49, Regan Heath <regan netmail.co.nz> wrote:

 On Fri, 24 Jan 2014 08:21:12 -0000, Jacob Carlborg <doob me.com> wrote:

 I would expect "contains" to take a element and check if it exists in the
 range.

 Except in the case of string, where we also want an overload taking more
 than a single element aka a substring.


A great example of when the string function should not be conflated with
the general function.

Jan 25 2014

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Saturday, 25 January 2014 at 09:18:24 UTC, Manu wrote:
 On 24 January 2014 21:49, Regan Heath <regan netmail.co.nz> 
 wrote:

 On Fri, 24 Jan 2014 08:21:12 -0000, Jacob Carlborg 
 <doob me.com> wrote:

 I would expect "contains" to take a element and check if it 
 exists in the
 range.

 Except in the case of string, where we also want an overload 
 taking more
 than a single element aka a substring.


 A great example of when the string function should not be 
 conflated with
 the general function.

You always want the overload.

If this works:

     contains("hello", "el");

then this should work:

     contains([1, 2, 3, 4, 5], [2, 3]);

Special cases are pure evil. There's nothing special about 
strings in this case.

Jan 25 2014

"Jacob Carlborg" <doob me.com> writes:

On Saturday, 25 January 2014 at 10:15:30 UTC, Peter Alexander 
wrote:

 You always want the overload.

 If this works:

     contains("hello", "el");

 then this should work:

     contains([1, 2, 3, 4, 5], [2, 3]);

 Special cases are pure evil. There's nothing special about 
 strings in this case.

I agree. Since strings are just a kind of array, it would be 
stupid to not allow the above.

--
/Jacob Carlborg

Jan 25 2014

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Saturday, 25 January 2014 at 10:15:30 UTC, Peter Alexander 
wrote:
 If this works:

     contains("hello", "el");

 then this should work:

     contains([1, 2, 3, 4, 5], [2, 3]);

 Special cases are pure evil. There's nothing special about 
 strings in this case.

I don't disagree, but naming and intuitive semantics should match 
up. In this case it does not. "contains" signifies  set 
membership.

hasSequence/findSequence would be more appropriate

Jan 25 2014

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Saturday, 25 January 2014 at 11:43:03 UTC, Ola Fosheim Grøstad 
wrote:
 On Saturday, 25 January 2014 at 10:15:30 UTC, Peter Alexander 
 wrote:
 If this works:

    contains("hello", "el");

 then this should work:

    contains([1, 2, 3, 4, 5], [2, 3]);

 Special cases are pure evil. There's nothing special about 
 strings in this case.

 I don't disagree, but naming and intuitive semantics should 
 match up. In this case it does not. "contains" signifies  set 
 membership.

 hasSequence/findSequence would be more appropriate

100% agree. The key thing is that it should be consistent between 
strings and other range types.

Jan 25 2014

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Saturday, 25 January 2014 at 14:23:48 UTC, Peter Alexander 
wrote:
 100% agree. The key thing is that it should be consistent 
 between strings and other range types.

Indeed. It is better to have to look up the name in the beginning.

Also, a good IDE will give you a list of alternatives and it is 
important to keep this list as short as possible. Ideally there 
should be no more than 10 functions for any type in order to 
maximize the benefit of using an IDE. So few functions, but with 
very descriptive names make me more efficient (I don't have to 
look it up in the documentation).

Basically, it is better to have a small core that can be used 
with lambdas for the specifics. I notice when I do Python that I 
don't use all the special functions. I use the generic ones with 
lambdas instead.

Jan 25 2014

Marco Leise <Marco.Leise gmx.de> writes:

Am Sat, 25 Jan 2014 14:36:52 +0000
schrieb "Ola Fosheim Gr=C3=B8stad"
<ola.fosheim.grostad+dlang gmail.com>:

 On Saturday, 25 January 2014 at 14:23:48 UTC, Peter Alexander=20
 wrote:
 100% agree. The key thing is that it should be consistent=20
 between strings and other range types.

=20
 Indeed. It is better to have to look up the name in the beginning.

If the name works well for strings, I agree. But otherwise I
prefer established function names from popular languages like
Python, Pascal/Delphi or Java.

 Also, a good IDE will give you a list of alternatives and it is=20
 important to keep this list as short as possible. Ideally there=20
 should be no more than 10 functions for any type in order to=20
 maximize the benefit of using an IDE. So few functions, but with=20
 very descriptive names make me more efficient (I don't have to=20
 look it up in the documentation).

There are 360 completions for a string already in Mono-D with
these imports:

import std.algorithm;
import std.array;
import std.conv;
import std.range;
import std.stdio;
import std.string;
import std.traits;

Don't bother with removing two or three function names for
string overloads. That's optimizing in the wrong area. :)

--=20
Marco

Jan 26 2014

Manu <turkeyman gmail.com> writes:

On 25 January 2014 20:15, Peter Alexander <peter.alexander.au gmail.com>wrote:

 On Saturday, 25 January 2014 at 09:18:24 UTC, Manu wrote:

 On 24 January 2014 21:49, Regan Heath <regan netmail.co.nz> wrote:

  On Fri, 24 Jan 2014 08:21:12 -0000, Jacob Carlborg <doob me.com> wrote:
  I would expect "contains" to take a element and check if it exists in
 the
 range.

 Except in the case of string, where we also want an overload taking more
 than a single element aka a substring.


 A great example of when the string function should not be conflated with
 the general function.

 You always want the overload.

 If this works:

     contains("hello", "el");

 then this should work:

     contains([1, 2, 3, 4, 5], [2, 3]);

 Special cases are pure evil. There's nothing special about strings in this
 case.

Does that work in all cases of strings wrt utf encodings? String
normalisation? What about when char and wchar strings are compared? (should
that should be handled transparently?)
Strings are special, they're almost always special, and rarely conflate
with generalisations well.
Strings almost always exposes special cases that aren't useful or relevant
for any other context.

Jan 25 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/25/14 6:07 PM, Manu wrote:
 On 25 January 2014 20:15, Peter Alexander <peter.alexander.au gmail.com
 <mailto:peter.alexander.au gmail.com>> wrote:
     If this works:

          contains("hello", "el");

     then this should work:

          contains([1, 2, 3, 4, 5], [2, 3]);

     Special cases are pure evil. There's nothing special about strings
     in this case.


 Does that work in all cases of strings wrt utf encodings?

Yes.

 String normalisation?

Not automatically, but you could do things such as 
find(haystack.byGrapheme, needle.byGrapheme).

 What about when char and wchar strings are compared?

Yes.

 (should that should be handled transparently?)

It is. D's stdlib is one of very few languages that can do this out of 
the box without converting one to the other.

 Strings are special, they're almost always special, and rarely conflate
 with generalisations well.

There is considerable evidence that the above is wrong.

 Strings almost always exposes special cases that aren't useful or
 relevant for any other context.

The special cases I found almost always involve whitespace (there is no 
obvious generalization of the notion). Other than that, UTF strings are 
nothing else but boring variable-length encodings.


Andrei

Jan 25 2014

"Regan Heath" <regan netmail.co.nz> writes:

On Sat, 25 Jan 2014 10:15:28 -0000, Peter Alexander  
<peter.alexander.au gmail.com> wrote:
 Special cases are pure evil. There's nothing special about strings in  
 this case.

This is a tangent to my suggestion.

I am arguing for domain specific language (aliases) where sensible, not  
domain specific functions.  If canFind can already handle all the  
desirable string cases, perfect, but lets alias it in std.string as  
"contains" so that people find what they expect to find first time and  
don't get frustrated looking for the correct generic name for the  
functionality they want.

There are likely other cases where we already have all the functionality  
in a nice generic function, but people struggle to find it because it has  
a suitably generic name.

I just want us to lower the bar for beginners coming from other languages  


R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Jan 27 2014

"Dicebot" <public dicebot.lv> writes:

On Monday, 27 January 2014 at 14:27:42 UTC, Regan Heath wrote:
 On Sat, 25 Jan 2014 10:15:28 -0000, Peter Alexander 
 <peter.alexander.au gmail.com> wrote:
 Special cases are pure evil. There's nothing special about 
 strings in this case.

 This is a tangent to my suggestion.

 I am arguing for domain specific language (aliases) where 
 sensible, not domain specific functions.  If canFind can 
 already handle all the desirable string cases, perfect, but 
 lets alias it in std.string as "contains" so that people find 
 what they expect to find first time and don't get frustrated 
 looking for the correct generic name for the functionality they 
 want.

 There are likely other cases where we already have all the 
 functionality in a nice generic function, but people struggle 
 to find it because it has a suitably generic name.

 I just want us to lower the bar for beginners coming from other 


 R

I think that is a small short-term learning advantage but huge 
long-term damage for code readability. Now you suddenly need to 
not only remember what Phobos can do but also all defined aliases 
for that stuff.

What could have been awesome is to be able to define such aliases 
via DDOC so that IDE's can understand them and list in 
auto-completion, while still putting "real" name in source code. 
It would have solved discoverability issue without harming naming 
consistency.

Jan 27 2014

"Regan Heath" <regan netmail.co.nz> writes:

On Mon, 27 Jan 2014 14:34:30 -0000, Dicebot <public dicebot.lv> wrote:

 On Monday, 27 January 2014 at 14:27:42 UTC, Regan Heath wrote:
 On Sat, 25 Jan 2014 10:15:28 -0000, Peter Alexander  
 <peter.alexander.au gmail.com> wrote:
 Special cases are pure evil. There's nothing special about strings in  
 this case.

 This is a tangent to my suggestion.

 I am arguing for domain specific language (aliases) where sensible, not  
 domain specific functions.  If canFind can already handle all the  
 desirable string cases, perfect, but lets alias it in std.string as  
 "contains" so that people find what they expect to find first time and  
 don't get frustrated looking for the correct generic name for the  
 functionality they want.

 There are likely other cases where we already have all the  
 functionality in a nice generic function, but people struggle to find  
 it because it has a suitably generic name.

 I just want us to lower the bar for beginners coming from other  


 R

 I think that is a small short-term learning advantage but huge long-term  
 damage for code readability. Now you suddenly need to not only remember  
 what Phobos can do but also all defined aliases for that stuff.

No, you really don't.

If you're writing string code you will intuitively reach for "substring",  
"contains", etc because you already know these terms and what behaviour to  
expect from them.  In a generic context, or a range context you will reach  
for different generic or range type names.

Likewise when reading code you will read "contains" and immediately know  
what it does, you don't need to remember that it's also called canFind ..  
why would you care?

Even *if* you decided to compare some string code with some generic code,  
and the two were actually doing the "same" thing with different calls, you  
wouldn't have any trouble at all in understanding each and then realising  
they do the same thing.

 What could have been awesome is to be able to define such aliases via  
 DDOC so that IDE's can understand them and list in auto-completion,  
 while still putting "real" name in source code. It would have solved  
 discoverability issue without harming naming consistency.

I think I would dislike this.. not sure.  Do our docs have "synonyms" in  
function descriptions.. then at least google would find "contains" on the  
page next to canFind and you would have an answer.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Jan 28 2014

"Dicebot" <public dicebot.lv> writes:

On Tuesday, 28 January 2014 at 11:26:39 UTC, Regan Heath wrote:
 No, you really don't.

 If you're writing string code you will intuitively reach for 
 "substring", "contains", etc because you already know these 
 terms and what behaviour to expect from them.  In a generic 
 context, or a range context you will reach for different 
 generic or range type names.

Trusting intuition is not acceptable. I will go and check in docs 
in most case if I have not encountered it before. Check each time 
for every new aliases. I'd hate to have this overhead. Right now 
all I need to do is to stop thinking about strings as strings - 
easy and fast.

 What could have been awesome is to be able to define such 
 aliases via DDOC so that IDE's can understand them and list in 
 auto-completion, while still putting "real" name in source 
 code. It would have solved discoverability issue without 
 harming naming consistency.

 I think I would dislike this.. not sure.  Do our docs have 
 "synonyms" in function descriptions.. then at least google 
 would find "contains" on the page next to canFind and you would 
 have an answer.

They don't have it right now and I propose to introduce it for 
this very reason.

Jan 29 2014

"Regan Heath" <regan netmail.co.nz> writes:

On Wed, 29 Jan 2014 09:52:01 -0000, Dicebot <public dicebot.lv> wrote:

 On Tuesday, 28 January 2014 at 11:26:39 UTC, Regan Heath wrote:
 No, you really don't.

 If you're writing string code you will intuitively reach for  
 "substring", "contains", etc because you already know these terms and  
 what behaviour to expect from them.  In a generic context, or a range  
 context you will reach for different generic or range type names.

 Trusting intuition is not acceptable.

Sure it is, if we're talking about making life easier for beginners and  
making things more "obvious" in general.  Of course, not everyone has the  
same idea of obvious, but there is enough overlap and we would *only*  
define aliases for that overlap.  In short, if people expect it to be  
there, lets make sure it's there.

 I will go and check in docs in most case if I have not encountered it  
 before. Check each time for every new aliases. I'd hate to have this  
 overhead.

Huh?  Assuming you have a decent editor checking the docs should be as  
simple as pressing F1 on the unknown function.  And, that's only assuming  
it's not immediately obvious what it's doing.  Are you telling me, that  
you would be confused by seeing...

if (str.contains("hello"))

I seriously doubt that, and that's all I'm suggesting, adding aliases for  
things which are obvious, things which any beginner will expect to be  
there, and currently aren't there.

I am *not* suggesting we add every obscure name for every single function,  
that would be complete nonsense.  Lets not get confused about the scope of  
what I'm suggesting, I am suggesting a very limited number of new aliases,  
and only for cases where there is a clear obvious expected name which we  
currently lack.

 Right now all I need to do is to stop thinking about strings as strings  
 - easy and fast.

Sure, once you learn all the generic terms for things.  I *still* have  
trouble finding the LINQ function I need when I want to do something in  
the LINQ generic style .. and I've been using LINQ for at least a year  
now.  The issue is that the generic name just does not naturally occur to  
me in certain contexts, like strings.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Jan 29 2014

"Dicebot" <public dicebot.lv> writes:

On Wednesday, 29 January 2014 at 10:13:44 UTC, Regan Heath wrote:
 I will go and check in docs in most case if I have not 
 encountered it before. Check each time for every new aliases. 
 I'd hate to have this overhead.

 Huh?  Assuming you have a decent editor checking the docs 
 should be as simple as pressing F1 on the unknown function.

It requires your mental context switching anyway.

 And, that's only assuming it's not immediately obvious what 
 it's doing.  Are you telling me, that you would be confused by 
 seeing...

 if (str.contains("hello"))

I won't be confused but I won't also be sure. For example, it may 
return boolean or inclusion count. `str` can be string of array 
of strings. With uniform ranges-based algorithms I can always 
expect consistent interpretation (or rant about inconsistent 
naming :)

 I seriously doubt that, and that's all I'm suggesting, adding 
 aliases for things which are obvious, things which any beginner 
 will expect to be there, and currently aren't there.

I don't buy into appealing to imaginary "any beginner" which has 
expectations identical to other "any beginner". My observations 
show quite the contrary - that those expectations are actually 
often different and incompatible and best way for a language is 
to force beginners to switch to expectations of the language.

Jan 29 2014

"Regan Heath" <regan netmail.co.nz> writes:

On Wed, 29 Jan 2014 10:42:08 -0000, Dicebot <public dicebot.lv> wrote:

 On Wednesday, 29 January 2014 at 10:13:44 UTC, Regan Heath wrote:
 I will go and check in docs in most case if I have not encountered it  
 before. Check each time for every new aliases. I'd hate to have this  
 overhead.

 Huh?  Assuming you have a decent editor checking the docs should be as  
 simple as pressing F1 on the unknown function.

 It requires your mental context switching anyway.

 And, that's only assuming it's not immediately obvious what it's  
 doing.  Are you telling me, that you would be confused by seeing...

 if (str.contains("hello"))

 I won't be confused but I won't also be sure. For example, it may return  
 boolean or inclusion count. `str` can be string of array of strings.  
 With uniform ranges-based algorithms I can always expect consistent  
 interpretation (or rant about inconsistent naming :)

 I seriously doubt that, and that's all I'm suggesting, adding aliases  
 for things which are obvious, things which any beginner will expect to  
 be there, and currently aren't there.

 I don't buy into appealing to imaginary "any beginner" which has  
 expectations identical to other "any beginner". My observations show  
 quite the contrary - that those expectations are actually often  
 different and incompatible and best way for a language is to force  
 beginners to switch to expectations of the language.

*shrug* agree to disagree on all points.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Jan 29 2014

"Dicebot" <public dicebot.lv> writes:

On Wednesday, 29 January 2014 at 15:57:30 UTC, Regan Heath wrote:
 *shrug* agree to disagree on all points.

 R

Peace!

Jan 29 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/27/14 6:27 AM, Regan Heath wrote:
 On Sat, 25 Jan 2014 10:15:28 -0000, Peter Alexander
 <peter.alexander.au gmail.com> wrote:
 Special cases are pure evil. There's nothing special about strings in
 this case.

 This is a tangent to my suggestion.

 I am arguing for domain specific language (aliases) where sensible, not
 domain specific functions.  If canFind can already handle all the
 desirable string cases, perfect, but lets alias it in std.string as
 "contains" so that people find what they expect to find first time and
 don't get frustrated looking for the correct generic name for the
 functionality they want.

 There are likely other cases where we already have all the functionality
 in a nice generic function, but people struggle to find it because it
 has a suitably generic name.

 I just want us to lower the bar for beginners coming from other


I just don't think this scales, though I understand it can sound 
reasonable before it being tried.

Walter doesn't like writing libraries so when he first defined Phobos' 
string support he simply took the string functions in Python and Ruby 
and implemented them. That didn't work well at all, in spite of the 
functions having the same names and semantics.


Andrei

Jan 27 2014

"Regan Heath" <regan netmail.co.nz> writes:

On Mon, 27 Jan 2014 16:19:54 -0000, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 1/27/14 6:27 AM, Regan Heath wrote:
 On Sat, 25 Jan 2014 10:15:28 -0000, Peter Alexander
 <peter.alexander.au gmail.com> wrote:
 Special cases are pure evil. There's nothing special about strings in
 this case.

 This is a tangent to my suggestion.

 I am arguing for domain specific language (aliases) where sensible, not
 domain specific functions.  If canFind can already handle all the
 desirable string cases, perfect, but lets alias it in std.string as
 "contains" so that people find what they expect to find first time and
 don't get frustrated looking for the correct generic name for the
 functionality they want.

 There are likely other cases where we already have all the functionality
 in a nice generic function, but people struggle to find it because it
 has a suitably generic name.

 I just want us to lower the bar for beginners coming from other


 I just don't think this scales, though I understand it can sound  
 reasonable before it being tried.

 Walter doesn't like writing libraries so when he first defined Phobos'  
 string support he simply took the string functions in Python and Ruby  
 and implemented them. That didn't work well at all, in spite of the  
 functions having the same names and semantics.

What specifically didn't work?  All I can recall are UTF and slicing  
issues, some of which remain with us today.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Jan 28 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/28/14 3:28 AM, Regan Heath wrote:
 On Mon, 27 Jan 2014 16:19:54 -0000, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Walter doesn't like writing libraries so when he first defined Phobos'
 string support he simply took the string functions in Python and Ruby
 and implemented them. That didn't work well at all, in spite of the
 functions having the same names and semantics.

 What specifically didn't work?  All I can recall are UTF and slicing
 issues, some of which remain with us today.

Problem is what we had was a crappy strings API because it used none of 
D's inherent advantages. What we have now is much better.

Andrei

Jan 28 2014

"Regan Heath" <regan netmail.co.nz> writes:

On Wed, 29 Jan 2014 06:49:30 -0000, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 1/28/14 3:28 AM, Regan Heath wrote:
 On Mon, 27 Jan 2014 16:19:54 -0000, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Walter doesn't like writing libraries so when he first defined Phobos'
 string support he simply took the string functions in Python and Ruby
 and implemented them. That didn't work well at all, in spite of the
 functions having the same names and semantics.

 What specifically didn't work?  All I can recall are UTF and slicing
 issues, some of which remain with us today.

 Problem is what we had was a crappy strings API because it used none of  
 D's inherent advantages. What we have now is much better.

Sure, but it would be better still if the commonly expected names for  
routines were present.. is all I'm saying.  I am certainly not suggesting  
we go back to a bad API, I am just saying there are some functions people  
expect to see, and they're not there, and that is frustrating; perhaps  
enough to put someone off.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Jan 29 2014

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Saturday, 25 January 2014 at 09:18:24 UTC, Manu wrote:
 On 24 January 2014 21:49, Regan Heath <regan netmail.co.nz> 
 wrote:

 On Fri, 24 Jan 2014 08:21:12 -0000, Jacob Carlborg 
 <doob me.com> wrote:

 I would expect "contains" to take a element and check if it 
 exists in the
 range.

 Except in the case of string, where we also want an overload 
 taking more
 than a single element aka a substring.


 A great example of when the string function should not be 
 conflated with
 the general function.

Both `find` and `canFind` support subrange search, and that works 
with any range, not just substrings.

Jan 25 2014

"Regan Heath" <regan netmail.co.nz> writes:

On Thu, 23 Jan 2014 20:53:01 -0000, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 1/23/14 8:06 AM, Regan Heath wrote:
 This.  Not my position.  Rather I am suggesting we identify individual
 omissions (like std.string.contains) and add an alias.  So that people
 don't have to struggle quite so much when switching to D.  The lower the
 bar and all that..

 Ionno. Just look at the current morass with  
 https://github.com/D-Programming-Language/phobos/pull/1875. We have two  
 names for the same function "canFind" and "any". Then we want to  
 deprecate one, but look at how much impact it's having on Phobos alone.  
 Are you sure you want to add a _third_?

Not *quite* the same.  Any is/was in the same module as canFind and for  
use in the exact same context.  A string specific "contains" would only be  
used in the context of string parsing.  If contains existed in std.string  
then it would be unusual for anyone to use canFind on a string (in a  
string only context).

That's what I'm suggesting, not adding more generic aliases/names for  
existing functions (as Any was) but adding specific names in specific  
contexts for otherwise generic functions with odd generic names, like  
canFind.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Jan 24 2014

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Monday, 13 January 2014 at 12:53:08 UTC, Regan Heath wrote:
 or less what you might call std.string.contains (which does not 
 exist - instead we'd use indexOf != -1.. I think).

Just a side track:

What I dislike about return values as error-indicators is that 
they are arbitrary so you have to memorize "-1", "0", null, 
throws…

I think it is often useful to have user-supplied default and 
sensible naming like having functions that allow testing for 
"0","false","null" as failure ending with "OK" in their name. And 
functions that throws ought to have some kind of assertive name 
like "validate" or a name that explicitly hints at exceptions.

"-1" is really a horrible error value since it fails the "boolean 
test", and e.g. if you want the non-query part of an url, you 
want string length to be the "not found value" when searching for 
"?", not -1:

"http://server.com/page"
"http://server.com/page?query=xyz"

Jan 22 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/9/14 12:53 PM, Craig Dillabaugh wrote:
 At the very least the documentation for std.string should say something
 along the lines of:

 "The libraries std.unicode and std.array also include a number of
 functions that operate on strings, so if what you are looking for isn't
 here, try looking there."

Pull request please.

Andrei

Jan 09 2014

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:
   string y = find(retro("Hello"), 'H');

import std.string;
auto idx = lastIndexOf("Hello", 'H');

Wow, that's unbelievable difficult. D sucks.

Jan 09 2014

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Thursday, 9 January 2014 at 17:39:00 UTC, Adam D. Ruppe wrote:
 On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:
  string y = find(retro("Hello"), 'H');

 import std.string;
 auto idx = lastIndexOf("Hello", 'H');

 Wow, that's unbelievable difficult. D sucks.

How on earth did I miss that...

Jan 09 2014

Manu <turkeyman gmail.com> writes:

On 10 January 2014 03:40, John Colvin <john.loughran.colvin gmail.com>wrote:

 On Thursday, 9 January 2014 at 17:39:00 UTC, Adam D. Ruppe wrote:

 On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:

  string y = find(retro("Hello"), 'H');

 import std.string;
 auto idx = lastIndexOf("Hello", 'H');

 Wow, that's unbelievable difficult. D sucks.

 How on earth did I miss that...

I have to wonder the same thing.
It's just not anything like anything I've ever called it before I guess.
I guess I started with find, and then it refers you to retro if you want to
reverse find, and of course, by this time I'm nowhere near std.string
anymore. Hard to find something if you're not even looking in the same file
:/

Jan 09 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/9/14 4:34 PM, Manu wrote:
 On 10 January 2014 03:40, John Colvin <john.loughran.colvin gmail.com
 <mailto:john.loughran.colvin gmail.com>> wrote:

     On Thursday, 9 January 2014 at 17:39:00 UTC, Adam D. Ruppe wrote:

         On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:

               string y = find(retro("Hello"), 'H');


         import std.string;
         auto idx = lastIndexOf("Hello", 'H');

         Wow, that's unbelievable difficult. D sucks.


     How on earth did I miss that...


 I have to wonder the same thing.
 It's just not anything like anything I've ever called it before I guess.
 I guess I started with find, and then it refers you to retro if you want
 to reverse find, and of course, by this time I'm nowhere near std.string
 anymore. Hard to find something if you're not even looking in the same
 file :/

Probably an xref of indexOf/lastIndexOf in find would be useful. PRP

Andrei

Jan 09 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-10 01:34, Manu wrote:

 I have to wonder the same thing.
 It's just not anything like anything I've ever called it before I guess.
 I guess I started with find, and then it refers you to retro if you want
 to reverse find, and of course, by this time I'm nowhere near std.string
 anymore. Hard to find something if you're not even looking in the same
 file :/

But "strchr" is a good name? If I wanted the index of a character in a 
string I would most likely look for something called "indexOf", or 

"index". In Python it's called "index" (and "find"). In PHP it's called 
"strrpos" and in C++ it's called "find". I think we're in pretty good 
shape here with D.

-- 
/Jacob Carlborg

Jan 10 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

09-Jan-2014 21:38, Adam D. Ruppe пишет:
 On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:
   string y = find(retro("Hello"), 'H');

 import std.string;
 auto idx = lastIndexOf("Hello", 'H');

 Wow, that's unbelievable difficult. D sucks.

+1 LOL

-- 
Dmitry Olshansky

Jan 09 2014

"Dicebot" <public dicebot.lv> writes:

On Thursday, 9 January 2014 at 17:39:00 UTC, Adam D. Ruppe wrote:
 On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:
  string y = find(retro("Hello"), 'H');

 import std.string;
 auto idx = lastIndexOf("Hello", 'H');

 Wow, that's unbelievable difficult. D sucks.

It is not the same thing as sample with byGrapheme though.

Jan 09 2014

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Thursday, 9 January 2014 at 17:54:05 UTC, Dicebot wrote:
 It is not the same thing as sample with byGrapheme though.

Right, but it works for ascii (and others) and shows std.string 
isn't as weak as being said in this thread.

Jan 09 2014

Manu <turkeyman gmail.com> writes:

On 10 January 2014 04:00, Adam D. Ruppe <destructionator gmail.com> wrote:

 On Thursday, 9 January 2014 at 17:54:05 UTC, Dicebot wrote:

 It is not the same thing as sample with byGrapheme though.

 Right, but it works for ascii (and others) and shows std.string isn't as
 weak as being said in this thread.

So is it 'correct'? The docs don't really say what it does. Is 'index' in
bytes, in codepoints, or in graphemes? Looks like bytes, but then it talks
about std.utf.UTFException, so maybe codepoints?
Being correct is constantly being thrown around as the 'value' in why
everything's so fucking hard... if this function isn't 'correct', then we
have a disparity.

I also don't think it excuses any of my other points. There shouldn't be
4-5+ modules where you have to look whenever you want to find string
related stuff.
In this case, my explicit example is just the straw that broke the camels
back. My experience still stands; every time I try to do any serious string
work, I waste far more time than I care to, and I HATE doing it. Makes me
feel dirty and I don't enjoy my programming time (which I ususally do
enjoy). In my experience, if you're not enjoying programming, something
went wrong.

The D docs are pretty terrible, they don't do much to help you find what
you're looking for.
You have a massive block of function names at the top of the page, you have
to carefully scan through one by one, hoping that it's named something
obvious that will stand out to you, and in the event it doesn't have a
helper function, you need to work out the proper sequence of
algorithm/range/whatever operations to do what you want (and then repeat
the process finding the small parts you need across a bunch of modules).

Blah! </endrant>

Jan 09 2014

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Friday, 10 January 2014 at 00:56:36 UTC, Manu wrote:
 So is it 'correct'?

Yes, with the caveat that it might find a surrogate pair (like H 
followed by an accent code point). That's what byGrapheme is 
about: combining those pairs.

But meh, do you really care about that?

indexOf does correctly handle the UTF formats and returns an 
index suitable for slicing (or -1).

auto idx = "cool".indexOf("o");
if(idx == -1)
   throw new Exception("not found");

auto before = "cool"[0 .. idx];
auto after = "cool"[idx + 1 .. $];


Code like that will always yield valid UTF strings. Again, it 
*might* break up a pair of code points, but it *will* correctly 
handle multi-byte code points... so probably good enough for 99% 
of use cases.

 Looks like bytes, but then it talks

It is bytes on string, and wchars on wstring; it is whatever unit 
is correct for slicing the type you pass it.

 The D docs are pretty terrible, they don't do much to help you 
 find what you're looking for.

I mostly agree (and this is partially why I started writing 
http://dpldocs.info/ but I never finished that so it isn't much 
better). I don't notice it so much because I already know where 
to look for most things but regardless I agree it is a pain for 
anything new.

Jan 09 2014

"Adam D. Ruppe" <destructionator gmail.com> writes:

BTW, I'll say it again: it was a *lot* easier to get started with 
this back in the phobos1 days, where std.string WAS the one-stop 
location for string stuff.

At the least, we should get the docs to point people in the right 
place, but I think we should also do more conceptual overview 
pages that talk about cross-module things.

Jan 09 2014

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Jan 10, 2014 at 01:18:01AM +0000, Adam D. Ruppe wrote:
 BTW, I'll say it again: it was a *lot* easier to get started with
 this back in the phobos1 days, where std.string WAS the one-stop
 location for string stuff.

I thought it still is? Except that a lot of it is now implicit via
public import from std.array and std.algorithm and wherever else. (But I
wouldn't know, though, I wasn't around in the D1 days.)


 At the least, we should get the docs to point people in the right
 place,

Yeah, I think all public imports should at least get a mention in the
ddoc header so that people know what's *actually* getting imported, not
just what the docs say are in the module.


 but I think we should also do more conceptual overview pages that talk
 about cross-module things.

+1. Currently Phobos has way too many modules under std, and unless
you're already familiar with where things are, you wouldn't even know
where to start looking when searching for new functionality.


T

-- 
Кто везде - тот нигде.

Jan 09 2014

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Friday, 10 January 2014 at 01:26:50 UTC, H. S. Teoh wrote:
 I thought it still is?

Yeah, mostly, though sometimes the disambiguation leaks the other 
details (for example replace() sometimes has a name conflict, so 
you need to explicitly import it or use a full name to 
disambiguate).

But this is primarily a documentation problem rather than a code 
one.


Some code differences from the old days:

* before: converting to and from string was in std.string. 
Functions like toInt, toString, etc. Nowadays, this is all done 
with std.conv.to. The new way is way cool, but a newbie's first 
place to look might be for std.string.toString rather than 
std.conv.to!string.

* before: some char type stuff was in std.string (and the rest in 
std.ctype IIRC). Now, it is in std.ascii and std.uni.

* before: the signatures were char[] foo(char[]). Nowadays, it is 
S foo(S)(S s) if(isSomeString!S)... so much wordier! Better 
functionality, but omg it can be a pain to read and surely 
intimidating for newbs.


I think things are generally improved as for functionality and 
consistency, but the docs are more debatable.

Jan 09 2014

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Jan 10, 2014 at 01:34:46AM +0000, Adam D. Ruppe wrote:
[...]
 Some code differences from the old days:
 
 * before: converting to and from string was in std.string. Functions
 like toInt, toString, etc. Nowadays, this is all done with
 std.conv.to. The new way is way cool, but a newbie's first place to
 look might be for std.string.toString rather than std.conv.to!string.

Right, so it should be mentioned in std.string.

But probably your idea of more concept-oriented overview pages is
better. It doesn't seem like the right solution to just insert
hyperlinks to std.conv in every other Phobos module.


 * before: some char type stuff was in std.string (and the rest in
 std.ctype IIRC). Now, it is in std.ascii and std.uni.

Yeah, this is one of the things I found annoying. Sure I understand why
std.ascii needs to be different from std.uni, but then you have stuff
split across std.string, std.ascii, std.uni, and std.utf -- what's the
diff between std.utf and std.uni?! (Yes I know what the diff is, the
point is that it looks silly to a newcomer.)


 * before: the signatures were char[] foo(char[]). Nowadays, it is S
 foo(S)(S s) if(isSomeString!S)... so much wordier! Better
 functionality, but omg it can be a pain to read and surely
 intimidating for newbs.

Sig constraints seriously need to be formatted differently from the way
they are right now, which is an unreadable blob of obtuse text. Take
std.algorithm.makeIndex, for example. How do you even *read* that
mess??! It's 6 lines of dense, *bolded* text (on my browser anyway,
YMMV), and it's not even clear that it's actually two overloads. I have
trouble telling what exactly it returns, and where its parameter lists
start and end. Nor what the sig constraints actually mean.

Actually, this particular case seems to be a prime example of the sig
constraint vs. static if idea I had in another post (i.e., sig
constraints should only define the scope of the overload, and type
requirements on arguments within that scope should be inside static ifs
in the body of the function / template). From what I can see, makeIndex
really should be in a *single* template, probably with no sig
constraints (or only very simple ones), and everything else should be
inside the template body as static if blocks. Whatever is unclear from
the outer sig constraints should be explained in the text of the ddoc.
Users shouldn't be expected to be able to parse sig constraints that are
really Phobos internal implementation details.


 I think things are generally improved as for functionality and
 consistency, but the docs are more debatable.

I agree, functionality is more unified and consistent, but the docs are
very newbie-unfriendly.


T

-- 
Why can't you just be a nonconformist like everyone else? -- YHL

Jan 09 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/9/14 6:00 PM, H. S. Teoh wrote:
 On Fri, Jan 10, 2014 at 01:34:46AM +0000, Adam D. Ruppe wrote:
 [...]
 Some code differences from the old days:

 * before: converting to and from string was in std.string. Functions
 like toInt, toString, etc. Nowadays, this is all done with
 std.conv.to. The new way is way cool, but a newbie's first place to
 look might be for std.string.toString rather than std.conv.to!string.

 Right, so it should be mentioned in std.string.

 But probably your idea of more concept-oriented overview pages is
 better. It doesn't seem like the right solution to just insert
 hyperlinks to std.conv in every other Phobos module.

A tutorial on string manipulation in D would be awesome.

Andrei

Jan 09 2014

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Friday, 10 January 2014 at 05:28:24 UTC, Andrei Alexandrescu 
wrote:
 On 1/9/14 6:00 PM, H. S. Teoh wrote:
 On Fri, Jan 10, 2014 at 01:34:46AM +0000, Adam D. Ruppe wrote:
 [...]
 Some code differences from the old days:

 * before: converting to and from string was in std.string. 
 Functions
 like toInt, toString, etc. Nowadays, this is all done with
 std.conv.to. The new way is way cool, but a newbie's first 
 place to
 look might be for std.string.toString rather than 
 std.conv.to!string.

 Right, so it should be mentioned in std.string.

 But probably your idea of more concept-oriented overview pages 
 is
 better. It doesn't seem like the right solution to just insert
 hyperlinks to std.conv in every other Phobos module.

 A tutorial on string manipulation in D would be awesome.

 Andrei

That would be a very useful asset.

Jan 10 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-10 02:34, Adam D. Ruppe wrote:

 Some code differences from the old days:

 * before: converting to and from string was in std.string. Functions
 like toInt, toString, etc. Nowadays, this is all done with std.conv.to.
 The new way is way cool, but a newbie's first place to look might be for
 std.string.toString rather than std.conv.to!string.

I think it would be good to still have a few alias, like toString and toInt.

 * before: some char type stuff was in std.string (and the rest in
 std.ctype IIRC). Now, it is in std.ascii and std.uni.

std.uni was available in D1 as well.

-- 
/Jacob Carlborg

Jan 10 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-10 02:18, Adam D. Ruppe wrote:
 BTW, I'll say it again: it was a *lot* easier to get started with this
 back in the phobos1 days, where std.string WAS the one-stop location for
 string stuff.

There was std.uni back in the D1 days as well ;)

-- 
/Jacob Carlborg

Jan 10 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

10-Jan-2014 05:16, Adam D. Ruppe пишет:
 On Friday, 10 January 2014 at 00:56:36 UTC, Manu wrote:
 So is it 'correct'?

 Yes, with the caveat that it might find a surrogate pair (like H
 followed by an accent code point). That's what byGrapheme is about:
 combining those pairs.

Not at all. Take time to read the Unicode standard.
Surrogate pairs are a part of UTF-16 encoding and little else.


-- 
Dmitry Olshansky

Jan 09 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

10-Jan-2014 11:49, Dmitry Olshansky пишет:
 10-Jan-2014 05:16, Adam D. Ruppe пишет:
 On Friday, 10 January 2014 at 00:56:36 UTC, Manu wrote:
 So is it 'correct'?

 Yes, with the caveat that it might find a surrogate pair (like H
 followed by an accent code point). That's what byGrapheme is about:
 combining those pairs.

 Not at all. Take time to read the Unicode standard.
 Surrogate pairs are a part of UTF-16 encoding and little else.

To clarify: grapheme cluster is not a pair, nor it's a surrogate pair, 
but H with accent is a grapheme cluster ;)

-- 
Dmitry Olshansky

Jan 09 2014

"Brad Anderson" <eco gnuk.net> writes:

On Friday, 10 January 2014 at 00:56:36 UTC, Manu wrote:
 The D docs are pretty terrible, they don't do much to help you 
 find what
 you're looking for.
 You have a massive block of function names at the top of the 
 page, you have
 to carefully scan through one by one, hoping that it's named 
 something
 obvious that will stand out to you, and in the event it doesn't 
 have a
 helper function, you need to work out the proper sequence of
 algorithm/range/whatever operations to do what you want (and 
 then repeat
 the process finding the small parts you need across a bunch of 
 modules).

DDox improves on this a bit by giving a table with brief
descriptions right up top:
http://vibed.org/temp/dlang.org/library/std/string.html

Still plenty left to do though.

 Blah! </endrant>

Jan 09 2014

Manu <turkeyman gmail.com> writes:

On 10 January 2014 12:40, Brad Anderson <eco gnuk.net> wrote:

 On Friday, 10 January 2014 at 00:56:36 UTC, Manu wrote:

 The D docs are pretty terrible, they don't do much to help you find what
 you're looking for.
 You have a massive block of function names at the top of the page, you
 have

 to carefully scan through one by one, hoping that it's named something
 obvious that will stand out to you, and in the event it doesn't have a
 helper function, you need to work out the proper sequence of
 algorithm/range/whatever operations to do what you want (and then repeat
 the process finding the small parts you need across a bunch of modules).

 DDox improves on this a bit by giving a table with brief
 descriptions right up top:
 http://vibed.org/temp/dlang.org/library/std/string.html

 Still plenty left to do though.

I prefer this immeasurably.

Jan 09 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-10 03:40, Brad Anderson wrote:

 DDox improves on this a bit by giving a table with brief
 descriptions right up top:
 http://vibed.org/temp/dlang.org/library/std/string.html

What's the hold up of making the official documentation use DDox?

-- 
/Jacob Carlborg

Jan 10 2014

"Kira Backes" <kira.backes nrwsoft.de> writes:

On Friday, 10 January 2014 at 08:15:19 UTC, Jacob Carlborg wrote:
 What's the hold up of making the official documentation use 
 DDox?

I’m also interested in this, since the current documentation is 
not beginner-friendly due to missing overview and this hurts D.

Jan 10 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

10-Jan-2014 12:15, Jacob Carlborg пишет:
 On 2014-01-10 03:40, Brad Anderson wrote:

 DDox improves on this a bit by giving a table with brief
 descriptions right up top:
 http://vibed.org/temp/dlang.org/library/std/string.html

 What's the hold up of making the official documentation use DDox?

Seconded.

-- 
Dmitry Olshansky

Jan 10 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/10/14 12:17 AM, Dmitry Olshansky wrote:
 10-Jan-2014 12:15, Jacob Carlborg пишет:
 On 2014-01-10 03:40, Brad Anderson wrote:

 DDox improves on this a bit by giving a table with brief
 descriptions right up top:
 http://vibed.org/temp/dlang.org/library/std/string.html

 What's the hold up of making the official documentation use DDox?

 Seconded.

Let's set to switch to ddox with 2.065.

Andrei

Jan 10 2014

"Dicebot" <public dicebot.lv> writes:

On Friday, 10 January 2014 at 08:31:28 UTC, Andrei Alexandrescu 
wrote:
 On 1/10/14 12:17 AM, Dmitry Olshansky wrote:
 10-Jan-2014 12:15, Jacob Carlborg пишет:
 On 2014-01-10 03:40, Brad Anderson wrote:

 DDox improves on this a bit by giving a table with brief
 descriptions right up top:
 http://vibed.org/temp/dlang.org/library/std/string.html

 What's the hold up of making the official documentation use 
 DDox?

 Seconded.

 Let's set to switch to ddox with 2.065.

 Andrei

Let's not put to much stress on release/deployment process (which 
changing documentation engine will require) and make at least one 
"simple" release just to get everyone familiar with it.

Jan 10 2014

"Brad Anderson" <eco gnuk.net> writes:

On Friday, 10 January 2014 at 08:39:13 UTC, Dicebot wrote:
 On Friday, 10 January 2014 at 08:31:28 UTC, Andrei Alexandrescu 
 wrote:
 On 1/10/14 12:17 AM, Dmitry Olshansky wrote:
 10-Jan-2014 12:15, Jacob Carlborg пишет:
 On 2014-01-10 03:40, Brad Anderson wrote:

 DDox improves on this a bit by giving a table with brief
 descriptions right up top:
 http://vibed.org/temp/dlang.org/library/std/string.html

 What's the hold up of making the official documentation use 
 DDox?

 Seconded.

 Let's set to switch to ddox with 2.065.

 Andrei

 Let's not put to much stress on release/deployment process 
 (which changing documentation engine will require) and make at 
 least one "simple" release just to get everyone familiar with 
 it.

Updating the website is almost strictly Andrei's domain so he 
should be able to do it independently of the release process 
(though integrating updating the website with the release process 
should probably happen at some point).  ddox was merged with the 
tools repo 6 months ago and dlang.org 3 months ago so as far as I 
know all that's left is for Andrei to generate the pages and 
upload them as 2.065 is completed.

Jan 10 2014

"Dicebot" <public dicebot.lv> writes:

On Friday, 10 January 2014 at 16:54:30 UTC, Brad Anderson wrote:
 Updating the website is almost strictly Andrei's domain so he 
 should be able to do it independently of the release process 
 (though integrating updating the website with the release 
 process should probably happen at some point).  ddox was merged 
 with the tools repo 6 months ago and dlang.org 3 months ago so 
 as far as I know all that's left is for Andrei to generate the 
 pages and upload them as 2.065 is completed.

Andrew should have access too as any release-related updates are 
supposed to be moved into domain of release manager.

Jan 12 2014

Jerry <jlquinn optonline.net> writes:

"Brad Anderson" <eco gnuk.net> writes:

 On Friday, 10 January 2014 at 00:56:36 UTC, Manu wrote:
 The D docs are pretty terrible, they don't do much to help you find what
 you're looking for.
 You have a massive block of function names at the top of the page, you have
 to carefully scan through one by one, hoping that it's named something
 obvious that will stand out to you, and in the event it doesn't have a
 helper function, you need to work out the proper sequence of
 algorithm/range/whatever operations to do what you want (and then repeat
 the process finding the small parts you need across a bunch of modules).

 DDox improves on this a bit by giving a table with brief
 descriptions right up top:
 http://vibed.org/temp/dlang.org/library/std/string.html

 Still plenty left to do though.

This looks much nicer as a summary.  I would personally prefer to have
the details all on the same page below, rather than having to jump to a
new page for each different function.

Still, thumbs up!

Jerry

Jan 17 2014

"Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:

On Friday, 10 January 2014 at 00:56:36 UTC, Manu wrote:
 The D docs are pretty terrible, they don't do much to help you 
 find what
 you're looking for.
 You have a massive block of function names at the top of the 
 page, you have
 to carefully scan through one by one, hoping that it's named 
 something
 obvious that will stand out to you, and in the event it doesn't 
 have a
 helper function, you need to work out the proper sequence of
 algorithm/range/whatever operations to do what you want (and 
 then repeat
 the process finding the small parts you need across a bunch of 
 modules).

I find this to be true in other languages, except the "block of 
function names."


Google it and find some StackOverflow page with the answer.

In Java, I Google it and find a Java API page (this was mostly be 
for StackOverflow took over).

D, I have a generally idea of where I need to be. Maybe it there 
are a couple modules to look at. Searching isn't as effective, 
there just aren't enough arbitrary tutorials on how to do the 
most basic of things to be able to find those basic things.


need isn't fun. But it can be a little better if you know which 
class you need.

Jan 09 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-10 03:40, Jesse Phillips wrote:


 isn't fun. But it can be a little better if you know which class you need.

It's easier in a more object oriented language. It's most likely that 


-- 
/Jacob Carlborg

Jan 10 2014

"Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:

On Friday, 10 January 2014 at 00:56:36 UTC, Manu wrote:
 On 10 January 2014 04:00, Adam D. Ruppe 
 <destructionator gmail.com> wrote:

 On Thursday, 9 January 2014 at 17:54:05 UTC, Dicebot wrote:

 It is not the same thing as sample with byGrapheme though.

 Right, but it works for ascii (and others) and shows 
 std.string isn't as
 weak as being said in this thread.

 So is it 'correct'?

It is interesting that you ask this about the D code but not the 
C function, which is not correct, you're trying to mimic.

Jan 09 2014

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Jan 10, 2014 at 10:56:27AM +1000, Manu wrote:
[...]
 The D docs are pretty terrible, they don't do much to help you find
 what you're looking for.
 You have a massive block of function names at the top of the page,

Yeah, that blob of links is useless unless you already knew what you
were looking for (kinda defeats the purpose).

The hand-classified table of functions in std.algorithm and std.range is
more useful, IMO. At least it lets you use divide-and-conquer to zoom
down to your area of interest, whereas the order of links in the blob of
links has no relation whatsoever to the functionality provided.

The order of docs for each symbol also follows the order in the source
code, which may not necessarily follow a logical order. This makes
browsing the docs difficult -- one minute it's describing find()
overloads, next minute it's talking about set unions, then after that
it's back to findAfter(), then it jumps to remove(), etc.. Try finding
what you want when the docs are 50 pages of this random jumping around.
All the more this makes a hand-classified table of symbols
indispensable.


 you have to carefully scan through one by one, hoping that it's named
 something obvious that will stand out to you, and in the event it
 doesn't have a helper function, you need to work out the proper
 sequence of algorithm/range/whatever operations to do what you want
 (and then repeat the process finding the small parts you need across a
 bunch of modules).

[...]

I will say, though, that taking the time to learn where things are in
std.algorithm, std.range, std.string, and std.array helps immensely in
knowing where to look for things in the future. This doesn't excuse the
poor state of the docs, of course, nor the non-intuitive placement of
some of the functions in Phobos, but you're likely to feel far less
frustrated if you took the time to familiarize yourself with where
things are. :)

I usually don't have too much trouble finding what I need when it comes
to string manipulation. But then again, when I fail to find something
within 15 seconds of looking at the obvious places, I just import
std.regex and proceed to crush the proverbial ant with the proverbial
nuclear warhead. :-P


T

-- 
It's bad luck to be superstitious. -- YHL

Jan 09 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-10 02:13, H. S. Teoh wrote:

 The hand-classified table of functions in std.algorithm and std.range is
 more useful, IMO. At least it lets you use divide-and-conquer to zoom
 down to your area of interest, whereas the order of links in the blob of
 links has no relation whatsoever to the functionality provided.

I'm convinced the both of the tables on in the std.algorithm 
documentation can automatically be generated with a bit help from the 
compiler. Add a macro, $(CATEGORY), the compiler knows about. The 
compiler will the generate the first table by using the symbol (which it 
already knows about) and the $(CATEGORY) macro. The second table can be 
generated in a similar way, just take the summary (first paragraph) of 
the documentation of the symbol.

 The order of docs for each symbol also follows the order in the source
 code, which may not necessarily follow a logical order. This makes
 browsing the docs difficult -- one minute it's describing find()
 overloads, next minute it's talking about set unions, then after that
 it's back to findAfter(), then it jumps to remove(), etc.. Try finding
 what you want when the docs are 50 pages of this random jumping around.
 All the more this makes a hand-classified table of symbols
 indispensable.

I would say that is poorly organized code. Although, if you do have a 
$(CATEGORY) macro, as described above, it might be a good idea to group 
the rest of the documentation after this as well.

-- 
/Jacob Carlborg

Jan 10 2014

"Brad Anderson" <eco gnuk.net> writes:

On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:
 This works fine:
   string x = find("Hello", 'H');

 This doesn't:
   string y = find(retro("Hello"), 'H');
   > Error: cannot implicitly convert expression 
 (find(retro("Hello"), 'H'))
 of type Result!() to string

 Is that wrong? That seems to be how the docs suggest it should 
 be used.

 On a side note, am I the only one that finds 
 std.algorithm/std.range/etc
 for string processing really obtuse?
 I can rarely understand the error messages, so say it's better 
 than STL is
 optimistic.

I absolutely hate the "does not match any template declaration" 
error. It's extremely unhelpful for figuring out what you need to 
do and anytime I try to do something fun with ranges I can expect 
to see it a dozen times.

 Using std.algorithm and std.range to do string manipulation 
 feels really
 lame to me.
 I hate looking through the docs of 3-4 modules to understand 
 the complete
 set of useful string operations (std.string, std.uni, 
 std.algorithm,
 std.range... at least).

I've finally started to get the hang of what stuff is in what 
module but it's taken me a couple years.  Things like File being 
in std.stdio instead of the more intuitive std.file are confusing 
enough but with strings you end up having to look in std.string, 
std.array, std.algorithm, std.range, std.format, and std.uni (and 
there are probably more than that).

 I also find the names of the generic algorithms are often 
 unrelated to the
 name of the string operation.
 My feeling is, everyone is always on about how cool D is at 
 string, but
 other than 'char[]', and the builtin slice operator, I feel 
 really
 unproductive whenever I do any heavy string manipulation in D.

I actually feel a lot more productive in D than in C++ with 
strings. Boost's string algorithms library helps fill the gap 
(and at least you only have one place to look for documentation 
when you are using it) but overall I prefer my experience working 
in D with pseudo-member chains.

Jan 09 2014

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Jan 09, 2014 at 06:25:33PM +0000, Brad Anderson wrote:
 On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:

[...]
On a side note, am I the only one that finds
std.algorithm/std.range/etc for string processing really obtuse?  I
can rarely understand the error messages, so say it's better than STL
is optimistic.

 
 I absolutely hate the "does not match any template declaration"
 error. It's extremely unhelpful for figuring out what you need to do
 and anytime I try to do something fun with ranges I can expect to
 see it a dozen times.

Yeah, that error drives me up the wall too. I often get screenfuls of
errors, dumping 25 or so overloads of some obscure Phobos internal
function (like toImpl) as though an end-user would understand any of it.
You have to parse all the sig constraints (and boy some of them are
obscure), *understand* what they mean (which requires understanding how
Phobos works internally), and *then* try to figure out, by elimination,
which is the one that you intended to match, and why your code failed to
match it.

I'm almost tempted to say that using sig constraints to differentiate
between template overloads is a bad idea. Instead, consider this
alternative implementation of toImpl:

	template toImpl(S,T)
		// N.B.: no sig constraints here
	{

		{
			S toImpl(T t)
			{
				// implementation here
			}
		}

		{
			S toImpl(T t)
			{
				// implementation here
			}
		}
		...
		else // N.B.: user-readable error message
		{
			static assert(0, "Unable to convert " ~
				T.stringof ~ " to " ~ S.stringof);
		}
	}

By putting all overloads inside a single template, we can give a useful
default message when no overloads match.

Alternatively, maybe sig constraints can have an additional string
parameter that specifies a message that explains why that particular
overload was rejected. These messages are not displayed if at least one
overload matches; only if no overload matches, they will be displayed
(so that the user can at least see why each of the overloads didn't
match).


[...]
I also find the names of the generic algorithms are often unrelated
to the name of the string operation.  My feeling is, everyone is
always on about how cool D is at string, but other than 'char[]', and
the builtin slice operator, I feel really unproductive whenever I do
any heavy string manipulation in D.


Really?? I find myself much more productive, because I only have to
learn one set of generic algorithms, and I can use them not just for
strings but for all sorts of other stuff that implement the range API.
Whereas in languages like C, sure you get familiar with string-specific
functions, but then when you need a similar-operating function for an
array of ints, you have to name it something else, and then basically
the same algorithm reimplemented for linked lists, called by yet another
name, etc.. Added together, it's many times more mental load than just
learning a single set of generic algorithms that work on (almost)
everything.

The composability of generic algorithms also allow me to think on a more
abstract level -- instead of thinking about manipulating individual
chars, I can figure out OK, if I split the string by "," then I can
filter for the strings I'm looking for, then join them back again with
another delimiter. Since the same set of algorithms work with other
ranges too, I can apply exactly the same thought process for working
with arrays, linked lists, and other containers, without having to
remember 5 different names of essentially the same algorithm but applied
to 5 different types.


 I actually feel a lot more productive in D than in C++ with strings.
 Boost's string algorithms library helps fill the gap (and at least
 you only have one place to look for documentation when you are using
 it) but overall I prefer my experience working in D with
 pseudo-member chains.

I found that what I got out of taking the time to learn std.algorithm
and std.range was worth far more than the effort invested.


T

-- 
Claiming that your operating system is the best in the world because more
people use it is like saying McDonalds makes the best food in the world. --
Carl B. Constantine

Jan 09 2014

"Brad Anderson" <eco gnuk.net> writes:

On Thursday, 9 January 2014 at 20:40:33 UTC, H. S. Teoh wrote:
 On Thu, Jan 09, 2014 at 06:25:33PM +0000, Brad Anderson wrote:
 On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:

 [...]
On a side note, am I the only one that finds
std.algorithm/std.range/etc for string processing really 
obtuse?  I
can rarely understand the error messages, so say it's better 
than STL
is optimistic.

 
 I absolutely hate the "does not match any template declaration"
 error. It's extremely unhelpful for figuring out what you need 
 to do
 and anytime I try to do something fun with ranges I can expect 
 to
 see it a dozen times.

 Yeah, that error drives me up the wall too. I often get 
 screenfuls of
 errors, dumping 25 or so overloads of some obscure Phobos 
 internal
 function (like toImpl) as though an end-user would understand 
 any of it.
 You have to parse all the sig constraints (and boy some of them 
 are
 obscure), *understand* what they mean (which requires 
 understanding how
 Phobos works internally), and *then* try to figure out, by 
 elimination,
 which is the one that you intended to match, and why your code 
 failed to
 match it.

 I'm almost tempted to say that using sig constraints to 
 differentiate
 between template overloads is a bad idea. Instead, consider this
 alternative implementation of toImpl:

 	template toImpl(S,T)
 		// N.B.: no sig constraints here
 	{

 */)
 		{
 			S toImpl(T t)
 			{
 				// implementation here
 			}
 		}
 		else static if (... /* sig constraint conditions for overload 

 		{
 			S toImpl(T t)
 			{
 				// implementation here
 			}
 		}
 		...
 		else // N.B.: user-readable error message
 		{
 			static assert(0, "Unable to convert " ~
 				T.stringof ~ " to " ~ S.stringof);
 		}
 	}

 By putting all overloads inside a single template, we can give 
 a useful
 default message when no overloads match.

Interesting and there is a lot of flexibility there. It does make 
the functions a lot more verbose though for something that is 
really the compiler's job (clearly describing errors).

 Alternatively, maybe sig constraints can have an additional 
 string
 parameter that specifies a message that explains why that 
 particular
 overload was rejected. These messages are not displayed if at 
 least one
 overload matches; only if no overload matches, they will be 
 displayed
 (so that the user can at least see why each of the overloads 
 didn't
 match).

Each constraint would have a string? I think that would help for 
some of the more obscure constraints that aren't wrapped up in an 
eponymous template helper but I don't think it'd help with the 
problem generally because the problem is identifying which exact 
constraint failed.

Example:

     void main()
     {
       import std.algorithm, std.range;
       struct A { }
       auto a = recurrence!"n"(0).take(5).find(A());
     }

This is the error message you get:

---
/d14/f101.d(5): Error: template std.algorithm.find does not match 
any function template declaration. Candidates are:
/opt/compilers/dmd2/include/std/algorithm.d(3650):        
std.algorithm.find(alias pred = "a == b", R, E)(R haystack, E 
needle) if (isInputRange!R && 
is(typeof(binaryFun!pred(haystack.front, needle)) : bool))
/opt/compilers/dmd2/include/std/algorithm.d(3713):        
std.algorithm.find(alias pred = "a == b", R1, R2)(R1 haystack, R2 
needle) if (isForwardRange!R1 && isForwardRange!R2 && 
is(typeof(binaryFun!pred(haystack.front, needle.front)) : bool) 
&& !isRandomAccessRange!R1)
/opt/compilers/dmd2/include/std/algorithm.d(3749):        
std.algorithm.find(alias pred = "a == b", R1, R2)(R1 haystack, R2 
needle) if (isRandomAccessRange!R1 && isBidirectionalRange!R2 && 
is(typeof(binaryFun!pred(haystack.front, needle.front)) : bool))
/opt/compilers/dmd2/include/std/algorithm.d(3821):        
std.algorithm.find(alias pred = "a == b", R1, R2)(R1 haystack, R2 
needle) if (isRandomAccessRange!R1 && isForwardRange!R2 && 
!isBidirectionalRange!R2 && 
is(typeof(binaryFun!pred(haystack.front, needle.front)) : bool))
/opt/compilers/dmd2/include/std/algorithm.d(4053):        
std.algorithm.find(alias pred = "a == b", Range, Ranges...)(Range 
haystack, Ranges needles) if (Ranges.length > 1 && 
is(typeof(startsWith!pred(haystack, needles))))
---

Where do you even begin with that flood of information? To fix it 
all you really want to see is which constraint you didn't 
satisfy. An error message like this would help greatly:

---
/d539/f571.d(5): Error: template std.algorithm.find call fails 
all constraints. Candidates are:
/opt/compilers/dmd2/include/std/algorithm.d:
   (3650) find(alias pred = "a == b", R, E)(R haystack, E needle):
               isInputRange!R
            && is(typeof(binaryFun!pred(haystack.front, needle)) : 
bool) <- FAILS
   (3713) find(alias pred = "a == b", R1, R2)(R1 haystack, R2 
needle):
               isForwardRange!R1
            && isForwardRange!R2 <- FAILS
            && is(typeof(binaryFun!pred(haystack.front, 
needle.front)) : bool)
            && !isRandomAccessRange!R1
   (3749) find(alias pred = "a == b", R1, R2)(R1 haystack, R2 
needle):
               isRandomAccessRange!R1 <- FAILS
            && isBidirectionalRange!R2
            && is(typeof(binaryFun!pred(haystack.front, 
needle.front)) : bool)
   (3821) find(alias pred = "a == b", R1, R2)(R1 haystack, R2 
needle)
               isRandomAccessRange!R1 <- FAILS
            && isForwardRange!R2
            && !isBidirectionalRange!R2
            && is(typeof(binaryFun!pred(haystack.front, 
needle.front)) : bool)
   (4053) find(alias pred = "a == b", Range, Ranges...)(Range 
haystack, Ranges needles)
               Ranges.length > 1 <-- FAILS
            && is(typeof(startsWith!pred(haystack, needles)))

---

The NG line limit will probably mangle that and I'm assuming 
constraints are short-circuited. The exact appearance isn't as 
important as just pointing out the failing constraints as 
strongly as you can.

 [...]
I also find the names of the generic algorithms are often 
unrelated
to the name of the string operation.  My feeling is, everyone 
is
always on about how cool D is at string, but other than 
'char[]', and
the builtin slice operator, I feel really unproductive 
whenever I do
any heavy string manipulation in D.


 Really?? I find myself much more productive, because I only 
 have to
 learn one set of generic algorithms, and I can use them not 
 just for
 strings but for all sorts of other stuff that implement the 
 range API.
 Whereas in languages like C, sure you get familiar with 
 string-specific
 functions, but then when you need a similar-operating function 
 for an
 array of ints, you have to name it something else, and then 
 basically
 the same algorithm reimplemented for linked lists, called by 
 yet another
 name, etc.. Added together, it's many times more mental load 
 than just
 learning a single set of generic algorithms that work on 
 (almost)
 everything.

 The composability of generic algorithms also allow me to think 
 on a more
 abstract level -- instead of thinking about manipulating 
 individual
 chars, I can figure out OK, if I split the string by "," then I 
 can
 filter for the strings I'm looking for, then join them back 
 again with
 another delimiter. Since the same set of algorithms work with 
 other
 ranges too, I can apply exactly the same thought process for 
 working
 with arrays, linked lists, and other containers, without having 
 to
 remember 5 different names of essentially the same algorithm 
 but applied
 to 5 different types.


 I actually feel a lot more productive in D than in C++ with 
 strings.
 Boost's string algorithms library helps fill the gap (and at 
 least
 you only have one place to look for documentation when you are 
 using
 it) but overall I prefer my experience working in D with
 pseudo-member chains.

 I found that what I got out of taking the time to learn 
 std.algorithm
 and std.range was worth far more than the effort invested.

Agreed. Except for some hiccups and those terrible error messages 
I find std.algorithm and std.range to be a work of genius. I envy 
them every day while I'm stuck using C++ at work.

 T

Jan 09 2014

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Jan 09, 2014 at 11:28:07PM +0000, Brad Anderson wrote:
 On Thursday, 9 January 2014 at 20:40:33 UTC, H. S. Teoh wrote:
On Thu, Jan 09, 2014 at 06:25:33PM +0000, Brad Anderson wrote:
On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:

[...]
On a side note, am I the only one that finds
std.algorithm/std.range/etc for string processing really obtuse?  I
can rarely understand the error messages, so say it's better than
STL is optimistic.

I absolutely hate the "does not match any template declaration"
error. It's extremely unhelpful for figuring out what you need to do
and anytime I try to do something fun with ranges I can expect to
see it a dozen times.

Yeah, that error drives me up the wall too. I often get screenfuls of
errors, dumping 25 or so overloads of some obscure Phobos internal
function (like toImpl) as though an end-user would understand any of
it.  You have to parse all the sig constraints (and boy some of them
are obscure), *understand* what they mean (which requires
understanding how Phobos works internally), and *then* try to figure
out, by elimination, which is the one that you intended to match, and
why your code failed to match it.

I'm almost tempted to say that using sig constraints to differentiate
between template overloads is a bad idea. Instead, consider this
alternative implementation of toImpl:

	template toImpl(S,T)
		// N.B.: no sig constraints here
	{

		{
			S toImpl(T t)
			{
				// implementation here
			}
		}

*/)
		{
			S toImpl(T t)
			{
				// implementation here
			}
		}
		...
		else // N.B.: user-readable error message
		{
			static assert(0, "Unable to convert " ~
				T.stringof ~ " to " ~ S.stringof);
		}
	}

By putting all overloads inside a single template, we can give a
useful default message when no overloads match.

 
 Interesting and there is a lot of flexibility there. It does make
 the functions a lot more verbose though for something that is really
 the compiler's job (clearly describing errors).

The way I see it, is that any sig constraints should go on the outer
template, and should define the *logical* scope of all overloads
encompassed therein. E.g., if you have a set of overloads for sqrt, say,
then the outer template would have a sig constraint that matches any
number-like type. Within the template, each individual overload would
handle various concrete types, and the static assert at the end is
essentially saying "in theory your arguments should match *something* in
this template, but currently your particular combination of types isn't
implemented by any overload".

Or, put another way, the outer template represents the "logical"
meta-function that does some given task (e.g., sqrt computes the square
root of *something*), whereas the inner overloads provide the actual set
of available implementations that implement that meta-function (computes
the square root of an int, computes the square root of a float, etc.).
My hypothesis is that you get the wall-of-template-errors problem when
there's a logical meta-function (or a small number of them) that, for
implementational reasons, consists of a large set of overloads. By
treating the logical meta-function as an actual entity (the outer
template), we can give a unified error message of failure to implement
the meta-function for the requested types, rather than many error
messages for each of the many overloads, most of which are irrelevant to
the user.


Alternatively, maybe sig constraints can have an additional string
parameter that specifies a message that explains why that particular
overload was rejected. These messages are not displayed if at least
one overload matches; only if no overload matches, they will be
displayed (so that the user can at least see why each of the
overloads didn't match).

 
 Each constraint would have a string? I think that would help for
 some of the more obscure constraints that aren't wrapped up in an
 eponymous template helper but I don't think it'd help with the
 problem generally because the problem is identifying which exact
 constraint failed.

True.


 Example:
 
     void main()
     {
       import std.algorithm, std.range;
       struct A { }
       auto a = recurrence!"n"(0).take(5).find(A());
     }
 
 This is the error message you get:
 
 ---
 /d14/f101.d(5): Error: template std.algorithm.find does not match
 any function template declaration. Candidates are:
 /opt/compilers/dmd2/include/std/algorithm.d(3650):
 std.algorithm.find(alias pred = "a == b", R, E)(R haystack, E
 needle) if (isInputRange!R &&
 is(typeof(binaryFun!pred(haystack.front, needle)) : bool))
 /opt/compilers/dmd2/include/std/algorithm.d(3713):
 std.algorithm.find(alias pred = "a == b", R1, R2)(R1 haystack, R2
 needle) if (isForwardRange!R1 && isForwardRange!R2 &&
 is(typeof(binaryFun!pred(haystack.front, needle.front)) : bool) &&
 !isRandomAccessRange!R1)
 /opt/compilers/dmd2/include/std/algorithm.d(3749):
 std.algorithm.find(alias pred = "a == b", R1, R2)(R1 haystack, R2
 needle) if (isRandomAccessRange!R1 && isBidirectionalRange!R2 &&
 is(typeof(binaryFun!pred(haystack.front, needle.front)) : bool))
 /opt/compilers/dmd2/include/std/algorithm.d(3821):
 std.algorithm.find(alias pred = "a == b", R1, R2)(R1 haystack, R2
 needle) if (isRandomAccessRange!R1 && isForwardRange!R2 &&
 !isBidirectionalRange!R2 && is(typeof(binaryFun!pred(haystack.front,
 needle.front)) : bool))
 /opt/compilers/dmd2/include/std/algorithm.d(4053):
 std.algorithm.find(alias pred = "a == b", Range, Ranges...)(Range
 haystack, Ranges needles) if (Ranges.length > 1 &&
 is(typeof(startsWith!pred(haystack, needles))))
 ---
 
 Where do you even begin with that flood of information? To fix it
 all you really want to see is which constraint you didn't satisfy.
 An error message like this would help greatly:
 
 ---
 /d539/f571.d(5): Error: template std.algorithm.find call fails all
 constraints. Candidates are:
 /opt/compilers/dmd2/include/std/algorithm.d:
   (3650) find(alias pred = "a == b", R, E)(R haystack, E needle):
               isInputRange!R
            && is(typeof(binaryFun!pred(haystack.front, needle)) :
 bool) <- FAILS
   (3713) find(alias pred = "a == b", R1, R2)(R1 haystack, R2
 needle):
               isForwardRange!R1
            && isForwardRange!R2 <- FAILS
            && is(typeof(binaryFun!pred(haystack.front,
 needle.front)) : bool)
            && !isRandomAccessRange!R1
   (3749) find(alias pred = "a == b", R1, R2)(R1 haystack, R2
 needle):
               isRandomAccessRange!R1 <- FAILS
            && isBidirectionalRange!R2
            && is(typeof(binaryFun!pred(haystack.front,
 needle.front)) : bool)
   (3821) find(alias pred = "a == b", R1, R2)(R1 haystack, R2 needle)
               isRandomAccessRange!R1 <- FAILS
            && isForwardRange!R2
            && !isBidirectionalRange!R2
            && is(typeof(binaryFun!pred(haystack.front,
 needle.front)) : bool)
   (4053) find(alias pred = "a == b", Range, Ranges...)(Range
 haystack, Ranges needles)
               Ranges.length > 1 <-- FAILS
            && is(typeof(startsWith!pred(haystack, needles)))

But still, this will dump out a whole bunch of overloads that aren't
necessarily interesting to the user. I mean, if I want to search for an
int in an int[], but accidentally passed a string instead of an int,
then I'm really only interested in seeing how the overload that handles
int[] searching failed to match my string argument; I don't care about
why the sig constraints failed for the overload that handles linked
lists, for example.

Perhaps a better solution lies in distinguishing the logical scope of
the function, vs. requirements on its argument types within that scope.
For example, the find() overload that searches T[] for some T, has T[]
as its scope, whereas within this scope, it imposes certain requirements
on the needle U (e.g., U must be comparable with an element of T). This
suggests that it should be implemented like this:

	auto find(R,S)(R haystack, S needle)
		if (is(R == T[], T)) // <-- defines the scope of this function
	{
		static if (isComparable(S, ElementType!R)) // <-- Defines type requirements
within this function's scope
		{
			// implementation here
		}
		else static if (isComparable(ElementType!S, ElementType!R)) // <-- ditto
		{
			// implementation here
		}
		else
			static assert(0, "Don't know how to search for "
				~ S.stringof ~ " in " ~ R.stringof);
	}

Then when you try to search for a string in an int[], for example, it
will first match this overload of find(), then fail the static if
conditions because the needle you passed in doesn't match the type
requirements for searching an int[].

Note that I've grouped at least two of the current find() overloads
under a single overload above -- because they are just two
implementations for handling two cases within the same scope: searching
an array. The fact that array-searching is implemented by two distinct
algorithms is irrelevant to the user, and so it makes sense to "hide"
them inside a single function's body.

So to summarize:
(1) use sig constraints to define the scope of an overload; and
(2) use static if inside the function body (or template body) to enforce
type requirements within that scope.

This solves the problem of needing the compiler to somehow read your
mind and figure out exactly which of the 56 overloads of find() you
intended to match but failed to.


T

-- 
The only difference between male factor and malefactor is just a little
emptiness inside.

Jan 09 2014

"Brad Anderson" <eco gnuk.net> writes:

On Friday, 10 January 2014 at 00:52:27 UTC, H. S. Teoh wrote:
 <snip>

 So to summarize:
 (1) use sig constraints to define the scope of an overload; and
 (2) use static if inside the function body (or template body) 
 to enforce
 type requirements within that scope.

 This solves the problem of needing the compiler to somehow read 
 your
 mind and figure out exactly which of the 56 overloads of find() 
 you
 intended to match but failed to.


 T

Ok, you've convinced me. I still think highlighting which 
constraints failed should happen but for well implemented modules 
like those in the standard library your approach offers even more 
helpful and tight error messages.

Jan 09 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 01/10/2014 02:19 AM, Brad Anderson wrote:
 On Friday, 10 January 2014 at 00:52:27 UTC, H. S. Teoh wrote:
 <snip>

 So to summarize:
 (1) use sig constraints to define the scope of an overload; and
 (2) use static if inside the function body (or template body) to enforce
 type requirements within that scope.

 This solves the problem of needing the compiler to somehow read your
 mind and figure out exactly which of the 56 overloads of find() you
 intended to match but failed to.


 T

 Ok, you've convinced me. I still think highlighting which constraints
 failed should happen but for well implemented modules like those in the
 standard library your approach offers even more helpful and tight error
 messages.

static assert is not a good way to implement custom error messages 
because it also changes the behaviour of the declaration.

Jan 09 2014

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Jan 10, 2014 at 04:03:53AM +0100, Timon Gehr wrote:
 On 01/10/2014 02:19 AM, Brad Anderson wrote:
On Friday, 10 January 2014 at 00:52:27 UTC, H. S. Teoh wrote:
<snip>

So to summarize:
(1) use sig constraints to define the scope of an overload; and
(2) use static if inside the function body (or template body) to
enforce type requirements within that scope.

This solves the problem of needing the compiler to somehow read your
mind and figure out exactly which of the 56 overloads of find() you
intended to match but failed to.


T

Ok, you've convinced me. I still think highlighting which constraints
failed should happen but for well implemented modules like those in
the standard library your approach offers even more helpful and tight
error messages.

 
 static assert is not a good way to implement custom error messages
 because it also changes the behaviour of the declaration.

It's not just about custom error messages; it's about picking up a
particular template signature even when you don't have an implementation
for it, because logically speaking, your set of overloads *should* pick
up all such instantiations to begin with. This is why I differentiated
between the scope of a template, vs. the actual available overloads.

With sig constraints, you're declining instantiations that don't satisfy
certain conditions; I'm arguing that sometimes you *want* to accept
instantiations that you currently don't implement (yet), because it
falls under the logical scope of what you intend to handle; you just
haven't gotten around to actually implementing it yet.


T

-- 
Stop staring at me like that! It's offens... no, you'll hurt your eyes!

Jan 10 2014

"QAston" <qaston gmail.com> writes:

Your proposal is awesome, this should be in phobos style guide 
imo.

Jan 12 2014

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Saturday, 11 January 2014 at 05:45:41 UTC, H. S. Teoh wrote:
 It's not just about custom error messages; it's about picking 
 up a
 particular template signature even when you don't have an 
 implementation
 for it, because logically speaking, your set of overloads 
 *should* pick
 up all such instantiations to begin with. This is why I 
 differentiated
 between the scope of a template, vs. the actual available 
 overloads.

 With sig constraints, you're declining instantiations that 
 don't satisfy
 certain conditions; I'm arguing that sometimes you *want* to 
 accept
 instantiations that you currently don't implement (yet), 
 because it
 falls under the logical scope of what you intend to handle; you 
 just
 haven't gotten around to actually implementing it yet.

There is nothing stopping you from writing constraints that 
accept currently unimplemented instantiations. There is no 
difference here with the two approaches.

Jan 13 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-09 21:27, H. S. Teoh wrote:

 Yeah, that error drives me up the wall too. I often get screenfuls of
 errors, dumping 25 or so overloads of some obscure Phobos internal
 function (like toImpl) as though an end-user would understand any of it.
 You have to parse all the sig constraints (and boy some of them are
 obscure), *understand* what they mean (which requires understanding how
 Phobos works internally), and *then* try to figure out, by elimination,
 which is the one that you intended to match, and why your code failed to
 match it.

 I'm almost tempted to say that using sig constraints to differentiate
 between template overloads is a bad idea. Instead, consider this
 alternative implementation of toImpl:

 	template toImpl(S,T)
 		// N.B.: no sig constraints here
 	{

 		{
 			S toImpl(T t)
 			{
 				// implementation here
 			}
 		}

 		{
 			S toImpl(T t)
 			{
 				// implementation here
 			}
 		}
 		...
 		else // N.B.: user-readable error message
 		{
 			static assert(0, "Unable to convert " ~
 				T.stringof ~ " to " ~ S.stringof);
 		}
 	}

 By putting all overloads inside a single template, we can give a useful
 default message when no overloads match.

If I recall correctly, Andrei has mentioned that something like the 
above doesn't works so well with __tratis(compile).

-- 
/Jacob Carlborg

Jan 10 2014

Manu <turkeyman gmail.com> writes:

On 10 January 2014 06:27, H. S. Teoh <hsteoh quickfur.ath.cx> wrote:

 On Thu, Jan 09, 2014 at 06:25:33PM +0000, Brad Anderson wrote:
 On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:

 [...]
On a side note, am I the only one that finds
std.algorithm/std.range/etc for string processing really obtuse?  I
can rarely understand the error messages, so say it's better than STL
is optimistic.

 I absolutely hate the "does not match any template declaration"
 error. It's extremely unhelpful for figuring out what you need to do
 and anytime I try to do something fun with ranges I can expect to
 see it a dozen times.

 Yeah, that error drives me up the wall too. I often get screenfuls of
 errors, dumping 25 or so overloads of some obscure Phobos internal
 function (like toImpl) as though an end-user would understand any of it.
 You have to parse all the sig constraints (and boy some of them are
 obscure), *understand* what they mean (which requires understanding how
 Phobos works internally), and *then* try to figure out, by elimination,
 which is the one that you intended to match, and why your code failed to
 match it.

 I'm almost tempted to say that using sig constraints to differentiate
 between template overloads is a bad idea. Instead, consider this
 alternative implementation of toImpl:

         template toImpl(S,T)
                 // N.B.: no sig constraints here
         {
                 static if (... /* sig constraint conditions for overload

                 {
                         S toImpl(T t)
                         {
                                 // implementation here
                         }
                 }
                 else static if (... /* sig constraint conditions for

                 {
                         S toImpl(T t)
                         {
                                 // implementation here
                         }
                 }
                 ...
                 else // N.B.: user-readable error message
                 {
                         static assert(0, "Unable to convert " ~
                                 T.stringof ~ " to " ~ S.stringof);
                 }
         }

 By putting all overloads inside a single template, we can give a useful
 default message when no overloads match.

*THIS* .. I've always thought that, and intuitively written my D code that
way. Funnily, I was always concerned I was being unidiomatic doing so,
since the 'std' code is rarely written like that.


Alternatively, maybe sig constraints can have an additional string
 parameter that specifies a message that explains why that particular
 overload was rejected. These messages are not displayed if at least one
 overload matches; only if no overload matches, they will be displayed
 (so that the user can at least see why each of the overloads didn't
 match).


 [...]
I also find the names of the generic algorithms are often unrelated
to the name of the string operation.  My feeling is, everyone is
always on about how cool D is at string, but other than 'char[]', and
the builtin slice operator, I feel really unproductive whenever I do
any heavy string manipulation in D.


 Really?? I find myself much more productive, because I only have to
 learn one set of generic algorithms, and I can use them not just for
 strings but for all sorts of other stuff that implement the range API.

That sounds good in theory, but if any time you try and actually use D's
generic algorithms you end up with many of the kind of errors you refer to
in your prior paragraph, then that basically undermines the whole
experience.
I don't like wasting my time, and I don't like pushing my way through
learning something that I feel is obtuse to begin with, so I usually take a
side path and work around it (most things can be done easily with a couple
of nested foreach-es). So, perhaps embarrassingly, despite my 3+ years
spent hanging around here, part of the problem is that I barely know/use
phobos. Call me lazy, but I don't think it's an unrealistic experience for
any end-user. If it saves me time/headache (and bloat) not using it, why
would I?
** Yes, it's the 'standard' library, and I like that concept in essence,
and feel like I should make use of it on principle... but it's like, you
need to already know phobos intimately to think it's awesome, which creates
a weird barrier to entry. And the docs don't help a lot.

Whereas in languages like C, sure you get familiar with string-specific
 functions, but then when you need a similar-operating function for an
 array of ints, you have to name it something else, and then basically
 the same algorithm reimplemented for linked lists, called by yet another
 name, etc.. Added together, it's many times more mental load than just
 learning a single set of generic algorithms that work on (almost)
 everything.

 The composability of generic algorithms also allow me to think on a more
 abstract level -- instead of thinking about manipulating individual
 chars, I can figure out OK, if I split the string by "," then I can
 filter for the strings I'm looking for, then join them back again with
 another delimiter. Since the same set of algorithms work with other
 ranges too, I can apply exactly the same thought process for working
 with arrays, linked lists, and other containers, without having to
 remember 5 different names of essentially the same algorithm but applied
 to 5 different types.

See, I get that idea about composability. Maybe it's just baggage from C,
but I just don't think that way. Maybe that's a large part of why I always
go wrong with phobos.
I would never think of doing something fundamental like string processing
with a sequence of generic algorithm. I'd freak out about the relatively
unknown performance characteristics.
Algorithms are usually a lot simpler when performed on strings of bytes
than they are performed on strings of objects with any imaginable copying
mechanisms and allocations patterns.
Unless I wrote something myself, I can never have faith that the sort of
concessions required to make it generic also make it fast in the case it
happens to be performed in a byte array.

There's an argument that you can specialise for string types, which is true
within single functions, but if you're 'composing' a function with generic
parts, then you can't specialise for strings anymore... There's no way to
specialise a call to a.b.c() as a compound operation.

Like I say, it's probably psychological baggage, but I tend to
unconsciously dismiss/reject that sort of thing without a second though...
or maybe experience learned me my lesson (*cough* STL).


 I actually feel a lot more productive in D than in C++ with strings.
 Boost's string algorithms library helps fill the gap (and at least
 you only have one place to look for documentation when you are using
 it) but overall I prefer my experience working in D with
 pseudo-member chains.

 I found that what I got out of taking the time to learn std.algorithm
 and std.range was worth far more than the effort invested.

Perhaps you're right. But I think there's ***HUGE*** room for improvement.
The key in your sentence is, it shouldn't require 'effort'; if it's not
intuitive to programmers with decades of experience, then there are
probably some fundamental design (or documentation/accessibility)
deficiencies that needs to be prioritised. How is any junior programmer
meant to take to D?

Jan 09 2014

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Jan 10, 2014 at 11:33:35AM +1000, Manu wrote:
 On 10 January 2014 06:27, H. S. Teoh <hsteoh quickfur.ath.cx> wrote:
 
 On Thu, Jan 09, 2014 at 06:25:33PM +0000, Brad Anderson wrote:
 On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:



[...]
I also find the names of the generic algorithms are often
unrelated to the name of the string operation.  My feeling is,
everyone is always on about how cool D is at string, but other
than 'char[]', and the builtin slice operator, I feel really
unproductive whenever I do any heavy string manipulation in D.


 Really?? I find myself much more productive, because I only have to
 learn one set of generic algorithms, and I can use them not just for
 strings but for all sorts of other stuff that implement the range
 API.

 
 That sounds good in theory, but if any time you try and actually use
 D's generic algorithms you end up with many of the kind of errors you
 refer to in your prior paragraph, then that basically undermines the
 whole experience.

Really? I only encounter those kinds of errors once in a while. They
*are* extremely annoying when they happen, but on the whole, they're
relatively rare. You must be doing something wrong if you're seeing them
all the time.


 I don't like wasting my time, and I don't like pushing my way through
 learning something that I feel is obtuse to begin with, so I usually
 take a side path and work around it (most things can be done easily
 with a couple of nested foreach-es). So, perhaps embarrassingly,
 despite my 3+ years spent hanging around here, part of the problem is
 that I barely know/use phobos. Call me lazy, but I don't think it's an
 unrealistic experience for any end-user. If it saves me time/headache
 (and bloat) not using it, why would I?

 ** Yes, it's the 'standard' library, and I like that concept in
 essence, and feel like I should make use of it on principle... but
 it's like, you need to already know phobos intimately to think it's
 awesome, which creates a weird barrier to entry. And the docs don't
 help a lot.

I think you're tainted by your experience with C. :-) Using Phobos
effectively requires that you take the time to understand and use
ranges; or, as somebody else said, stick with std.string. But if that
doesn't do what you need, then you need to ... er, understand and use
ranges. :-P  Expecting to use things the same way as in C is probably
the root cause for your frustrations.


 Whereas in languages like C, sure you get familiar with
 string-specific functions, but then when you need a
 similar-operating function for an array of ints, you have to name it
 something else, and then basically the same algorithm reimplemented
 for linked lists, called by yet another name, etc.. Added together,
 it's many times more mental load than just learning a single set of
 generic algorithms that work on (almost) everything.

 The composability of generic algorithms also allow me to think on a
 more abstract level -- instead of thinking about manipulating
 individual chars, I can figure out OK, if I split the string by ","
 then I can filter for the strings I'm looking for, then join them
 back again with another delimiter. Since the same set of algorithms
 work with other ranges too, I can apply exactly the same thought
 process for working with arrays, linked lists, and other containers,
 without having to remember 5 different names of essentially the same
 algorithm but applied to 5 different types.

 
 See, I get that idea about composability. Maybe it's just baggage from
 C, but I just don't think that way. Maybe that's a large part of why I
 always go wrong with phobos.

Yes, the baggage is slowing you down. Cast it overboard and lighten the
boat, man. ;-)


 I would never think of doing something fundamental like string
 processing with a sequence of generic algorithm. I'd freak out about
 the relatively unknown performance characteristics.

I think your caution is misplaced. Things like std.algorithm.find are
actually quite efficient -- don't be misled by the verbose layers of
template abstractions surrounding the code; for the common cases, it
translates to a simple loop. And recently, certain cases even translate
straight to C's strchr / memchr, and so are on par with C.


 Algorithms are usually a lot simpler when performed on strings of
 bytes than they are performed on strings of objects with any
 imaginable copying mechanisms and allocations patterns.

Phobos also has lots of template specializations that take advantage of
strings and arrays.


 Unless I wrote something myself, I can never have faith that the sort
 of concessions required to make it generic also make it fast in the
 case it happens to be performed in a byte array.

Well, if you're going to insist on NIH syndrome, then you might as well
write your own standard library instead of fighting with Phobos. :)


 There's an argument that you can specialise for string types, which is
 true within single functions, but if you're 'composing' a function
 with generic parts, then you can't specialise for strings anymore...
 There's no way to specialise a call to a.b.c() as a compound
 operation.

And how exactly does the C compiler specialize strchr(strcat(a,b),c) as
a single compound operation?

If you want a single-pass compound operation on a string, you'd have to
write it out manually in C... and in D, you could write it out manually
too, just use a for loop over the string -- same effort, same
performance. Or you could save yourself the trouble and compose two
algorithms from std.algorithm, the result of which is *also* single-pass
(because ranges are lazy). Sure you can object that there's overhead
introduced by using ranges, but since .front translates to just *ptr and
.popFront translates to just ++ptr, the only overhead is just a few
function calls if the compiler doesn't inline them. Which, for functions
that small, it probably does.


 Like I say, it's probably psychological baggage, but I tend to
 unconsciously dismiss/reject that sort of thing without a second
 though...  or maybe experience learned me my lesson (*cough* STL).

OK, let's get one thing straight here. Comparing Phobos to STL is truly
unfair. I spent almost 2 decades writing C++, and wrote code both using
STL and without (from when STL didn't exist yet), and IME, Phobos's
range algorithms are *orders* of magnitude better than STL in terms of
usability. At least. In STL, you have to always manage pointer pairs,
which become a massive pain when you need to pass multiple pairs around
(very error-prone, transpose one argument, and you have a nice segfault
or memory corruption bug).  Then you have stupid verbose syntax like:

	// You can't even write the for-loop conditions in a single
	// line!
	for (std::vector<MyType<Blah> >::iterator it =
		myContainer.start();
		it != myContainer.end();
		it++)
	{
		// What's with this (*smartPtr)->x nonsense everywhere?
		doSomething((*((*it)->impl)->myDataField);

		// What, I can't even write a simple X != Y if-condition
		// in a single line?! Not to mention the silly
		// redundancy of having to write out the entire chain of
		// dereferences to exactly the same object twice.
		if (find((*(*it)->impl)->mySubContainer, key) ==
			(*(*it)->impl)->mySubContainer.end())
		{
			// How I long for D's .init!
			std::vector<MyTypeBlah> >::iterator empty;
			return empty;
		}
	}

Whereas in D:

	foreach (item; myContainer) {
		doSomething(item.impl.myDataField);
		if (!item.mySubContainer.canFind(key))
			return ElementType!MyContainer.init;
	}

There's no comparison, I tell you. No comparison at all.


 I actually feel a lot more productive in D than in C++ with
 strings.  Boost's string algorithms library helps fill the gap
 (and at least you only have one place to look for documentation
 when you are using it) but overall I prefer my experience working
 in D with pseudo-member chains.

 I found that what I got out of taking the time to learn
 std.algorithm and std.range was worth far more than the effort
 invested.

 
 Perhaps you're right. But I think there's ***HUGE*** room for
 improvement.  The key in your sentence is, it shouldn't require
 'effort'; if it's not intuitive to programmers with decades of
 experience, then there are probably some fundamental design (or
 documentation/accessibility) deficiencies that needs to be
 prioritised. How is any junior programmer meant to take to D?

No offense, but IME, junior programmers tend to pick up these things
much faster than experienced programmers with lots of baggage from other
languages, precisely because they don't have all that baggage to slow
them down. Old habits die hard, as they say.

That's not to say that the D docs don't need improvement, of course. But
given all your objections about Phobos algorithms despite having barely
*used* Phobos, I think the source of your difficulty lies more in the
baggage than in the documentation. :)


T

-- 
Give me some fresh salted fish, please.

Jan 09 2014

"Atila Neves" <atila.neves gmail.com> writes:

I agree that std.algorithm is better than <algorithm>, but let's 
not pretend that C++11 never happened (that happens from time to 
time on this forum). The modern C++ version isn't _that_ 
different:

     for(auto& blah: myContainer) { //for-loop condition on one 
line
         doSomething(blah->impl->myDataField);
         if(find(blah->impl->mySubContainer.begin(), 
blah->impl->mySubContainer.end(), key) == 
blah->impl->mySubContainer.end()) {
             //decltype is way shorter than 
std::vector<MyType<Blah>>
             //and change-resistant
             return decltype(blah)::iterator{};
         }
      }

Again, I think that std.algorithm is better and that passing a 
pair of iterators to everything when 99.9% of the time they'll be 
begin() and end() anyway is a massive PITA. I'm a D convert. 
Nobody here makes a point of posting D1 code and IMHO there's 
also no point in posting C++98 / C++2003 code.

Atila

 	// You can't even write the for-loop conditions in a single
 	// line!
 	for (std::vector<MyType<Blah> >::iterator it =
 		myContainer.start();
 		it != myContainer.end();
 		it++)
 	{
 		// What's with this (*smartPtr)->x nonsense everywhere?
 		doSomething((*((*it)->impl)->myDataField);

 		// What, I can't even write a simple X != Y if-condition
 		// in a single line?! Not to mention the silly
 		// redundancy of having to write out the entire chain of
 		// dereferences to exactly the same object twice.
 		if (find((*(*it)->impl)->mySubContainer, key) ==
 			(*(*it)->impl)->mySubContainer.end())
 		{
 			// How I long for D's .init!
 			std::vector<MyTypeBlah> >::iterator empty;
 			return empty;
 		}
 	}



 OK, let's get one thing straight here. Comparing Phobos to STL 
 is truly
 unfair. I spent almost 2 decades writing C++, and wrote code 
 both using
 STL and without (from when STL didn't exist yet), and IME, 
 Phobos's
 range algorithms are *orders* of magnitude better than STL in 
 terms of
 usability. At least. In STL, you have to always manage pointer 
 pairs,
 which become a massive pain when you need to pass multiple 
 pairs around
 (very error-prone, transpose one argument, and you have a nice 
 segfault
 or memory corruption bug).  Then you have stupid verbose syntax 
 like:

 	// You can't even write the for-loop conditions in a single
 	// line!
 	for (std::vector<MyType<Blah> >::iterator it =
 		myContainer.start();
 		it != myContainer.end();
 		it++)
 	{
 		// What's with this (*smartPtr)->x nonsense everywhere?
 		doSomething((*((*it)->impl)->myDataField);

 		// What, I can't even write a simple X != Y if-condition
 		// in a single line?! Not to mention the silly
 		// redundancy of having to write out the entire chain of
 		// dereferences to exactly the same object twice.
 		if (find((*(*it)->impl)->mySubContainer, key) ==
 			(*(*it)->impl)->mySubContainer.end())
 		{
 			// How I long for D's .init!
 			std::vector<MyTypeBlah> >::iterator empty;
 			return empty;
 		}
 	}

 Whereas in D:

 	foreach (item; myContainer) {
 		doSomething(item.impl.myDataField);
 		if (!item.mySubContainer.canFind(key))
 			return ElementType!MyContainer.init;
 	}

 There's no comparison, I tell you. No comparison at all.


 I actually feel a lot more productive in D than in C++ with
 strings.  Boost's string algorithms library helps fill the 
 gap
 (and at least you only have one place to look for 
 documentation
 when you are using it) but overall I prefer my experience 
 working
 in D with pseudo-member chains.

 I found that what I got out of taking the time to learn
 std.algorithm and std.range was worth far more than the 
 effort
 invested.

 
 Perhaps you're right. But I think there's ***HUGE*** room for
 improvement.  The key in your sentence is, it shouldn't require
 'effort'; if it's not intuitive to programmers with decades of
 experience, then there are probably some fundamental design (or
 documentation/accessibility) deficiencies that needs to be
 prioritised. How is any junior programmer meant to take to D?

 No offense, but IME, junior programmers tend to pick up these 
 things
 much faster than experienced programmers with lots of baggage 
 from other
 languages, precisely because they don't have all that baggage 
 to slow
 them down. Old habits die hard, as they say.

 That's not to say that the D docs don't need improvement, of 
 course. But
 given all your objections about Phobos algorithms despite 
 having barely
 *used* Phobos, I think the source of your difficulty lies more 
 in the
 baggage than in the documentation. :)


 T

Jan 10 2014

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Jan 10, 2014 at 07:32:23PM +0000, Atila Neves wrote:
 I agree that std.algorithm is better than <algorithm>, but let's not
 pretend that C++11 never happened (that happens from time to time on
 this forum). The modern C++ version isn't _that_ different:
 
     for(auto& blah: myContainer) { //for-loop condition on one line
         doSomething(blah->impl->myDataField);
         if(find(blah->impl->mySubContainer.begin(),
 blah->impl->mySubContainer.end(), key) ==
 blah->impl->mySubContainer.end()) {
             //decltype is way shorter than std::vector<MyType<Blah>>
             //and change-resistant
             return decltype(blah)::iterator{};
         }
      }
 
 Again, I think that std.algorithm is better and that passing a pair
 of iterators to everything when 99.9% of the time they'll be begin()
 and end() anyway is a massive PITA. I'm a D convert. Nobody here
 makes a point of posting D1 code and IMHO there's also no point in
 posting C++98 / C++2003 code.

You're right, my C++ is outdated. I'm not exactly motivated to keep up
with the latest version of C++, though, since D is far better, and my
day job is primarily with C, and what C++ code we have is still in the
dark ages of C++2003 (or perhaps *shudder* even C++98), and is unlikely
to be upgraded to C++11 anytime in the foreseeable future.


[...]
	// You can't even write the for-loop conditions in a single
	// line!
	for (std::vector<MyType<Blah> >::iterator it =
		myContainer.start();
		it != myContainer.end();
		it++)
	{
		// What's with this (*smartPtr)->x nonsense everywhere?
		doSomething((*((*it)->impl)->myDataField);

		// What, I can't even write a simple X != Y if-condition
		// in a single line?! Not to mention the silly
		// redundancy of having to write out the entire chain of
		// dereferences to exactly the same object twice.
		if (find((*(*it)->impl)->mySubContainer, key) ==
			(*(*it)->impl)->mySubContainer.end())
		{
			// How I long for D's .init!
			std::vector<MyTypeBlah> >::iterator empty;
			return empty;
		}
	}



T
-- 
Heuristics are bug-ridden by definition. If they didn't have bugs, they'd be
algorithms.

Jan 10 2014

"Craig Dillabaugh" <cdillaba cg.scs.carleton.ca> writes:

On Friday, 10 January 2014 at 19:32:24 UTC, Atila Neves wrote:
 I agree that std.algorithm is better than <algorithm>, but 
 let's not pretend that C++11 never happened (that happens from 
 time to time on this forum). The modern C++ version isn't 
 _that_ different:

     for(auto& blah: myContainer) { //for-loop condition on one 
 line
         doSomething(blah->impl->myDataField);
         if(find(blah->impl->mySubContainer.begin(), 
 blah->impl->mySubContainer.end(), key) == 
 blah->impl->mySubContainer.end()) {
             //decltype is way shorter than 
 std::vector<MyType<Blah>>
             //and change-resistant
             return decltype(blah)::iterator{};
         }
      }

 Again, I think that std.algorithm is better and that passing a 
 pair of iterators to everything when 99.9% of the time they'll 
 be begin() and end() anyway is a massive PITA. I'm a D convert. 
 Nobody here makes a point of posting D1 code and IMHO there's 
 also no point in posting C++98 / C++2003 code.

 Atila

In our company we have people working with Visual Studio 2005, so 
when I am working on common code I still have to avoid any new 
C++ features! I am 'really' trying to get them to upgrade!

Craig

Jan 10 2014

Manu <turkeyman gmail.com> writes:

On 10 January 2014 12:48, H. S. Teoh <hsteoh quickfur.ath.cx> wrote:

 On Fri, Jan 10, 2014 at 11:33:35AM +1000, Manu wrote:
 On 10 January 2014 06:27, H. S. Teoh <hsteoh quickfur.ath.cx> wrote:

 On Thu, Jan 09, 2014 at 06:25:33PM +0000, Brad Anderson wrote:
 On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:



 [...]
I also find the names of the generic algorithms are often
unrelated to the name of the string operation.  My feeling is,
everyone is always on about how cool D is at string, but other
than 'char[]', and the builtin slice operator, I feel really
unproductive whenever I do any heavy string manipulation in D.


 Really?? I find myself much more productive, because I only have to
 learn one set of generic algorithms, and I can use them not just for
 strings but for all sorts of other stuff that implement the range
 API.

 That sounds good in theory, but if any time you try and actually use
 D's generic algorithms you end up with many of the kind of errors you
 refer to in your prior paragraph, then that basically undermines the
 whole experience.

 Really? I only encounter those kinds of errors once in a while. They
 *are* extremely annoying when they happen, but on the whole, they're
 relatively rare. You must be doing something wrong if you're seeing them
 all the time.

I think not really knowing quite what you need to do in advance elevates
the probability of doing something wrong ;)
The quality of these range error messages needs to be improved somehow if
basic string operations are supposed to use them comfortably.


 I don't like wasting my time, and I don't like pushing my way through
 learning something that I feel is obtuse to begin with, so I usually
 take a side path and work around it (most things can be done easily
 with a couple of nested foreach-es). So, perhaps embarrassingly,
 despite my 3+ years spent hanging around here, part of the problem is
 that I barely know/use phobos. Call me lazy, but I don't think it's an
 unrealistic experience for any end-user. If it saves me time/headache
 (and bloat) not using it, why would I?

 ** Yes, it's the 'standard' library, and I like that concept in
 essence, and feel like I should make use of it on principle... but
 it's like, you need to already know phobos intimately to think it's
 awesome, which creates a weird barrier to entry. And the docs don't
 help a lot.

 I think you're tainted by your experience with C. :-) Using Phobos
 effectively requires that you take the time to understand and use
 ranges; or, as somebody else said, stick with std.string. But if that
 doesn't do what you need, then you need to ... er, understand and use
 ranges. :-P  Expecting to use things the same way as in C is probably
 the root cause for your frustrations.

I don't agree that something like ranges shouldn't be more or less
intuitive. C doesn't have ranges, so I don't think I'm really transposing C
baggage when considering how to debug my mistakes in range based code in
this case.
Like most things, once you know your way around it, it's fine, but is there
opportunities (mostly in trivial things like better naming
conventions/standards and improved error messages) to make it a whole lot
more intuitive?


 Whereas in languages like C, sure you get familiar with
 string-specific functions, but then when you need a
 similar-operating function for an array of ints, you have to name it
 something else, and then basically the same algorithm reimplemented
 for linked lists, called by yet another name, etc.. Added together,
 it's many times more mental load than just learning a single set of
 generic algorithms that work on (almost) everything.

 The composability of generic algorithms also allow me to think on a
 more abstract level -- instead of thinking about manipulating
 individual chars, I can figure out OK, if I split the string by ","
 then I can filter for the strings I'm looking for, then join them
 back again with another delimiter. Since the same set of algorithms
 work with other ranges too, I can apply exactly the same thought
 process for working with arrays, linked lists, and other containers,
 without having to remember 5 different names of essentially the same
 algorithm but applied to 5 different types.

 See, I get that idea about composability. Maybe it's just baggage from
 C, but I just don't think that way. Maybe that's a large part of why I
 always go wrong with phobos.

 Yes, the baggage is slowing you down. Cast it overboard and lighten the
 boat, man. ;-)


 I would never think of doing something fundamental like string
 processing with a sequence of generic algorithm. I'd freak out about
 the relatively unknown performance characteristics.

 I think your caution is misplaced. Things like std.algorithm.find are
 actually quite efficient -- don't be misled by the verbose layers of
 template abstractions surrounding the code; for the common cases, it
 translates to a simple loop. And recently, certain cases even translate
 straight to C's strchr / memchr, and so are on par with C.

Surely it can't do that if the operation requires any composition? How do
you specialise a composed sequence of operations?

 Algorithms are usually a lot simpler when performed on strings of
 bytes than they are performed on strings of objects with any
 imaginable copying mechanisms and allocations patterns.

 Phobos also has lots of template specializations that take advantage of
 strings and arrays.

Again, I'm talking WRT composition specifically here.


 Unless I wrote something myself, I can never have faith that the sort
 of concessions required to make it generic also make it fast in the
 case it happens to be performed in a byte array.

 Well, if you're going to insist on NIH syndrome, then you might as well
 write your own standard library instead of fighting with Phobos. :)


 There's an argument that you can specialise for string types, which is
 true within single functions, but if you're 'composing' a function
 with generic parts, then you can't specialise for strings anymore...
 There's no way to specialise a call to a.b.c() as a compound
 operation.

 And how exactly does the C compiler specialize strchr(strcat(a,b),c) as
 a single compound operation?

That's equally a composed statement. It's the same as the concern I raise.
I was refering to cases where D requires a composed statement as opposed to
cases where other languages may have some explicit function that does a
single complex thing.

And I'm not talking about specifics, I was illustrating the nature of my
psychological baggage :) .. I have an unreasonable distrust towards
requiring composed statements to do very simple things.
It's not a specific criticism, it's a comment.


If you want a single-pass compound operation on a string, you'd have to
 write it out manually in C... and in D, you could write it out manually
 too, just use a for loop over the string -- same effort, same
 performance. Or you could save yourself the trouble and compose two
 algorithms from std.algorithm, the result of which is *also* single-pass
 (because ranges are lazy). Sure you can object that there's overhead
 introduced by using ranges, but since .front translates to just *ptr and
 .popFront translates to just ++ptr, the only overhead is just a few
 function calls if the compiler doesn't inline them. Which, for functions
 that small, it probably does.

Surely it can't be *ptr and ++ptr as you say, otherwise none of it would be
unicode safe...?


 Like I say, it's probably psychological baggage, but I tend to
 unconsciously dismiss/reject that sort of thing without a second
 though...  or maybe experience learned me my lesson (*cough* STL).

 OK, let's get one thing straight here. Comparing Phobos to STL is truly
 unfair. I spent almost 2 decades writing C++, and wrote code both using
 STL and without (from when STL didn't exist yet), and IME, Phobos's
 range algorithms are *orders* of magnitude better than STL in terms of
 usability. At least. In STL, you have to always manage pointer pairs,
 which become a massive pain when you need to pass multiple pairs around
 (very error-prone, transpose one argument, and you have a nice segfault
 or memory corruption bug).  Then you have stupid verbose syntax like:

         // You can't even write the for-loop conditions in a single
         // line!
         for (std::vector<MyType<Blah> >::iterator it =
                 myContainer.start();
                 it != myContainer.end();
                 it++)
         {
                 // What's with this (*smartPtr)->x nonsense everywhere?
                 doSomething((*((*it)->impl)->myDataField);

                 // What, I can't even write a simple X != Y if-condition
                 // in a single line?! Not to mention the silly
                 // redundancy of having to write out the entire chain of
                 // dereferences to exactly the same object twice.
                 if (find((*(*it)->impl)->mySubContainer, key) ==
                         (*(*it)->impl)->mySubContainer.end())
                 {
                         // How I long for D's .init!
                         std::vector<MyTypeBlah> >::iterator empty;
                         return empty;
                 }
         }

 Whereas in D:

         foreach (item; myContainer) {
                 doSomething(item.impl.myDataField);
                 if (!item.mySubContainer.canFind(key))
                         return ElementType!MyContainer.init;
         }

 There's no comparison, I tell you. No comparison at all.

Yes, I'm aware that it's syntactically superior, but the quality of the
error messages isn't much better than STL.
I also find things easier to find and/or more logically named (probably
biased from past exposure, i know) in the STL than in phobos.


 I actually feel a lot more productive in D than in C++ with
 strings.  Boost's string algorithms library helps fill the gap
 (and at least you only have one place to look for documentation
 when you are using it) but overall I prefer my experience working
 in D with pseudo-member chains.

 I found that what I got out of taking the time to learn
 std.algorithm and std.range was worth far more than the effort
 invested.

 Perhaps you're right. But I think there's ***HUGE*** room for
 improvement.  The key in your sentence is, it shouldn't require
 'effort'; if it's not intuitive to programmers with decades of
 experience, then there are probably some fundamental design (or
 documentation/accessibility) deficiencies that needs to be
 prioritised. How is any junior programmer meant to take to D?

 No offense, but IME, junior programmers tend to pick up these things
 much faster than experienced programmers with lots of baggage from other
 languages, precisely because they don't have all that baggage to slow
 them down. Old habits die hard, as they say.

Maybe you're right, but I can't imagine many juniors that would be capable
of tracking down what went wrong when they inevitably made a mistake and
get met with weird errors relating to ranges and template constraints and
all that good stuff... Maybe they'd be doing it differently in the first
place though? Who knows.


That's not to say that the D docs don't need improvement, of course. But
 given all your objections about Phobos algorithms despite having barely
 *used* Phobos, I think the source of your difficulty lies more in the
 baggage than in the documentation. :)

I already said that myself. But I'd like to think the experience could be
smoother, more helpful, and more intuitive. I don't think you can say it's
perfect, or even particularly 'good'. It's acceptable, it does seem to
work, but it's not an easy learning curve, and it's hard to take in small
steps, or to absorb via osmosis.
Every time I try and repeat something that 'I kinda remember seeing a few
months ago' and 'it was kinda like this...', it takes me AGES to get right.
Always finicky little details that take the most time, and I often find the
phobos source code more helpful than the docs, which isn't a good sign.

That's my general point. I think there's a lot of room for case study, and
improvement.

Jan 09 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-10 07:18, Manu wrote:

 Surely it can't be *ptr and ++ptr as you say, otherwise none of it would
 be unicode safe...?

For UTF-8 strings it's an extra if-statement:

immutable c = str[0];
if(c < 0x80)
{
     //ptr is used to avoid unnnecessary bounds checking.
     str = str.ptr[1 .. str.length];
}

-- 
/Jacob Carlborg

Jan 10 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 1/9/2014 10:18 PM, Manu wrote:
 Always
 finicky little details that take the most time, and I often find the phobos
 source code more helpful than the docs, which isn't a good sign.

 That's my general point. I think there's a lot of room for case study, and
 improvement.

You're right, and I see the same thing when I use ranges.

The only way to tackle it is when running into things, on a case by case basis, 
submit improvement proposals.

Jan 26 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-09 15:07, Manu wrote:
 This works fine:
    string x = find("Hello", 'H');

 This doesn't:
    string y = find(retro("Hello"), 'H');
    > Error: cannot implicitly convert expression (find(retro("Hello"),
 'H')) of type Result!() to string

 Is that wrong? That seems to be how the docs suggest it should be used.

As other as said, the problem is that "find" returns a range, which is 
not implicitly convertible to "string". The main reason is to avoid 
temporary allocations when chaining algorithms.

If it was the other way around you would probably be complaining it 
wasn't efficient enough ;)

 On a side note, am I the only one that finds std.algorithm/std.range/etc
 for string processing really obtuse?
 I can rarely understand the error messages, so say it's better than STL
 is optimistic.
 Using std.algorithm and std.range to do string manipulation feels really
 lame to me.
 I hate looking through the docs of 3-4 modules to understand the
 complete set of useful string operations (std.string, std.uni,
 std.algorithm, std.range... at least).

You forgot std.array ;)

 I also find the names of the generic algorithms are often unrelated to
 the name of the string operation.
 My feeling is, everyone is always on about how cool D is at string, but
 other than 'char[]', and the builtin slice operator, I feel really
 unproductive whenever I do any heavy string manipulation in D.

You have built-in appending, concatenation, using strings in switch 
statements and so on.

 I also hate that I need to import at least 4-5 modules to do anything
 useful with strings... I feel my program bloating and cringe with every
 gigantic import that sources exactly one symbol.

I agree with you. I have built up a small library through out the years 
that basically allows me to only import a single module to do most 
string operations I need.

You probably don't like it but you could have a look at Tango as well. 
It contains two useful modules (for this case). One for handling 
arbitrary array operators and one for string operations.

tango.core.Array
tango.text.Util

https://github.com/SiegeLord/Tango-D2
http://siegelord.github.io/Tango-D2/

-- 
/Jacob Carlborg

Jan 09 2014

Manu <turkeyman gmail.com> writes:

On 10 January 2014 06:34, Jacob Carlborg <doob me.com> wrote:

 On 2014-01-09 15:07, Manu wrote:

 This works fine:
    string x = find("Hello", 'H');

 This doesn't:
    string y = find(retro("Hello"), 'H');
    > Error: cannot implicitly convert expression (find(retro("Hello"),
 'H')) of type Result!() to string

 Is that wrong? That seems to be how the docs suggest it should be used.

 As other as said, the problem is that "find" returns a range, which is not
 implicitly convertible to "string". The main reason is to avoid temporary
 allocations when chaining algorithms.

 If it was the other way around you would probably be complaining it wasn't
 efficient enough ;)

Then there's probably a fundamental problem somewhere, and it should be
re-thought at a lower level.
Perhaps even something super simple like a can't-go-wrong naming
convention, that makes it REALLY plain when string related function are
dealing with bytes, codepoints, or graphemes?
It would seem to be that a lot of the confusion and complexity surrounding
strings is because it tries to be 'correct' (and varying levels of correct
in different circumstances), but there are no clear relationships between
different functions that deal with these different versions of
'correct'-ness.


 On a side note, am I the only one that finds std.algorithm/std.range/etc
 for string processing really obtuse?
 I can rarely understand the error messages, so say it's better than STL
 is optimistic.
 Using std.algorithm and std.range to do string manipulation feels really
 lame to me.
 I hate looking through the docs of 3-4 modules to understand the
 complete set of useful string operations (std.string, std.uni,
 std.algorithm, std.range... at least).

 You forgot std.array ;)


I did! And there are probably others too.
You can't do anything without std.typecons either. Although not directly
related, it's always seems to be there alongside.

 I also find the names of the generic algorithms are often unrelated to
 the name of the string operation.
 My feeling is, everyone is always on about how cool D is at string, but
 other than 'char[]', and the builtin slice operator, I feel really
 unproductive whenever I do any heavy string manipulation in D.

 You have built-in appending, concatenation, using strings in switch
 statements and so on.


Correct, those things are good. That is where 'D is awesome at strings'
ends though, in my opinion.

 I also hate that I need to import at least 4-5 modules to do anything
 useful with strings... I feel my program bloating and cringe with every
 gigantic import that sources exactly one symbol.

 I agree with you. I have built up a small library through out the years
 that basically allows me to only import a single module to do most string
 operations I need.

I suspect your effort is not uncommon. Is this not clear evidence of a
critical problem?

You probably don't like it but you could have a look at Tango as well. It
 contains two useful modules (for this case). One for handling arbitrary
 array operators and one for string operations.

 tango.core.Array
 tango.text.Util

 https://github.com/SiegeLord/Tango-D2
 http://siegelord.github.io/Tango-D2/


Yeah... I want less libraries, not more :/

Jan 09 2014

Jacob Carlborg <doob me.com> writes:

On 2014-01-10 02:06, Manu wrote:

 Then there's probably a fundamental problem somewhere, and it should be
 re-thought at a lower level.
 Perhaps even something super simple like a can't-go-wrong naming
 convention, that makes it REALLY plain when string related function are
 dealing with bytes, codepoints, or graphemes?

Isn't it with convention that every thing _can_ go wrong.

 It would seem to be that a lot of the confusion and complexity
 surrounding strings is because it tries to be 'correct' (and varying
 levels of correct in different circumstances), but there are no clear
 relationships between different functions that deal with these different
 versions of 'correct'-ness.

I think the confusion comes from strings are just plain arrays, which 
are also containers. If there's a function that works on containers it 
will work on arrays and strings as well. Because of that it's put in a 
general module for containers, in this case std.algorithm. Functions 
that work on arrays will also work on strings and they're put in the 
most general location they fit, std.array. Functions that work only work 
on strings are put in std.string. The we of course have some other 
modules, like std.uni and std.utf making it a bit more complicated.

 I suspect your effort is not uncommon. Is this not clear evidence of a
 critical problem?

Probably. I find that to be a problem in most standard libraries. They 
have very general functionality but very few convenient functions, that 
require calling two or three functions and perhaps creating an object.

-- 
/Jacob Carlborg

Jan 10 2014

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Thursday, 9 January 2014 at 14:08:02 UTC, Manu wrote:
 [snip]

Using std.algorithm or std.range requires learning about ranges. 
You shouldn't be surprised that string handling with ranges works 
differently from specialized string handling functions, which is 
the norm in most languages. For anyone with even a cursory 
knowledge of ranges and range algorithms, it's no surprise when 
the result of a range composition is not of string type even when 
the input is a string.

If you don't want to learn about ranges, use std.string. If 
std.string is not sufficient, then you should consider learning 
about ranges, which means accepting that yes, things will be 
different. Learning about ranges and how to use them for string 
manipulation is not the easiest thing right now due to a dearth 
of learning material, but that's not a problem with ranges. 
Compiler error messages are indeed part of the problem, but they 
are a WIP. 2.065 contains an incremental improvement to error 
messages on failure of overload resolution (Thanks Kenji).

About Unicode, the unit that the language promotes and the 
standard library embraces is `dchar`, the Unicode code point. The 
choice of not using graphemes is a compromise between correctness 
and performance. That means that the onus is still on the user to 
cover the last mile of correctness, so the user is not exempt 
from having to learn at least the basics of Unicode in order to 
write Unicode-correct code in D. However, this is a surprisingly 
reasonable compromise: as long as all inputs are normalized to 
the same format (which may require std.uni.normalize if the 
source of the input does not guarantee a particular format), then 
outside of contrived examples it's very hard to break grapheme 
clusters by using range-based code, even though they are ranges 
of code points. Explicit handling of graphemes is typically only 
needed for very specific domains, like if you're writing a text 
rendering library or a text input box etc. Thus typical 
range-based string manipulation tends to be correct even for 
multi-code-point graphemes, without the author having to 
consciously handle it.

2.065 has std.uni.byGrapheme/byCodePoint for range-based grapheme 
manipulation. However, there is a performance cost involved so I 
recommend against using it dogmatically. The result of 
`byGrapheme` is not bidirectional yet - someone needs to take the 
time to implement `decodeGraphemeBack` and/or 
`graphemeStrideBack` first.

Jan 09 2014

D Programming

C/C++ Programming

Other

digitalmars.D - Should this work?