digitalmars.D - improving the join function

Andrei Alexandrescu (32/32) Oct 11 2010 I'm looking at http://d.puremagic.com/issues/show_bug.cgi?id=3313 and

bearophile (5/13) Oct 11 2010 Too much over-generalization is bad, and not just for D newbies. So std....

bearophile (3/4) Oct 11 2010 If you don't like that name collision, the std.algorithm one may be name...

Andrei Alexandrescu (5/7) Oct 11 2010 This is not a matter of name collision. The new join should be a

Daniel Gibson (12/32) Oct 11 2010 I like that idea.

Simen kjaeraas (11/19) Oct 11 2010 Once again I see the combinatorial range in the background. Man, why doe...

Philippe Sigaud (15/32) Oct 11 2010 h

Daniel Gibson (14/48) Oct 11 2010 zip doesn't work here because it doesn't create a combinatorical/cartesi...

Simen kjaeraas (9/17) Oct 12 2010 Natural join could easily be done in D for ranges of structs or classes.

Daniel Gibson (12/35) Oct 11 2010 Yes, but if:

bearophile (4/8) Oct 11 2010 I like the std.string.join() function, in Python I use the str.join() me...

Daniel Gibson (5/16) Oct 11 2010 Then the name in python sucks as well :P
Eric Poggel (3/11) Oct 11 2010 Most other languages call their equivalent function join. Renaming it

Andrei Alexandrescu (5/10) Oct 11 2010 I think union() is a worse name than join(). The discussion was to

Daniel Gibson (16/28) Oct 11 2010 Okay, union does kind of suck, because it implies set semantics (and thu...

Jonathan M Davis (12/46) Oct 11 2010 Really. It's not that hard to have a function with a name that means dif...

Daniel Gibson (6/52) Oct 11 2010 It's not about database code (and not primarily about strings or std.str...

Andrei Alexandrescu (6/24) Oct 11 2010 I for one would expect join() in its relational sense to work on things

Daniel Gibson (6/35) Oct 11 2010 Of course indexes would speed things up, but as mentioned before join() ...

Andrei Alexandrescu (5/13) Oct 12 2010 From http://www.hookedonlinq.com/JoinOperator.ashx (see the "loop

Robert Jacques (7/37) Oct 11 2010 Regarding the bike shed,

dolive (4/40) Oct 11 2010 Yes��reference should learn java naming philosophy��
Norbert Nemec (3/8) Oct 12 2010 I agree - what is currently offered by join() could simply be achieved

Philippe Sigaud (15/29) Oct 11 2010 I like this and I've nothing against this signature, but I'm probably

Andrei Alexandrescu (4/18) Oct 12 2010 You must mean 4am :o). The abstraction you talk about is already

Daniel Gibson (5/50) Oct 11 2010 Btw: Is "join" not just a (rather trivial) generalization of reduce?

Daniel Gibson (2/7) Oct 11 2010 Not generalization, I meant specialization. (I should probably go to bed...

Robert Jacques (5/17) Oct 11 2010 Well, except for the N memory allocations. Also, for generic ranges you'...

Andrei Alexandrescu (6/11) Oct 12 2010 It is, but things are a bit messed up by empty ranges.

Pelle (6/38) Oct 11 2010 I think the function signature should be more of isInputRange!R1 &&

Andrei Alexandrescu (7/12) Oct 12 2010 Correct. I figured out my mistake when I started playing with an

Justin Johansson (9/41) Oct 12 2010 Yes, "do the darn string version already and cut all that crap".

Justin Johansson (3/11) Oct 12 2010 I think I meant the "ubiquitous join" rather than the "colloquial

Andrei Alexandrescu (10/25) Oct 12 2010 By both I understand "join as in Python". Right?

Justin Johansson (11/37) Oct 12 2010 Yes, I agree from a range idiom point of view.

Steven Schveighoffer (25/55) Oct 13 2010 This doesn't quite work if T is not a value type (actually, I think it

Andrei Alexandrescu (9/30) Oct 13 2010 Yah, I had a similar idea:

Steven Schveighoffer (10/19) Oct 13 2010 Even though I consider myself a reasonable parser of function templates,...

Juanjo Alvarez (12/19) Oct 13 2010 templates,

Juanjo Alvarez (4/5) Oct 13 2010 Truncated sentence, I wanted to say that adding more asserts would

Gerrit Wichert (15/18) Oct 14 2010 I think the best way to explain the usage of a feature are *working*

Steven Schveighoffer (5/23) Oct 14 2010 I really *really* like this idea. Documentation examples are almost as ...

Andrei Alexandrescu (11/37) Oct 14 2010 I've asked Walter many times for

Jonathan M Davis (8/52) Oct 14 2010 It would certainly be an improvement to have somethnig like. In the date...
Brad Roberts (2/5) Oct 14 2010 Doesn't need to be walter to implement it...

Andrei Alexandrescu (3/8) Oct 14 2010 Yup, could be Don :o).

Lars T. Kyllingstad (4/33) Oct 14 2010 I agree, this would be awesome. Keeping doc examples in sync with the

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

I'm looking at http://d.puremagic.com/issues/show_bug.cgi?id=3313 and 
that got me looking at std.string.join, which currently has the sig:

string join(in string[] words, string sep);

A narrow fix:

Char[] join(Char)(in Char[][] words, in Char[] sep)
if (isSomeChar!Char);

I think it's reasonable to assume that people would want to join things 
that aren't necessarily arrays of characters, so T could be pretty much 
any type. An obvious step towards generalization is:

T[] join(T)(in T[][] items, T[] sep);

But join doesn't really need random access for words - really, an input 
range should suffice. So a generally useful join, almost worth putting 
in std.algorithm, would be:

ElementType!R1[] join(R1, R2)(R1 items, R2 sep)
if (isInputRange!R1 && isForwardRange!R2
     && is(ElementType!R2 : ElementType!R1);

Notice how the separator must be a forward range because it gets spanned 
multiple times, whereas the items need only be an input range as they 
are spanned once. This is at the same time a very general and very 
precise interface.

One thing is still bothering me: the array output type. Why would the 
"default" output range be an array? What can be done to make join() at 
the same time a general function and also one that works for strings the 
way the old join did? For example, if I want to join things into an 
already-existing buffer, or if I want to write them straight to a file, 
there's no way to do so without having an array allocation in the loop. 
I have a couple of ideas but I wouldn't want to bias yours.

I also have a question from people who dislike Phobos. Was there a point 
in the changes of signature above where you threw your hands thinking, 
"do the darn string version already and cut all that crap!"?


Thanks,

Andrei

Oct 11 2010

bearophile <bearophileHUGS lycos.com> writes:

Andrei:

 One thing is still bothering me: the array output type. Why would the 
 "default" output range be an array?

The chain() function that returns a range is already present.


 What can be done to make join() at 
 the same time a general function and also one that works for strings the 
 way the old join did?

 I also have a question from people who dislike Phobos. Was there a point 
 in the changes of signature above where you threw your hands thinking, 
 "do the darn string version already and cut all that crap!"?

Too much over-generalization is bad, and not just for D newbies. So std.string
may contain wrappers specialized for strings. You may implement a generic
std.algorithm.join, and then implement the std.string.join that uses just
strings (the second argument may be a single char too) and calls
std.algorithm.join for its implementation.

Bye,
bearophile

Oct 11 2010

bearophile <bearophileHUGS lycos.com> writes:

 You may implement a generic std.algorithm.join, and then implement the
std.string.join that uses just strings (the second argument may be a single
char too) and calls std.algorithm.join for its implementation.

If you don't like that name collision, the std.algorithm one may be named
joinRange or something else.

Bye,
bearophile

Oct 11 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/11/2010 08:02 PM, bearophile wrote:
 You may implement a generic std.algorithm.join, and then implement the
std.string.join that uses just strings (the second argument may be a single
char too) and calls std.algorithm.join for its implementation.

 If you don't like that name collision, the std.algorithm one may be named
joinRange or something else.

This is not a matter of name collision. The new join should be a 
backward-compatible generalization of the existing one, so it should 
just work for existing calls.

Andrei

Oct 11 2010

Daniel Gibson <metalcaedes gmail.com> writes:

bearophile schrieb:
 Andrei:
 
 One thing is still bothering me: the array output type. Why would the 
 "default" output range be an array?

 
 The chain() function that returns a range is already present.
 
 
 What can be done to make join() at 
 the same time a general function and also one that works for strings the 
 way the old join did?

 
 I also have a question from people who dislike Phobos. Was there a point 
 in the changes of signature above where you threw your hands thinking, 
 "do the darn string version already and cut all that crap!"?

 
 Too much over-generalization is bad, and not just for D newbies. So std.string
may contain wrappers specialized for strings. You may implement a generic
std.algorithm.join, and then implement the std.string.join that uses just
strings (the second argument may be a single char too) and calls
std.algorithm.join for its implementation.
 
 Bye,
 bearophile

I like that idea.

I don't like the name "join" - especially for general ranges.
When I hear join I think of database like joins. These may not be horribly
interesting for strings 
but certainly are for general ranges (*).
union() or concat() would be better names for doing what std.string.join does.

(*) Something like
Range!(Tuple!(T1, T2)) join(T1, T2)(Range!(T1) r1, Range!(T2) r2,
BinaryPredicate!(T1, T2) joinPred)
just pseudo-code, I'm not really familiar with D2 and std.algorithm.
The idea is you have a Range r1 with elements of type T1, a Range r1 with
elements of type T2 and a 
predicate that gets a T1 value and a T2 value and returns bool if they match
and in that case a 
Tuple with those two values is part of the Range that is returned.

Oct 11 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Daniel Gibson <metalcaedes gmail.com> wrote:

 (*) Something like
 Range!(Tuple!(T1, T2)) join(T1, T2)(Range!(T1) r1, Range!(T2) r2,  
 BinaryPredicate!(T1, T2) joinPred)
 just pseudo-code, I'm not really familiar with D2 and std.algorithm.
 The idea is you have a Range r1 with elements of type T1, a Range r1  
 with elements of type T2 and a predicate that gets a T1 value and a T2  
 value and returns bool if they match and in that case a Tuple with those  
 two values is part of the Range that is returned.

Once again I see the combinatorial range in the background. Man, why does
this have to be so hard?

That is, your join could be implemented as follows, given the
combinatorial product range combine:


auto join( alias fun, R... )( R ranges ) if ( allSatisfy!( isForwardRange,  
R ) ) {
     return filter!fun( combine( ranges );
}

-- 
Simen

Oct 11 2010

Philippe Sigaud <philippe.sigaud gmail.com> writes:

On Tue, Oct 12, 2010 at 03:28, Simen kjaeraas <simen.kjaras gmail.com> wrot=
e:
 Daniel Gibson <metalcaedes gmail.com> wrote:

 (*) Something like
 Range!(Tuple!(T1, T2)) join(T1, T2)(Range!(T1) r1, Range!(T2) r2,
 BinaryPredicate!(T1, T2) joinPred)
 just pseudo-code, I'm not really familiar with D2 and std.algorithm.
 The idea is you have a Range r1 with elements of type T1, a Range r1 wit=


h
 elements of type T2 and a predicate that gets a T1 value and a T2 value =


and
 returns bool if they match and in that case a Tuple with those two value=


s is
 part of the Range that is returned.

 Once again I see the combinatorial range in the background. Man, why does
 this have to be so hard?

 That is, your join could be implemented as follows, given the
 combinatorial product range combine:


 auto join( alias fun, R... )( R ranges ) if ( allSatisfy!( isForwardRange=

, R
 ) ) {
 =C2=A0 =C2=A0return filter!fun( combine( ranges );
 }

And IIRC, there is a difference between outer join, inner join and
some other versions.
So

filter!fun(zip(ranges))

(that is, filtering in parallel) is also a possibilty. I should read
some again on DB joints.
There is also the need for creating a range of ranges on this one
(aka, tensor product, but that scares people when I say that)
Anyway, that's derailing the thread, so I'll stop now.

Oct 11 2010

Daniel Gibson <metalcaedes gmail.com> writes:

Philippe Sigaud schrieb:
 On Tue, Oct 12, 2010 at 03:28, Simen kjaeraas <simen.kjaras gmail.com> wrote:
 Daniel Gibson <metalcaedes gmail.com> wrote:

 (*) Something like
 Range!(Tuple!(T1, T2)) join(T1, T2)(Range!(T1) r1, Range!(T2) r2,
 BinaryPredicate!(T1, T2) joinPred)
 just pseudo-code, I'm not really familiar with D2 and std.algorithm.
 The idea is you have a Range r1 with elements of type T1, a Range r1 with
 elements of type T2 and a predicate that gets a T1 value and a T2 value and
 returns bool if they match and in that case a Tuple with those two values is
 part of the Range that is returned.

 Once again I see the combinatorial range in the background. Man, why does
 this have to be so hard?

 That is, your join could be implemented as follows, given the
 combinatorial product range combine:


 auto join( alias fun, R... )( R ranges ) if ( allSatisfy!( isForwardRange, R
 ) ) {
    return filter!fun( combine( ranges );
 }

 
 And IIRC, there is a difference between outer join, inner join and
 some other versions.
 So
 
 filter!fun(zip(ranges))
 
 (that is, filtering in parallel) is also a possibilty. I should read
 some again on DB joints.
 There is also the need for creating a range of ranges on this one
 (aka, tensor product, but that scares people when I say that)
 Anyway, that's derailing the thread, so I'll stop now.

zip doesn't work here because it doesn't create a combinatorical/cartesian
product[1] that 
(logically) is the foundation of a join[2], but just combines the first element
of range one with 
the first element of range two, ... the i-th element of range one with the
i-the element of range 
two etc

inner join is the "normal" join, outer join means that, if a to-be-joined
element has no "partner" 
in the other set (range), it's included in the output anyway with the partner
having a NULL value. 
(This can be done for either the first, the second or both partners).
natural join is like an inner join, but has no explicit predicate, the implicit
predicate being that 
(in database tables) columns with equal names have to contain equal values. So
natural joins are 
rather uninteresting for ranges I guess.


[1] http://en.wikipedia.org/wiki/Cartesian_product // I called this cross
product before, but "cross 
product" seems to be normally used for something else
[2] http://en.wikipedia.org/wiki/Join_%28relational_algebra%29#Joins_and_join-like_operators

Oct 11 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Daniel Gibson <metalcaedes gmail.com> wrote:

 inner join is the "normal" join, outer join means that, if a  
 to-be-joined element has no "partner" in the other set (range), it's  
 included in the output anyway with the partner having a NULL value.  
 (This can be done for either the first, the second or both partners).



 natural join is like an inner join, but has no explicit predicate, the  
 implicit predicate being that (in database tables) columns with equal  
 names have to contain equal values. So natural joins are rather  
 uninteresting for ranges I guess.

Natural join could easily be done in D for ranges of structs or classes.
(not sure how it would cope with polymorphism, though)

It's trivial to automatically generate a predicate that uses
__traits( allMembers ) to check that all fields with the same name have
the same value (and even to statically decline natural join on
types with eponymous fields of incompatible types).

-- 
Simen

Oct 12 2010

Daniel Gibson <metalcaedes gmail.com> writes:

Simen kjaeraas schrieb:
 Daniel Gibson <metalcaedes gmail.com> wrote:
 
 (*) Something like
 Range!(Tuple!(T1, T2)) join(T1, T2)(Range!(T1) r1, Range!(T2) r2, 
 BinaryPredicate!(T1, T2) joinPred)
 just pseudo-code, I'm not really familiar with D2 and std.algorithm.
 The idea is you have a Range r1 with elements of type T1, a Range r1 
 with elements of type T2 and a predicate that gets a T1 value and a T2 
 value and returns bool if they match and in that case a Tuple with 
 those two values is part of the Range that is returned.

 
 Once again I see the combinatorial range in the background. Man, why does
 this have to be so hard?
 
 That is, your join could be implemented as follows, given the
 combinatorial product range combine:
 
 
 auto join( alias fun, R... )( R ranges ) if ( allSatisfy!( 
 isForwardRange, R ) ) {
     return filter!fun( combine( ranges );
 }
 

Yes, but if:
* at least the second input range is a sorted random access range join could be
calculated a lot 
cheaper, especially on the (common) case that the predicate checks for equality
(=> binary search)
* both ranges are sorted and the predicate checks for equality the join can
even be done in linear 
time (instead of quadratic like when using a cross product/combinatorical
product)

However for generic cases combine() would certainly be very helpful (on the
other hand if there were 
a proper join() you could get combine() by just using a predicate that is
always true).


But right now the point is: join() does something completely different and
should be renamed (or 
deprecated in std.string and replaced by union() - a real join isn't needed in
std.string anyway, 
but when join() is deprecated in std.string you can implement a real join in
std.algorithm without 
causing too much confusion).

Oct 11 2010

bearophile <bearophileHUGS lycos.com> writes:

Daniel Gibson:

 But right now the point is: join() does something completely different and
should be renamed (or 
 deprecated in std.string and replaced by union() - a real join isn't needed in
std.string anyway, 
 but when join() is deprecated in std.string you can implement a real join in
std.algorithm without 
 causing too much confusion).

I like the std.string.join() function, in Python I use the str.join() method
often... :-)

Bye,
bearophile

Oct 11 2010

Daniel Gibson <metalcaedes gmail.com> writes:

bearophile schrieb:
 Daniel Gibson:
 
 But right now the point is: join() does something completely different and
should be renamed (or 
 deprecated in std.string and replaced by union() - a real join isn't needed in
std.string anyway, 
 but when join() is deprecated in std.string you can implement a real join in
std.algorithm without 
 causing too much confusion).

 
 I like the std.string.join() function, in Python I use the str.join() method
often... :-)
 
 Bye,
 bearophile

Then the name in python sucks as well :P

IMHO when using the word "join" in a programming context - especially when
dealing with (kinds of) 
iterators, it should mean the relational algebra/database join and not some
kind of concatenation.

But maybe I just had too many database lectures at university ;-)

Oct 11 2010

Eric Poggel <dnewsgroup2 yage3d.net> writes:

On 10/11/2010 10:08 PM, bearophile wrote:
 Daniel Gibson:

 But right now the point is: join() does something completely different and
should be renamed (or
 deprecated in std.string and replaced by union() - a real join isn't needed in
std.string anyway,
 but when join() is deprecated in std.string you can implement a real join in
std.algorithm without
 causing too much confusion).

 I like the std.string.join() function, in Python I use the str.join() method
often... :-)

 Bye,
 bearophile

Most other languages call their equivalent function join.  Renaming it 
would be confusing.

Oct 11 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/11/2010 08:57 PM, Daniel Gibson wrote:
 But right now the point is: join() does something completely different
 and should be renamed (or deprecated in std.string and replaced by
 union() - a real join isn't needed in std.string anyway, but when join()
 is deprecated in std.string you can implement a real join in
 std.algorithm without causing too much confusion).

I think union() is a worse name than join(). The discussion was to 
generalize within reason std.string.join, which is present under that 
name and with that functionality in many other languages and libraries.

Andrei

Oct 11 2010

Daniel Gibson <metalcaedes gmail.com> writes:

Andrei Alexandrescu schrieb:
 On 10/11/2010 08:57 PM, Daniel Gibson wrote:
 But right now the point is: join() does something completely different
 and should be renamed (or deprecated in std.string and replaced by
 union() - a real join isn't needed in std.string anyway, but when join()
 is deprecated in std.string you can implement a real join in
 std.algorithm without causing too much confusion).

 
 I think union() is a worse name than join(). The discussion was to 
 generalize within reason std.string.join, which is present under that 
 name and with that functionality in many other languages and libraries.
 
 Andrei

Okay, union does kind of suck, because it implies set semantics (and thus no
ordering).

What about concat()?
It seems like join() is expected to work this way for strings.. but as a
generic algorithm working 
on kind-of-cursors?
std.algorithm already has some operations that are also in the relational
algebra (setDifference, 
setIntersection, setUnion, Filter, even Group (like in group by) etc), adding a
join (as in 
relational algebra join) implementation would only make sense - but how are you
gonna name that 
thing if join() is already taken for some kind of "concatenation with
additional seperator"?
Sure, "setJoin" would be available, but having both join and setJoin doing
completely different 
things would be confusing.

What about something like
char[] concat(char[][] words, char[] sep="") // or sep=null
in the string case and something equivalent in the ranges case?

Cheers,
- Daniel

Oct 11 2010

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday 11 October 2010 20:34:41 Daniel Gibson wrote:
 Andrei Alexandrescu schrieb:
 On 10/11/2010 08:57 PM, Daniel Gibson wrote:
 But right now the point is: join() does something completely different
 and should be renamed (or deprecated in std.string and replaced by
 union() - a real join isn't needed in std.string anyway, but when join()
 is deprecated in std.string you can implement a real join in
 std.algorithm without causing too much confusion).

 
 I think union() is a worse name than join(). The discussion was to
 generalize within reason std.string.join, which is present under that
 name and with that functionality in many other languages and libraries.
 
 Andrei

 
 Okay, union does kind of suck, because it implies set semantics (and thus
 no ordering).
 
 What about concat()?
 It seems like join() is expected to work this way for strings.. but as a
 generic algorithm working on kind-of-cursors?
 std.algorithm already has some operations that are also in the relational
 algebra (setDifference, setIntersection, setUnion, Filter, even Group
 (like in group by) etc), adding a join (as in relational algebra join)
 implementation would only make sense - but how are you gonna name that
 thing if join() is already taken for some kind of "concatenation with
 additional seperator"? Sure, "setJoin" would be available, but having both
 join and setJoin doing completely different things would be confusing.
 
 What about something like
 char[] concat(char[][] words, char[] sep="") // or sep=null
 in the string case and something equivalent in the ranges case?
 
 Cheers,
 - Daniel

Really. It's not that hard to have a function with a name that means different 
stuff in different contexts. join is an excellent name for what join() does.
Yes, 
there are joins in database which are different, but so what? Nothing in 
std.algorithm has anything to do with databases. We may end up with a module 
that does, and maybe it'll have a join() function too, but that doesn't mean 
that std.algorithm can't have one. As others have pointed out, there are other 
languages which have a join() function which does essentially the same thing as 
the one in std.string. I say leave it as join(). It's a fine name, doesn't 
conflict with anything, and doesn't preclude the name being used in database
code 
later.

- Jonathan M Davis

Oct 11 2010

Daniel Gibson <metalcaedes gmail.com> writes:

Jonathan M Davis schrieb:
 On Monday 11 October 2010 20:34:41 Daniel Gibson wrote:
 Andrei Alexandrescu schrieb:
 On 10/11/2010 08:57 PM, Daniel Gibson wrote:
 But right now the point is: join() does something completely different
 and should be renamed (or deprecated in std.string and replaced by
 union() - a real join isn't needed in std.string anyway, but when join()
 is deprecated in std.string you can implement a real join in
 std.algorithm without causing too much confusion).

 I think union() is a worse name than join(). The discussion was to
 generalize within reason std.string.join, which is present under that
 name and with that functionality in many other languages and libraries.

 Andrei

 Okay, union does kind of suck, because it implies set semantics (and thus
 no ordering).

 What about concat()?
 It seems like join() is expected to work this way for strings.. but as a
 generic algorithm working on kind-of-cursors?
 std.algorithm already has some operations that are also in the relational
 algebra (setDifference, setIntersection, setUnion, Filter, even Group
 (like in group by) etc), adding a join (as in relational algebra join)
 implementation would only make sense - but how are you gonna name that
 thing if join() is already taken for some kind of "concatenation with
 additional seperator"? Sure, "setJoin" would be available, but having both
 join and setJoin doing completely different things would be confusing.

 What about something like
 char[] concat(char[][] words, char[] sep="") // or sep=null
 in the string case and something equivalent in the ranges case?

 Cheers,
 - Daniel

 
 Really. It's not that hard to have a function with a name that means different 
 stuff in different contexts. join is an excellent name for what join() does.
Yes, 
 there are joins in database which are different, but so what? Nothing in 
 std.algorithm has anything to do with databases. We may end up with a module 
 that does, and maybe it'll have a join() function too, but that doesn't mean 
 that std.algorithm can't have one. As others have pointed out, there are other 
 languages which have a join() function which does essentially the same thing
as 
 the one in std.string. I say leave it as join(). It's a fine name, doesn't 
 conflict with anything, and doesn't preclude the name being used in database
code 
 later.
 
 - Jonathan M Davis

It's not about database code (and not primarily about strings or std.string),
it's about 
std.algorithm code.
It makes perfect sense to use database-like operations on
arrays/containers/iterators (and thus 
ranges), see LINQ[1].


[1] http://en.wikipedia.org/wiki/LINQ

Oct 11 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/11/2010 10:34 PM, Daniel Gibson wrote:
 Andrei Alexandrescu schrieb:
 On 10/11/2010 08:57 PM, Daniel Gibson wrote:
 But right now the point is: join() does something completely different
 and should be renamed (or deprecated in std.string and replaced by
 union() - a real join isn't needed in std.string anyway, but when join()
 is deprecated in std.string you can implement a real join in
 std.algorithm without causing too much confusion).

 I think union() is a worse name than join(). The discussion was to
 generalize within reason std.string.join, which is present under that
 name and with that functionality in many other languages and libraries.

 Andrei

 Okay, union does kind of suck, because it implies set semantics (and
 thus no ordering).

 What about concat()?
 It seems like join() is expected to work this way for strings.. but as a
 generic algorithm working on kind-of-cursors?

I for one would expect join() in its relational sense to work on things 
quite a bit more structured than just ranges (there's need for indexes 
etc). Therefore, if relational join() will be introduced later, 
overloading will disambiguate it. There's no reason to worry.

Andrei

Oct 11 2010

Daniel Gibson <metalcaedes gmail.com> writes:

Andrei Alexandrescu schrieb:
 On 10/11/2010 10:34 PM, Daniel Gibson wrote:
 Andrei Alexandrescu schrieb:
 On 10/11/2010 08:57 PM, Daniel Gibson wrote:
 But right now the point is: join() does something completely different
 and should be renamed (or deprecated in std.string and replaced by
 union() - a real join isn't needed in std.string anyway, but when 
 join()
 is deprecated in std.string you can implement a real join in
 std.algorithm without causing too much confusion).

 I think union() is a worse name than join(). The discussion was to
 generalize within reason std.string.join, which is present under that
 name and with that functionality in many other languages and libraries.

 Andrei

 Okay, union does kind of suck, because it implies set semantics (and
 thus no ordering).

 What about concat()?
 It seems like join() is expected to work this way for strings.. but as a
 generic algorithm working on kind-of-cursors?

 
 I for one would expect join() in its relational sense to work on things 
 quite a bit more structured than just ranges (there's need for indexes 
 etc). Therefore, if relational join() will be introduced later, 
 overloading will disambiguate it. There's no reason to worry.
 
 Andrei

Of course indexes would speed things up, but as mentioned before join() would
work ok on almost(*) 
all ranges (with O(n^2) complexity) and a lot better on std.range.SortedRange.
Because the user would provide a predicate (that should use the same comparator
that was used to 
sort the range) no additional structure (metadata like needed for natural join)
would be needed.

(*) the inner range needs to be a FordwardRange so it can be traversed multiple
times

Oct 11 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/11/10 23:00 CDT, Daniel Gibson wrote:
 Of course indexes would speed things up, but as mentioned before join()
 would work ok on almost(*) all ranges (with O(n^2) complexity) and a lot
 better on std.range.SortedRange.
 Because the user would provide a predicate (that should use the same
 comparator that was used to sort the range) no additional structure
 (metadata like needed for natural join) would be needed.

 (*) the inner range needs to be a FordwardRange so it can be traversed
 multiple times

 From http://www.hookedonlinq.com/JoinOperator.ashx (see the "loop 
count" section), the way it works is not O(n*n); an index is created 
automatically.

Andrei

Oct 12 2010

"Robert Jacques" <sandford jhu.edu> writes:

On Mon, 11 Oct 2010 23:34:41 -0400, Daniel Gibson <metalcaedes gmail.com>  
wrote:

 Andrei Alexandrescu schrieb:
 On 10/11/2010 08:57 PM, Daniel Gibson wrote:
 But right now the point is: join() does something completely different
 and should be renamed (or deprecated in std.string and replaced by
 union() - a real join isn't needed in std.string anyway, but when  
 join()
 is deprecated in std.string you can implement a real join in
 std.algorithm without causing too much confusion).

  I think union() is a worse name than join(). The discussion was to  
 generalize within reason std.string.join, which is present under that  
 name and with that functionality in many other languages and libraries.
  Andrei

 Okay, union does kind of suck, because it implies set semantics (and  
 thus no ordering).

 What about concat()?
 It seems like join() is expected to work this way for strings.. but as a  
 generic algorithm working on kind-of-cursors?
 std.algorithm already has some operations that are also in the  
 relational algebra (setDifference, setIntersection, setUnion, Filter,  
 even Group (like in group by) etc), adding a join (as in relational  
 algebra join) implementation would only make sense - but how are you  
 gonna name that thing if join() is already taken for some kind of  
 "concatenation with additional seperator"?
 Sure, "setJoin" would be available, but having both join and setJoin  
 doing completely different things would be confusing.

 What about something like
 char[] concat(char[][] words, char[] sep="") // or sep=null
 in the string case and something equivalent in the ranges case?

 Cheers,
 - Daniel

Regarding the bike shed,
Well, std.range already has transversal( range_of_ranges , Nth) and  
frontTransversal(range_of_ranges). So there is some opportunity for both a  
transverse all elements, i.e. transversal( range_of_ranges ), and  
interleaved elements, i.e. transversal( range_of_ranges, separator ).

Oct 11 2010

dolive <dolive89 sina.com> writes:

Daniel Gibson Wrote:

 bearophile schrieb:
 Andrei:
 
 One thing is still bothering me: the array output type. Why would the 
 "default" output range be an array?

 
 The chain() function that returns a range is already present.
 
 
 What can be done to make join() at 
 the same time a general function and also one that works for strings the 
 way the old join did?

 
 I also have a question from people who dislike Phobos. Was there a point 
 in the changes of signature above where you threw your hands thinking, 
 "do the darn string version already and cut all that crap!"?

 
 Too much over-generalization is bad, and not just for D newbies. So std.string
may contain wrappers specialized for strings. You may implement a generic
std.algorithm.join, and then implement the std.string.join that uses just
strings (the second argument may be a single char too) and calls
std.algorithm.join for its implementation.
 
 Bye,
 bearophile

 
 I like that idea.
 
 I don't like the name "join" - especially for general ranges.
 When I hear join I think of database like joins. These may not be horribly
interesting for strings 
 but certainly are for general ranges (*).
 union() or concat() would be better names for doing what std.string.join does.
 
 (*) Something like
 Range!(Tuple!(T1, T2)) join(T1, T2)(Range!(T1) r1, Range!(T2) r2,
BinaryPredicate!(T1, T2) joinPred)
 just pseudo-code, I'm not really familiar with D2 and std.algorithm.
 The idea is you have a Range r1 with elements of type T1, a Range r1 with
elements of type T2 and a 
 predicate that gets a T1 value and a T2 value and returns bool if they match
and in that case a 
 Tuple with those two values is part of the Range that is returned.
 

Yes��reference should learn java naming philosophy��
for non-English speaking countries ordinary programmers can easily use it��not
every programmer is the master.

thanks

Oct 11 2010

Norbert Nemec <Norbert Nemec-online.de> writes:

On 10/12/2010 03:09 AM, Daniel Gibson wrote:
 I don't like the name "join" - especially for general ranges.
 When I hear join I think of database like joins. These may not be
 horribly interesting for strings but certainly are for general ranges (*).
 union() or concat() would be better names for doing what std.string.join
 does.

I agree - what is currently offered by join() could simply be achieved 
by an optional argument to concat()

Oct 12 2010

Philippe Sigaud <philippe.sigaud gmail.com> writes:

On Tue, Oct 12, 2010 at 02:33, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 ElementType!R1[] join(R1, R2)(R1 items, R2 sep)
 if (isInputRange!R1 && isForwardRange!R2
 =C2=A0 =C2=A0&& is(ElementType!R2 : ElementType!R1);

 Notice how the separator must be a forward range because it gets spanned
 multiple times, whereas the items need only be an input range as they are
 spanned once. This is at the same time a very general and very precise
 interface.

I like this and I've nothing against this signature, but I'm probably
biased. When I look at this, I don't even look for the function name:
the constraints (ie, the interface) is what catches my eye.



 One thing is still bothering me: the array output type. Why would the
 "default" output range be an array? What can be done to make join() at th=

e
 same time a general function and also one that works for strings the way =

the
 old join did? For example, if I want to join things into an already-exist=

ing
 buffer, or if I want to write them straight to a file, there's no way to =

do
 so without having an array allocation in the loop. I have a couple of ide=

as
 but I wouldn't want to bias yours.

Let to my own, I'd make that a lazy Join struct range: an input range
that delivers R1 elements one by one, interspersed with R2 elements.
Hmm, now that I think a bit more, I was taking them both (or at least
R1) to be ranges of ranges:  join(["the","quick","red","fox"], " ").
Man, it's 4 pm now, I'll stop.

Oct 11 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/11/10 21:05 CDT, Philippe Sigaud wrote:
 On Tue, Oct 12, 2010 at 02:33, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org>  wrote:
 One thing is still bothering me: the array output type. Why would the
 "default" output range be an array? What can be done to make join() at the
 same time a general function and also one that works for strings the way the
 old join did? For example, if I want to join things into an already-existing
 buffer, or if I want to write them straight to a file, there's no way to do
 so without having an array allocation in the loop. I have a couple of ideas
 but I wouldn't want to bias yours.

 Let to my own, I'd make that a lazy Join struct range: an input range
 that delivers R1 elements one by one, interspersed with R2 elements.
 Hmm, now that I think a bit more, I was taking them both (or at least
 R1) to be ranges of ranges:  join(["the","quick","red","fox"], " ").
 Man, it's 4 pm now, I'll stop.

You must mean 4am :o). The abstraction you talk about is already 
implemented in std.algorithm.joiner(). Here I'm discussing eager join.

Andrei

Oct 12 2010

Daniel Gibson <metalcaedes gmail.com> writes:

Andrei Alexandrescu schrieb:
 I'm looking at http://d.puremagic.com/issues/show_bug.cgi?id=3313 and 
 that got me looking at std.string.join, which currently has the sig:
 
 string join(in string[] words, string sep);
 
 A narrow fix:
 
 Char[] join(Char)(in Char[][] words, in Char[] sep)
 if (isSomeChar!Char);
 
 I think it's reasonable to assume that people would want to join things 
 that aren't necessarily arrays of characters, so T could be pretty much 
 any type. An obvious step towards generalization is:
 
 T[] join(T)(in T[][] items, T[] sep);
 
 But join doesn't really need random access for words - really, an input 
 range should suffice. So a generally useful join, almost worth putting 
 in std.algorithm, would be:
 
 ElementType!R1[] join(R1, R2)(R1 items, R2 sep)
 if (isInputRange!R1 && isForwardRange!R2
     && is(ElementType!R2 : ElementType!R1);
 
 Notice how the separator must be a forward range because it gets spanned 
 multiple times, whereas the items need only be an input range as they 
 are spanned once. This is at the same time a very general and very 
 precise interface.
 
 One thing is still bothering me: the array output type. Why would the 
 "default" output range be an array? What can be done to make join() at 
 the same time a general function and also one that works for strings the 
 way the old join did? For example, if I want to join things into an 
 already-existing buffer, or if I want to write them straight to a file, 
 there's no way to do so without having an array allocation in the loop. 
 I have a couple of ideas but I wouldn't want to bias yours.
 
 I also have a question from people who dislike Phobos. Was there a point 
 in the changes of signature above where you threw your hands thinking, 
 "do the darn string version already and cut all that crap!"?
 
 
 Thanks,
 
 Andrei

Btw: Is "join" not just a (rather trivial) generalization of reduce?

auto inRange = ...; // range of char[]
char[] sep = " ";
auto joined = reduce!( (char[] res, char[] x) {return res~sep~x;}) (inRange);

Oct 11 2010

Daniel Gibson <metalcaedes gmail.com> writes:

On Tue, Oct 12, 2010 at 6:37 AM, Daniel Gibson <metalcaedes gmail.com> wrote:
 Btw: Is "join" not just a (rather trivial) generalization of reduce?

 auto inRange = ...; // range of char[]
 char[] sep = " ";
 auto joined = reduce!( (char[] res, char[] x) {return res~sep~x;})
 (inRange);

Not generalization, I meant specialization. (I should probably go to bed.)

Oct 11 2010

"Robert Jacques" <sandford jhu.edu> writes:

On Tue, 12 Oct 2010 00:47:33 -0400, Daniel Gibson <metalcaedes gmail.com>  
wrote:

 On Tue, Oct 12, 2010 at 6:37 AM, Daniel Gibson <metalcaedes gmail.com>  
 wrote:
 Btw: Is "join" not just a (rather trivial) generalization of reduce?

 auto inRange = ...; // range of char[]
 char[] sep = " ";
 auto joined = reduce!( (char[] res, char[] x) {return res~sep~x;})
 (inRange);

 Not generalization, I meant specialization. (I should probably go to  
 bed.)

Well, except for the N memory allocations. Also, for generic ranges you'd  
also want to use chain and not "~", but chain won't compose properly in a  
reduce.

Oct 11 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/11/10 23:37 CDT, Daniel Gibson wrote:
 Btw: Is "join" not just a (rather trivial) generalization of reduce?

 auto inRange = ...; // range of char[]
 char[] sep = " ";
 auto joined = reduce!( (char[] res, char[] x) {return res~sep~x;})
 (inRange);

It is, but things are a bit messed up by empty ranges.

auto joined = inRange.empty
   ? reduce!( (char[] res, char[] x) {return res~sep~x;})(inRange)
   : "":


Andrei

Oct 12 2010

Pelle <pelle.mansson gmail.com> writes:

On 10/12/2010 02:33 AM, Andrei Alexandrescu wrote:
 I'm looking at http://d.puremagic.com/issues/show_bug.cgi?id=3313 and
 that got me looking at std.string.join, which currently has the sig:

 string join(in string[] words, string sep);

 A narrow fix:

 Char[] join(Char)(in Char[][] words, in Char[] sep)
 if (isSomeChar!Char);

 I think it's reasonable to assume that people would want to join things
 that aren't necessarily arrays of characters, so T could be pretty much
 any type. An obvious step towards generalization is:

 T[] join(T)(in T[][] items, T[] sep);

 But join doesn't really need random access for words - really, an input
 range should suffice. So a generally useful join, almost worth putting
 in std.algorithm, would be:

 ElementType!R1[] join(R1, R2)(R1 items, R2 sep)
 if (isInputRange!R1 && isForwardRange!R2
 && is(ElementType!R2 : ElementType!R1);

 Notice how the separator must be a forward range because it gets spanned
 multiple times, whereas the items need only be an input range as they
 are spanned once. This is at the same time a very general and very
 precise interface.

 One thing is still bothering me: the array output type. Why would the
 "default" output range be an array? What can be done to make join() at
 the same time a general function and also one that works for strings the
 way the old join did? For example, if I want to join things into an
 already-existing buffer, or if I want to write them straight to a file,
 there's no way to do so without having an array allocation in the loop.
 I have a couple of ideas but I wouldn't want to bias yours.

 I also have a question from people who dislike Phobos. Was there a point
 in the changes of signature above where you threw your hands thinking,
 "do the darn string version already and cut all that crap!"?


 Thanks,

 Andrei

I think the function signature should be more of isInputRange!R1 && 
isInputRange(ElementType!R1), same with the is(). As the first one 
should be a range of ranges.

I think this should be a lazy range of ElementType!(ElementType!R1), or 
perhaps the common type. No reason to be overly eager :-)

Oct 11 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/12/2010 01:25 AM, Pelle wrote:
 I think the function signature should be more of isInputRange!R1 &&
 isInputRange(ElementType!R1), same with the is(). As the first one
 should be a range of ranges.

Correct. I figured out my mistake when I started playing with an 
implementation.

 I think this should be a lazy range of ElementType!(ElementType!R1), or
 perhaps the common type. No reason to be overly eager :-)

That's already present, see std.algorithm.joiner(). The problem with 
joiner() is that it's rather slow - there are a few tests for each 
element iterated. An eager join() is still necessary.


Andrei

Oct 12 2010

Justin Johansson <no spam.com> writes:

On 12/10/2010 11:33 AM, Andrei Alexandrescu wrote:
 I'm looking at http://d.puremagic.com/issues/show_bug.cgi?id=3313 and
 that got me looking at std.string.join, which currently has the sig:

 string join(in string[] words, string sep);

 A narrow fix:

 Char[] join(Char)(in Char[][] words, in Char[] sep)
 if (isSomeChar!Char);

 I think it's reasonable to assume that people would want to join things
 that aren't necessarily arrays of characters, so T could be pretty much
 any type. An obvious step towards generalization is:

 T[] join(T)(in T[][] items, T[] sep);

 But join doesn't really need random access for words - really, an input
 range should suffice. So a generally useful join, almost worth putting
 in std.algorithm, would be:

 ElementType!R1[] join(R1, R2)(R1 items, R2 sep)
 if (isInputRange!R1 && isForwardRange!R2
 && is(ElementType!R2 : ElementType!R1);

 Notice how the separator must be a forward range because it gets spanned
 multiple times, whereas the items need only be an input range as they
 are spanned once. This is at the same time a very general and very
 precise interface.

 One thing is still bothering me: the array output type. Why would the
 "default" output range be an array? What can be done to make join() at
 the same time a general function and also one that works for strings the
 way the old join did? For example, if I want to join things into an
 already-existing buffer, or if I want to write them straight to a file,
 there's no way to do so without having an array allocation in the loop.
 I have a couple of ideas but I wouldn't want to bias yours.

 I also have a question from people who dislike Phobos. Was there a point
 in the changes of signature above where you threw your hands thinking,
 "do the darn string version already and cut all that crap!"?


 Thanks,

 Andrei

Yes, "do the darn string version already and cut all that crap".

This is probably the thing to do to make for familiarity among
library users [of other languages].

However, if you have an urge to back-end the implementation
of the colloquial "join" by your ideas, do not give up your
dream.  So long as it is implemented as your private dream
no one will notice and you will remain internally satisfied. :-)

- JJ

Oct 12 2010

Justin Johansson <no spam.com> writes:

On 13/10/2010 1:28 AM, Justin Johansson wrote:
 Yes, "do the darn string version already and cut all that crap".

 This is probably the thing to do to make for familiarity among
 library users [of other languages].

 However, if you have an urge to back-end the implementation
 of the colloquial "join" by your ideas, do not give up your
 dream. So long as it is implemented as your private dream
 no one will notice and you will remain internally satisfied. :-)

 - JJ

I think I meant the "ubiquitous join" rather than the "colloquial
join".

Oct 12 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/12/10 9:35 CDT, Justin Johansson wrote:
 On 13/10/2010 1:28 AM, Justin Johansson wrote:
 Yes, "do the darn string version already and cut all that crap".

 This is probably the thing to do to make for familiarity among
 library users [of other languages].

 However, if you have an urge to back-end the implementation
 of the colloquial "join" by your ideas, do not give up your
 dream. So long as it is implemented as your private dream
 no one will notice and you will remain internally satisfied. :-)

 - JJ

 I think I meant the "ubiquitous join" rather than the "colloquial
 join".

By both I understand "join as in Python". Right?

Question is, where to stop?

1. string only (i.e. leave as is)

2. const(char)[] only (to allow joining char[] values)

3. various width of char, i.e. why shouldn't you join an array of wstring?

 From 3, the incremental effort to generalize to any type is virtually 
nonexistent, and the effort to generalize to ranges instead of arrays is 
minor. To me these are positives.


Andrei

Oct 12 2010

Justin Johansson <no spam.com> writes:

On 13/10/2010 2:02 AM, Andrei Alexandrescu wrote:
 On 10/12/10 9:35 CDT, Justin Johansson wrote:
 On 13/10/2010 1:28 AM, Justin Johansson wrote:
 Yes, "do the darn string version already and cut all that crap".

 This is probably the thing to do to make for familiarity among
 library users [of other languages].

 However, if you have an urge to back-end the implementation
 of the colloquial "join" by your ideas, do not give up your
 dream. So long as it is implemented as your private dream
 no one will notice and you will remain internally satisfied. :-)

 - JJ

 I think I meant the "ubiquitous join" rather than the "colloquial
 join".

 By both I understand "join as in Python". Right?

 Question is, where to stop?

 1. string only (i.e. leave as is)

 2. const(char)[] only (to allow joining char[] values)

 3. various width of char, i.e. why shouldn't you join an array of wstring?

  From 3, the incremental effort to generalize to any type is virtually
 nonexistent, and the effort to generalize to ranges instead of arrays is
 minor. To me these are positives.


 Andrei

Yes, I agree from a range idiom point of view.

Now, while understanding that D people don't care much for the
XPath 2.0 type system, and not myself caring much for the back-end
implementation, my XPath-ish function signature for this join() function
to preserve the generality that you suggest would be

item() join( things as item()*, separator as item()* );

Of course I'm anticipating an understanding of the above XPath 2.0
function signature syntax, an even then, I suspect my proposed
signature to be too liberal.

Regards, Justin

Oct 12 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 11 Oct 2010 20:33:27 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 I'm looking at http://d.puremagic.com/issues/show_bug.cgi?id=3313 and  
 that got me looking at std.string.join, which currently has the sig:

 string join(in string[] words, string sep);

 A narrow fix:

 Char[] join(Char)(in Char[][] words, in Char[] sep)
 if (isSomeChar!Char);

 I think it's reasonable to assume that people would want to join things  
 that aren't necessarily arrays of characters, so T could be pretty much  
 any type. An obvious step towards generalization is:

 T[] join(T)(in T[][] items, T[] sep);

This doesn't quite work if T is not a value type (actually, I think it  
does, but only because there are bugs in the compiler).

 But join doesn't really need random access for words - really, an input  
 range should suffice. So a generally useful join, almost worth putting  
 in std.algorithm, would be:

 ElementType!R1[] join(R1, R2)(R1 items, R2 sep)
 if (isInputRange!R1 && isForwardRange!R2
      && is(ElementType!R2 : ElementType!R1);

 Notice how the separator must be a forward range because it gets spanned  
 multiple times, whereas the items need only be an input range as they  
 are spanned once. This is at the same time a very general and very  
 precise interface.

I think this is fine.  Note that this does not take into account the  
constancy of items, meaning it is legal for this function to mess with the  
original data in items.

Not that I think it's a bad thing, but it does lose some guarantees as  
compared to the original join.  inout can't be used here because it  
doesn't work as a template parameter.

 One thing is still bothering me: the array output type. Why would the  
 "default" output range be an array? What can be done to make join() at  
 the same time a general function and also one that works for strings the  
 way the old join did? For example, if I want to join things into an  
 already-existing buffer, or if I want to write them straight to a file,  
 there's no way to do so without having an array allocation in the loop.  
 I have a couple of ideas but I wouldn't want to bias yours.

Well, one could have a version of join that takes an output range.  It  
would have to return the output range instead of the *result* of the  
output range.  And in that case, the standard join which returns an array  
can be implemented:

ElementType!R1[] join(R1 items, R2 sep) ...
{
    return join(R1, R2, Appender!(ElementType!R1)).data;
}

 I also have a question from people who dislike Phobos. Was there a point  
 in the changes of signature above where you threw your hands thinking,  
 "do the darn string version already and cut all that crap!"?

It's not a problem with phobos, it's a problem with documentation.  There  
is a fundamental issue with documenting complex templates which makes  
function signatures very difficult to understand.  The doc generator can  
and should simplify things, and I think at some point we should address  
this.  In other words, it should be transformed into a form that's easy to  
see that it's the same as string[] join(string[][], string[]).

-Steve

Oct 13 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/13/10 14:03 CDT, Steven Schveighoffer wrote:
 On Mon, 11 Oct 2010 20:33:27 -0400, Andrei Alexandrescu
 T[] join(T)(in T[][] items, T[] sep);

 This doesn't quite work if T is not a value type (actually, I think it
 does, but only because there are bugs in the compiler).

My focus in this discussion is not the const aspect, but point taken.

 Well, one could have a version of join that takes an output range. It
 would have to return the output range instead of the *result* of the
 output range. And in that case, the standard join which returns an array
 can be implemented:

 ElementType!R1[] join(R1 items, R2 sep) ...
 {
 return join(R1, R2, Appender!(ElementType!R1)).data;
 }

Yah, I had a similar idea:

void join(In1, In2, Out)(In1 items, In2 sep, Out target);

as an overload.

 I also have a question from people who dislike Phobos. Was there a
 point in the changes of signature above where you threw your hands
 thinking, "do the darn string version already and cut all that crap!"?

 It's not a problem with phobos, it's a problem with documentation. There
 is a fundamental issue with documenting complex templates which makes
 function signatures very difficult to understand. The doc generator can
 and should simplify things, and I think at some point we should address
 this. In other words, it should be transformed into a form that's easy
 to see that it's the same as string[] join(string[][], string[]).

Good point. On the other hand, an overly simplified documentation might 
hinder a good deal of legit uses for advanced users. I wonder how to 
please everyone.


Andrei

Oct 13 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 13 Oct 2010 16:07:46 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 It's not a problem with phobos, it's a problem with documentation. There
 is a fundamental issue with documenting complex templates which makes
 function signatures very difficult to understand. The doc generator can
 and should simplify things, and I think at some point we should address
 this. In other words, it should be transformed into a form that's easy
 to see that it's the same as string[] join(string[][], string[]).

 Good point. On the other hand, an overly simplified documentation might  
 hinder a good deal of legit uses for advanced users. I wonder how to  
 please everyone.

Even though I consider myself a reasonable parser of function templates,  
sometimes in std.algorithm, I'll stare at a function signature for about  
10 minutes trying to figure out whether I can do what I want, give up and  
finally just try to compile it.

I think what might help is spelling out the constraints somehow and  
especially explaining how the alias parameters work.  They are some sort  
of black magic I don't always understand :)

-Steve

Oct 13 2010

Juanjo Alvarez <fake fakeemail.com> writes:

On Wed, 13 Oct 2010 16:42:35 -0400, "Steven Schveighoffer" 
<schveiguy yahoo.com> wrote:
 Even though I consider myself a reasonable parser of function 

templates,  
 sometimes in std.algorithm, I'll stare at a function signature for 

about  
 10 minutes trying to figure out whether I can do what I want, give 

up and  
 finally just try to compile it.




 I think what might help is spelling out the constraints somehow and 

 
 especially explaining how the alias parameters work.  They are some 

sort  
 of black magic I don't always understand :)

Glad to see I'm not the only one :) The asserts help a lot there; I 
understood that module better looking at them than with the 
signatures. Adding more (or just adding some where they're missing). 

The template constraints is something that could definitely kill more 
trees in future editions of TDPL.

Oct 13 2010

Juanjo Alvarez <fake fakeemail.com> writes:

On Thu, 14 Oct 2010 01:30:42 +0200, Juanjo Alvarez 
<fake fakeemail.com> wrote:
 signatures. Adding more (or just adding some where they're missing).

Truncated sentence, I wanted to say that adding more asserts would 
not hurt.

Oct 13 2010

Gerrit Wichert <gwichert yahoo.com> writes:

Am 13.10.2010 22:07, schrieb Andrei Alexandrescu:
 Good point. On the other hand, an overly simplified documentation
 might hinder a good deal of legit uses for advanced users. I wonder
 how to please everyone.

I think the best way to explain the usage of a feature are *working*
code-examples.
Maybe it's possible to have a special unit-test block named such as
'example'.
The compiler can completely ignore such sections or just syntax check
them, or ... .

For doc generation they are just taken as they are and put into (or
linked to) the documentation.

It may be even possible for the doc generator to compile and run these
samples, so they become some kind of unit test and their possible output
can be part of the documentation.

Just an idea that comes to my mind

Gerrit

Oct 14 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 14 Oct 2010 06:53:35 -0400, Gerrit Wichert <gwichert yahoo.com>  
wrote:

 Am 13.10.2010 22:07, schrieb Andrei Alexandrescu:
 Good point. On the other hand, an overly simplified documentation
 might hinder a good deal of legit uses for advanced users. I wonder
 how to please everyone.

 I think the best way to explain the usage of a feature are *working*
 code-examples.
 Maybe it's possible to have a special unit-test block named such as
 'example'.
 The compiler can completely ignore such sections or just syntax check
 them, or ... .

 For doc generation they are just taken as they are and put into (or
 linked to) the documentation.

 It may be even possible for the doc generator to compile and run these
 samples, so they become some kind of unit test and their possible output
 can be part of the documentation.

 Just an idea that comes to my mind

I really *really* like this idea.  Documentation examples are almost as  
important as unit tests.  Can you start a new thread on this?

-Steve

Oct 14 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/14/10 8:44 CDT, Steven Schveighoffer wrote:
 On Thu, 14 Oct 2010 06:53:35 -0400, Gerrit Wichert <gwichert yahoo.com>
 wrote:

 Am 13.10.2010 22:07, schrieb Andrei Alexandrescu:
 Good point. On the other hand, an overly simplified documentation
 might hinder a good deal of legit uses for advanced users. I wonder
 how to please everyone.

 I think the best way to explain the usage of a feature are *working*
 code-examples.
 Maybe it's possible to have a special unit-test block named such as
 'example'.
 The compiler can completely ignore such sections or just syntax check
 them, or ... .

 For doc generation they are just taken as they are and put into (or
 linked to) the documentation.

 It may be even possible for the doc generator to compile and run these
 samples, so they become some kind of unit test and their possible output
 can be part of the documentation.

 Just an idea that comes to my mind

 I really *really* like this idea. Documentation examples are almost as
 important as unit tests. Can you start a new thread on this?

 -Steve

I've asked Walter many times for

/// Example
unittest
{
     ...
}

such that the code of the unittest will also appear as a documentation 
example. It would be a huge improvement in both test coverage and 
documentation, but it never made it to the top of the list.


Andrei

Oct 14 2010

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday, October 14, 2010 07:53:24 Andrei Alexandrescu wrote:
 On 10/14/10 8:44 CDT, Steven Schveighoffer wrote:
 On Thu, 14 Oct 2010 06:53:35 -0400, Gerrit Wichert <gwichert yahoo.com>
 
 wrote:
 Am 13.10.2010 22:07, schrieb Andrei Alexandrescu:
 Good point. On the other hand, an overly simplified documentation
 might hinder a good deal of legit uses for advanced users. I wonder
 how to please everyone.

 
 I think the best way to explain the usage of a feature are *working*
 code-examples.
 Maybe it's possible to have a special unit-test block named such as
 'example'.
 The compiler can completely ignore such sections or just syntax check
 them, or ... .
 
 For doc generation they are just taken as they are and put into (or
 linked to) the documentation.
 
 It may be even possible for the doc generator to compile and run these
 samples, so they become some kind of unit test and their possible output
 can be part of the documentation.
 
 Just an idea that comes to my mind

 
 I really *really* like this idea. Documentation examples are almost as
 important as unit tests. Can you start a new thread on this?
 
 -Steve

 
 I've asked Walter many times for
 
 /// Example
 unittest
 {
      ...
 }
 
 such that the code of the unittest will also appear as a documentation
 example. It would be a huge improvement in both test coverage and
 documentation, but it never made it to the top of the list.
 
 
 Andrei

It would certainly be an improvement to have somethnig like. In the datetime 
code that I've been working on, I've specifically been putting the examples for
a 
function in the unittest block for that function, but then you have to remember 
to put it there and keep them in sync. It would be far less error prone if
there 
was a way to automate create examples from unit tests or to create unit tests 
from examples.

- Jonathan M Davis

Oct 14 2010

Brad Roberts <braddr puremagic.com> writes:

On 10/14/2010 7:53 AM, Andrei Alexandrescu wrote:
 
 I've asked Walter many times for
 

Doesn't need to be walter to implement it...

Oct 14 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/14/10 21:03 CDT, Brad Roberts wrote:
 On 10/14/2010 7:53 AM, Andrei Alexandrescu wrote:
 I've asked Walter many times for

 Doesn't need to be walter to implement it...

Yup, could be Don :o).

Andrei

Oct 14 2010

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Thu, 14 Oct 2010 09:44:19 -0400, Steven Schveighoffer wrote:

 On Thu, 14 Oct 2010 06:53:35 -0400, Gerrit Wichert <gwichert yahoo.com>
 wrote:
 
 Am 13.10.2010 22:07, schrieb Andrei Alexandrescu:
 Good point. On the other hand, an overly simplified documentation
 might hinder a good deal of legit uses for advanced users. I wonder
 how to please everyone.

 I think the best way to explain the usage of a feature are *working*
 code-examples.
 Maybe it's possible to have a special unit-test block named such as
 'example'.
 The compiler can completely ignore such sections or just syntax check
 them, or ... .

 For doc generation they are just taken as they are and put into (or
 linked to) the documentation.

 It may be even possible for the doc generator to compile and run these
 samples, so they become some kind of unit test and their possible
 output can be part of the documentation.

 Just an idea that comes to my mind

 
 I really *really* like this idea.  Documentation examples are almost as
 important as unit tests.  Can you start a new thread on this?
 
 -Steve


I agree, this would be awesome.  Keeping doc examples in sync with the 
unittests is a pain.

-Lars

Oct 14 2010

D Programming

C/C++ Programming

Other

digitalmars.D - improving the join function