www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - improving the join function

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
I'm looking at http://d.puremagic.com/issues/show_bug.cgi?id=3313 and 
that got me looking at std.string.join, which currently has the sig:

string join(in string[] words, string sep);

A narrow fix:

Char[] join(Char)(in Char[][] words, in Char[] sep)
if (isSomeChar!Char);

I think it's reasonable to assume that people would want to join things 
that aren't necessarily arrays of characters, so T could be pretty much 
any type. An obvious step towards generalization is:

T[] join(T)(in T[][] items, T[] sep);

But join doesn't really need random access for words - really, an input 
range should suffice. So a generally useful join, almost worth putting 
in std.algorithm, would be:

ElementType!R1[] join(R1, R2)(R1 items, R2 sep)
if (isInputRange!R1 && isForwardRange!R2
     && is(ElementType!R2 : ElementType!R1);

Notice how the separator must be a forward range because it gets spanned 
multiple times, whereas the items need only be an input range as they 
are spanned once. This is at the same time a very general and very 
precise interface.

One thing is still bothering me: the array output type. Why would the 
"default" output range be an array? What can be done to make join() at 
the same time a general function and also one that works for strings the 
way the old join did? For example, if I want to join things into an 
already-existing buffer, or if I want to write them straight to a file, 
there's no way to do so without having an array allocation in the loop. 
I have a couple of ideas but I wouldn't want to bias yours.

I also have a question from people who dislike Phobos. Was there a point 
in the changes of signature above where you threw your hands thinking, 
"do the darn string version already and cut all that crap!"?


Thanks,

Andrei
Oct 11 2010
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei:

 One thing is still bothering me: the array output type. Why would the 
 "default" output range be an array?

The chain() function that returns a range is already present.
 What can be done to make join() at 
 the same time a general function and also one that works for strings the 
 way the old join did?

 I also have a question from people who dislike Phobos. Was there a point 
 in the changes of signature above where you threw your hands thinking, 
 "do the darn string version already and cut all that crap!"?

Too much over-generalization is bad, and not just for D newbies. So std.string may contain wrappers specialized for strings. You may implement a generic std.algorithm.join, and then implement the std.string.join that uses just strings (the second argument may be a single char too) and calls std.algorithm.join for its implementation. Bye, bearophile
Oct 11 2010
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
 You may implement a generic std.algorithm.join, and then implement the
std.string.join that uses just strings (the second argument may be a single
char too) and calls std.algorithm.join for its implementation.

If you don't like that name collision, the std.algorithm one may be named joinRange or something else. Bye, bearophile
Oct 11 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/11/2010 08:02 PM, bearophile wrote:
 You may implement a generic std.algorithm.join, and then implement the
std.string.join that uses just strings (the second argument may be a single
char too) and calls std.algorithm.join for its implementation.

If you don't like that name collision, the std.algorithm one may be named joinRange or something else.

This is not a matter of name collision. The new join should be a backward-compatible generalization of the existing one, so it should just work for existing calls. Andrei
Oct 11 2010
prev sibling next sibling parent reply Daniel Gibson <metalcaedes gmail.com> writes:
bearophile schrieb:
 Andrei:
 
 One thing is still bothering me: the array output type. Why would the 
 "default" output range be an array?

The chain() function that returns a range is already present.
 What can be done to make join() at 
 the same time a general function and also one that works for strings the 
 way the old join did?

 I also have a question from people who dislike Phobos. Was there a point 
 in the changes of signature above where you threw your hands thinking, 
 "do the darn string version already and cut all that crap!"?

Too much over-generalization is bad, and not just for D newbies. So std.string may contain wrappers specialized for strings. You may implement a generic std.algorithm.join, and then implement the std.string.join that uses just strings (the second argument may be a single char too) and calls std.algorithm.join for its implementation. Bye, bearophile

I like that idea. I don't like the name "join" - especially for general ranges. When I hear join I think of database like joins. These may not be horribly interesting for strings but certainly are for general ranges (*). union() or concat() would be better names for doing what std.string.join does. (*) Something like Range!(Tuple!(T1, T2)) join(T1, T2)(Range!(T1) r1, Range!(T2) r2, BinaryPredicate!(T1, T2) joinPred) just pseudo-code, I'm not really familiar with D2 and std.algorithm. The idea is you have a Range r1 with elements of type T1, a Range r1 with elements of type T2 and a predicate that gets a T1 value and a T2 value and returns bool if they match and in that case a Tuple with those two values is part of the Range that is returned.
Oct 11 2010
next sibling parent dolive <dolive89 sina.com> writes:
Daniel Gibson Wrote:

 bearophile schrieb:
 Andrei:
 
 One thing is still bothering me: the array output type. Why would the 
 "default" output range be an array?

The chain() function that returns a range is already present.
 What can be done to make join() at 
 the same time a general function and also one that works for strings the 
 way the old join did?

 I also have a question from people who dislike Phobos. Was there a point 
 in the changes of signature above where you threw your hands thinking, 
 "do the darn string version already and cut all that crap!"?

Too much over-generalization is bad, and not just for D newbies. So std.string may contain wrappers specialized for strings. You may implement a generic std.algorithm.join, and then implement the std.string.join that uses just strings (the second argument may be a single char too) and calls std.algorithm.join for its implementation. Bye, bearophile

I like that idea. I don't like the name "join" - especially for general ranges. When I hear join I think of database like joins. These may not be horribly interesting for strings but certainly are for general ranges (*). union() or concat() would be better names for doing what std.string.join does. (*) Something like Range!(Tuple!(T1, T2)) join(T1, T2)(Range!(T1) r1, Range!(T2) r2, BinaryPredicate!(T1, T2) joinPred) just pseudo-code, I'm not really familiar with D2 and std.algorithm. The idea is you have a Range r1 with elements of type T1, a Range r1 with elements of type T2 and a predicate that gets a T1 value and a T2 value and returns bool if they match and in that case a Tuple with those two values is part of the Range that is returned.

for non-English speaking countries ordinary programmers can easily use itúČnot every programmer is the master. thanks
Oct 11 2010
prev sibling next sibling parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Simen kjaeraas schrieb:
 Daniel Gibson <metalcaedes gmail.com> wrote:
 
 (*) Something like
 Range!(Tuple!(T1, T2)) join(T1, T2)(Range!(T1) r1, Range!(T2) r2, 
 BinaryPredicate!(T1, T2) joinPred)
 just pseudo-code, I'm not really familiar with D2 and std.algorithm.
 The idea is you have a Range r1 with elements of type T1, a Range r1 
 with elements of type T2 and a predicate that gets a T1 value and a T2 
 value and returns bool if they match and in that case a Tuple with 
 those two values is part of the Range that is returned.

Once again I see the combinatorial range in the background. Man, why does this have to be so hard? That is, your join could be implemented as follows, given the combinatorial product range combine: auto join( alias fun, R... )( R ranges ) if ( allSatisfy!( isForwardRange, R ) ) { return filter!fun( combine( ranges ); }

Yes, but if: * at least the second input range is a sorted random access range join could be calculated a lot cheaper, especially on the (common) case that the predicate checks for equality (=> binary search) * both ranges are sorted and the predicate checks for equality the join can even be done in linear time (instead of quadratic like when using a cross product/combinatorical product) However for generic cases combine() would certainly be very helpful (on the other hand if there were a proper join() you could get combine() by just using a predicate that is always true). But right now the point is: join() does something completely different and should be renamed (or deprecated in std.string and replaced by union() - a real join isn't needed in std.string anyway, but when join() is deprecated in std.string you can implement a real join in std.algorithm without causing too much confusion).
Oct 11 2010
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Daniel Gibson:

 But right now the point is: join() does something completely different and
should be renamed (or 
 deprecated in std.string and replaced by union() - a real join isn't needed in
std.string anyway, 
 but when join() is deprecated in std.string you can implement a real join in
std.algorithm without 
 causing too much confusion).

I like the std.string.join() function, in Python I use the str.join() method often... :-) Bye, bearophile
Oct 11 2010
next sibling parent Daniel Gibson <metalcaedes gmail.com> writes:
bearophile schrieb:
 Daniel Gibson:
 
 But right now the point is: join() does something completely different and
should be renamed (or 
 deprecated in std.string and replaced by union() - a real join isn't needed in
std.string anyway, 
 but when join() is deprecated in std.string you can implement a real join in
std.algorithm without 
 causing too much confusion).

I like the std.string.join() function, in Python I use the str.join() method often... :-) Bye, bearophile

Then the name in python sucks as well :P IMHO when using the word "join" in a programming context - especially when dealing with (kinds of) iterators, it should mean the relational algebra/database join and not some kind of concatenation. But maybe I just had too many database lectures at university ;-)
Oct 11 2010
prev sibling parent Eric Poggel <dnewsgroup2 yage3d.net> writes:
On 10/11/2010 10:08 PM, bearophile wrote:
 Daniel Gibson:

 But right now the point is: join() does something completely different and
should be renamed (or
 deprecated in std.string and replaced by union() - a real join isn't needed in
std.string anyway,
 but when join() is deprecated in std.string you can implement a real join in
std.algorithm without
 causing too much confusion).

I like the std.string.join() function, in Python I use the str.join() method often... :-) Bye, bearophile

Most other languages call their equivalent function join. Renaming it would be confusing.
Oct 11 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/11/2010 08:57 PM, Daniel Gibson wrote:
 But right now the point is: join() does something completely different
 and should be renamed (or deprecated in std.string and replaced by
 union() - a real join isn't needed in std.string anyway, but when join()
 is deprecated in std.string you can implement a real join in
 std.algorithm without causing too much confusion).

I think union() is a worse name than join(). The discussion was to generalize within reason std.string.join, which is present under that name and with that functionality in many other languages and libraries. Andrei
Oct 11 2010
next sibling parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Andrei Alexandrescu schrieb:
 On 10/11/2010 08:57 PM, Daniel Gibson wrote:
 But right now the point is: join() does something completely different
 and should be renamed (or deprecated in std.string and replaced by
 union() - a real join isn't needed in std.string anyway, but when join()
 is deprecated in std.string you can implement a real join in
 std.algorithm without causing too much confusion).

I think union() is a worse name than join(). The discussion was to generalize within reason std.string.join, which is present under that name and with that functionality in many other languages and libraries. Andrei

Okay, union does kind of suck, because it implies set semantics (and thus no ordering). What about concat()? It seems like join() is expected to work this way for strings.. but as a generic algorithm working on kind-of-cursors? std.algorithm already has some operations that are also in the relational algebra (setDifference, setIntersection, setUnion, Filter, even Group (like in group by) etc), adding a join (as in relational algebra join) implementation would only make sense - but how are you gonna name that thing if join() is already taken for some kind of "concatenation with additional seperator"? Sure, "setJoin" would be available, but having both join and setJoin doing completely different things would be confusing. What about something like char[] concat(char[][] words, char[] sep="") // or sep=null in the string case and something equivalent in the ranges case? Cheers, - Daniel
Oct 11 2010
next sibling parent Daniel Gibson <metalcaedes gmail.com> writes:
Jonathan M Davis schrieb:
 On Monday 11 October 2010 20:34:41 Daniel Gibson wrote:
 Andrei Alexandrescu schrieb:
 On 10/11/2010 08:57 PM, Daniel Gibson wrote:
 But right now the point is: join() does something completely different
 and should be renamed (or deprecated in std.string and replaced by
 union() - a real join isn't needed in std.string anyway, but when join()
 is deprecated in std.string you can implement a real join in
 std.algorithm without causing too much confusion).

generalize within reason std.string.join, which is present under that name and with that functionality in many other languages and libraries. Andrei

no ordering). What about concat()? It seems like join() is expected to work this way for strings.. but as a generic algorithm working on kind-of-cursors? std.algorithm already has some operations that are also in the relational algebra (setDifference, setIntersection, setUnion, Filter, even Group (like in group by) etc), adding a join (as in relational algebra join) implementation would only make sense - but how are you gonna name that thing if join() is already taken for some kind of "concatenation with additional seperator"? Sure, "setJoin" would be available, but having both join and setJoin doing completely different things would be confusing. What about something like char[] concat(char[][] words, char[] sep="") // or sep=null in the string case and something equivalent in the ranges case? Cheers, - Daniel

Really. It's not that hard to have a function with a name that means different stuff in different contexts. join is an excellent name for what join() does. Yes, there are joins in database which are different, but so what? Nothing in std.algorithm has anything to do with databases. We may end up with a module that does, and maybe it'll have a join() function too, but that doesn't mean that std.algorithm can't have one. As others have pointed out, there are other languages which have a join() function which does essentially the same thing as the one in std.string. I say leave it as join(). It's a fine name, doesn't conflict with anything, and doesn't preclude the name being used in database code later. - Jonathan M Davis

It's not about database code (and not primarily about strings or std.string), it's about std.algorithm code. It makes perfect sense to use database-like operations on arrays/containers/iterators (and thus ranges), see LINQ[1]. [1] http://en.wikipedia.org/wiki/LINQ
Oct 11 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/11/2010 10:34 PM, Daniel Gibson wrote:
 Andrei Alexandrescu schrieb:
 On 10/11/2010 08:57 PM, Daniel Gibson wrote:
 But right now the point is: join() does something completely different
 and should be renamed (or deprecated in std.string and replaced by
 union() - a real join isn't needed in std.string anyway, but when join()
 is deprecated in std.string you can implement a real join in
 std.algorithm without causing too much confusion).

I think union() is a worse name than join(). The discussion was to generalize within reason std.string.join, which is present under that name and with that functionality in many other languages and libraries. Andrei

Okay, union does kind of suck, because it implies set semantics (and thus no ordering). What about concat()? It seems like join() is expected to work this way for strings.. but as a generic algorithm working on kind-of-cursors?

I for one would expect join() in its relational sense to work on things quite a bit more structured than just ranges (there's need for indexes etc). Therefore, if relational join() will be introduced later, overloading will disambiguate it. There's no reason to worry. Andrei
Oct 11 2010
parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Andrei Alexandrescu schrieb:
 On 10/11/2010 10:34 PM, Daniel Gibson wrote:
 Andrei Alexandrescu schrieb:
 On 10/11/2010 08:57 PM, Daniel Gibson wrote:
 But right now the point is: join() does something completely different
 and should be renamed (or deprecated in std.string and replaced by
 union() - a real join isn't needed in std.string anyway, but when 
 join()
 is deprecated in std.string you can implement a real join in
 std.algorithm without causing too much confusion).

I think union() is a worse name than join(). The discussion was to generalize within reason std.string.join, which is present under that name and with that functionality in many other languages and libraries. Andrei

Okay, union does kind of suck, because it implies set semantics (and thus no ordering). What about concat()? It seems like join() is expected to work this way for strings.. but as a generic algorithm working on kind-of-cursors?

I for one would expect join() in its relational sense to work on things quite a bit more structured than just ranges (there's need for indexes etc). Therefore, if relational join() will be introduced later, overloading will disambiguate it. There's no reason to worry. Andrei

Of course indexes would speed things up, but as mentioned before join() would work ok on almost(*) all ranges (with O(n^2) complexity) and a lot better on std.range.SortedRange. Because the user would provide a predicate (that should use the same comparator that was used to sort the range) no additional structure (metadata like needed for natural join) would be needed. (*) the inner range needs to be a FordwardRange so it can be traversed multiple times
Oct 11 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/11/10 23:00 CDT, Daniel Gibson wrote:
 Of course indexes would speed things up, but as mentioned before join()
 would work ok on almost(*) all ranges (with O(n^2) complexity) and a lot
 better on std.range.SortedRange.
 Because the user would provide a predicate (that should use the same
 comparator that was used to sort the range) no additional structure
 (metadata like needed for natural join) would be needed.

 (*) the inner range needs to be a FordwardRange so it can be traversed
 multiple times

From http://www.hookedonlinq.com/JoinOperator.ashx (see the "loop count" section), the way it works is not O(n*n); an index is created automatically. Andrei
Oct 12 2010
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday 11 October 2010 20:34:41 Daniel Gibson wrote:
 Andrei Alexandrescu schrieb:
 On 10/11/2010 08:57 PM, Daniel Gibson wrote:
 But right now the point is: join() does something completely different
 and should be renamed (or deprecated in std.string and replaced by
 union() - a real join isn't needed in std.string anyway, but when join()
 is deprecated in std.string you can implement a real join in
 std.algorithm without causing too much confusion).

I think union() is a worse name than join(). The discussion was to generalize within reason std.string.join, which is present under that name and with that functionality in many other languages and libraries. Andrei

Okay, union does kind of suck, because it implies set semantics (and thus no ordering). What about concat()? It seems like join() is expected to work this way for strings.. but as a generic algorithm working on kind-of-cursors? std.algorithm already has some operations that are also in the relational algebra (setDifference, setIntersection, setUnion, Filter, even Group (like in group by) etc), adding a join (as in relational algebra join) implementation would only make sense - but how are you gonna name that thing if join() is already taken for some kind of "concatenation with additional seperator"? Sure, "setJoin" would be available, but having both join and setJoin doing completely different things would be confusing. What about something like char[] concat(char[][] words, char[] sep="") // or sep=null in the string case and something equivalent in the ranges case? Cheers, - Daniel

Really. It's not that hard to have a function with a name that means different stuff in different contexts. join is an excellent name for what join() does. Yes, there are joins in database which are different, but so what? Nothing in std.algorithm has anything to do with databases. We may end up with a module that does, and maybe it'll have a join() function too, but that doesn't mean that std.algorithm can't have one. As others have pointed out, there are other languages which have a join() function which does essentially the same thing as the one in std.string. I say leave it as join(). It's a fine name, doesn't conflict with anything, and doesn't preclude the name being used in database code later. - Jonathan M Davis
Oct 11 2010
prev sibling next sibling parent Daniel Gibson <metalcaedes gmail.com> writes:
Philippe Sigaud schrieb:
 On Tue, Oct 12, 2010 at 03:28, Simen kjaeraas <simen.kjaras gmail.com> wrote:
 Daniel Gibson <metalcaedes gmail.com> wrote:

 (*) Something like
 Range!(Tuple!(T1, T2)) join(T1, T2)(Range!(T1) r1, Range!(T2) r2,
 BinaryPredicate!(T1, T2) joinPred)
 just pseudo-code, I'm not really familiar with D2 and std.algorithm.
 The idea is you have a Range r1 with elements of type T1, a Range r1 with
 elements of type T2 and a predicate that gets a T1 value and a T2 value and
 returns bool if they match and in that case a Tuple with those two values is
 part of the Range that is returned.

this have to be so hard? That is, your join could be implemented as follows, given the combinatorial product range combine: auto join( alias fun, R... )( R ranges ) if ( allSatisfy!( isForwardRange, R ) ) { return filter!fun( combine( ranges ); }

And IIRC, there is a difference between outer join, inner join and some other versions. So filter!fun(zip(ranges)) (that is, filtering in parallel) is also a possibilty. I should read some again on DB joints. There is also the need for creating a range of ranges on this one (aka, tensor product, but that scares people when I say that) Anyway, that's derailing the thread, so I'll stop now.

zip doesn't work here because it doesn't create a combinatorical/cartesian product[1] that (logically) is the foundation of a join[2], but just combines the first element of range one with the first element of range two, ... the i-th element of range one with the i-the element of range two etc inner join is the "normal" join, outer join means that, if a to-be-joined element has no "partner" in the other set (range), it's included in the output anyway with the partner having a NULL value. (This can be done for either the first, the second or both partners). natural join is like an inner join, but has no explicit predicate, the implicit predicate being that (in database tables) columns with equal names have to contain equal values. So natural joins are rather uninteresting for ranges I guess. [1] http://en.wikipedia.org/wiki/Cartesian_product // I called this cross product before, but "cross product" seems to be normally used for something else [2] http://en.wikipedia.org/wiki/Join_%28relational_algebra%29#Joins_and_join-like_operators
Oct 11 2010
prev sibling parent Norbert Nemec <Norbert Nemec-online.de> writes:
On 10/12/2010 03:09 AM, Daniel Gibson wrote:
 I don't like the name "join" - especially for general ranges.
 When I hear join I think of database like joins. These may not be
 horribly interesting for strings but certainly are for general ranges (*).
 union() or concat() would be better names for doing what std.string.join
 does.

I agree - what is currently offered by join() could simply be achieved by an optional argument to concat()
Oct 12 2010
prev sibling next sibling parent "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Daniel Gibson <metalcaedes gmail.com> wrote:

 (*) Something like
 Range!(Tuple!(T1, T2)) join(T1, T2)(Range!(T1) r1, Range!(T2) r2,  
 BinaryPredicate!(T1, T2) joinPred)
 just pseudo-code, I'm not really familiar with D2 and std.algorithm.
 The idea is you have a Range r1 with elements of type T1, a Range r1  
 with elements of type T2 and a predicate that gets a T1 value and a T2  
 value and returns bool if they match and in that case a Tuple with those  
 two values is part of the Range that is returned.

Once again I see the combinatorial range in the background. Man, why does this have to be so hard? That is, your join could be implemented as follows, given the combinatorial product range combine: auto join( alias fun, R... )( R ranges ) if ( allSatisfy!( isForwardRange, R ) ) { return filter!fun( combine( ranges ); } -- Simen
Oct 11 2010
prev sibling next sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Tue, Oct 12, 2010 at 03:28, Simen kjaeraas <simen.kjaras gmail.com> wrot=
e:
 Daniel Gibson <metalcaedes gmail.com> wrote:

 (*) Something like
 Range!(Tuple!(T1, T2)) join(T1, T2)(Range!(T1) r1, Range!(T2) r2,
 BinaryPredicate!(T1, T2) joinPred)
 just pseudo-code, I'm not really familiar with D2 and std.algorithm.
 The idea is you have a Range r1 with elements of type T1, a Range r1 wit=


 elements of type T2 and a predicate that gets a T1 value and a T2 value =


 returns bool if they match and in that case a Tuple with those two value=


 part of the Range that is returned.

Once again I see the combinatorial range in the background. Man, why does this have to be so hard? That is, your join could be implemented as follows, given the combinatorial product range combine: auto join( alias fun, R... )( R ranges ) if ( allSatisfy!( isForwardRange=

 ) ) {
 =C2=A0 =C2=A0return filter!fun( combine( ranges );
 }

And IIRC, there is a difference between outer join, inner join and some other versions. So filter!fun(zip(ranges)) (that is, filtering in parallel) is also a possibilty. I should read some again on DB joints. There is also the need for creating a range of ranges on this one (aka, tensor product, but that scares people when I say that) Anyway, that's derailing the thread, so I'll stop now.
Oct 11 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Mon, 11 Oct 2010 23:34:41 -0400, Daniel Gibson <metalcaedes gmail.com>  
wrote:

 Andrei Alexandrescu schrieb:
 On 10/11/2010 08:57 PM, Daniel Gibson wrote:
 But right now the point is: join() does something completely different
 and should be renamed (or deprecated in std.string and replaced by
 union() - a real join isn't needed in std.string anyway, but when  
 join()
 is deprecated in std.string you can implement a real join in
 std.algorithm without causing too much confusion).

generalize within reason std.string.join, which is present under that name and with that functionality in many other languages and libraries. Andrei

Okay, union does kind of suck, because it implies set semantics (and thus no ordering). What about concat()? It seems like join() is expected to work this way for strings.. but as a generic algorithm working on kind-of-cursors? std.algorithm already has some operations that are also in the relational algebra (setDifference, setIntersection, setUnion, Filter, even Group (like in group by) etc), adding a join (as in relational algebra join) implementation would only make sense - but how are you gonna name that thing if join() is already taken for some kind of "concatenation with additional seperator"? Sure, "setJoin" would be available, but having both join and setJoin doing completely different things would be confusing. What about something like char[] concat(char[][] words, char[] sep="") // or sep=null in the string case and something equivalent in the ranges case? Cheers, - Daniel

Regarding the bike shed, Well, std.range already has transversal( range_of_ranges , Nth) and frontTransversal(range_of_ranges). So there is some opportunity for both a transverse all elements, i.e. transversal( range_of_ranges ), and interleaved elements, i.e. transversal( range_of_ranges, separator ).
Oct 11 2010
prev sibling parent "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Daniel Gibson <metalcaedes gmail.com> wrote:

 inner join is the "normal" join, outer join means that, if a  
 to-be-joined element has no "partner" in the other set (range), it's  
 included in the output anyway with the partner having a NULL value.  
 (This can be done for either the first, the second or both partners).

 natural join is like an inner join, but has no explicit predicate, the  
 implicit predicate being that (in database tables) columns with equal  
 names have to contain equal values. So natural joins are rather  
 uninteresting for ranges I guess.

Natural join could easily be done in D for ranges of structs or classes. (not sure how it would cope with polymorphism, though) It's trivial to automatically generate a predicate that uses __traits( allMembers ) to check that all fields with the same name have the same value (and even to statically decline natural join on types with eponymous fields of incompatible types). -- Simen
Oct 12 2010
prev sibling next sibling parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Tue, Oct 12, 2010 at 02:33, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 ElementType!R1[] join(R1, R2)(R1 items, R2 sep)
 if (isInputRange!R1 && isForwardRange!R2
 =C2=A0 =C2=A0&& is(ElementType!R2 : ElementType!R1);

 Notice how the separator must be a forward range because it gets spanned
 multiple times, whereas the items need only be an input range as they are
 spanned once. This is at the same time a very general and very precise
 interface.

I like this and I've nothing against this signature, but I'm probably biased. When I look at this, I don't even look for the function name: the constraints (ie, the interface) is what catches my eye.
 One thing is still bothering me: the array output type. Why would the
 "default" output range be an array? What can be done to make join() at th=

 same time a general function and also one that works for strings the way =

 old join did? For example, if I want to join things into an already-exist=

 buffer, or if I want to write them straight to a file, there's no way to =

 so without having an array allocation in the loop. I have a couple of ide=

 but I wouldn't want to bias yours.

Let to my own, I'd make that a lazy Join struct range: an input range that delivers R1 elements one by one, interspersed with R2 elements. Hmm, now that I think a bit more, I was taking them both (or at least R1) to be ranges of ranges: join(["the","quick","red","fox"], " "). Man, it's 4 pm now, I'll stop.
Oct 11 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/11/10 21:05 CDT, Philippe Sigaud wrote:
 On Tue, Oct 12, 2010 at 02:33, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org>  wrote:
 One thing is still bothering me: the array output type. Why would the
 "default" output range be an array? What can be done to make join() at the
 same time a general function and also one that works for strings the way the
 old join did? For example, if I want to join things into an already-existing
 buffer, or if I want to write them straight to a file, there's no way to do
 so without having an array allocation in the loop. I have a couple of ideas
 but I wouldn't want to bias yours.

Let to my own, I'd make that a lazy Join struct range: an input range that delivers R1 elements one by one, interspersed with R2 elements. Hmm, now that I think a bit more, I was taking them both (or at least R1) to be ranges of ranges: join(["the","quick","red","fox"], " "). Man, it's 4 pm now, I'll stop.

You must mean 4am :o). The abstraction you talk about is already implemented in std.algorithm.joiner(). Here I'm discussing eager join. Andrei
Oct 12 2010
prev sibling next sibling parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Andrei Alexandrescu schrieb:
 I'm looking at http://d.puremagic.com/issues/show_bug.cgi?id=3313 and 
 that got me looking at std.string.join, which currently has the sig:
 
 string join(in string[] words, string sep);
 
 A narrow fix:
 
 Char[] join(Char)(in Char[][] words, in Char[] sep)
 if (isSomeChar!Char);
 
 I think it's reasonable to assume that people would want to join things 
 that aren't necessarily arrays of characters, so T could be pretty much 
 any type. An obvious step towards generalization is:
 
 T[] join(T)(in T[][] items, T[] sep);
 
 But join doesn't really need random access for words - really, an input 
 range should suffice. So a generally useful join, almost worth putting 
 in std.algorithm, would be:
 
 ElementType!R1[] join(R1, R2)(R1 items, R2 sep)
 if (isInputRange!R1 && isForwardRange!R2
     && is(ElementType!R2 : ElementType!R1);
 
 Notice how the separator must be a forward range because it gets spanned 
 multiple times, whereas the items need only be an input range as they 
 are spanned once. This is at the same time a very general and very 
 precise interface.
 
 One thing is still bothering me: the array output type. Why would the 
 "default" output range be an array? What can be done to make join() at 
 the same time a general function and also one that works for strings the 
 way the old join did? For example, if I want to join things into an 
 already-existing buffer, or if I want to write them straight to a file, 
 there's no way to do so without having an array allocation in the loop. 
 I have a couple of ideas but I wouldn't want to bias yours.
 
 I also have a question from people who dislike Phobos. Was there a point 
 in the changes of signature above where you threw your hands thinking, 
 "do the darn string version already and cut all that crap!"?
 
 
 Thanks,
 
 Andrei

Btw: Is "join" not just a (rather trivial) generalization of reduce? auto inRange = ...; // range of char[] char[] sep = " "; auto joined = reduce!( (char[] res, char[] x) {return res~sep~x;}) (inRange);
Oct 11 2010
next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Tue, 12 Oct 2010 00:47:33 -0400, Daniel Gibson <metalcaedes gmail.com>  
wrote:

 On Tue, Oct 12, 2010 at 6:37 AM, Daniel Gibson <metalcaedes gmail.com>  
 wrote:
 Btw: Is "join" not just a (rather trivial) generalization of reduce?

 auto inRange = ...; // range of char[]
 char[] sep = " ";
 auto joined = reduce!( (char[] res, char[] x) {return res~sep~x;})
 (inRange);

Not generalization, I meant specialization. (I should probably go to bed.)

Well, except for the N memory allocations. Also, for generic ranges you'd also want to use chain and not "~", but chain won't compose properly in a reduce.
Oct 11 2010
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/11/10 23:37 CDT, Daniel Gibson wrote:
 Btw: Is "join" not just a (rather trivial) generalization of reduce?

 auto inRange = ...; // range of char[]
 char[] sep = " ";
 auto joined = reduce!( (char[] res, char[] x) {return res~sep~x;})
 (inRange);

It is, but things are a bit messed up by empty ranges. auto joined = inRange.empty ? reduce!( (char[] res, char[] x) {return res~sep~x;})(inRange) : "": Andrei
Oct 12 2010
prev sibling next sibling parent Daniel Gibson <metalcaedes gmail.com> writes:
On Tue, Oct 12, 2010 at 6:37 AM, Daniel Gibson <metalcaedes gmail.com> wrote:
 Btw: Is "join" not just a (rather trivial) generalization of reduce?

 auto inRange = ...; // range of char[]
 char[] sep = " ";
 auto joined = reduce!( (char[] res, char[] x) {return res~sep~x;})
 (inRange);

Not generalization, I meant specialization. (I should probably go to bed.)
Oct 11 2010
prev sibling next sibling parent reply Pelle <pelle.mansson gmail.com> writes:
On 10/12/2010 02:33 AM, Andrei Alexandrescu wrote:
 I'm looking at http://d.puremagic.com/issues/show_bug.cgi?id=3313 and
 that got me looking at std.string.join, which currently has the sig:

 string join(in string[] words, string sep);

 A narrow fix:

 Char[] join(Char)(in Char[][] words, in Char[] sep)
 if (isSomeChar!Char);

 I think it's reasonable to assume that people would want to join things
 that aren't necessarily arrays of characters, so T could be pretty much
 any type. An obvious step towards generalization is:

 T[] join(T)(in T[][] items, T[] sep);

 But join doesn't really need random access for words - really, an input
 range should suffice. So a generally useful join, almost worth putting
 in std.algorithm, would be:

 ElementType!R1[] join(R1, R2)(R1 items, R2 sep)
 if (isInputRange!R1 && isForwardRange!R2
 && is(ElementType!R2 : ElementType!R1);

 Notice how the separator must be a forward range because it gets spanned
 multiple times, whereas the items need only be an input range as they
 are spanned once. This is at the same time a very general and very
 precise interface.

 One thing is still bothering me: the array output type. Why would the
 "default" output range be an array? What can be done to make join() at
 the same time a general function and also one that works for strings the
 way the old join did? For example, if I want to join things into an
 already-existing buffer, or if I want to write them straight to a file,
 there's no way to do so without having an array allocation in the loop.
 I have a couple of ideas but I wouldn't want to bias yours.

 I also have a question from people who dislike Phobos. Was there a point
 in the changes of signature above where you threw your hands thinking,
 "do the darn string version already and cut all that crap!"?


 Thanks,

 Andrei

I think the function signature should be more of isInputRange!R1 && isInputRange(ElementType!R1), same with the is(). As the first one should be a range of ranges. I think this should be a lazy range of ElementType!(ElementType!R1), or perhaps the common type. No reason to be overly eager :-)
Oct 11 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/12/2010 01:25 AM, Pelle wrote:
 I think the function signature should be more of isInputRange!R1 &&
 isInputRange(ElementType!R1), same with the is(). As the first one
 should be a range of ranges.

Correct. I figured out my mistake when I started playing with an implementation.
 I think this should be a lazy range of ElementType!(ElementType!R1), or
 perhaps the common type. No reason to be overly eager :-)

That's already present, see std.algorithm.joiner(). The problem with joiner() is that it's rather slow - there are a few tests for each element iterated. An eager join() is still necessary. Andrei
Oct 12 2010
prev sibling next sibling parent reply Justin Johansson <no spam.com> writes:
On 12/10/2010 11:33 AM, Andrei Alexandrescu wrote:
 I'm looking at http://d.puremagic.com/issues/show_bug.cgi?id=3313 and
 that got me looking at std.string.join, which currently has the sig:

 string join(in string[] words, string sep);

 A narrow fix:

 Char[] join(Char)(in Char[][] words, in Char[] sep)
 if (isSomeChar!Char);

 I think it's reasonable to assume that people would want to join things
 that aren't necessarily arrays of characters, so T could be pretty much
 any type. An obvious step towards generalization is:

 T[] join(T)(in T[][] items, T[] sep);

 But join doesn't really need random access for words - really, an input
 range should suffice. So a generally useful join, almost worth putting
 in std.algorithm, would be:

 ElementType!R1[] join(R1, R2)(R1 items, R2 sep)
 if (isInputRange!R1 && isForwardRange!R2
 && is(ElementType!R2 : ElementType!R1);

 Notice how the separator must be a forward range because it gets spanned
 multiple times, whereas the items need only be an input range as they
 are spanned once. This is at the same time a very general and very
 precise interface.

 One thing is still bothering me: the array output type. Why would the
 "default" output range be an array? What can be done to make join() at
 the same time a general function and also one that works for strings the
 way the old join did? For example, if I want to join things into an
 already-existing buffer, or if I want to write them straight to a file,
 there's no way to do so without having an array allocation in the loop.
 I have a couple of ideas but I wouldn't want to bias yours.

 I also have a question from people who dislike Phobos. Was there a point
 in the changes of signature above where you threw your hands thinking,
 "do the darn string version already and cut all that crap!"?


 Thanks,

 Andrei

Yes, "do the darn string version already and cut all that crap". This is probably the thing to do to make for familiarity among library users [of other languages]. However, if you have an urge to back-end the implementation of the colloquial "join" by your ideas, do not give up your dream. So long as it is implemented as your private dream no one will notice and you will remain internally satisfied. :-) - JJ
Oct 12 2010
parent reply Justin Johansson <no spam.com> writes:
On 13/10/2010 1:28 AM, Justin Johansson wrote:
 Yes, "do the darn string version already and cut all that crap".

 This is probably the thing to do to make for familiarity among
 library users [of other languages].

 However, if you have an urge to back-end the implementation
 of the colloquial "join" by your ideas, do not give up your
 dream. So long as it is implemented as your private dream
 no one will notice and you will remain internally satisfied. :-)

 - JJ

I think I meant the "ubiquitous join" rather than the "colloquial join".
Oct 12 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/12/10 9:35 CDT, Justin Johansson wrote:
 On 13/10/2010 1:28 AM, Justin Johansson wrote:
 Yes, "do the darn string version already and cut all that crap".

 This is probably the thing to do to make for familiarity among
 library users [of other languages].

 However, if you have an urge to back-end the implementation
 of the colloquial "join" by your ideas, do not give up your
 dream. So long as it is implemented as your private dream
 no one will notice and you will remain internally satisfied. :-)

 - JJ

I think I meant the "ubiquitous join" rather than the "colloquial join".

By both I understand "join as in Python". Right? Question is, where to stop? 1. string only (i.e. leave as is) 2. const(char)[] only (to allow joining char[] values) 3. various width of char, i.e. why shouldn't you join an array of wstring? From 3, the incremental effort to generalize to any type is virtually nonexistent, and the effort to generalize to ranges instead of arrays is minor. To me these are positives. Andrei
Oct 12 2010
parent Justin Johansson <no spam.com> writes:
On 13/10/2010 2:02 AM, Andrei Alexandrescu wrote:
 On 10/12/10 9:35 CDT, Justin Johansson wrote:
 On 13/10/2010 1:28 AM, Justin Johansson wrote:
 Yes, "do the darn string version already and cut all that crap".

 This is probably the thing to do to make for familiarity among
 library users [of other languages].

 However, if you have an urge to back-end the implementation
 of the colloquial "join" by your ideas, do not give up your
 dream. So long as it is implemented as your private dream
 no one will notice and you will remain internally satisfied. :-)

 - JJ

I think I meant the "ubiquitous join" rather than the "colloquial join".

By both I understand "join as in Python". Right? Question is, where to stop? 1. string only (i.e. leave as is) 2. const(char)[] only (to allow joining char[] values) 3. various width of char, i.e. why shouldn't you join an array of wstring? From 3, the incremental effort to generalize to any type is virtually nonexistent, and the effort to generalize to ranges instead of arrays is minor. To me these are positives. Andrei

Yes, I agree from a range idiom point of view. Now, while understanding that D people don't care much for the XPath 2.0 type system, and not myself caring much for the back-end implementation, my XPath-ish function signature for this join() function to preserve the generality that you suggest would be item() join( things as item()*, separator as item()* ); Of course I'm anticipating an understanding of the above XPath 2.0 function signature syntax, an even then, I suspect my proposed signature to be too liberal. Regards, Justin
Oct 12 2010
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 11 Oct 2010 20:33:27 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 I'm looking at http://d.puremagic.com/issues/show_bug.cgi?id=3313 and  
 that got me looking at std.string.join, which currently has the sig:

 string join(in string[] words, string sep);

 A narrow fix:

 Char[] join(Char)(in Char[][] words, in Char[] sep)
 if (isSomeChar!Char);

 I think it's reasonable to assume that people would want to join things  
 that aren't necessarily arrays of characters, so T could be pretty much  
 any type. An obvious step towards generalization is:

 T[] join(T)(in T[][] items, T[] sep);

This doesn't quite work if T is not a value type (actually, I think it does, but only because there are bugs in the compiler).
 But join doesn't really need random access for words - really, an input  
 range should suffice. So a generally useful join, almost worth putting  
 in std.algorithm, would be:

 ElementType!R1[] join(R1, R2)(R1 items, R2 sep)
 if (isInputRange!R1 && isForwardRange!R2
      && is(ElementType!R2 : ElementType!R1);

 Notice how the separator must be a forward range because it gets spanned  
 multiple times, whereas the items need only be an input range as they  
 are spanned once. This is at the same time a very general and very  
 precise interface.

I think this is fine. Note that this does not take into account the constancy of items, meaning it is legal for this function to mess with the original data in items. Not that I think it's a bad thing, but it does lose some guarantees as compared to the original join. inout can't be used here because it doesn't work as a template parameter.
 One thing is still bothering me: the array output type. Why would the  
 "default" output range be an array? What can be done to make join() at  
 the same time a general function and also one that works for strings the  
 way the old join did? For example, if I want to join things into an  
 already-existing buffer, or if I want to write them straight to a file,  
 there's no way to do so without having an array allocation in the loop.  
 I have a couple of ideas but I wouldn't want to bias yours.

Well, one could have a version of join that takes an output range. It would have to return the output range instead of the *result* of the output range. And in that case, the standard join which returns an array can be implemented: ElementType!R1[] join(R1 items, R2 sep) ... { return join(R1, R2, Appender!(ElementType!R1)).data; }
 I also have a question from people who dislike Phobos. Was there a point  
 in the changes of signature above where you threw your hands thinking,  
 "do the darn string version already and cut all that crap!"?

It's not a problem with phobos, it's a problem with documentation. There is a fundamental issue with documenting complex templates which makes function signatures very difficult to understand. The doc generator can and should simplify things, and I think at some point we should address this. In other words, it should be transformed into a form that's easy to see that it's the same as string[] join(string[][], string[]). -Steve
Oct 13 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/13/10 14:03 CDT, Steven Schveighoffer wrote:
 On Mon, 11 Oct 2010 20:33:27 -0400, Andrei Alexandrescu
 T[] join(T)(in T[][] items, T[] sep);

This doesn't quite work if T is not a value type (actually, I think it does, but only because there are bugs in the compiler).

My focus in this discussion is not the const aspect, but point taken.
 Well, one could have a version of join that takes an output range. It
 would have to return the output range instead of the *result* of the
 output range. And in that case, the standard join which returns an array
 can be implemented:

 ElementType!R1[] join(R1 items, R2 sep) ...
 {
 return join(R1, R2, Appender!(ElementType!R1)).data;
 }

Yah, I had a similar idea: void join(In1, In2, Out)(In1 items, In2 sep, Out target); as an overload.
 I also have a question from people who dislike Phobos. Was there a
 point in the changes of signature above where you threw your hands
 thinking, "do the darn string version already and cut all that crap!"?

It's not a problem with phobos, it's a problem with documentation. There is a fundamental issue with documenting complex templates which makes function signatures very difficult to understand. The doc generator can and should simplify things, and I think at some point we should address this. In other words, it should be transformed into a form that's easy to see that it's the same as string[] join(string[][], string[]).

Good point. On the other hand, an overly simplified documentation might hinder a good deal of legit uses for advanced users. I wonder how to please everyone. Andrei
Oct 13 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/14/10 8:44 CDT, Steven Schveighoffer wrote:
 On Thu, 14 Oct 2010 06:53:35 -0400, Gerrit Wichert <gwichert yahoo.com>
 wrote:

 Am 13.10.2010 22:07, schrieb Andrei Alexandrescu:
 Good point. On the other hand, an overly simplified documentation
 might hinder a good deal of legit uses for advanced users. I wonder
 how to please everyone.

code-examples. Maybe it's possible to have a special unit-test block named such as 'example'. The compiler can completely ignore such sections or just syntax check them, or ... . For doc generation they are just taken as they are and put into (or linked to) the documentation. It may be even possible for the doc generator to compile and run these samples, so they become some kind of unit test and their possible output can be part of the documentation. Just an idea that comes to my mind

I really *really* like this idea. Documentation examples are almost as important as unit tests. Can you start a new thread on this? -Steve

I've asked Walter many times for /// Example unittest { ... } such that the code of the unittest will also appear as a documentation example. It would be a huge improvement in both test coverage and documentation, but it never made it to the top of the list. Andrei
Oct 14 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/14/10 21:03 CDT, Brad Roberts wrote:
 On 10/14/2010 7:53 AM, Andrei Alexandrescu wrote:
 I've asked Walter many times for

Doesn't need to be walter to implement it...

Yup, could be Don :o). Andrei
Oct 14 2010
prev sibling parent Brad Roberts <braddr puremagic.com> writes:
On 10/14/2010 7:53 AM, Andrei Alexandrescu wrote:
 
 I've asked Walter many times for
 

Doesn't need to be walter to implement it...
Oct 14 2010
prev sibling parent Gerrit Wichert <gwichert yahoo.com> writes:
Am 13.10.2010 22:07, schrieb Andrei Alexandrescu:
 Good point. On the other hand, an overly simplified documentation
 might hinder a good deal of legit uses for advanced users. I wonder
 how to please everyone.

code-examples. Maybe it's possible to have a special unit-test block named such as 'example'. The compiler can completely ignore such sections or just syntax check them, or ... . For doc generation they are just taken as they are and put into (or linked to) the documentation. It may be even possible for the doc generator to compile and run these samples, so they become some kind of unit test and their possible output can be part of the documentation. Just an idea that comes to my mind Gerrit
Oct 14 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 13 Oct 2010 16:07:46 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 It's not a problem with phobos, it's a problem with documentation. There
 is a fundamental issue with documenting complex templates which makes
 function signatures very difficult to understand. The doc generator can
 and should simplify things, and I think at some point we should address
 this. In other words, it should be transformed into a form that's easy
 to see that it's the same as string[] join(string[][], string[]).

Good point. On the other hand, an overly simplified documentation might hinder a good deal of legit uses for advanced users. I wonder how to please everyone.

Even though I consider myself a reasonable parser of function templates, sometimes in std.algorithm, I'll stare at a function signature for about 10 minutes trying to figure out whether I can do what I want, give up and finally just try to compile it. I think what might help is spelling out the constraints somehow and especially explaining how the alias parameters work. They are some sort of black magic I don't always understand :) -Steve
Oct 13 2010
prev sibling next sibling parent Juanjo Alvarez <fake fakeemail.com> writes:
On Wed, 13 Oct 2010 16:42:35 -0400, "Steven Schveighoffer" 
<schveiguy yahoo.com> wrote:
 Even though I consider myself a reasonable parser of function 

 sometimes in std.algorithm, I'll stare at a function signature for 

 10 minutes trying to figure out whether I can do what I want, give 

 finally just try to compile it.

 I think what might help is spelling out the constraints somehow and 

 especially explaining how the alias parameters work.  They are some 

 of black magic I don't always understand :)

Glad to see I'm not the only one :) The asserts help a lot there; I understood that module better looking at them than with the signatures. Adding more (or just adding some where they're missing). The template constraints is something that could definitely kill more trees in future editions of TDPL.
Oct 13 2010
prev sibling next sibling parent Juanjo Alvarez <fake fakeemail.com> writes:
On Thu, 14 Oct 2010 01:30:42 +0200, Juanjo Alvarez 
<fake fakeemail.com> wrote:
 signatures. Adding more (or just adding some where they're missing).

Truncated sentence, I wanted to say that adding more asserts would not hurt.
Oct 13 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 14 Oct 2010 06:53:35 -0400, Gerrit Wichert <gwichert yahoo.com>  
wrote:

 Am 13.10.2010 22:07, schrieb Andrei Alexandrescu:
 Good point. On the other hand, an overly simplified documentation
 might hinder a good deal of legit uses for advanced users. I wonder
 how to please everyone.

code-examples. Maybe it's possible to have a special unit-test block named such as 'example'. The compiler can completely ignore such sections or just syntax check them, or ... . For doc generation they are just taken as they are and put into (or linked to) the documentation. It may be even possible for the doc generator to compile and run these samples, so they become some kind of unit test and their possible output can be part of the documentation. Just an idea that comes to my mind

I really *really* like this idea. Documentation examples are almost as important as unit tests. Can you start a new thread on this? -Steve
Oct 14 2010
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, October 14, 2010 07:53:24 Andrei Alexandrescu wrote:
 On 10/14/10 8:44 CDT, Steven Schveighoffer wrote:
 On Thu, 14 Oct 2010 06:53:35 -0400, Gerrit Wichert <gwichert yahoo.com>
 
 wrote:
 Am 13.10.2010 22:07, schrieb Andrei Alexandrescu:
 Good point. On the other hand, an overly simplified documentation
 might hinder a good deal of legit uses for advanced users. I wonder
 how to please everyone.

I think the best way to explain the usage of a feature are *working* code-examples. Maybe it's possible to have a special unit-test block named such as 'example'. The compiler can completely ignore such sections or just syntax check them, or ... . For doc generation they are just taken as they are and put into (or linked to) the documentation. It may be even possible for the doc generator to compile and run these samples, so they become some kind of unit test and their possible output can be part of the documentation. Just an idea that comes to my mind

I really *really* like this idea. Documentation examples are almost as important as unit tests. Can you start a new thread on this? -Steve

I've asked Walter many times for /// Example unittest { ... } such that the code of the unittest will also appear as a documentation example. It would be a huge improvement in both test coverage and documentation, but it never made it to the top of the list. Andrei

It would certainly be an improvement to have somethnig like. In the datetime code that I've been working on, I've specifically been putting the examples for a function in the unittest block for that function, but then you have to remember to put it there and keep them in sync. It would be far less error prone if there was a way to automate create examples from unit tests or to create unit tests from examples. - Jonathan M Davis
Oct 14 2010
prev sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Thu, 14 Oct 2010 09:44:19 -0400, Steven Schveighoffer wrote:

 On Thu, 14 Oct 2010 06:53:35 -0400, Gerrit Wichert <gwichert yahoo.com>
 wrote:
 
 Am 13.10.2010 22:07, schrieb Andrei Alexandrescu:
 Good point. On the other hand, an overly simplified documentation
 might hinder a good deal of legit uses for advanced users. I wonder
 how to please everyone.

code-examples. Maybe it's possible to have a special unit-test block named such as 'example'. The compiler can completely ignore such sections or just syntax check them, or ... . For doc generation they are just taken as they are and put into (or linked to) the documentation. It may be even possible for the doc generator to compile and run these samples, so they become some kind of unit test and their possible output can be part of the documentation. Just an idea that comes to my mind

I really *really* like this idea. Documentation examples are almost as important as unit tests. Can you start a new thread on this? -Steve

I agree, this would be awesome. Keeping doc examples in sync with the unittests is a pain. -Lars
Oct 14 2010