digitalmars.D - Another day in the ordeal of cartesianProduct

H. S. Teoh (55/55) Oct 26 2012 http://d.puremagic.com/issues/show_bug.cgi?id=8900

Peter Alexander (32/42) Oct 27 2012 Unit tests are not the solution to this. There are too many range

bearophile (14/20) Oct 27 2012 It's not an easy problem. To solve it the Rust language uses

Peter Alexander (10/23) Oct 27 2012 Yeah, it's certainly not going to be easy. It's unfortunate that

Andrei Alexandrescu (5/13) Oct 27 2012 It's not clear to me at all. The mechanism works very well and is more

H. S. Teoh (41/56) Oct 27 2012 [...]

xenon325 (18/55) Oct 29 2012 We've had a short discussion with Jacob Carlborg about almost

Peter Alexander (34/42) Oct 28 2012 I'm not sure I can agree it works well.

monarch_dodra (11/13) Oct 28 2012 Actually, before 880, you had an infinite loop, as "size_t.max"

monarch_dodra (7/8) Oct 28 2012 Yeah, sorry. I was saying that by just doing unit tests that do

Andrei Alexandrescu (16/53) Oct 29 2012 This proves nothing at all. So this has to do with invoking walkLength

Peter Alexander (34/80) Oct 29 2012 You're misunderstanding. walkLength used to allow infinite

Andrei Alexandrescu (20/91) Oct 29 2012 Yes, but what gets ignored here is that typeclasses have a large

bearophile (15/19) Oct 29 2012 Such costs are an interesting discussion topic :-)

Andrei Alexandrescu (6/18) Oct 29 2012 For those who wouldn't know how to search the Net, these indeed are

bearophile (27/32) Oct 29 2012 People are often a bit lazy in clicking on links present in

Peter Alexander (3/4) Oct 29 2012 Ok. There is no point continuing this discussion then.

Andrei Alexandrescu (4/21) Oct 27 2012 No, this has nothing to do with the problem at hand.
Johannes Pfau (20/35) Oct 27 2012 Yes and no. We can't test all possible combinations, but we should try

Andrei Alexandrescu (11/29) Oct 27 2012 Often when there are many issues caused by a specific artifact, there
monarch_dodra (4/5) Oct 28 2012 The best we can get are users like you that take the time to
Don Clugston (7/60) Oct 29 2012 I think that unit tests aren't very effective without code coverage.

Peter Alexander (15/21) Oct 29 2012 That's a good step forward, but I don't think it solves (what

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

http://d.puremagic.com/issues/show_bug.cgi?id=8900

:-(

(The code there is called cartesianProd but it's the reduced code, so it
doesn't really compute the cartesian product. But that's where it's
from.)

So far, the outstanding blockers for cartesianProduct are:
1) Compiler bug which causes unittest failure:

	std/range.d(4629): Error: variable lower used before set
	std/range.d(4630): Error: variable upper used before set

(Jonathan had a pull request with a Phobos workaround for this, which I
_think_ is already merged, but the autotester is still failing at this
point. :-/)

2) Issue 8542 (crosstalk between template instantiations)

3) And now, issue 8900 (zip fails to compile with repeat(char[]))

So there's still no joy for cartesianProduct. :-(

I'm getting a bit frustrated with the Phobos bugs related to ranges and
std.algorithm. I think we need to increase the number of unittests. And
by that I mean, GREATLY increase the number of unittests. Most of the
current tests are merely sanity tests for the most common usage
patterns, most basic types, or tests added as a result of fixed bugs.

This is inadequate.

We need to actively unittest corner cases, rare combinations, unusual
usages, etc.. Torture test various combinations of range constructs.
Algorithms. Nested range constructs. Nested algorithms. Deliberately
cook up nasty tests that try their best to break the code by using
unusual parameters, unusual range-like objects, strange data, etc.. Go
beyond the simple cases to test non-trivial things.  We need unittests
that pass unusual structs and objects into the range constructs and
algorithms, and make sure they actually work as we have been _assuming_
they should.

I have a feeling there are a LOT of bugs lurking in there behind
overlooked corner cases, off by 1 errors, and other such careless slips,
as well as code that only works for basic types like arrays, which
starts breaking when you hand it something non-trivial.  All these
issues must be weeded out and prevented from slipping back in.

Here's a start:

- Create a set of structs/classes (inside a version(unittest) block)
  that are input, forward, bidirectional, output, etc., ranges, that are
  NOT merely arrays.

- There should be some easy way, perhaps using std.random, of creating
  non-trivial instances of these things.  These should be put in a
  separate place, perhaps outside the std/ subdirectory, where they can
  be imported into unittest blocks by std.range, std.algorithm, whatever
  else that needs extensive testing.

- Use these ranges as input for testing range constructs and algorithms.

- For best results, use a compile-time loop to loop over a given
  combination of these range types, and run them through the same set of
  tests. This will improve the currently spotty test coverage.  Perhaps
  provide some templated functions that, given a set of range types
  (from the above structs/classes) and a set of functions, run through
  all combinations of them to make sure they all work. (We run unittests
  separately anyway, we aren't afraid of long-running tests.)


T

-- 
"How are you doing?" "Doing what?"

Oct 26 2012

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Friday, 26 October 2012 at 22:43:44 UTC, H. S. Teoh wrote:
 I'm getting a bit frustrated with the Phobos bugs related to 
 ranges and
 std.algorithm. I think we need to increase the number of 
 unittests. And
 by that I mean, GREATLY increase the number of unittests. Most 
 of the
 current tests are merely sanity tests for the most common usage
 patterns, most basic types, or tests added as a result of fixed 
 bugs.

 This is inadequate.

Unit tests are not the solution to this. There are too many range 
combinations and types to effectively unit test everything. What 
we need is proper semantic support for ranges (and other 
conceptual classes of types). For example, this should not 
compile:

void foo(Range)(Range r) if (isForwardRange!Range)
{
     r.popBack();
}

But it will, because of the way templates work in D -- nothing is 
checked until you try to use it.

Imagine I could compile non-template code like this:

void foo(int x)
{
     writeln(x.name);
}

Imagine you had to try to call it before the static type error 
was caught. This is not a way to write robust code, especially 
with the combinatorial explosion of range types that exist.

Template constraints are nice, but not enough. We really need 
something *like* the proposed C++ concepts, or just something 
analogous to interfaces. I should be able to write something like 
this:

void foo(ForwardRange Range)(Range r)
{
     r.popBack();
}

And *immediately* get a type error without having to instantiate 
it with a variety of different range types. This is the only way 
I can see these kinds of problems going away. Unit testing does 
not scale with exponential use cases.

Oct 27 2012

"bearophile" <bearophileHUGS lycos.com> writes:

Peter Alexander:

 void foo(ForwardRange Range)(Range r)
 {
     r.popBack();
 }

 And *immediately* get a type error without having to 
 instantiate it with a variety of different range types.

It's not an easy problem. To solve it the Rust language uses 
typeclasses adapted from Haskell. But unlike the usual Haskell 
compilers Rust doesn't use global type inferencing and it keeps 
the C++-style monomorphization, so at run-time its generic 
programming is as efficient as C++ generic programming.

I also hope Rust designers will take a look at this (Efficient 
Dynamic Dispatch without Virtual Function Tables. The SmallEiffel 
Compiler), an alternative to virtual tables. Maybe the D 
front-end could do the same on request with a compilation switch 
(with no changes in D language):

http://smarteiffel.loria.fr/papers/oopsla97.pdf

Bye,
bearophile

Oct 27 2012

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Saturday, 27 October 2012 at 11:41:07 UTC, bearophile wrote:
 Peter Alexander:

 void foo(ForwardRange Range)(Range r)
 {
    r.popBack();
 }

 And *immediately* get a type error without having to 
 instantiate it with a variety of different range types.

 It's not an easy problem. To solve it the Rust language uses 
 typeclasses adapted from Haskell. But unlike the usual Haskell 
 compilers Rust doesn't use global type inferencing and it keeps 
 the C++-style monomorphization, so at run-time its generic 
 programming is as efficient as C++ generic programming.

Yeah, it's certainly not going to be easy. It's unfortunate that 
D adopted the whole C++ style "glorified macros" approach to 
templates -- it makes it very difficult to reason about (or 
automate reasoning about) the semantics of your code.

Retrofitting some sort of structure to templates will be a 
Herculean task, but I think it has to happen. It is clear to me 
that the development process we use now (write the template, try 
a few instantiations, pray) is unsustainable beyond simple 
templates.

Oct 27 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/27/12 8:23 AM, Peter Alexander wrote:
 Yeah, it's certainly not going to be easy. It's unfortunate that D
 adopted the whole C++ style "glorified macros" approach to templates --
 it makes it very difficult to reason about (or automate reasoning about)
 the semantics of your code.

I think D has reached a sweet spot with restricted templates.

 Retrofitting some sort of structure to templates will be a Herculean
 task, but I think it has to happen. It is clear to me that the
 development process we use now (write the template, try a few
 instantiations, pray) is unsustainable beyond simple templates.

It's not clear to me at all. The mechanism works very well and is more 
expressive than alternatives used by other languages.


Andrei

Oct 27 2012

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sat, Oct 27, 2012 at 09:06:11AM -0400, Andrei Alexandrescu wrote:
 On 10/27/12 8:23 AM, Peter Alexander wrote:
Yeah, it's certainly not going to be easy. It's unfortunate that D
adopted the whole C++ style "glorified macros" approach to templates
-- it makes it very difficult to reason about (or automate reasoning
about) the semantics of your code.

 
 I think D has reached a sweet spot with restricted templates.
 
Retrofitting some sort of structure to templates will be a Herculean
task, but I think it has to happen. It is clear to me that the
development process we use now (write the template, try a few
instantiations, pray) is unsustainable beyond simple templates.

 
 It's not clear to me at all. The mechanism works very well and is more
 expressive than alternatives used by other languages.

[...]

I think the complaint is that the template constraints enforce ducktype
requirements on the *caller*, but not on the template code itself.
That's what Peter's example was about: the signature constraint declares
that the template takes an input range, yet the template body makes use
of non-input range properties of the argument.

I think there is some merit to being able to declare concepts; for
example:

	// This concept matches any type that has fields that satisfy
	// what's specified inside the body.
	concept InputRange(T) {
		bool empty;	// this matches  property bool empty() too
		T front;
		void popFront();
	}

	auto myRangeBasedFunc(InputRange r) {
		r.popBack();	// compile-time error: InputRange does
				// not define .popBack
	}

This way, *both* the user of the template and the template writer are
kept "honest" (i.e., they both have to conform to the requirements of an
input range). The compiler can thus statically check the template for
correctness *before* it ever gets instantiated.

Not only so, the compiler will be able to generate a meaningful error
message when the templates fail to match -- it can tell the user "the
template didn't match 'cos struct X that you tried to pass to it isn't
an InputRange, nor a ForwardRange, ... etc.".

Currently, if the template writer screws up, the compiler doesn't know
any better, and only when the user actually tries to pass something that
the template body can't handle, does the error manifest itself. Which
seems a bit late -- it should be the template writer who first notices
the problem (like if the compiler can statically verify the template
body and complain loudly before the code ever gets to the user). Plus,
the error message is usually a long cryptic sequence of instantiation
failures containing all sorts of internal Phobos types that the user has
no idea about. The compiler can hardly do much better, because it
doesn't have enough information to make a meaningful message.


T

-- 
Старый друг лучше новых двух.

Oct 27 2012

"xenon325" <1 a.net> writes:

On Saturday, 27 October 2012 at 16:19:43 UTC, H. S. Teoh wrote:
 [snip]
 I think there is some merit to being able to declare concepts; 
 for
 example:

 	// This concept matches any type that has fields that satisfy
 	// what's specified inside the body.
 	concept InputRange(T) {
 		bool empty;	// this matches  property bool empty() too
 		T front;
 		void popFront();
 	}

 	auto myRangeBasedFunc(InputRange r) {
 		r.popBack();	// compile-time error: InputRange does
 				// not define .popBack
 	}

 This way, *both* the user of the template and the template 
 writer are
 kept "honest" (i.e., they both have to conform to the 
 requirements of an
 input range). The compiler can thus statically check the 
 template for
 correctness *before* it ever gets instantiated.

 Not only so, the compiler will be able to generate a meaningful 
 error
 message when the templates fail to match -- it can tell the 
 user "the
 template didn't match 'cos struct X that you tried to pass to 
 it isn't
 an InputRange, nor a ForwardRange, ... etc.".

We've had a short discussion with Jacob Carlborg about almost 
exactly this syntax 
(http://forum.dlang.org/post/jukabm$1btd$1 digitalmars.com) in 
the context of better compiler messages.

I will quote my concerns:

=====

 void foo (InputRange range);

1. How would you signify `foo` is generic ?
    For complier it's probably possible - by type of `InputRange`,
         but that will be one more special case
    What about user ?

2. How would you put 2 constraints on `range` ?

3. Also you forgot to address:

 template isInputRange(R)
 {
      enum bool isInputRange = is(typeof(
      (inout int _dummy=0)
      {
          R r = void;  /// how would you express this
                       /// b.t.w. ?


    range.d comment on this line is "can define a range object".

===== end quote

However I do agree it would be nice if compiler verifies template 
body against constraints. IMNA compiler-writer, but I wonder if 
it's possible with current syntax.

Oct 29 2012

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Saturday, 27 October 2012 at 13:06:09 UTC, Andrei Alexandrescu 
wrote:
 On 10/27/12 8:23 AM, Peter Alexander wrote:
 Retrofitting some sort of structure to templates will be a 
 Herculean
 task, but I think it has to happen. It is clear to me that the
 development process we use now (write the template, try a few
 instantiations, pray) is unsustainable beyond simple templates.

 It's not clear to me at all. The mechanism works very well and 
 is more expressive than alternatives used by other languages.

I'm not sure I can agree it works well.

For example, here's what happened with bug 8900 mentioned in the 
OP:

std.range.zip creates a Zip object, which has a Tuple member. 
Tuple has a toString function, which calls formatElement, which 
calls formatValue, which calls formatRange, which (when there's a 
range of characters) has a code path for right-aligning the 
range. To right-align the range it needs to call walkLength.

The problem arises when you zip an infinite range of characters 
e.g. repeat('a').

Before pull request 880, walkLength accepted infinite ranges and 
just returned size_t.max. Pull request 880 added a constraint to 
walkLength to stop it accepting infinite ranges. Suddenly, you 
cannot zip a range of infinite chars because of a seemingly 
unrelated change.

This would have been caught if there was a unit test, but there 
wasn't, and as Dijkstra says, "testing can be a very effective 
way to show the presence of bugs, but it is hopelessly inadequate 
for showing their absence." There are probably other places that 
are broken, and many changes in the future will just introduce 
more bugs without tests.

Maybe we have different standards for "working well", but to me 
at least, this isn't what "working well" looks like.

Working well in this case would look like this:

- The person that put together pull request 880 would add the 
template constraint to walkLength.
- On the next compile he would get this error: "formatRange 
potentially calls walkLength with an infinite range." (or 
something along those lines).
- The person fixes formatRange, and all is well.

No need for unit tests, it's all caught as soon as possible 
without need for instantiation.

Oct 28 2012

"monarch_dodra" <monarchdodra gmail.com> writes:

On Sunday, 28 October 2012 at 12:28:23 UTC, Peter Alexander wrote:
 Before pull request 880, walkLength accepted infinite ranges 
 and just returned size_t.max.

Actually, before 880, you had an infinite loop, as "size_t.max" 
was a magic number that meant "walk to end"...

----

I've found recently trying to do coverage for std.container and 
others, that an "efficient" method of unittesting is to just try 
to instanciate the template in as many forms as possible. While 
you aren't actually verifying any run-time behavior, you ARE 
finding a lot of compile errors.

I haven't commited any of them, but I have been able to detect 
and fix quite some errors. They are [keyboar died]

Oct 28 2012

"monarch_dodra" <monarchdodra gmail.com> writes:

On Sunday, 28 October 2012 at 14:11:41 UTC, monarch_dodra wrote:
 ...[keyboar died]

Yeah, sorry. I was saying that by just doing unit tests that do
nothing more than declare the templates, you can get some very
good compile coverage.

It is not optimal, but it gives a good "bang for the buck", in
terms of invested effort (very little, just typing the types, but 
not testing anything).

Oct 28 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/28/12 8:28 AM, Peter Alexander wrote:
 On Saturday, 27 October 2012 at 13:06:09 UTC, Andrei Alexandrescu wrote:
 On 10/27/12 8:23 AM, Peter Alexander wrote:
 Retrofitting some sort of structure to templates will be a Herculean
 task, but I think it has to happen. It is clear to me that the
 development process we use now (write the template, try a few
 instantiations, pray) is unsustainable beyond simple templates.

 It's not clear to me at all. The mechanism works very well and is more
 expressive than alternatives used by other languages.

 I'm not sure I can agree it works well.

 For example, here's what happened with bug 8900 mentioned in the OP:

 std.range.zip creates a Zip object, which has a Tuple member. Tuple has
 a toString function, which calls formatElement, which calls formatValue,
 which calls formatRange, which (when there's a range of characters) has
 a code path for right-aligning the range. To right-align the range it
 needs to call walkLength.

 The problem arises when you zip an infinite range of characters e.g.
 repeat('a').

This proves nothing at all. So this has to do with invoking walkLength 
against an infinite range. At the time I wrote walkLength, infinite 
ranges were an experimental notion that I was ready to remove if there 
wasn't enough practical support for it. So I didn't even think of the 
connection, which means the restriction wouldn't have likely made it 
into the definition of walkLength regardless of the formalism used.

 Before pull request 880, walkLength accepted infinite ranges and just
 returned size_t.max. Pull request 880 added a constraint to walkLength
 to stop it accepting infinite ranges. Suddenly, you cannot zip a range
 of infinite chars because of a seemingly unrelated change.

The connection is obvious and is independent qualitatively of other 
cases of "if you change A and B uses it, B may change in behavior too". 
It's a pattern old as dust in programming.

Anyway, I'm not sure whether this is clear as day: expressing 
constraints as Booleans or "C++ concepts" style or Gangnam style doesn't 
influence this case in the least.

 This would have been caught if there was a unit test, but there wasn't,
 and as Dijkstra says, "testing can be a very effective way to show the
 presence of bugs, but it is hopelessly inadequate for showing their
 absence." There are probably other places that are broken, and many
 changes in the future will just introduce more bugs without tests.

 Maybe we have different standards for "working well", but to me at
 least, this isn't what "working well" looks like.

 Working well in this case would look like this:

 - The person that put together pull request 880 would add the template
 constraint to walkLength.
 - On the next compile he would get this error: "formatRange potentially
 calls walkLength with an infinite range." (or something along those lines).
 - The person fixes formatRange, and all is well.

 No need for unit tests, it's all caught as soon as possible without need
 for instantiation.

But this works today and has nothing to do with "retrofitting structure 
to templates". Nothing. Nothing.


Andrei

Oct 29 2012

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Monday, 29 October 2012 at 15:48:11 UTC, Andrei Alexandrescu 
wrote:
 On 10/28/12 8:28 AM, Peter Alexander wrote:
 For example, here's what happened with bug 8900 mentioned in 
 the OP:

 std.range.zip creates a Zip object, which has a Tuple member. 
 Tuple has
 a toString function, which calls formatElement, which calls 
 formatValue,
 which calls formatRange, which (when there's a range of 
 characters) has
 a code path for right-aligning the range. To right-align the 
 range it
 needs to call walkLength.

 The problem arises when you zip an infinite range of 
 characters e.g.
 repeat('a').

 This proves nothing at all. So this has to do with invoking 
 walkLength against an infinite range. At the time I wrote 
 walkLength, infinite ranges were an experimental notion that I 
 was ready to remove if there wasn't enough practical support 
 for it. So I didn't even think of the connection, which means 
 the restriction wouldn't have likely made it into the 
 definition of walkLength regardless of the formalism used.

You're misunderstanding. walkLength used to allow infinite 
ranges. Recently, a commit added a constraint to walkLength to 
disallow infinite ranges. After this commit, all the unit tests 
still passed, but at least one bug was introduced (bug 8900).

That's the problem: a change occurred that introduced a bug, but 
the type system failed to catch it before the change was 
committed. Something like typeclasses would have caught the bug 
before commit and without unit tests.


 The connection is obvious and is independent qualitatively of 
 other cases of "if you change A and B uses it, B may change in 
 behavior too". It's a pattern old as dust in programming.

 Anyway, I'm not sure whether this is clear as day: expressing 
 constraints as Booleans or "C++ concepts" style or Gangnam 
 style doesn't influence this case in the least.

If I change A and B uses it, I expect B to give an error or at 
least a warning at compile time where possible. This doesn't 
happen. With template constraints, you don't get an error until 
you try to instantiate the template. This is too late in my 
opinion.

I would like this to give an error:

void foo(R)(R r) if (isForwardRange!R) { r.popBack(); }

It doesn't, not until you try to use it at least, and even then 
it only gives you an error if you try it with a non-bidirectional 
forward range. If this did give an error, bug 8900 (any many 
others) would never have happened.

The problem with constraints vs. something like typeclasses or 
C++ concepts is that constraint predicates are not possible to 
enforce pre-instantiation. They have too much freedom of 
expression.


 Working well in this case would look like this:

 - The person that put together pull request 880 would add the 
 template
 constraint to walkLength.
 - On the next compile he would get this error: "formatRange 
 potentially
 calls walkLength with an infinite range." (or something along 
 those lines).
 - The person fixes formatRange, and all is well.

 No need for unit tests, it's all caught as soon as possible 
 without need
 for instantiation.

 But this works today and has nothing to do with "retrofitting 
 structure to templates". Nothing. Nothing.

It doesn't work today.

This isn't a fabricated example. This happened. walkLength 
changed its constraint, everything still compiled, and all the 
unit tests passed. There was no error, no hint that things were 
broken, nothing. Problems only started to arise when the poor OP 
tried to implement cartesianProduct.

This should never have happened. Typeclasses or C++ concepts 
wouldn't have allowed it to happen. This is the kind of structure 
that templates need.

Oct 29 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/29/12 12:21 PM, Peter Alexander wrote:
 On Monday, 29 October 2012 at 15:48:11 UTC, Andrei Alexandrescu wrote:
 On 10/28/12 8:28 AM, Peter Alexander wrote:
 For example, here's what happened with bug 8900 mentioned in the OP:

 std.range.zip creates a Zip object, which has a Tuple member. Tuple has
 a toString function, which calls formatElement, which calls formatValue,
 which calls formatRange, which (when there's a range of characters) has
 a code path for right-aligning the range. To right-align the range it
 needs to call walkLength.

 The problem arises when you zip an infinite range of characters e.g.
 repeat('a').

 This proves nothing at all. So this has to do with invoking walkLength
 against an infinite range. At the time I wrote walkLength, infinite
 ranges were an experimental notion that I was ready to remove if there
 wasn't enough practical support for it. So I didn't even think of the
 connection, which means the restriction wouldn't have likely made it
 into the definition of walkLength regardless of the formalism used.

 You're misunderstanding. walkLength used to allow infinite ranges.
 Recently, a commit added a constraint to walkLength to disallow infinite
 ranges. After this commit, all the unit tests still passed, but at least
 one bug was introduced (bug 8900).

I thought I understood the matter rather well.

 That's the problem: a change occurred that introduced a bug, but the
 type system failed to catch it before the change was committed.
 Something like typeclasses would have caught the bug before commit and
 without unit tests.

Yes, but what gets ignored here is that typeclasses have a large 
cognitive cost to everyone involved. I think typeclasses generally don't 
pull their weight. Besides I think template constraints are more 
powerful because they operate on arbitrary Boolean expressions instead 
of types.

 The connection is obvious and is independent qualitatively of other
 cases of "if you change A and B uses it, B may change in behavior
 too". It's a pattern old as dust in programming.

 Anyway, I'm not sure whether this is clear as day: expressing
 constraints as Booleans or "C++ concepts" style or Gangnam style
 doesn't influence this case in the least.

 If I change A and B uses it, I expect B to give an error or at least a
 warning at compile time where possible. This doesn't happen. With
 template constraints, you don't get an error until you try to
 instantiate the template.

That's also at compile time, just a tad later.

 This is too late in my opinion.

I think there's a marked difference between compile-time and run-time. 
Instantiation time does not make a big enough difference to bring a big 
gun in the mix.

 I would like this to give an error:

 void foo(R)(R r) if (isForwardRange!R) { r.popBack(); }

 It doesn't, not until you try to use it at least, and even then it only
 gives you an error if you try it with a non-bidirectional forward range.

So them a unittest with a minimal mock forward range should be created. 
I understand your concern, but please understand that typeclasses are 
too big a weight for what they do.

 If this did give an error, bug 8900 (any many others) would never have
 happened.

How many others? (Honest question.)

 The problem with constraints vs. something like typeclasses or C++
 concepts is that constraint predicates are not possible to enforce
 pre-instantiation. They have too much freedom of expression.

Freedom of expression is also a strength. (The problem with C++ concepts 
is that they almost sunk C++.)

 Working well in this case would look like this:

 - The person that put together pull request 880 would add the template
 constraint to walkLength.
 - On the next compile he would get this error: "formatRange potentially
 calls walkLength with an infinite range." (or something along those
 lines).
 - The person fixes formatRange, and all is well.

 No need for unit tests, it's all caught as soon as possible without need
 for instantiation.

 But this works today and has nothing to do with "retrofitting
 structure to templates". Nothing. Nothing.

 It doesn't work today.

 This isn't a fabricated example.

It's just blown out of proportion.

 This happened. walkLength changed its
 constraint, everything still compiled, and all the unit tests passed.
 There was no error, no hint that things were broken, nothing. Problems
 only started to arise when the poor OP tried to implement cartesianProduct.

 This should never have happened. Typeclasses or C++ concepts wouldn't
 have allowed it to happen. This is the kind of structure that templates
 need.

We will not add C++ concepts or typeclasses to D.


Andrei

Oct 29 2012

"bearophile" <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 Yes, but what gets ignored here is that typeclasses have a 
 large cognitive cost to everyone involved.

Such costs are an interesting discussion topic :-)

To put people up to speed a bit:
http://en.wikipedia.org/wiki/Type_class

A bit of explanations regarding Rust ones:
https://air.mozilla.org/rust-typeclasses/

Deeper info from one of the original designers:
http://homepages.inf.ed.ac.uk/wadler/topics/type-classes.html


I think typeclasses generally don't pull their weight.<

Rust and Haskell designers think otherwise, it seems.


 We will not add C++ concepts or typeclasses to D.

I agree that maybe now it's too much late to add them to D2. But 
we are free to discuss about the topic. I didn't know much about 
typeclasses years ago when people were designing D2 :-( I have 
studied them only recently while learning Haskell.

Bye,
bearophile

Oct 29 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/29/12 2:16 PM, bearophile wrote:
 Andrei Alexandrescu:

 Yes, but what gets ignored here is that typeclasses have a large
 cognitive cost to everyone involved.

 Such costs are an interesting discussion topic :-)

 To put people up to speed a bit:
 http://en.wikipedia.org/wiki/Type_class

 A bit of explanations regarding Rust ones:
 https://air.mozilla.org/rust-typeclasses/

 Deeper info from one of the original designers:
 http://homepages.inf.ed.ac.uk/wadler/topics/type-classes.html

For those who wouldn't know how to search the Net, these indeed are 
quite appropriate.

 I think typeclasses generally don't pull their weight.<

 Rust and Haskell designers think otherwise, it seems.

It's a matter of what priorities the language has and what other 
features are available overlapping with the projected usefulness.


Andrei

Oct 29 2012

"bearophile" <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 For those who wouldn't know how to search the Net, these indeed 
 are quite appropriate.

People are often a bit lazy in clicking on links present in 
newsgroup messages, and they are even more lazy in searching 
things. So I've seen numerous times that if you give them links 
they will be more willing to read. This is especially true about 
topics they know nothing about, because they have a higher 
"cognitive impedance".

Google gives many results about this topic, but not all of them 
are good. That link from mozilla is a little video that doesn't 
tell a lot. The papers from Wadler were the ones Haskell 
typeclases come from, they are harder, but they are more 
fundamental. But starting from those papers is hard, so better to 
start from some examples and simpler explanations.

In the Haskell wiki there are more practical texts on this topic, 
and comparisons:
http://www.haskell.org/tutorial/classes.html
http://www.haskell.org/haskellwiki/OOP_vs_type_classes

Plus an online free chapter of what maybe is the best book about 
Haskell:
http://book.realworldhaskell.org/read/using-typeclasses.html


 It's a matter of what priorities the language has and what 
 other features are available overlapping with the projected 
 usefulness.

I agree. On the other hand the main point of this thread is that 
someone perceives as not good enough or not sufficient the 
current features of D. Even if their perception is wrong (and you 
are an expert on this field, so we trust your words), I think the 
topic is worth discussing.

Bye,
bearophile

Oct 29 2012

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Monday, 29 October 2012 at 17:56:26 UTC, Andrei Alexandrescu 
wrote:
 We will not add C++ concepts or typeclasses to D.

Ok. There is no point continuing this discussion then.

Oct 29 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/27/12 6:20 AM, Peter Alexander wrote:
 On Friday, 26 October 2012 at 22:43:44 UTC, H. S. Teoh wrote:
 I'm getting a bit frustrated with the Phobos bugs related to ranges and
 std.algorithm. I think we need to increase the number of unittests. And
 by that I mean, GREATLY increase the number of unittests. Most of the
 current tests are merely sanity tests for the most common usage
 patterns, most basic types, or tests added as a result of fixed bugs.

 This is inadequate.

 Unit tests are not the solution to this. There are too many range
 combinations and types to effectively unit test everything.

Agreed.

 What we need
 is proper semantic support for ranges (and other conceptual classes of
 types). For example, this should not compile:

 void foo(Range)(Range r) if (isForwardRange!Range)
 {
 r.popBack();
 }

No, this has nothing to do with the problem at hand.


Andrei

Oct 27 2012

Johannes Pfau <nospam example.com> writes:

Am Sat, 27 Oct 2012 12:20:51 +0200
schrieb "Peter Alexander" <peter.alexander.au gmail.com>:

 On Friday, 26 October 2012 at 22:43:44 UTC, H. S. Teoh wrote:
 I'm getting a bit frustrated with the Phobos bugs related to 
 ranges and
 std.algorithm. I think we need to increase the number of 
 unittests. And
 by that I mean, GREATLY increase the number of unittests. Most 
 of the
 current tests are merely sanity tests for the most common usage
 patterns, most basic types, or tests added as a result of fixed 
 bugs.

 This is inadequate.

 
 Unit tests are not the solution to this. There are too many range 
 combinations and types to effectively unit test everything.


Yes and no. We can't test all possible combinations, but we should try
to at least test all code paths:
------------------
void test(R)(R range)
{
    static if(isInputRange!R)
        r.popFront();
    else if(isForwardRange!R)
        r.saev();
}

unittest
{
    //Should test at least with InputRange & ForwardRange
}
-------------------

We sometimes even have templates that are not used in phobos at all
(and don't have unittests). Those are really evil as they can break
silently if some typo is introduced.

Oct 27 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/26/12 6:45 PM, H. S. Teoh wrote:
 http://d.puremagic.com/issues/show_bug.cgi?id=8900

 :-(

 (The code there is called cartesianProd but it's the reduced code, so it
 doesn't really compute the cartesian product. But that's where it's
 from.)

 So far, the outstanding blockers for cartesianProduct are:
 1) Compiler bug which causes unittest failure:

 	std/range.d(4629): Error: variable lower used before set
 	std/range.d(4630): Error: variable upper used before set

 (Jonathan had a pull request with a Phobos workaround for this, which I
 _think_ is already merged, but the autotester is still failing at this
 point. :-/)

 2) Issue 8542 (crosstalk between template instantiations)

 3) And now, issue 8900 (zip fails to compile with repeat(char[]))

 So there's still no joy for cartesianProduct. :-(

Often when there are many issues caused by a specific artifact, there 
are only a couple of compiler bugs manifesting themselves in various 
unpredictable ways.

 I'm getting a bit frustrated with the Phobos bugs related to ranges and
 std.algorithm. I think we need to increase the number of unittests. And
 by that I mean, GREATLY increase the number of unittests.

I don't think that's the best way. Walter and I have had a lasting 
disagreement about this as he firmly believes unittests are the way to 
make sure the compiler works well. In my opinion the psychological 
effect has been negative because this outlook has caused 
incompletely-defined features to get implemented (in a latent 
expectation that unittests will cause any issues to be detected).


Andrei

Oct 27 2012

"monarch_dodra" <monarchdodra gmail.com> writes:

On Friday, 26 October 2012 at 22:43:44 UTC, H. S. Teoh wrote:
 This is inadequate.

The best we can get are users like you that take the time to 
report ;)

Thanks.

Oct 28 2012

Don Clugston <dac nospam.com> writes:

On 27/10/12 00:45, H. S. Teoh wrote:
 http://d.puremagic.com/issues/show_bug.cgi?id=8900

 :-(

 (The code there is called cartesianProd but it's the reduced code, so it
 doesn't really compute the cartesian product. But that's where it's
 from.)

 So far, the outstanding blockers for cartesianProduct are:
 1) Compiler bug which causes unittest failure:

 	std/range.d(4629): Error: variable lower used before set
 	std/range.d(4630): Error: variable upper used before set

 (Jonathan had a pull request with a Phobos workaround for this, which I
 _think_ is already merged, but the autotester is still failing at this
 point. :-/)

 2) Issue 8542 (crosstalk between template instantiations)

 3) And now, issue 8900 (zip fails to compile with repeat(char[]))

 So there's still no joy for cartesianProduct. :-(

 I'm getting a bit frustrated with the Phobos bugs related to ranges and
 std.algorithm. I think we need to increase the number of unittests. And
 by that I mean, GREATLY increase the number of unittests. Most of the
 current tests are merely sanity tests for the most common usage
 patterns, most basic types, or tests added as a result of fixed bugs.

 This is inadequate.

 We need to actively unittest corner cases, rare combinations, unusual
 usages, etc.. Torture test various combinations of range constructs.
 Algorithms. Nested range constructs. Nested algorithms. Deliberately
 cook up nasty tests that try their best to break the code by using
 unusual parameters, unusual range-like objects, strange data, etc.. Go
 beyond the simple cases to test non-trivial things.  We need unittests
 that pass unusual structs and objects into the range constructs and
 algorithms, and make sure they actually work as we have been _assuming_
 they should.

 I have a feeling there are a LOT of bugs lurking in there behind
 overlooked corner cases, off by 1 errors, and other such careless slips,
 as well as code that only works for basic types like arrays, which
 starts breaking when you hand it something non-trivial.  All these
 issues must be weeded out and prevented from slipping back in.

 Here's a start:

 - Create a set of structs/classes (inside a version(unittest) block)
    that are input, forward, bidirectional, output, etc., ranges, that are
    NOT merely arrays.

 - There should be some easy way, perhaps using std.random, of creating
    non-trivial instances of these things.  These should be put in a
    separate place, perhaps outside the std/ subdirectory, where they can
    be imported into unittest blocks by std.range, std.algorithm, whatever
    else that needs extensive testing.

 - Use these ranges as input for testing range constructs and algorithms.

 - For best results, use a compile-time loop to loop over a given
    combination of these range types, and run them through the same set of
    tests. This will improve the currently spotty test coverage.  Perhaps
    provide some templated functions that, given a set of range types
    (from the above structs/classes) and a set of functions, run through
    all combinations of them to make sure they all work. (We run unittests
    separately anyway, we aren't afraid of long-running tests.)


 T


I think that unit tests aren't very effective without code coverage.

One fairly non-disruptive thing we could do: implement code coverage for 
templates. Currently, templates get no code coverage numbers.
We could do a code-coverage equivalent for templates: which lines 
actually got instantiated?
I bet this would show _huge_ gaps in the existing test suite.

Oct 29 2012

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Monday, 29 October 2012 at 14:47:37 UTC, Don Clugston wrote:
 One fairly non-disruptive thing we could do: implement code 
 coverage for templates. Currently, templates get no code 
 coverage numbers.
 We could do a code-coverage equivalent for templates: which 
 lines actually got instantiated?
 I bet this would show _huge_ gaps in the existing test suite.

That's a good step forward, but I don't think it solves (what 
appears to me to be) the most common issue: incorrect/missing 
template constraints.

auto average(R)(R r) if (isForwardRange!R)
{
     return reduce!("a + b")(r) / walkLength(r);
}

(This code can have full coverage for some ranges, but also needs 
to also check !isInfinite!R, and probably other things).

These are difficult because every time a constraint changes, you 
need to go round and update the constraints of all calling 
functions. It's like when you change the types of a function 
argument, or return type; except that with template constraints, 
nothing is checked until instantiation.

Oct 29 2012

D Programming

C/C++ Programming

Other

digitalmars.D - Another day in the ordeal of cartesianProduct