digitalmars.D - Serious Problems with the Test Suite

Walter Bright (43/43) Jun 17 2020 A good test suite should:

Avrina (5/9) Jun 17 2020 I've run into these problems with, for example, optlink. When

H. S. Teoh (22/31) Jun 17 2020 Whoa, holey miss the point batman! Optlink may have its own share of

Avrina (8/24) Jun 18 2020 There are issues with optlink, I've seen them manifest in

Walter Bright (2/4) Jun 18 2020 I've run those tests more than anyone, and have not seen an optlink heis...

H. S. Teoh (8/15) Jun 18 2020 I think it's because Walter uses advanced quantum technology that can

Walter Bright (2/11) Jun 18 2020 That's not an optlink issue.

Mathias LANG (22/38) Jun 18 2020 Starting a new thread as not to derail the original topic, which

Stefan Koch (8/11) Jun 17 2020 Most of those could be fixed with an improved test runner. If we
Walter Bright (3/3) Jun 18 2020 I've added a new keyword TestSuite and here are the current test suite b...

Walter Bright <newshound2 digitalmars.com> writes:

A good test suite should:

1. verify that things that are supposed to work do work

2. when things don't verify, point to where the problem is

The D test suite fails miserably at point 2. The only bright spot is the 
autotester, where when one of the tests fail it's quick to find the problem
source.

But I cringe every time something else fails, because then I know I'm in for 
hours or even DAYS trying to figure out what and where things went wrong.

For example,

https://github.com/dlang/dmd/pull/11287

has several failures. All of which come with USELESS log files. I have no idea 
what went wrong. Some principles for log files:

1. If the log file says ERROR, it should be an ERROR, i.e. the test should
fail. 
I'm often confronted with log files that list multiple ERRORe, but never mind, 
those errors don't need to pass. All benign ERROR messages, all deprecation 
messages, all warning messages need to be fixed, so what when the log file says 
ERROR that's why the test failed.

2. The ERROR that causes the test to fail should be LAST line in the log file, 
not 300 lines back.

3. Log files need to contain comment text at each step to SAY WHAT THEY ARE
DOING.

4. Makefiles should NEVER, EVER be run in "quiet" mode, for the simple reason 
that one has no idea what it was trying to do when it failed.

5. Test files must either include a URL to the bugzilla issue they fix or have 
some clue in the comments what they are doing.

6. Running tests multi-process makes them go faster, but since the log files 
randomly interleave the output from them, it makes it impossible to figure out 
where the failure is.

7. Any test that fails because of a network error, or other environmental error 
unrelated to what is being tested, should automatically sleep for a minute or 
ten, then try again.

8. Any timeout terminations MUST say which test timed out.

9. Tests should not be Rube Goldberg Machines with layers and layers of 
complexity before the actual test is even run. Tests should be a THIN layer
over 
the test.

10. Many tests are UTTERLY UNDOCUMENTED. For example,

https://github.com/dlang/dmd/tree/master/test/unit

What is that? What does it do? Is it one test or many tests? Let's look at:

https://github.com/dlang/dmd/blob/master/test/unit/frontend.d

Not a SINGLE COMMENT in it. What it is, what it does, etc., is all left to the 
imagination. This is completely unacceptable for production code, it is also 
unacceptable for any code accepted into the D repository.

11. Every time we run into "oh, that's just a heisenbug, try re-running the 
test" that is a BUG in the test suite and needs to be fixed. Those are gigantic 
time and resource wasting problems.

Jun 17 2020

Avrina <avrina12309412342 gmail.com> writes:

On Wednesday, 17 June 2020 at 23:59:52 UTC, Walter Bright wrote:
 11. Every time we run into "oh, that's just a heisenbug, try 
 re-running the test" that is a BUG in the test suite and needs 
 to be fixed. Those are gigantic time and resource wasting 
 problems.

I've run into these problems with, for example, optlink. When 
trying to get optlink removed, you prevent it. These heisenbugs 
exist because, a lot of the time, you aren't willing to chop off 
dead weight.

Jun 17 2020

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Jun 18, 2020 at 01:59:39AM +0000, Avrina via Digitalmars-d wrote:
 On Wednesday, 17 June 2020 at 23:59:52 UTC, Walter Bright wrote:
 11. Every time we run into "oh, that's just a heisenbug, try
 re-running the test" that is a BUG in the test suite and needs to be
 fixed. Those are gigantic time and resource wasting problems.

 
 I've run into these problems with, for example, optlink. When trying
 to get optlink removed, you prevent it. These heisenbugs exist
 because, a lot of the time, you aren't willing to chop off dead
 weight.

Whoa, holey miss the point batman!  Optlink may have its own share of
issues, but the problem here isn't with this or that piece of software,
it's with the structure of the testsuite.

Tests that are non-deterministic or depend on external state, strictly
speaking, shouldn't be in the test suite. This includes tests that
involve downloading some remote resource over the network, tests that
assume things about the host OS and filesystem, etc..  There are a
couple of these in the test suite, and they put you at the mercy of
external state which is beyond your control. (I remember one time there
was a heisenbug that had to do with random number generators, meaning,
its probability of arbitrary, totally coincidental failure was non-zero.
Sigh.)

These tests ought to be removed, or at least disabled in CI.  Any time
you depend on external state, it really does not belong in the test
suite, or at least, it does not belong in the autotester, because it
just leads to tons of wasted time trying to track down exactly what it
is that failed, which most of the time isn't even relevant to the PR
you're trying to push through.


T

-- 
MASM = Mana Ada Sistem, Man!

Jun 17 2020

Avrina <avrina12309412342 gmail.com> writes:

On Thursday, 18 June 2020 at 02:34:42 UTC, H. S. Teoh wrote:
 On Thu, Jun 18, 2020 at 01:59:39AM +0000, Avrina via 
 Digitalmars-d wrote:
 On Wednesday, 17 June 2020 at 23:59:52 UTC, Walter Bright 
 wrote:
 11. Every time we run into "oh, that's just a heisenbug, try 
 re-running the test" that is a BUG in the test suite and 
 needs to be fixed. Those are gigantic time and resource 
 wasting problems.

 
 I've run into these problems with, for example, optlink. When 
 trying to get optlink removed, you prevent it. These 
 heisenbugs exist because, a lot of the time, you aren't 
 willing to chop off dead weight.

 Whoa, holey miss the point batman!  Optlink may have its own 
 share of issues, but the problem here isn't with this or that 
 piece of software, it's with the structure of the testsuite.

There are issues with optlink, I've seen them manifest in 
testsuite and just running the test again "fix" it. It's not the 
only problem where this has occured.

I'm sure there's more problem with the test suite, and it is 
rather messy and has grown slow. I was replying specifically to 
the point about "heisenbugs". Some of which are of Walter's own 
creation do to his refusal to accept change.

Jun 18 2020

Walter Bright <newshound2 digitalmars.com> writes:

On 6/18/2020 7:38 AM, Avrina wrote:
 There are issues with optlink, I've seen them manifest in testsuite and just 
 running the test again "fix" it. It's not the only problem where this has
occured.

I've run those tests more than anyone, and have not seen an optlink heisenbug.

Jun 18 2020

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Jun 18, 2020 at 02:40:33PM -0700, Walter Bright via Digitalmars-d wrote:
 On 6/18/2020 7:38 AM, Avrina wrote:
 There are issues with optlink, I've seen them manifest in testsuite
 and just running the test again "fix" it. It's not the only problem
 where this has occured.

 
 I've run those tests more than anyone, and have not seen an optlink
 heisenbug.

I think it's because Walter uses advanced quantum technology that can
directly handle quantum-superimposed computation states [1], so none of
these heisenbugs affect him. ;-)

[1] https://forum.dlang.org/post/mailman.3657.1591403118.31109.digitalmars-d puremagic.com


T

-- 
English is useful because it is a mess. Since English is a mess, it maps well
onto the problem space, which is also a mess, which we call reality. Similarly,
Perl was designed to be a mess, though in the nicest of all possible ways. --
Larry Wall

Jun 18 2020

Walter Bright <newshound2 digitalmars.com> writes:

On 6/18/2020 3:20 PM, H. S. Teoh wrote:
 On Thu, Jun 18, 2020 at 02:40:33PM -0700, Walter Bright via Digitalmars-d
wrote:
 I've run those tests more than anyone, and have not seen an optlink
 heisenbug.

 
 I think it's because Walter uses advanced quantum technology that can
 directly handle quantum-superimposed computation states [1], so none of
 these heisenbugs affect him. ;-)
 
 [1] https://forum.dlang.org/post/mailman.3657.1591403118.31109.digitalmars-d puremagic.com

That's not an optlink issue.

Jun 18 2020

Mathias LANG <geod24 gmail.com> writes:

On Friday, 19 June 2020 at 00:54:15 UTC, Walter Bright wrote:
 On 6/18/2020 3:20 PM, H. S. Teoh wrote:
 On Thu, Jun 18, 2020 at 02:40:33PM -0700, Walter Bright via 
 Digitalmars-d wrote:
 I've run those tests more than anyone, and have not seen an 
 optlink
 heisenbug.

 
 I think it's because Walter uses advanced quantum technology 
 that can
 directly handle quantum-superimposed computation states [1], 
 so none of
 these heisenbugs affect him. ;-)
 
 [1] 
 https://forum.dlang.org/post/mailman.3657.1591403118.31109.digitalmars-d puremagic.com

 That's not an optlink issue.

Starting a new thread as not to derail the original topic, which 
contained valid points.

Optlink has been a pain for everyone on x86 Windows for a while. 
I personally use Linux and Mac OSX, but tried doing some work on 
Windows recently and first think I got was a linker crash.

There have been active steps taken to limit its use / reduce the 
exposure of new users to it, among them:
- Dub defaults to mscoff since v1.15.0, and that has drastically 
improved the UX for new users. See 
https://github.com/dlang/dub/pull/1661 for the many reasons this 
was done.
- Vibe.d recently dropped support for it because they were 
causing crashes / timeout: 
https://github.com/vibe-d/vibe.d/pull/2445
- This was tried in DMD, and you obviously shut it down: 
https://github.com/dlang/dmd/pull/8347 . I will just quote the 
last post by Manu here: "I don't have the energy to pursue this. 
I do think it's important though."

And yes, they are document, advertised, and have been advocated 
for years, yet you refused to listen to the feedback countless 
users have given.

Jun 18 2020

Stefan Koch <uplink.coder googlemail.com> writes:

On Wednesday, 17 June 2020 at 23:59:52 UTC, Walter Bright wrote:
 A good test suite should:

 1. verify that things that are supposed to work do work

 [...]

Most of those could be fixed with an improved test runner. If we 
did a timeout per test.

Another oblivious improvement would be printing only the tests 
which failed.

As for the missing comments, I think that's a plus.
When introducing a change in how dmd interprets D's semantics, 
one should be forced to scratch their head.

Jun 17 2020

Walter Bright <newshound2 digitalmars.com> writes:

I've added a new keyword TestSuite and here are the current test suite bugs
that 
I found:

https://issues.dlang.org/buglist.cgi?keywords=TestSuite&list_id=231900

Jun 18 2020

D Programming

C/C++ Programming

Other

digitalmars.D - Serious Problems with the Test Suite