www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Extending unittests [proposal] [Proof Of Concept]

reply Johannes Pfau <nospam example.com> writes:
Current situation:
The compiler combines all unittests of a module into one huge function.
If a unittest in a module fails, the rest won't be executed. The
runtime (which is responsible for calling that per module unittest
method) must always run all unittests of a module.

Goal:
The runtime / test runner can decide for every test if it wants to
continue testing or abort. It should also be possible to run single
tests and skip some tests. As a secondary goal the runtime should
receive the filename and line number of the unittest declaration.

Proposal:
Introduce a new 'MInewunitTest' ModuleInfo flag in object.d and in the
compiler. If MInewunitTest is present, the moduleinfo does not contain
a unittester function. Instead it contains an array (slice) of UnitTest
structs. So the new module property looks like this:
----
 property UnitTest[] unitTest() nothrow pure;
----

the UnitTest struct looks like this:
----
struct UnitTest
{
   string name; //Not used yet
   string fileName;
   uint line;
   void function() testFunc;
}
----

The compiler generates a static array of all UnitTest objects for every
module and sets the UnitTest[] slice in the moduleinfo to point to this
static array. As the compiler already contains individual functions for
every unittest, this isn't too difficult.


Proof of Concept:
I haven't done any dmd hacking before so this might be terrible code,
but it is working as expected and can be used as a guide on how to
implement this:
https://github.com/jpf91/druntime/compare/newUnittest
https://github.com/jpf91/dmd/compare/newUnittest

In this POC the MInewunitTest flag is not present yet, the new method
is always used. Also the implementation in druntime is only minimally
changed. The compiler changes allow an advanced testrunner to do a lot
more:

* Be a GUI tool / use colored output / ...
* Allow to run single, specific tests, skip tests, ...
* Execute tests in a different process, communicate with IPC. This way
  we can deal with segmentation faults in unit tests.

Sample output:
Testing generated/linux/debug/32/unittest/std/array
std/array.d:86          SUCCESS
std/array.d:145         SUCCESS
std/array.d:183         SUCCESS
std/array.d:200         SUCCESS
std/array.d:231         SUCCESS
std/array.d:252         SUCCESS
std/array.d:317         SUCCESS

The perfect solution:
Would allow user defined attributes on tests, so you could name them,
assign categories, etc. But till we have those user defined attributes,
this seems to be a good solution.
Sep 20 2012
next sibling parent reply "Jesse Phillips" <Jessekphillips+D gmail.com> writes:
On Thursday, 20 September 2012 at 16:52:40 UTC, Johannes Pfau 
wrote:

 Sample output:
 Testing generated/linux/debug/32/unittest/std/array
 std/array.d:86          SUCCESS
 std/array.d:145         SUCCESS
 std/array.d:183         SUCCESS
 std/array.d:200         SUCCESS
 std/array.d:231         SUCCESS
 std/array.d:252         SUCCESS
 std/array.d:317         SUCCESS

 The perfect solution:
 Would allow user defined attributes on tests, so you could name 
 them,
 assign categories, etc. But till we have those user defined 
 attributes,
 this seems to be a good solution.

I didn't read everything in your post, where does the FAILURE show up. If it is intermixed with the SUCCESS, then I could see that as a problem. While I can't say I've hated/liked the lack of output for unittest success, I believe my feeling would be the same with this.
Sep 20 2012
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2012-09-20 19:37, Johannes Pfau wrote:

 That's just an example output. We could leave the druntime
 test runner as is and don't change the output at all. We could only
 print the failure messages. Or we could collect all failures and print
 them at the end. All that can easily be changed in druntime (and
 I'd argue we should enhance the druntime interface, so everyone could
 implement a custom test runner), but we need the compiler changes to
 allow this.

It's already possible, just set a unit test runner using Runtime.moduleUnitTester. -- /Jacob Carlborg
Sep 20 2012
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2012-09-20 21:11, Johannes Pfau wrote:

 Oh right, I thought that interface was more restrictive. So the only
 changes necessary in druntime are to adapt to the new compiler
 interface.

 The new dmd code is still necessary, as it allows to access
 all unittests of a module individually. The current code only
 provides one function for all unittests in a module.

Yes, exactly. There where some additional data, like file and module name as well in your compiler changes? -- /Jacob Carlborg
Sep 21 2012
parent reply Jacob Carlborg <doob me.com> writes:
On 2012-09-21 14:19, Jens Mueller wrote:

 Why do you need filename and line information of a unittest. If a
 unittest fails you'll get the relevant information. Why do you want the
 information when a unittest succeeded? I only care about failed
 unittests. A count of the number of executed unittests and total number
 is enough, I think.

But others might care about other things. I doesn't hurt if the information is available. There might be use cases when one would want to display all tests regardless of if they failed or not. -- /Jacob Carlborg
Sep 21 2012
parent reply Jacob Carlborg <doob me.com> writes:
On 2012-09-21 16:37, Jens Mueller wrote:

 If there are use cases I agree. I do not know one.
 The question whether there are *tools* that report in case of success is
 easier to verify. Do you know any tool that does reporting in case
 success? I think gtest does not do it. I'm not sure about JUnit.
 But of course if a unittest has additional information and that is
 already implemented or easy to implement fine with me. My point is more
 that for the common cases you do not need this. Maybe in most. Maybe in
 all.

Test::Unit, the default testing framework for Ruby on Rails prints a dot for each successful test. -- /Jacob Carlborg
Sep 21 2012
parent Jacob Carlborg <doob me.com> writes:
On 2012-09-21 22:57, Jens Mueller wrote:

 Test::Unit, the default testing framework for Ruby on Rails prints a
 dot for each successful test.

That is fine. But you don't need the name of the unittest then.

I'm just saying that different testing library to things differently. There are probably testing libraries out there that do print the name of the test. -- /Jacob Carlborg
Sep 22 2012
prev sibling parent Jens Mueller <jens.k.mueller gmx.de> writes:
Tobias Pankrath wrote:
 
I'm actually kinda surprised the feedback on this is rather
negative. I
thought running unit tests individually and printing
line/file/name was
requested quite often?

I want to have this. My workflow is: Run all tests0(run all). If some fail, see if there might be a common reason (so don't stop). Than run the unit tests that will most likely tell you what's wrong in a debugger (run one test individually).

Though dtest is in an early state, you can do: $ ./dtest --abort=no runs all unittests and report each failure, i.e. it continues in case of a failure instead of aborting. Then run: $ ./dtest --abort=no --break=both to turn all failures into breakpoints. What is true that you cannot pick here an individual unittest. But you can continue in the debugger though this may have its problems. But running them individually may have problems too if the unittests are not written to be executed independently. Jens
Sep 21 2012
prev sibling next sibling parent Jens Mueller <jens.k.mueller gmx.de> writes:
Johannes Pfau wrote:
 Am Fri, 21 Sep 2012 16:37:37 +0200
 schrieb Jens Mueller <jens.k.mueller gmx.de>:
 
 Jacob Carlborg wrote:
 On 2012-09-21 14:19, Jens Mueller wrote:
 
Why do you need filename and line information of a unittest. If a
unittest fails you'll get the relevant information. Why do you



With the recent name mangling change it's possible to get the unittest line if a test fails, but only if you have working backtraces. That might not be true for other compilers / non x86 architectures. To get the filename you have to demangle the unittest function name (IIRC core.demangle can't even demangle that name right now) and this only gives you the module name (which you could also get using moduleinfo though)

I'm saying I do not care which unittest succeeded. All I need that all unittest I ran where successfully executed.
 It's also useful for disabled tests, so you can actually look them up.

That may be useful. So you say these tests where disabled instead of just 2 tests where disabled.
want the information when a unittest succeeded? I only care about
failed unittests. A count of the number of executed unittests and
total number is enough, I think.



The posted example shows everything that can be done, even if it might not make sense. However printing successful tests also has a use case: 1: It shows the progress of unit testing. (Not so important) 2: If code crashes and doesn't produce a backtrace, you still now which test crashed as the file name and line number are printed before running the test. (might sound unprobable. But try porting gdc to a new architecture. I don't want to waste time anymore commenting out unit tests to find the failing one in a file with dozens of tests and an ARM machine that takes ages to run the tests)

Why don't you get report when the program crashes?
 Another use case is printing all unittests in a library. Or a gui app
 displaying all unittests, allowing to only run single unittests, etc.

Listing on a unittest level and selecting may be useful.
 But others might care about other things. I doesn't hurt if the
 information is available. There might be use cases when one would
 want to display all tests regardless of if they failed or not.

If there are use cases I agree. I do not know one. The question whether there are *tools* that report in case of success is easier to verify. Do you know any tool that does reporting in case success? I think gtest does not do it. I'm not sure about JUnit.

I don't know those tools, but I guess they have some sort of progress indicator?

They have them at test case level. I'm not sure whether there is a strict relation between unittest and test case for D. The module level may be enough.
 But I remember some .NET unit test GUIs that showed a green button for
 successful tests. But it's been years since I've done anything in .NET.
 
 But of course if a unittest has additional information and that is
 already implemented or easy to implement fine with me. My point is
 more that for the common cases you do not need this. Maybe in most.
 Maybe in all.

You usually don't have to print sucessful tests (although sometimes I wasn't sure if tests actually run), but as you can't know at compile time which tests fail you either have this information for all tests or for none.

But you could just count the number and report it. If it says "testing std.algorithm with 134 of 134 unittest" you know all have been executed. What is true it won't tell you which unittests were disabled. But that is easy to find out.
 The main reason _I_ want this is for gdc: We currently don't run the
 unit tests on gdc at all. I know they won't pass on ARM. But the unit
 tests error out on the first failing test. Often that error is a
 difficult to fix backend bug, and lots of simpler library bugs are
 hidden because the other tests aren't executed.

But this is a different problem. You want to keep executing on failure. You don't need a unittest name for this. Maybe you say skipping a failing unittest is better and disabling them in the source using disable is tedious.
 I'm actually kinda surprised the feedback on this is rather negative. I
 thought running unit tests individually and printing line/file/name was
 requested quite often?

Running unittests individually is very useful. But I'm not so sure about the latter. I think driving the execution of how to execute the unittests is important. Not so much reporting listing single unittests. But I won't object when you add this feature if you believe it will be used. Just saying I have less use for it. And if the change is simple it should be unlikely to introduce any bugs. Jens
Sep 21 2012
prev sibling parent Jens Mueller <jens.k.mueller gmx.de> writes:
Jacob Carlborg wrote:
 On 2012-09-21 16:37, Jens Mueller wrote:
 
If there are use cases I agree. I do not know one.
The question whether there are *tools* that report in case of success is
easier to verify. Do you know any tool that does reporting in case
success? I think gtest does not do it. I'm not sure about JUnit.
But of course if a unittest has additional information and that is
already implemented or easy to implement fine with me. My point is more
that for the common cases you do not need this. Maybe in most. Maybe in
all.

Test::Unit, the default testing framework for Ruby on Rails prints a dot for each successful test.

That is fine. But you don't need the name of the unittest then. Jens
Sep 21 2012
prev sibling next sibling parent Johannes Pfau <nospam example.com> writes:
Am Thu, 20 Sep 2012 19:27:00 +0200
schrieb "Jesse Phillips" <Jessekphillips+D gmail.com>:


 
 I didn't read everything in your post, where does the FAILURE 
 show up. If it is intermixed with the SUCCESS, then I could see 
 that as a problem.
 
 While I can't say I've hated/liked the lack of output for 
 unittest success, I believe my feeling would be the same with 
 this.

That's just an example output. We could leave the druntime test runner as is and don't change the output at all. We could only print the failure messages. Or we could collect all failures and print them at the end. All that can easily be changed in druntime (and I'd argue we should enhance the druntime interface, so everyone could implement a custom test runner), but we need the compiler changes to allow this. In the end, you have an array of UnitTests (ordered as they appear in the source file). A UnitTest has a filename, line number and a function member(the actual unittest function). What you do with this is completely up to you.
Sep 20 2012
prev sibling next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Johannes Pfau:

 The perfect solution:
 Would allow user defined attributes on tests, so you could name 
 them,
 assign categories, etc. But till we have those user defined 
 attributes,
 this seems to be a good solution.

We have disable, maybe it's usable for unittests too :-) Bye, bearophile
Sep 20 2012
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 20-Sep-12 22:18, bearophile wrote:
 Johannes Pfau:

 The perfect solution:
 Would allow user defined attributes on tests, so you could name them,
 assign categories, etc. But till we have those user defined attributes,
 this seems to be a good solution.

We have disable, maybe it's usable for unittests too :-)

-- Dmitry Olshansky
Sep 20 2012
prev sibling next sibling parent Johannes Pfau <nospam example.com> writes:
Am Thu, 20 Sep 2012 20:51:47 +0200
schrieb Jacob Carlborg <doob me.com>:

 On 2012-09-20 19:37, Johannes Pfau wrote:
 
 That's just an example output. We could leave the druntime
 test runner as is and don't change the output at all. We could only
 print the failure messages. Or we could collect all failures and
 print them at the end. All that can easily be changed in druntime
 (and I'd argue we should enhance the druntime interface, so
 everyone could implement a custom test runner), but we need the
 compiler changes to allow this.

It's already possible, just set a unit test runner using Runtime.moduleUnitTester.

Oh right, I thought that interface was more restrictive. So the only changes necessary in druntime are to adapt to the new compiler interface. The new dmd code is still necessary, as it allows to access all unittests of a module individually. The current code only provides one function for all unittests in a module.
Sep 20 2012
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, September 20, 2012 18:53:38 Johannes Pfau wrote:
 Proposal:

In general, I'm all for instrumenting druntime such that unit testing tools could run unit tests individually and present their output in a customized manner, just so long as it doesn't really change how unit tests work now as far as compiling with -unittest and running your program goes. The only change that might be desirable would be making it so that after a unittest block fails, subsequent unittest blocks within its module are still run. I _would_ point out that running any unittest blocks in a function without running every other unittest block before them is error prone (running further unittet blocks after a failure is risky enough). Last time I used JUnit, even though it claimed to run unitests individually and tell you the result, it didn't really. It might have not run all of the unit tests after the one you asked for, but it still ran all of the ones before it. Not doing so would be p problem in any situation where one unit test affects state that further unit tests use (much as that's arguably bad practice). Regardless, I confess that I don't care too much about the details of how this sort of thing is done so long as it doesn't really change how they work from the standpoint of compiling with -unittest and running your executable. The _one_ feature that I really wish we had was the ability to name unit tests, since then you'd get a decent name in stack traces, but the fact that the pull request which makes it so that unittest block functions are named after their line number has finally been merged in makes that less of an issue. - Jonathan M Davis
Sep 20 2012
prev sibling next sibling parent Jens Mueller <jens.k.mueller gmx.de> writes:
Johannes Pfau wrote:
 Current situation:
 The compiler combines all unittests of a module into one huge function.
 If a unittest in a module fails, the rest won't be executed. The
 runtime (which is responsible for calling that per module unittest
 method) must always run all unittests of a module.
 
 Goal:
 The runtime / test runner can decide for every test if it wants to
 continue testing or abort. It should also be possible to run single
 tests and skip some tests. As a secondary goal the runtime should
 receive the filename and line number of the unittest declaration.
 
 Proposal:
 Introduce a new 'MInewunitTest' ModuleInfo flag in object.d and in the
 compiler. If MInewunitTest is present, the moduleinfo does not contain
 a unittester function. Instead it contains an array (slice) of UnitTest
 structs. So the new module property looks like this:
 ----
  property UnitTest[] unitTest() nothrow pure;
 ----
 
 the UnitTest struct looks like this:
 ----
 struct UnitTest
 {
    string name; //Not used yet
    string fileName;
    uint line;
    void function() testFunc;
 }
 ----
 
 The compiler generates a static array of all UnitTest objects for every
 module and sets the UnitTest[] slice in the moduleinfo to point to this
 static array. As the compiler already contains individual functions for
 every unittest, this isn't too difficult.
 
 
 Proof of Concept:
 I haven't done any dmd hacking before so this might be terrible code,
 but it is working as expected and can be used as a guide on how to
 implement this:
 https://github.com/jpf91/druntime/compare/newUnittest
 https://github.com/jpf91/dmd/compare/newUnittest
 
 In this POC the MInewunitTest flag is not present yet, the new method
 is always used. Also the implementation in druntime is only minimally
 changed. The compiler changes allow an advanced testrunner to do a lot
 more:
 
 * Be a GUI tool / use colored output / ...
 * Allow to run single, specific tests, skip tests, ...
 * Execute tests in a different process, communicate with IPC. This way
   we can deal with segmentation faults in unit tests.

Very recently I have polished a tool I wrote called dtest. http://jkm.github.com/dtest/dtest.html And the single thing I want to support but failed to implement is calling individual unittests. I looked into it. I thought I could find a way to inspect the assembly with some C library. But I couldn't make it work. Currently each module has a __modtest which calls the unittests. I haven't looked into segmentation faults but I think you can handle them already currently. You just need to provide your own segmentation fault handler. I should add this to dtest. dtest also let's you continue executing the tests if an assertion fails and it can turn failures into break points. When you use GNU ld you can even continue and break on any thrown Throwable. In summary I think everything can be done already but not on an individual unittest level. But I also think that this is important and this restriction alone is enough to merge your pull request after a review. But the changes should be backward compatible. I think there is no need to make the runtime more complex. Just let it execute the single function __modtest as it was but add the array of unittests. I'd be happy to extend dtest to use this array because I found no different solution.
 Sample output:
 Testing generated/linux/debug/32/unittest/std/array
 std/array.d:86          SUCCESS
 std/array.d:145         SUCCESS
 std/array.d:183         SUCCESS
 std/array.d:200         SUCCESS
 std/array.d:231         SUCCESS
 std/array.d:252         SUCCESS
 std/array.d:317         SUCCESS

See https://buildhive.cloudbees.com/job/jkm/job/dtest/16/console for dtest's output. $ ./dtest --output=xml Testing 1 modules: ["dtest_unittest"] ====== Run 1 of 1 ====== PASS dtest_unittest ======================== All modules passed: ["dtest_unittest"] This also generates a JUnit/GTest-compatible XML report. Executing ./failing gives more interesting output: $ ./failing --abort=asserts Testing 3 modules: ["exception", "fail", "pass"] ====== Run 1 of 1 ====== FAIL exception object.Exception tests/exception.d(3): first exception object.Exception tests/exception.d(4): second exception FAIL fail core.exception.AssertError fail(5): unittest failure PASS pass ======================== Failed modules (2 of 3): ["exception", "fail"] I also found some inconsistency in the output when asserts have no message. It'll be nice if that could be fixed too. http://d.puremagic.com/issues/show_bug.cgi?id=8652
 The perfect solution:
 Would allow user defined attributes on tests, so you could name them,
 assign categories, etc. But till we have those user defined attributes,
 this seems to be a good solution.

This is orthogonal to your proposal. You just want that every unittest is exposed as a function. How to define attributes for functions is a different story. Jens
Sep 20 2012
prev sibling next sibling parent Jens Mueller <jens.k.mueller gmx.de> writes:
Jonathan M Davis wrote:
 On Thursday, September 20, 2012 18:53:38 Johannes Pfau wrote:
 Proposal:

In general, I'm all for instrumenting druntime such that unit testing tools could run unit tests individually and present their output in a customized manner, just so long as it doesn't really change how unit tests work now as far as compiling with -unittest and running your program goes. The only change that might be desirable would be making it so that after a unittest block fails, subsequent unittest blocks within its module are still run. I _would_ point out that running any unittest blocks in a function without running every other unittest block before them is error prone (running further unittet blocks after a failure is risky enough). Last time I used JUnit, even though it claimed to run unitests individually and tell you the result, it didn't really. It might have not run all of the unit tests after the one you asked for, but it still ran all of the ones before it. Not doing so would be p problem in any situation where one unit test affects state that further unit tests use (much as that's arguably bad practice).

You say that JUnit silently runs all unittests before the first specified one, don't you? If that is done silently that's indeed strange. When to abort the execution of a unittest or all unittests of a module is indeed a delicate question. But even though there is a strong default to abort in case of any failure I see merit of allowing the user to change this behavior on demand.
 Regardless, I confess that I don't care too much about the details of how this 
 sort of thing is done so long as it doesn't really change how they work from 
 the standpoint of compiling with -unittest and running your executable. The 
 _one_ feature that I really wish we had was the ability to name unit tests, 
 since then you'd get a decent name in stack traces, but the fact that the pull 
 request which makes it so that unittest block functions are named after their 
 line number has finally been merged in makes that less of an issue.

When has this been merged? It must have been after v2.060 was released. Because I noticed some number at the end of the unittest function names. But it was not the line number. Jens
Sep 20 2012
prev sibling next sibling parent "Tobias Pankrath" <tobias pankrath.net> writes:
On Thursday, 20 September 2012 at 16:52:40 UTC, Johannes Pfau 
wrote:
 snip

It should be possible to generate test cases programmatically [at compile time]. For instance if I have a program that reads files in format A and produces B (e.g. a compiler) it should be possible to have a folder with both inputs and results and generate a test case for every possible input file. (instead of one big testcase for every input file).
Sep 20 2012
prev sibling next sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, September 20, 2012 22:55:23 Jens Mueller wrote:
 You say that JUnit silently runs all unittests before the first
 specified one, don't you?

Yes. At least, that was its behavior the last time that I used it (which was admittedly a few years ago).
 If that is done silently that's indeed strange.

It could have been a quirk of their implementation, but I expect that it's to avoid issues where a unit test relies on previous unit tests in the same file. If your unit testing functions (or unittest blocks in the case of D) have _any_ dependencies on external state, then skipping any of them affects the ones that you don't skip, possibly changing the result of the unit test (be it to success or failure). Running more unittest blocks after a failure is similarly flawed, but at least in that case, you know that had a failure earlier in the module, which should then tell you that you may not be able to trust further tests (but if you still run them, it's at least then potentially possible to fix further failures at the same time - particularly if your tests don't rely on external state). So, while not necessarily a great idea, it's not as bad to run subsequent unittest blocks after a failure (especially if programmers are doing what they're supposed to and making their unit tests independent). However, what's truly insane IMHO is continuing to run a unittest block after it's already had a failure in it. Unless you have exceedingly simplistic unit tests, the failures after the first one mean pretty much _nothing_ and simply clutter the results.
 When has this been merged? It must have been after v2.060 was released.
 Because I noticed some number at the end of the unittest function names.
 But it was not the line number.

A couple of weeks ago IIRC. I'm pretty sure that it was after 2.060 was released. - Jonathan M Davis
Sep 20 2012
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2012-09-20 23:14, Jonathan M Davis wrote:

 Running more unittest blocks after a failure is similarly flawed, but at least
 in that case, you know that had a failure earlier in the module, which should
 then tell you that you may not be able to trust further tests (but if you
 still run them, it's at least then potentially possible to fix further failures
 at the same time - particularly if your tests don't rely on external state).
 So, while not necessarily a great idea, it's not as bad to run subsequent
 unittest blocks after a failure (especially if programmers are doing what
 they're supposed to and making their unit tests independent).

I don't agree. I think that if you designed your unittests blocks so they depend on other unittest blocks are equally flawed. There's a reason for that most testing frameworks have "setup" and "teardown" functions that are called before and after each test. With these function you can restore the environment to a known state and have the tests keep running. On the other hand, if there's a failure in a test, continue running that test would be quite bad. -- /Jacob Carlborg
Sep 21 2012
parent reply Jacob Carlborg <doob me.com> writes:
On 2012-09-21 17:32, Johannes Pfau wrote:

 Well, I think we should just leave the basic unittest runner in
 druntime unchanged. There are unittests in phobos which depend on that
 behavior.

Yeah, this was more a philosophical discussion.
 Other projects can use a custom test runner like Jens Mueller's dtest.

 Ignoring assert in a test is not supported by this proposal. It would
 need much more work and it's probably not a good idea anyway.

There's core.runtime.setAssertHandler, I hope your changes are compatible with that.
 But when porting gdc it's quite annoying if a unit test fails because
 of a compiler (codegen) error and you can't see the result of the
 remaining unit tests. If unit tests are not independent, this could
 cause some false positives, or crash in the worst case. But as long as
 this is not the default in druntime I see no reason why we should
 explicitly prevent it.
 Again, the default unit test runner in druntime hasn't changed _at all_.
 This just provides additional possibilities for test runners.

With core.runtime.exception.setAssertHandler and core.runtime.Runtime.moduleUnitTester I think that's only thing I need to run the unit tests the way I want it. -- /Jacob Carlborg
Sep 21 2012
parent reply Jacob Carlborg <doob me.com> writes:
On 2012-09-21 20:01, Johannes Pfau wrote:

 I didn't think of setAssertHandler. My changes are perfectly compatible
 with it.
 IIRC setAssertHandler has the small downside that it's used for all
 asserts, not only those used in unit tests? I'm not sure if that's a
 drawback or actually useful.

That's no problem, there's a predefined version, "unittest", when you pass the -unittest flag to the compiler: version (unittest) setAssertHandler(myUnitTestSpecificAssertHandler); -- /Jacob Carlborg
Sep 21 2012
parent reply Jacob Carlborg <doob me.com> writes:
On 2012-09-21 23:11, Jens Mueller wrote:

 But if you have an assert in some algorithm to ensure some invariant or
 in a contract it will be handled by myUnitTestSpecificAssertHandler.
 But I think that is not a drawback. Don't you want to no whenever an
 assert is violated?

Oh, you mean like that. Sure, but that will only show up as a failed test. For example, in the Ruby world there are two different testing frameworks: Rspec and test-unit. Rspec makes not difference between a thrown exception or a failed test (assert). Test-unit on the other hand do make a difference of these scenarios. I'm leaning more towards the Rspec way of handling this. -- /Jacob Carlborg
Sep 22 2012
parent reply Jacob Carlborg <doob me.com> writes:
On 2012-09-22 19:43, Jens Mueller wrote:

 What does it mean to make no distinction in RSpec?
 Both should be reported. In D you just see either an AssertError or
 SomeException.

Test-unit would report something like this: 5 tests, 2 failures, 1 error Failures would be asserts that triggered, errors would be thrown exceptions. Rspec on the other hand would report: 5 examples, 3 failures It doesn't care if a failure is due to a thrown exception or a failed assert. Both of these also show a form of stack trace if an exception has been thrown. -- /Jacob Carlborg
Sep 23 2012
parent Jacob Carlborg <doob me.com> writes:
On 2012-09-23 16:57, Jens Mueller wrote:

 How is the stack trace provided? Do you get a stack trace for each
 failure/error? Because that may clutter up the output. Maybe they stop
 at a predefined stack trace length.

Rspec with an exception thrown: http://pastebin.com/wVeQEBUh With full backtrace: http://pastebin.com/Qb7XrXM2 The same for test-unit: http://pastebin.com/5wAG7mHS -- /Jacob Carlborg
Sep 24 2012
prev sibling parent Jens Mueller <jens.k.mueller gmx.de> writes:
Jacob Carlborg wrote:
 On 2012-09-22 19:43, Jens Mueller wrote:
 
What does it mean to make no distinction in RSpec?
Both should be reported. In D you just see either an AssertError or
SomeException.

Test-unit would report something like this: 5 tests, 2 failures, 1 error Failures would be asserts that triggered, errors would be thrown exceptions. Rspec on the other hand would report: 5 examples, 3 failures It doesn't care if a failure is due to a thrown exception or a failed assert.

I see. Thanks for giving this example.
 Both of these also show a form of stack trace if an exception has
 been thrown.

How is the stack trace provided? Do you get a stack trace for each failure/error? Because that may clutter up the output. Maybe they stop at a predefined stack trace length. Jens
Sep 23 2012
prev sibling next sibling parent Jens Mueller <jens.k.mueller gmx.de> writes:
Jonathan M Davis wrote:
 On Thursday, September 20, 2012 22:55:23 Jens Mueller wrote:
 You say that JUnit silently runs all unittests before the first
 specified one, don't you?

Yes. At least, that was its behavior the last time that I used it (which was admittedly a few years ago).
 If that is done silently that's indeed strange.

It could have been a quirk of their implementation, but I expect that it's to avoid issues where a unit test relies on previous unit tests in the same file. If your unit testing functions (or unittest blocks in the case of D) have _any_ dependencies on external state, then skipping any of them affects the ones that you don't skip, possibly changing the result of the unit test (be it to success or failure). Running more unittest blocks after a failure is similarly flawed, but at least in that case, you know that had a failure earlier in the module, which should then tell you that you may not be able to trust further tests (but if you still run them, it's at least then potentially possible to fix further failures at the same time - particularly if your tests don't rely on external state). So, while not necessarily a great idea, it's not as bad to run subsequent unittest blocks after a failure (especially if programmers are doing what they're supposed to and making their unit tests independent). However, what's truly insane IMHO is continuing to run a unittest block after it's already had a failure in it. Unless you have exceedingly simplistic unit tests, the failures after the first one mean pretty much _nothing_ and simply clutter the results.

I sometimes have unittests like assert(testProperty1()); assert(testProperty2()); assert(testProperty3()); And in these cases it will be useful if I got all of the assertion failures. But you are very right that you should use it with very much care and knowing what you do. You may even get lost not seeing the actual problem because of so many subsequent failures.
 When has this been merged? It must have been after v2.060 was released.
 Because I noticed some number at the end of the unittest function names.
 But it was not the line number.

A couple of weeks ago IIRC. I'm pretty sure that it was after 2.060 was released.

I just checked. It was merged in on Wed Sep 5 19:46:50 2012 -0700 (commit d3669f79813) and v2.060 was released 2nd of August. Meaning I could try calling these functions myself now that I know their names. Jens
Sep 20 2012
prev sibling next sibling parent Johannes Pfau <nospam example.com> writes:
Am Thu, 20 Sep 2012 22:41:23 +0400
schrieb Dmitry Olshansky <dmitry.olsh gmail.com>:

 On 20-Sep-12 22:18, bearophile wrote:
 Johannes Pfau:

 The perfect solution:
 Would allow user defined attributes on tests, so you could name
 them, assign categories, etc. But till we have those user defined
 attributes, this seems to be a good solution.

We have disable, maybe it's usable for unittests too :-)


Actually disable is better. version(none) completely ignores the test. But disable could set a disabled bool in the UnitTest struct (and set the function pointer to null). This way you can easily get all disabled unittests: ./unittest --show-disabled =========================== core.thread ============================ src/core/thread.d:1761 ... This can be implemented with 3 lines of additional code. The real question is if it's ok to reuse disable for this.
Sep 21 2012
prev sibling next sibling parent Johannes Pfau <nospam example.com> writes:
Am Fri, 21 Sep 2012 11:11:49 +0200
schrieb Jacob Carlborg <doob me.com>:

 On 2012-09-20 21:11, Johannes Pfau wrote:
 
 Oh right, I thought that interface was more restrictive. So the only
 changes necessary in druntime are to adapt to the new compiler
 interface.

 The new dmd code is still necessary, as it allows to access
 all unittests of a module individually. The current code only
 provides one function for all unittests in a module.

Yes, exactly. There where some additional data, like file and module name as well in your compiler changes?

The modulename can already be obtained from the moduleinfo. My proposal adds fileName and line information for every unittest. It's also prepared for unittest names: The name information is passed to the runtime, but currently it's always an empty string. It could also allow to mark unittests as disable, if we wanted that. Here's the dmd pull request https://github.com/D-Programming-Language/dmd/pull/1131 For user code, druntime changes are more interesting: https://github.com/D-Programming-Language/druntime/issues/308 Custom test runner: http://dpaste.dzfl.pl/046ed6fb Sample unittests: http://dpaste.dzfl.pl/517b1088 Output: http://dpaste.dzfl.pl/2780939b
Sep 21 2012
prev sibling next sibling parent Jens Mueller <jens.k.mueller gmx.de> writes:
Johannes Pfau wrote:
 Am Fri, 21 Sep 2012 11:11:49 +0200
 schrieb Jacob Carlborg <doob me.com>:
 
 On 2012-09-20 21:11, Johannes Pfau wrote:
 
 Oh right, I thought that interface was more restrictive. So the only
 changes necessary in druntime are to adapt to the new compiler
 interface.

 The new dmd code is still necessary, as it allows to access
 all unittests of a module individually. The current code only
 provides one function for all unittests in a module.

Yes, exactly. There where some additional data, like file and module name as well in your compiler changes?

The modulename can already be obtained from the moduleinfo. My proposal adds fileName and line information for every unittest. It's also prepared for unittest names: The name information is passed to the runtime, but currently it's always an empty string.

Why do you need filename and line information of a unittest. If a unittest fails you'll get the relevant information. Why do you want the information when a unittest succeeded? I only care about failed unittests. A count of the number of executed unittests and total number is enough, I think. Jens
Sep 21 2012
prev sibling next sibling parent Jens Mueller <jens.k.mueller gmx.de> writes:
Jacob Carlborg wrote:
 On 2012-09-21 14:19, Jens Mueller wrote:
 
Why do you need filename and line information of a unittest. If a
unittest fails you'll get the relevant information. Why do you want the
information when a unittest succeeded? I only care about failed
unittests. A count of the number of executed unittests and total number
is enough, I think.

But others might care about other things. I doesn't hurt if the information is available. There might be use cases when one would want to display all tests regardless of if they failed or not.

If there are use cases I agree. I do not know one. The question whether there are *tools* that report in case of success is easier to verify. Do you know any tool that does reporting in case success? I think gtest does not do it. I'm not sure about JUnit. But of course if a unittest has additional information and that is already implemented or easy to implement fine with me. My point is more that for the common cases you do not need this. Maybe in most. Maybe in all. Jens
Sep 21 2012
prev sibling next sibling parent Johannes Pfau <nospam example.com> writes:
Am Fri, 21 Sep 2012 11:25:10 +0200
schrieb Jacob Carlborg <doob me.com>:

 On 2012-09-20 23:14, Jonathan M Davis wrote:
 
 Running more unittest blocks after a failure is similarly flawed,
 but at least in that case, you know that had a failure earlier in
 the module, which should then tell you that you may not be able to
 trust further tests (but if you still run them, it's at least then
 potentially possible to fix further failures at the same time -
 particularly if your tests don't rely on external state). So, while
 not necessarily a great idea, it's not as bad to run subsequent
 unittest blocks after a failure (especially if programmers are
 doing what they're supposed to and making their unit tests
 independent).

I don't agree. I think that if you designed your unittests blocks so they depend on other unittest blocks are equally flawed. There's a reason for that most testing frameworks have "setup" and "teardown" functions that are called before and after each test. With these function you can restore the environment to a known state and have the tests keep running. On the other hand, if there's a failure in a test, continue running that test would be quite bad.

Well, I think we should just leave the basic unittest runner in druntime unchanged. There are unittests in phobos which depend on that behavior. Other projects can use a custom test runner like Jens Mueller's dtest. Ignoring assert in a test is not supported by this proposal. It would need much more work and it's probably not a good idea anyway. But when porting gdc it's quite annoying if a unit test fails because of a compiler (codegen) error and you can't see the result of the remaining unit tests. If unit tests are not independent, this could cause some false positives, or crash in the worst case. But as long as this is not the default in druntime I see no reason why we should explicitly prevent it. Again, the default unit test runner in druntime hasn't changed _at all_. This just provides additional possibilities for test runners.
Sep 21 2012
prev sibling next sibling parent Johannes Pfau <nospam example.com> writes:
Am Fri, 21 Sep 2012 16:37:37 +0200
schrieb Jens Mueller <jens.k.mueller gmx.de>:

 Jacob Carlborg wrote:
 On 2012-09-21 14:19, Jens Mueller wrote:
 
Why do you need filename and line information of a unittest. If a
unittest fails you'll get the relevant information. Why do you



With the recent name mangling change it's possible to get the unittest line if a test fails, but only if you have working backtraces. That might not be true for other compilers / non x86 architectures. To get the filename you have to demangle the unittest function name (IIRC core.demangle can't even demangle that name right now) and this only gives you the module name (which you could also get using moduleinfo though) It's also useful for disabled tests, so you can actually look them up.
want the information when a unittest succeeded? I only care about
failed unittests. A count of the number of executed unittests and
total number is enough, I think.



The posted example shows everything that can be done, even if it might not make sense. However printing successful tests also has a use case: 1: It shows the progress of unit testing. (Not so important) 2: If code crashes and doesn't produce a backtrace, you still now which test crashed as the file name and line number are printed before running the test. (might sound unprobable. But try porting gdc to a new architecture. I don't want to waste time anymore commenting out unit tests to find the failing one in a file with dozens of tests and an ARM machine that takes ages to run the tests) Another use case is printing all unittests in a library. Or a gui app displaying all unittests, allowing to only run single unittests, etc. Of course names are better than filename+line. But names need a change in the language and filename+line are an useful identifier as long as we don't have names.
 
 But others might care about other things. I doesn't hurt if the
 information is available. There might be use cases when one would
 want to display all tests regardless of if they failed or not.

If there are use cases I agree. I do not know one. The question whether there are *tools* that report in case of success is easier to verify. Do you know any tool that does reporting in case success? I think gtest does not do it. I'm not sure about JUnit.

I don't know those tools, but I guess they have some sort of progress indicator? But I remember some .NET unit test GUIs that showed a green button for successful tests. But it's been years since I've done anything in .NET.
 But of course if a unittest has additional information and that is
 already implemented or easy to implement fine with me. My point is
 more that for the common cases you do not need this. Maybe in most.
 Maybe in all.

You usually don't have to print sucessful tests (although sometimes I wasn't sure if tests actually run), but as you can't know at compile time which tests fail you either have this information for all tests or for none. The main reason _I_ want this is for gdc: We currently don't run the unit tests on gdc at all. I know they won't pass on ARM. But the unit tests error out on the first failing test. Often that error is a difficult to fix backend bug, and lots of simpler library bugs are hidden because the other tests aren't executed. I'm actually kinda surprised the feedback on this is rather negative. I thought running unit tests individually and printing line/file/name was requested quite often?
Sep 21 2012
prev sibling next sibling parent "Tobias Pankrath" <tobias pankrath.net> writes:
 I'm actually kinda surprised the feedback on this is rather 
 negative. I
 thought running unit tests individually and printing 
 line/file/name was
 requested quite often?

I want to have this. My workflow is: Run all tests0(run all). If some fail, see if there might be a common reason (so don't stop). Than run the unit tests that will most likely tell you what's wrong in a debugger (run one test individually).
Sep 21 2012
prev sibling next sibling parent Johannes Pfau <nospam example.com> writes:
Am Fri, 21 Sep 2012 19:15:13 +0200
schrieb Jacob Carlborg <doob me.com>:

 On 2012-09-21 17:32, Johannes Pfau wrote:
 
 Well, I think we should just leave the basic unittest runner in
 druntime unchanged. There are unittests in phobos which depend on
 that behavior.

Yeah, this was more a philosophical discussion.
 Other projects can use a custom test runner like Jens Mueller's
 dtest.

 Ignoring assert in a test is not supported by this proposal. It
 would need much more work and it's probably not a good idea anyway.

There's core.runtime.setAssertHandler, I hope your changes are compatible with that.

Oh, I totally forgot about that. So the compiler is already calling into druntime on assert statements, so forgot what I just said, setAssertHandler should work just fine.
 
 But when porting gdc it's quite annoying if a unit test fails
 because of a compiler (codegen) error and you can't see the result
 of the remaining unit tests. If unit tests are not independent,
 this could cause some false positives, or crash in the worst case.
 But as long as this is not the default in druntime I see no reason
 why we should explicitly prevent it.
 Again, the default unit test runner in druntime hasn't changed _at
 all_. This just provides additional possibilities for test runners.

With core.runtime.exception.setAssertHandler and core.runtime.Runtime.moduleUnitTester I think that's only thing I need to run the unit tests the way I want it.

I didn't think of setAssertHandler. My changes are perfectly compatible with it. IIRC setAssertHandler has the small downside that it's used for all asserts, not only those used in unit tests? I'm not sure if that's a drawback or actually useful.
Sep 21 2012
prev sibling next sibling parent "David Piepgrass" <qwertie256 gmail.com> writes:
 However, what's truly insane IMHO is continuing to run a 
 unittest block after
 it's already had a failure in it. Unless you have exceedingly 
 simplistic unit
 tests, the failures after the first one mean pretty much 
 _nothing_ and simply
 clutter the results.

I disagree. Not only are my unit tests independent (so of course the test runner should keep running tests after one fails) but often I do want to keep running after a failure. I like the BOOST unit test library's approach, which has two types of "assert": BOOST_CHECK and BOOST_REQUIRE. After a BOOST_CHECK fails, the test keeps running, but BOOST_REQUIRE throws an exception to stop the test. When testing a series of inputs in a loop, it is useful (for debugging) to see the complete set of which ones succeed and which ones fail. For this feature (continuation) to be really useful though, it needs to be able to output context information on failure (e.g. "during iteration 13 of input group B").
Sep 21 2012
prev sibling next sibling parent Jens Mueller <jens.k.mueller gmx.de> writes:
Jacob Carlborg wrote:
 On 2012-09-21 20:01, Johannes Pfau wrote:
 
I didn't think of setAssertHandler. My changes are perfectly compatible
with it.
IIRC setAssertHandler has the small downside that it's used for all
asserts, not only those used in unit tests? I'm not sure if that's a
drawback or actually useful.

That's no problem, there's a predefined version, "unittest", when you pass the -unittest flag to the compiler: version (unittest) setAssertHandler(myUnitTestSpecificAssertHandler);

But if you have an assert in some algorithm to ensure some invariant or in a contract it will be handled by myUnitTestSpecificAssertHandler. But I think that is not a drawback. Don't you want to no whenever an assert is violated? Jens
Sep 21 2012
prev sibling next sibling parent Jens Mueller <jens.k.mueller gmx.de> writes:
David Piepgrass wrote:
However, what's truly insane IMHO is continuing to run a unittest
block after
it's already had a failure in it. Unless you have exceedingly
simplistic unit
tests, the failures after the first one mean pretty much _nothing_
and simply
clutter the results.

I disagree. Not only are my unit tests independent (so of course the test runner should keep running tests after one fails) but often I do want to keep running after a failure. I like the BOOST unit test library's approach, which has two types of "assert": BOOST_CHECK and BOOST_REQUIRE. After a BOOST_CHECK fails, the test keeps running, but BOOST_REQUIRE throws an exception to stop the test. When testing a series of inputs in a loop, it is useful (for debugging) to see the complete set of which ones succeed and which ones fail. For this feature (continuation) to be really useful though, it needs to be able to output context information on failure (e.g. "during iteration 13 of input group B").

This leads us to the distinction of exceptions and errors. It is safe to catch exceptions but less so for errors. At least it is far more dangerous and less advised to continue execution but should not be prohibited I think. Jens
Sep 21 2012
prev sibling next sibling parent Johannes Pfau <nospam example.com> writes:
Am Fri, 21 Sep 2012 23:15:33 +0200
schrieb Jens Mueller <jens.k.mueller gmx.de>:

 I like the BOOST unit test library's approach, which has two types
 of "assert": BOOST_CHECK and BOOST_REQUIRE. After a BOOST_CHECK
 fails, the test keeps running, but BOOST_REQUIRE throws an exception
 to stop the test. When testing a series of inputs in a loop, it is
 useful (for debugging) to see the complete set of which ones succeed
 and which ones fail. For this feature (continuation) to be really
 useful though, it needs to be able to output context information on
 failure (e.g. "during iteration 13 of input group B").

This leads us to the distinction of exceptions and errors. It is safe to catch exceptions but less so for errors. At least it is far more dangerous and less advised to continue execution but should not be prohibited I think. Jens

I don't know, I'd expect Exceptions, Errors and assert to always exit the current scope, so I'd expect this to always work: unittest { throw new Exception(); assert(1 == 2); auto a = *cast(int*)null; //should never be executed } I think a proper unit test framework should supply an additional check method: unittest { check(a == 1); //always continue execution check(b == 2); //always continue execution check(c == 3); //always continue execution } This can be implemented in library code: Have a module level array of failed checks, then let check append to that array. Analyze the result after every unit test run and empty the array. Then call the next test and so on.
Sep 22 2012
prev sibling parent Jens Mueller <jens.k.mueller gmx.de> writes:
Johannes Pfau wrote:
 Am Fri, 21 Sep 2012 23:15:33 +0200
 schrieb Jens Mueller <jens.k.mueller gmx.de>:
 
 I like the BOOST unit test library's approach, which has two types
 of "assert": BOOST_CHECK and BOOST_REQUIRE. After a BOOST_CHECK
 fails, the test keeps running, but BOOST_REQUIRE throws an exception
 to stop the test. When testing a series of inputs in a loop, it is
 useful (for debugging) to see the complete set of which ones succeed
 and which ones fail. For this feature (continuation) to be really
 useful though, it needs to be able to output context information on
 failure (e.g. "during iteration 13 of input group B").

This leads us to the distinction of exceptions and errors. It is safe to catch exceptions but less so for errors. At least it is far more dangerous and less advised to continue execution but should not be prohibited I think. Jens

I don't know, I'd expect Exceptions, Errors and assert to always exit the current scope, so I'd expect this to always work: unittest { throw new Exception(); assert(1 == 2); auto a = *cast(int*)null; //should never be executed }

That should definitely be the default. I think dmd also warns about such code.
 I think a proper unit test framework should supply an additional check
 method:
 unittest
 {
     check(a == 1); //always continue execution
     check(b == 2); //always continue execution
     check(c == 3); //always continue execution
 }
 
 This can be implemented in library code: Have a module level array of
 failed checks, then let check append to that array. Analyze the result
 after every unit test run and empty the array. Then call the next test
 and so on.

gtest also has ASSERT_* and EXPECT_*. I used EXPECT_* rarely. I think it complicates matters for little reason. Because in D you then have assert, enforce, and check. I would prefer if assert and enforce are the only ones. Though semantics for enforce and check are a little different. An exception is like a check. It may fail and can be handled. Jens
Sep 22 2012