digitalmars.D - RTest, a random testing framework

Fawzi Mohamed (109/109) Jul 21 2008 = RTest

bearophile (19/21) Jul 21 2008 This code:
BCS (19/19) Jul 21 2008 Reply to Fawzi,

Fawzi Mohamed (39/62) Jul 22 2008 Hi BCS,

BCS (23/94) Jul 22 2008 The way the magic above would work is that the Where clauses are not jus...

Fawzi Mohamed (11/65) Jul 22 2008 ok I see how this could work, you have one where clause for each

BCS (6/24) Jul 22 2008 Easier to use, yes. For one, it would syntax highlight correctly! Also t...

Fawzi Mohamed (9/26) Jul 23 2008 well I spent some effort in making that better, if there is a syntax

Fawzi Mohamed (7/40) Jul 23 2008 and I forgot to say that my hope is that the most common case will be
BCS (4/13) Jul 23 2008 the better thing about non string mixin code is that the error happens a...

Fawzi Mohamed (7/22) Jul 23 2008 well I could get rid of the mixin for the manual init and force people
Don (5/20) Jul 24 2008 Yes there is! If you detect an error, instead of returning your mixed-in...

BCS (22/44) Jul 24 2008 option 1;

JAnderson (11/140) Jul 22 2008 Nice! although I'm not exactly sure what the process is above with your
Bruce Adams (14/21) Jul 22 2008 Personally I don't like the idea of randomly generated test cases. Test ...

BCS (6/38) Jul 22 2008 I think there was something in the OP about dumping the entropy needed t...

dsimcha (15/15) Jul 22 2008 I disagree. Random testing can be a great way to find subtle bugs in re...

BCS (7/27) Jul 22 2008 I'm not sure you do as I'm not sure what you are disagreeing with.

Jesse Phillips (16/48) Jul 22 2008 His reply got misplaced, it was to go to Bruce Adams's post. And the res...

Fawzi Mohamed (18/33) Jul 23 2008 and this is exactly what my framework does.

JAnderson (26/67) Jul 23 2008 At work I use UnitTest++. That allows me to run the program though a

Bruce Adams (12/39) Jul 24 2008 I agree with the strategy of using a slow version to test a fast version...

Fawzi Mohamed <fmohamed mac.com> writes:

= RTest
== RTest a random testing framework

I wrote a framework to quickly write tests that check 
property/functions using randomly generated data or all combinations of 
some values (full coverage).
This was inspired by Haskell's Quickcheck, but the result is quite different.

the code is at
	http://github.com/fawzi/rtest/tree/master

The idea is to be able to write tests as quickly and as painlessly as possible.
Typical use is as follow:
{{{
    import frm.rtest.RTest;

    private mixin testInit!() autoInitTst;

    void myTests(){
        // define a collection for my tests
        TestCollection myTests=new TestCollection("myTests",__LINE__,__FILE__);

        // define a test
        autoInitTst.testTrue("testName",functionToTest,__LINE__,__FILE__);
        // for example
        autoInitTst.testTrue("(2*x)%2==0",(int x){ return 
((2*x)%2==0);},__LINE__,__FILE__);

        // run the tests
        myTests.runTests();
    }
}}}
If everything goes well not much should happen, because by default the 
printer does not write successes.
You can change the default controller as follows:
{{{
    SingleRTest.defaultTestController=new TextController(
        TextController.OnFailure.StopTest,
        TextController.PrintLevel.AllShort,Stdout);
}}}
and it should write out something like
{{{
    test`testName`          failures-passes/totalTests(totalCombinatorialRuns)
}}}
i.e.:
{{{
    test`assert(x*x<100)`                 0-100/100(100)
    test`assert(x*x<100)`                 0- 56/100(100)
}}}
If one wants to run three times as many tests:
{{{
    myTests.runTests(3);
}}}
If a test fails then it will print out something like this
{{{
    test`(2*x)%4==0 || (2*x)%4==2` failed (returned false instead of true)
    arg0: -802454419

    To reproduce:
     intial rng state: 
CMWC000000003ade6df6_00000020_595a6207_2a7a7b53_e59a5471_492be655_75b9b464_f45bb6b8_c5af6b1d_1eb47eb9_ff49627d_fe4cecb1_fa196181_ab208cf5_cc398818_d75acbbc_92212c68_ceaff756_c47bf07b_c11af291_c1b66dc4_ac48aabe_462ec397_21bf4b7a_803338ab_c214db41_dc162ebe_41a762a8_7b914689_ba74dba0_d0e7fa35_7fb2df5a_3beb71fb_6dcee941_0000001f_2a9f30df_00000000_00000000
 

    counter: [0]
    ERROR test `(2*x)%4==0 || (2*x)%4==2` from `test.d:35` FAILED!!
    -----------------------------------------------------------
    test`(2*x)%4==0 || (2*x)%4==2`   1-  0/  1(  1)
}}}
from it you should see the arguments that made the test fail.
If you want to re-run it you can add .runTests(1,seed,counter) to it, i.e.:
{{{
autoInitTst.testTrue("(2*x)%4==0 || (2*x)%4==2 (should fail)",(int x){ 
return ((2*x)%4==0 || (2*x)%4==2);},
    
__LINE__,__FILE__).runTests(1,"CMWC000000003ade6df6_00000020_595a6207_2a7a7b53_e59a5471_492be655_75b9b464_f45bb6b8_c5af6b1d_1eb47eb9_ff49627d_fe4cecb1_fa196181_ab208cf5_cc398818_d75acbbc_92212c68_ceaff756_c47bf07b_c11af291_c1b66dc4_ac48aabe_462ec397_21bf4b7a_803338ab_c214db41_dc162ebe_41a762a8_7b914689_ba74dba0_d0e7fa35_7fb2df5a_3beb71fb_6dcee941_0000001f_2a9f30df_00000000_00000000",[0])
}}}

If 

the default generator is not good enough you can create tests that use 
a custom generator like this:
{{{
    private mixin testInit!(manualInit,checkInit) customTst;
}}}
in manualInit you have the following variables:
  arg0,arg1,... : variable of the first,second,... argument that you 
can initialize
  arg0_i,arg0_i,... : index variable for combinatorial (extensive) coverage.
    if you use it you probably want to initialize the next variable
  arg0_max, arg1_max,...: variable that can be initialized to an 
integer that gives
    the maximum value of arg0_i+1, arg1_i+1,... giving it a value makes 
the combinatorial
    machine work, and does not set test.hasRandom to true for this variable
If an argument is not defined the default generation procedure
{{{
    Rand r=...;
    argI=generateRandom!(typeof(argI))(r);
}}}
is used.
checkInit can be used if the generation of the random configurations is 
mostly good,
  but might contain some configurations that should be skipped. In 
checkInit one
  should set the boolean variable "acceptable" to false if the configuration
  should be skipped.

For example:
{{{
    private mixin testInit!("arg0=r.uniformR(10);") smallIntTst;
}}}
then gets used as follow:
{{{
    smallIntTst.testTrue("x*x<100",(int x){ return 
(x*x<100);},__LINE__,__FILE__).runTests();
}}}
by the way this is also a faster way to perform a test, as you can see 
you don't need to define a collection (but probably it is a good idea 
to define one)

enjoy

Fawzi Mohamed

Jul 21 2008

bearophile <bearophileHUGS lycos.com> writes:

Fawzi Mohamed:
 = RTest
 == RTest a random testing framework

This code:

template isFloat(T){
    static if(is(T==float)||is(T==double)||is(T==real)){
        const bool isFloat=true;
    } else {
        const bool isFloat=false;
    }
}

Can be written as:

template isFloat(T){
    const bool isFloat = is(T == float) || is(T == double) || is(T == real);
}

Or using a nicer template:

template isFloat(T){
    const bool isFloat = IsType!(T, float, double, real);
}

Bye,
bearophile

Jul 21 2008

BCS <ao pathlink.com> writes:

Reply to Fawzi,

I'm not quite following that, it seems to randomly select test values to 
supply to a function.
While I like that Idea, the implementation doesn't appeal to me (I have never 
liked string mixins if anything else is usable). I had an idea a while ago 
to might make for a better interface and might even allow better application 
of constraints:

double x,y;
Assert(1/x=pow(x,-1)).
    Where(x).NotZero.
    TestRandom();

Assert(x<y).
    Where(x).LessThan(y).
    Where(x).InRange(-10,10).
    TestEdges(1000); //< 1000 test points

Assert's arg would be lazy bool and the delegate would be storeed, the Where 
would use ref so it can pick up a pointer to x. Some sort of internal magic 
would keep track of what constraints apply to what and let the Test* functions 
inelegantly search the envelope.

Jul 21 2008

Fawzi Mohamed <fmohamed mac.com> writes:

On 2008-07-22 00:12:51 +0200, BCS <ao pathlink.com> said:

 Reply to Fawzi,
 
 I'm not quite following that, it seems to randomly select test values 
 to supply to a function.

Hi BCS,

that is exactly the idea: test a function with random values as input, 
so that many test cases can be generated with little effort from your 
side (I dislike very much writing test cases :).
These random values do not have to be basic types, they can be 
structures, classes,typedefs, actually if one is using too much custom 
generators probably he should define his type, and write
T randomGenerate(T:MyType) for it.

 While I like that Idea, the implementation doesn't appeal to me (I have 
 never liked string mixins if anything else is usable).

I agree the string mixins are a feature that should be avoided if 
possible, but I didn't see how to do it with a similar effort without 
them.
I see them as the equivalent of dynamically typed languages vs 
statically typed, like in that case you loose many checks, but you can 
gain simplicity and expressiveness.
There are many discussions about it but it seems clear to me that each 
choice has it own pros and cons, and depending on the problem it might 
be more or less suited.

  I had an idea a while ago to might make for a better interface and 
 might even allow better application of constraints:
 
 double x,y;
 Assert(1/x=pow(x,-1)).
     Where(x).NotZero.
     TestRandom();
 
 Assert(x<y).
     Where(x).LessThan(y).
     Where(x).InRange(-10,10).
     TestEdges(1000); //< 1000 test points
 
 Assert's arg would be lazy bool and the delegate would be storeed, the 
 Where would use ref so it can pick up a pointer to x. Some sort of 
 internal magic would keep track of what constraints apply to what and 
 let the Test* functions inelegantly search the envelope.

I like your proposal but I fail to see how it can scale to more complex cases.

Indeed my examples where very simple, but a "real" case looks like this:
You have a function that solves linear systems of equations, and you 
want to test it.
The matrix should be square, and the b vector should have the same size 
as the matrix dimension.
So either you define  an ad-hoc structure, or you write a custom 
generator for it
(it is quite unlikely that the constraint are satisfied just by chance 
and you would spend all your time waiting for a valid test case).
Then (if detA>0) you can check that the solution really solves the 
system of equations with a small residual error.
Your test can fail in many ways also due to the internal checks of the 
equation solver, and you want to  always have a nice report that lets 
you reproduce the problem.
Another typical use case is when you have a slow reference 
implementation for something and a fast one, and you want to be sure 
they are the same.

I think that my approach works well in those cases, and I don't see how 
your "magic" could work, but I would like to be shown wrong :)

Fawzi

Jul 22 2008

BCS <ao pathlink.com> writes:

Reply to Fawzi,

 On 2008-07-22 00:12:51 +0200, BCS <ao pathlink.com> said:
 
 Reply to Fawzi,
 
 I'm not quite following that, it seems to randomly select test values
 to supply to a function.
 

 Hi BCS,
 
 that is exactly the idea: test a function with random values as input,
 so that many test cases can be generated with little effort from your
 side (I dislike very much writing test cases :).
 These random values do not have to be basic types, they can be
 structures, classes,typedefs, actually if one is using too much custom
 generators probably he should define his type, and write
 T randomGenerate(T:MyType) for it.
 While I like that Idea, the implementation doesn't appeal to me (I
 have never liked string mixins if anything else is usable).
 

 I agree the string mixins are a feature that should be avoided if
 possible, but I didn't see how to do it with a similar effort without
 them.
 I see them as the equivalent of dynamically typed languages vs
 statically typed, like in that case you loose many checks, but you can
 gain simplicity and expressiveness.
 There are many discussions about it but it seems clear to me that each
 choice has it own pros and cons, and depending on the problem it might
 be more or less suited.
 I had an idea a while ago to might make for a better interface and
 might even allow better application of constraints:
 
 double x,y;
 Assert(1/x=pow(x,-1)).
 Where(x).NotZero.
 TestRandom();
 Assert(x<y).
 Where(x).LessThan(y).
 Where(x).InRange(-10,10).
 TestEdges(1000); //< 1000 test points
 Assert's arg would be lazy bool and the delegate would be storeed,
 the Where would use ref so it can pick up a pointer to x. Some sort
 of internal magic would keep track of what constraints apply to what
 and let the Test* functions inelegantly search the envelope.
 

 I like your proposal but I fail to see how it can scale to more
 complex cases.
 
 Indeed my examples where very simple, but a "real" case looks like
 this:
 You have a function that solves linear systems of equations, and you
 want to test it.
 The matrix should be square, and the b vector should have the same
 size
 as the matrix dimension.
 So either you define  an ad-hoc structure, or you write a custom
 generator for it
 (it is quite unlikely that the constraint are satisfied just by chance
 and you would spend all your time waiting for a valid test case).
 Then (if detA>0) you can check that the solution really solves the
 system of equations with a small residual error.
 Your test can fail in many ways also due to the internal checks of the
 equation solver, and you want to  always have a nice report that lets
 you reproduce the problem.
 Another typical use case is when you have a slow reference
 implementation for something and a fast one, and you want to be sure
 they are the same.
 I think that my approach works well in those cases, and I don't see
 how your "magic" could work, but I would like to be shown wrong :)
 

The way the magic above would work is that the Where clauses are not just 
tested, they are used to define the test envelope. e.g. the clause
Where(x).InRange(-10,10) 
would define a radome number generator that sets x to a radome number in 
the given range, then the Where(x).LessThan(y) clause would be evaluated 
and a radome number generated for y that is always in the specified range 
(in this cases [-inf, x]). Thus in the given cases, ALL test cases would 
be valid.

Other 'Where' predicates could be defended that don't drive the generator 
but are only checked to see if the test cases is valid (like the det()!=0 
example you gave) or even only used to determine the expected result so that 
correct failure cases could be checked as well.

Other variants of the where clause could be used like:

 With(arg).InRange(0,8).Eval(matrix.SquareArrray(arg)). // set arg in [0,10] 
then eval the expression

The overloading needed to mix ref and lazy parameters with constant arguments 
would take some work but I think it is doable. There would also need to be 
quite a bit of magic in the final test function to find test cases within 
the provided constraints but for many cases this could be made a lot better 
than guess and check.

I would love to take a crack at the problem but I have about 4 different 
project in the queue already.

 Fawzi
 

p.s. typo: Assert(1/x=pow(x,-1)).  -> Assert(1/x==pow(x,-1)).

Jul 22 2008

Fawzi Mohamed <fmohamed mac.com> writes:

On 2008-07-22 19:22:19 +0200, BCS <ao pathlink.com> said:

 Reply to Fawzi,
 
 On 2008-07-22 00:12:51 +0200, BCS <ao pathlink.com> said:
 
 Reply to Fawzi,
 
 [...]
 I had an idea a while ago to might make for a better interface and
 might even allow better application of constraints:
 
 double x,y;
 Assert(1/x=pow(x,-1)).
 Where(x).NotZero.
 TestRandom();
 Assert(x<y).
 Where(x).LessThan(y).
 Where(x).InRange(-10,10).
 TestEdges(1000); //< 1000 test points
 Assert's arg would be lazy bool and the delegate would be storeed,
 the Where would use ref so it can pick up a pointer to x. Some sort
 of internal magic would keep track of what constraints apply to what
 and let the Test* functions inelegantly search the envelope.
 

 I like your proposal but I fail to see how it can scale to more
 complex cases.
 [...]

 
 The way the magic above would work is that the Where clauses are not 
 just tested, they are used to define the test envelope. e.g. the clause 
 Where(x).InRange(-10,10) would define a radome number generator that 
 sets x to a radome number in the given range, then the 
 Where(x).LessThan(y) clause would be evaluated and a radome number 
 generated for y that is always in the specified range (in this cases 
 [-inf, x]). Thus in the given cases, ALL test cases would be valid.
 
 Other 'Where' predicates could be defended that don't drive the 
 generator but are only checked to see if the test cases is valid (like 
 the det()!=0 example you gave) or even only used to determine the 
 expected result so that correct failure cases could be checked as well.
 
 Other variants of the where clause could be used like:
 
  With(arg).InRange(0,8).Eval(matrix.SquareArrray(arg)). // set arg in 
 [0,10] then eval the expression
 
 The overloading needed to mix ref and lazy parameters with constant 
 arguments would take some work but I think it is doable. There would 
 also need to be quite a bit of magic in the final test function to find 
 test cases within the provided constraints but for many cases this 
 could be made a lot better than guess and check.
 
 I would love to take a crack at the problem but I have about 4 
 different project in the queue already.

ok I see how this could work, you have one where clause for each 
generator, and that is used to pick up the type and the address of the 
variable and store them is some structure (like a Variant).
Then subsequent messages would set the generator, and maybe constraints.
It could be done, and would be an interesting and challenging project...

Would it be easier to use or simpler to implement than string mixins? I 
don't know, probably not, still an interesting approach.

thanks for the feedback BCS!

Fawzi

 p.s. typo: Assert(1/x=pow(x,-1)).  -> Assert(1/x==pow(x,-1)).

and the second where should be Where(y) :)

Jul 22 2008

BCS <ao pathlink.com> writes:

Reply to Fawzi,


 ok I see how this could work, you have one where clause for each
 generator, and that is used to pick up the type and the address of the
 variable and store them is some structure (like a Variant).
 Then subsequent messages would set the generator, and maybe
 constraints.
 It could be done, and would be an interesting and challenging
 project...
 Would it be easier to use or simpler to implement than string mixins?
 I don't know, probably not, still an interesting approach.
 

Easier to use, yes. For one, it would syntax highlight correctly! Also the 
parse errors get better messages. Easier to write? probably not, but it might 
not be harder either.

 thanks for the feedback BCS!
 
 Fawzi
 
 p.s. typo: Assert(1/x=pow(x,-1)).  -> Assert(1/x==pow(x,-1)).
 

 and the second where should be Where(y) :)
 

There was a reason I did it that way... Darn I forget what. But you are correct 
that line is wrong (but switching to y might not the the solution).

Jul 22 2008

Fawzi Mohamed <fmohamed mac.com> writes:

On 2008-07-23 00:54:49 +0200, BCS <ao pathlink.com> said:

 Reply to Fawzi,
 
 
 ok I see how this could work, you have one where clause for each
 generator, and that is used to pick up the type and the address of the
 variable and store them is some structure (like a Variant).
 Then subsequent messages would set the generator, and maybe
 constraints.
 It could be done, and would be an interesting and challenging
 project...
 Would it be easier to use or simpler to implement than string mixins?
 I don't know, probably not, still an interesting approach.
 

 
 Easier to use, yes. For one, it would syntax highlight correctly!

fair enough

  Also the parse errors get better messages.

well I spent some effort in making that better, if there is a syntax 
error it caches it and writes your a message saying there is and error, 
your arguments and the core part of the generated mixin.
Not perfect, but much better than the default behavior.

  Easier to write? probably not, but it might not be harder either.

true, in my case expression mixin (a D 2.0 feature) would have made the 
interface and usage a little bit better, but well...

Fawzi

Jul 23 2008

Fawzi Mohamed <fmohamed mac.com> writes:

On 2008-07-23 09:20:47 +0200, Fawzi Mohamed <fmohamed mac.com> said:

 On 2008-07-23 00:54:49 +0200, BCS <ao pathlink.com> said:
 
 Reply to Fawzi,
 
 
 ok I see how this could work, you have one where clause for each
 generator, and that is used to pick up the type and the address of the
 variable and store them is some structure (like a Variant).
 Then subsequent messages would set the generator, and maybe
 constraints.
 It could be done, and would be an interesting and challenging
 project...
 Would it be easier to use or simpler to implement than string mixins?
 I don't know, probably not, still an interesting approach.
 

 
 Easier to use, yes. For one, it would syntax highlight correctly!

 
 fair enough
 
  Also the parse errors get better messages.

 
 well I spent some effort in making that better, if there is a syntax 
 error it caches it and writes your a message saying there is and error, 
 your arguments and the core part of the generated mixin.
 Not perfect, but much better than the default behavior.

and I forgot to say that my hope is that the most common case will be 
of using the default generator for each type (so instantiate the 
template with no arguments).
Quickcheck enforces it, if you want another generator you need to 
define a typedef, I leave more freedom, one can (and should) do it if 
he thinks that this generator is going to be used often.

 
  Easier to write? probably not, but it might not be harder either.

 
 true, in my case expression mixin (a D 2.0 feature) would have made the 
 interface and usage a little bit better, but well...
 
 Fawzi

Jul 23 2008

BCS <ao pathlink.com> writes:

Reply to Fawzi,

 On 2008-07-23 00:54:49 +0200, BCS <ao pathlink.com> said:
 
 Also the parse errors get better messages.
 

 well I spent some effort in making that better, if there is a syntax
 error it caches it and writes your a message saying there is and
 error,
 your arguments and the core part of the generated mixin.
 Not perfect, but much better than the default behavior.

the better thing about non string mixin code is that the error happens at 
the point of the error, there is no way to tell where the string is defined 
and generate an error there.

Jul 23 2008

Fawzi Mohamed <fmohamed mac.com> writes:

On 2008-07-23 18:48:50 +0200, BCS <ao pathlink.com> said:

 Reply to Fawzi,
 
 On 2008-07-23 00:54:49 +0200, BCS <ao pathlink.com> said:
 
 Also the parse errors get better messages.
 

 well I spent some effort in making that better, if there is a syntax
 error it caches it and writes your a message saying there is and
 error,
 your arguments and the core part of the generated mixin.
 Not perfect, but much better than the default behavior.

 
 the better thing about non string mixin code is that the error happens 
 at the point of the error, there is no way to tell where the string is 
 defined and generate an error there.

well I could get rid of the mixin for the manual init and force people 
to use typedefs, mmhh, first I will try to use it a little, than I will 
re-evaluate the decision.
For the exclusion of bad cases I think that the mixin is still the best 
solution, well actually an expression mixin would be better, but in 
D1.0...

Jul 23 2008

Don <nospam nospam.com.au> writes:

BCS wrote:
 Reply to Fawzi,
 
 On 2008-07-23 00:54:49 +0200, BCS <ao pathlink.com> said:

 Also the parse errors get better messages.

 well I spent some effort in making that better, if there is a syntax
 error it caches it and writes your a message saying there is and
 error,
 your arguments and the core part of the generated mixin.
 Not perfect, but much better than the default behavior.

 
 the better thing about non string mixin code is that the error happens 
 at the point of the error, there is no way to tell where the string is 
 defined and generate an error there.

Yes there is! If you detect an error, instead of returning your mixed-in 
string, you return `static assert(0, "Found an error in your code");`
And then the error points the line of the user's code. That's actually 
much better than you can do with templates.

Jul 24 2008

BCS <ao pathlink.com> writes:

Reply to don,

 BCS wrote:
 
 Reply to Fawzi,
 
 On 2008-07-23 00:54:49 +0200, BCS <ao pathlink.com> said:
 
 Also the parse errors get better messages.
 

 well I spent some effort in making that better, if there is a syntax
 error it caches it and writes your a message saying there is and
 error,
 your arguments and the core part of the generated mixin.
 Not perfect, but much better than the default behavior.

 the better thing about non string mixin code is that the error
 happens at the point of the error, there is no way to tell where the
 string is defined and generate an error there.
 

 Yes there is! If you detect an error, instead of returning your
 mixed-in
 string, you return `static assert(0, "Found an error in your code");`
 And then the error points the line of the user's code. That's actually
 much better than you can do with templates.

option 1;
int Foo(char[] foo)() { return mixin(foo); }
int Bob()() { Foo!("blab")(); return 0; }
void main() { Bob!()(); }

errors:
t2.d(1): Error: undefined identifier blab  // the relevant error message
t2.d(2): template instance t2.Foo!("blab") error instantiating   // the
relevant 

t2.d(3): template instance t2.Bob!() error instantiating

note that there could be an arbitrary number of errors between line 1 & 2 
and between 2 & 3 depending on how much processing is done.

option 2:
int Foo2(T)(lazy T t) { return t(); }
int Bob2()() {  Foo2!(int)(blab);   return 0; }
void main() {   Bob2!()(); }

errors:
t2.d(2): Error: undefined identifier blab  // the relevant one
t2.d(3): template instance t2.Bob2!() error instantiating

I think it's better to have the same error message give me the line number 
of the actual error and what's wrong is a lot better than splitting them 
across who knows how much space. And to boot it's at the top of the list.

Jul 24 2008

JAnderson <ask me.com> writes:

Fawzi Mohamed wrote:
 = RTest
 == RTest a random testing framework
 
 I wrote a framework to quickly write tests that check property/functions 
 using randomly generated data or all combinations of some values (full 
 coverage).
 This was inspired by Haskell's Quickcheck, but the result is quite 
 different.
 
 the code is at
     http://github.com/fawzi/rtest/tree/master
 
 The idea is to be able to write tests as quickly and as painlessly as 
 possible.
 Typical use is as follow:
 {{{
    import frm.rtest.RTest;
 
    private mixin testInit!() autoInitTst;
 
    void myTests(){
        // define a collection for my tests
        TestCollection myTests=new 
 TestCollection("myTests",__LINE__,__FILE__);
 
        // define a test
        autoInitTst.testTrue("testName",functionToTest,__LINE__,__FILE__);
        // for example
        autoInitTst.testTrue("(2*x)%2==0",(int x){ return 
 ((2*x)%2==0);},__LINE__,__FILE__);
 
        // run the tests
        myTests.runTests();
    }
 }}}
 If everything goes well not much should happen, because by default the 
 printer does not write successes.
 You can change the default controller as follows:
 {{{
    SingleRTest.defaultTestController=new TextController(
        TextController.OnFailure.StopTest,
        TextController.PrintLevel.AllShort,Stdout);
 }}}
 and it should write out something like
 {{{
    test`testName`          
 failures-passes/totalTests(totalCombinatorialRuns)
 }}}
 i.e.:
 {{{
    test`assert(x*x<100)`                 0-100/100(100)
    test`assert(x*x<100)`                 0- 56/100(100)
 }}}
 If one wants to run three times as many tests:
 {{{
    myTests.runTests(3);
 }}}
 If a test fails then it will print out something like this
 {{{
    test`(2*x)%4==0 || (2*x)%4==2` failed (returned false instead of true)
    arg0: -802454419
 
    To reproduce:
     intial rng state: 
 CMWC000000003ade6df6_00000020_595a6207_2a7a7b53_e59a5471_492be655_75b9b464_f45bb6b8_c5af6b1d_1eb47eb9_ff49627d_fe4cecb1_fa196181_ab208cf5_cc398818_d75acbbc_92212c68_ceaff756_c47bf07b_c11af291_c1b66dc4_ac48aabe_462ec397_21bf4b7a_803338ab_c214db41_dc162ebe_41a762a8_7b914689_ba74dba0_d0e7fa35_7fb2df5a_3beb71fb_6dcee941_0000001f_2a9f30
f_00000000_00000000 
 
 
 
    counter: [0]
    ERROR test `(2*x)%4==0 || (2*x)%4==2` from `test.d:35` FAILED!!
    -----------------------------------------------------------
    test`(2*x)%4==0 || (2*x)%4==2`   1-  0/  1(  1)
 }}}
 from it you should see the arguments that made the test fail.
 If you want to re-run it you can add .runTests(1,seed,counter) to it, i.e.:
 {{{
 autoInitTst.testTrue("(2*x)%4==0 || (2*x)%4==2 (should fail)",(int x){ 
 return ((2*x)%4==0 || (2*x)%4==2);},
    
 __LINE__,__FILE__).runTests(1,"CMWC000000003ade6df6_00000020_595a6207_2a7a7b53_e59a5471_492be655_75b9b464_f45bb6b8_c5af6b1d_1eb47eb9_ff49627d_fe4cecb1_fa196181_ab208cf5_cc398818_d75acbbc_92212c68_ceaff756_c47bf07b_c11af291_c1b66dc4_ac48aabe_462ec397_21bf4b7a_803338ab_c214db41_dc162ebe_41a762a8_7b914689_ba74dba0_d0e7fa35_7fb2df5a_3beb71fb_6dcee941_0000001f_2a9f30df_000
0000_00000000",[0]) 
 
 }}}
 
 If
 the default generator is not good enough you can create tests that use a 
 custom generator like this:
 {{{
    private mixin testInit!(manualInit,checkInit) customTst;
 }}}
 in manualInit you have the following variables:
  arg0,arg1,... : variable of the first,second,... argument that you can 
 initialize
  arg0_i,arg0_i,... : index variable for combinatorial (extensive) coverage.
    if you use it you probably want to initialize the next variable
  arg0_max, arg1_max,...: variable that can be initialized to an integer 
 that gives
    the maximum value of arg0_i+1, arg1_i+1,... giving it a value makes 
 the combinatorial
    machine work, and does not set test.hasRandom to true for this variable
 If an argument is not defined the default generation procedure
 {{{
    Rand r=...;
    argI=generateRandom!(typeof(argI))(r);
 }}}
 is used.
 checkInit can be used if the generation of the random configurations is 
 mostly good,
  but might contain some configurations that should be skipped. In 
 checkInit one
  should set the boolean variable "acceptable" to false if the configuration
  should be skipped.
 
 For example:
 {{{
    private mixin testInit!("arg0=r.uniformR(10);") smallIntTst;
 }}}
 then gets used as follow:
 {{{
    smallIntTst.testTrue("x*x<100",(int x){ return 
 (x*x<100);},__LINE__,__FILE__).runTests();
 }}}
 by the way this is also a faster way to perform a test, as you can see 
 you don't need to define a collection (but probably it is a good idea to 
 define one)
 
 enjoy
 
 Fawzi Mohamed
 

Nice! although I'm not exactly sure what the process is above with your 
code.  I've often though about writing a tool that automatically creates 
unit-tests.  Something that you could give an object and it would test 
every function in many random ways and sequences.  It would validate 
based on the outputs of the function.

So more or less it would be a test to make sure that the function 
behavior doesn't change, rather then a check to make sure it works in 
the first place. However you could look at the results that where 
generated and touch it up a little to give it better coverage.

-Joel

Jul 22 2008

"Bruce Adams" <tortoise_74 yeah.who.co.uk> writes:

On Mon, 21 Jul 2008 22:30:58 +0100, Fawzi Mohamed <fmohamed mac.com> wrote:

 = RTest
 == RTest a random testing framework

 I wrote a framework to quickly write tests that check property/functions  
 using randomly generated data or all combinations of some values (full  
 coverage).
 This was inspired by Haskell's Quickcheck, but the result is quite  
 different.

Personally I don't like the idea of randomly generated test cases. Test  
cases need to
be deterministic and repeatable otherwise you don't have much chance of  
tracking
down problems when your tests fail. That said, automatically generating  
test cases
 from a deterministic pseudo-random number generator might still have its  
uses.
It might be idea to add something that checks you have sufficient coverage  
of
the range/domain's involved statistically speaking.

Regards,

Bruce.

Jul 22 2008

BCS <ao pathlink.com> writes:

Reply to Bruce,

 On Mon, 21 Jul 2008 22:30:58 +0100, Fawzi Mohamed <fmohamed mac.com>
 wrote:
 
 = RTest
 == RTest a random testing framework
 I wrote a framework to quickly write tests that check
 property/functions
 using randomly generated data or all combinations of some values
 (full
 coverage).
 This was inspired by Haskell's Quickcheck, but the result is quite
 different.

 Personally I don't like the idea of randomly generated test cases.
 Test
 cases need to
 be deterministic and repeatable otherwise you don't have much chance
 of
 tracking
 down problems when your tests fail. That said, automatically
 generating
 test cases
 from a deterministic pseudo-random number generator might still have
 its
 uses.
 It might be idea to add something that checks you have sufficient
 coverage
 of
 the range/domain's involved statistically speaking.
 Regards,
 
 Bruce.
 

I think there was something in the OP about dumping the entropy needed to 
reproduce failed test cases.

As to checking the domain, My proposal might be of use there as it could 
explicitly define the edges (where most of the interesting stuff happens) 
and then concentrate checks there.

Jul 22 2008

dsimcha <dsimcha yahoo.com> writes:

I disagree.  Random testing can be a great way to find subtle bugs in relatively
complex algorithms that have a simpler but less efficient equivalent.  For
example, let's say you're trying to write a super-efficient implementation of a
hash table with lots of little speed hacks that could hide subtle bugs in
something that's only called a relatively small percentage of the time to begin
with, like collision resolution.  Then, let's say that this bug only shows up
under some relatively specific combination of inputs.  An easy way to be
reasonably sure that you don't have these kinds of subtle bugs would be to also
implement an associative array as a linear search just for testing.  This is
trivial to implement, so unlike your uber-optimized hash table, if it looks
right
it probably is.  In any event, it's even less likely to be wrong in the same way
as your hash table.  Then generate a ton of random data and put it in both your
hash table and your linear search and make sure it all reads back properly.  If
the bug is subtle enough, or if you don't think of it, it may just be near
impossible to manually generate enough test cases to find it.

Jul 22 2008

BCS <ao pathlink.com> writes:

Reply to dsimcha,

 I disagree. 

I'm not sure you do as I'm not sure what you are disagreeing with. 

All I was saying is that most (not all) errors are edge cases so spend more 
time (but not all of it) plugging away there.

If 90% of the errors can be found in 3% of the domain, I'd rather spend 90% 
of my time in that 3%

Aside from that, I have no issues with your assertions.

 Random testing can be a great way to find subtle bugs in
 relatively complex algorithms that have a simpler but less efficient
 equivalent.  For example, let's say you're trying to write a
 super-efficient implementation of a hash table with lots of little
 speed hacks that could hide subtle bugs in something that's only
 called a relatively small percentage of the time to begin with, like
 collision resolution.  Then, let's say that this bug only shows up
 under some relatively specific combination of inputs.  An easy way to
 be reasonably sure that you don't have these kinds of subtle bugs
 would be to also implement an associative array as a linear search
 just for testing.  This is trivial to implement, so unlike your
 uber-optimized hash table, if it looks right it probably is.  In any
 event, it's even less likely to be wrong in the same way as your hash
 table.  Then generate a ton of random data and put it in both your
 hash table and your linear search and make sure it all reads back
 properly.  If the bug is subtle enough, or if you don't think of it,
 it may just be near impossible to manually generate enough test cases
 to find it.

Jul 22 2008

Jesse Phillips <jessekphillips gmail.com> writes:

On Tue, 22 Jul 2008 21:13:31 +0000, BCS wrote:

 Reply to dsimcha,
 
 I disagree.

 
 I'm not sure you do as I'm not sure what you are disagreeing with.
 
 All I was saying is that most (not all) errors are edge cases so spend
 more time (but not all of it) plugging away there.
 
 If 90% of the errors can be found in 3% of the domain, I'd rather spend
 90% of my time in that 3%
 
 Aside from that, I have no issues with your assertions.
 
 Random testing can be a great way to find subtle bugs in relatively
 complex algorithms that have a simpler but less efficient equivalent. 
 For example, let's say you're trying to write a super-efficient
 implementation of a hash table with lots of little speed hacks that
 could hide subtle bugs in something that's only called a relatively
 small percentage of the time to begin with, like collision resolution. 
 Then, let's say that this bug only shows up under some relatively
 specific combination of inputs.  An easy way to be reasonably sure that
 you don't have these kinds of subtle bugs would be to also implement an
 associative array as a linear search just for testing.  This is trivial
 to implement, so unlike your uber-optimized hash table, if it looks
 right it probably is.  In any event, it's even less likely to be wrong
 in the same way as your hash table.  Then generate a ton of random data
 and put it in both your hash table and your linear search and make sure
 it all reads back properly.  If the bug is subtle enough, or if you
 don't think of it, it may just be near impossible to manually generate
 enough test cases to find it.


His reply got misplaced, it was to go to Bruce Adams's post. And the rest 
of my reply is to go to dsimcha.

I agree with Bruce that test cases need to be deterministic. The reason 
for this is that in order to debug on must be able to reproduce the 
problem at hand, if a random is applied that causes an assert to fail, 
you will not be able to track down where the problem lies. Such a system 
is just as bad as running your application and a crash occurring. You 
have successfully produced the random data set needed to create a crash, 
but no way of tracking it down.

The only way a random test case could be of use is if the random value is 
captured and reported at crash time. This would allow it to be analyzed 
and be added as a static test case to prevent future regressions. I have 
not read the suggested code to see if this is the case, but the adding of 
the test case as an unchanging value is vital to the assurance of bug 
free code.

Jul 22 2008

Fawzi Mohamed <fmohamed mac.com> writes:

On 2008-07-23 06:43:53 +0200, Jesse Phillips <jessekphillips gmail.com> said:

 [...]
 I agree with Bruce that test cases need to be deterministic. The reason
 for this is that in order to debug on must be able to reproduce the
 problem at hand, if a random is applied that causes an assert to fail,
 you will not be able to track down where the problem lies. Such a system
 is just as bad as running your application and a crash occurring. You
 have successfully produced the random data set needed to create a crash,
 but no way of tracking it down.
 
 The only way a random test case could be of use is if the random value is
 captured and reported at crash time. This would allow it to be analyzed
 and be added as a static test case to prevent future regressions. I have
 not read the suggested code to see if this is the case, but the adding of
 the test case as an unchanging value is vital to the assurance of bug
 free code.

and this is exactly what my framework does.

It prints the arguments it had generated for the function (often that 
is enough to understand what is wrong) *and* it prints the Rng initial 
state and counter number you need to reproduce exactly that run and (as 
BCS noted, I had written in the initial post) you just need to append a 
.runTests(1,seed,counter) to the test to do it.

The counter number is used to have a full coverage by performing all 
combinations of discrete sets.
For example you know that 0 and 1 will be corner cases for the first 
argument and 2,4,8 for the second argument you can easily define a 
generator that does all possible combinations of them.
You can also mix combinatorial arguments and random ones.

I did this (that for example Quickcheck cannot do easily) because while 
random coverage is good if you have few cases that you want to check, 
the probability that at least one will be missed is greater that what 
one expects, so having both is (I think) a good idea.

Fawzi

Jul 23 2008

JAnderson <ask me.com> writes:

Fawzi Mohamed wrote:
 On 2008-07-23 06:43:53 +0200, Jesse Phillips <jessekphillips gmail.com> 
 said:
 
 [...]
 I agree with Bruce that test cases need to be deterministic. The reason
 for this is that in order to debug on must be able to reproduce the
 problem at hand, if a random is applied that causes an assert to fail,
 you will not be able to track down where the problem lies. Such a system
 is just as bad as running your application and a crash occurring. You
 have successfully produced the random data set needed to create a crash,
 but no way of tracking it down.

 The only way a random test case could be of use is if the random value is
 captured and reported at crash time. This would allow it to be analyzed
 and be added as a static test case to prevent future regressions. I have
 not read the suggested code to see if this is the case, but the adding of
 the test case as an unchanging value is vital to the assurance of bug
 free code.

 
 and this is exactly what my framework does.
 
 It prints the arguments it had generated for the function (often that is 
 enough to understand what is wrong) *and* it prints the Rng initial 
 state and counter number you need to reproduce exactly that run and (as 
 BCS noted, I had written in the initial post) you just need to append a 
 .runTests(1,seed,counter) to the test to do it.
 
 The counter number is used to have a full coverage by performing all 
 combinations of discrete sets.
 For example you know that 0 and 1 will be corner cases for the first 
 argument and 2,4,8 for the second argument you can easily define a 
 generator that does all possible combinations of them.
 You can also mix combinatorial arguments and random ones.
 
 I did this (that for example Quickcheck cannot do easily) because while 
 random coverage is good if you have few cases that you want to check, 
 the probability that at least one will be missed is greater that what 
 one expects, so having both is (I think) a good idea.
 
 Fawzi
 


At work I use UnitTest++.  That allows me to run the program though a 
debugger when I need to.  Often its simply enough to know that you 
changed something that broke the unittest however perhaps you could 
print out the entire unit test when it fails.  Then people could simply 
run that piece of code though a debugger.

Another though that I've thought would be cool for these sort of unit 
tests is something that monitors/records your code and automatically 
generates units tests on the macro scale.  ie it would write unit test 
files out which would essentially be recording of what your app did.

I imagine you could come up with some sort of template to do this:

ie:

I have:

foo(5);
foo2(20);

Now I want to record foo and foo2 as well so I change the definition to:

Record(foo)(5);  //Run foo and record it
Record(foo2)(10);//Run foo2 and record it

//Output from application run
foo(5);
foo2(10);

Of course this would also be useful for playing back code in smoke 
tests.  It could of course be made more advanced, like being able to 
change the script to wait for a certain function to return a certain 
result etc...

-Joel

Jul 23 2008

"Bruce Adams" <tortoise_74 yeah.who.co.uk> writes:

On Tue, 22 Jul 2008 22:01:37 +0100, dsimcha <dsimcha yahoo.com> wrote:

 I disagree.  Random testing can be a great way to find subtle bugs in  
 relatively
 complex algorithms that have a simpler but less efficient equivalent.   
 For
 example, let's say you're trying to write a super-efficient  
 implementation of a
 hash table with lots of little speed hacks that could hide subtle bugs in
 something that's only called a relatively small percentage of the time  
 to begin
 with, like collision resolution.  Then, let's say that this bug only  
 shows up
 under some relatively specific combination of inputs.  An easy way to be
 reasonably sure that you don't have these kinds of subtle bugs would be  
 to also
 implement an associative array as a linear search just for testing.   
 This is
 trivial to implement, so unlike your uber-optimized hash table, if it  
 looks right
 it probably is.  In any event, it's even less likely to be wrong in the  
 same way
 as your hash table.  Then generate a ton of random data and put it in  
 both your
 hash table and your linear search and make sure it all reads back  
 properly.  If
 the bug is subtle enough, or if you don't think of it, it may just be  
 near
 impossible to manually generate enough test cases to find it.

I agree with the strategy of using a slow version to test a fast version of
an algorithm. I often use it myself. I would still be less keen on throwing
random numbers at it. Rather I would try to write interfaces that exposes
the bit where you're being clever. In this case maybe the collision  
resolution
dohickey. I try to test things at the lowest level possible first. Really  
test
the units and then the other unit tests become more like integration  
tests. They
are mainly there to check that the logic of calling the simpler cases is  
correct.

Jul 24 2008

D Programming

C/C++ Programming

Other

digitalmars.D - RTest, a random testing framework