digitalmars.D - Writing a unit test for a singleton implementation

Idan Arye (34/34) May 15 2013 I'm making an idioms

Diggory (8/42) May 15 2013 There's no real way to reliably test race conditions. One thing

Idan Arye (50/104) May 15 2013 OK, I used `core.sync.barrier` to make all threads access the

Sebastian Graf (12/46) May 15 2013 Also note that unit tests are supposed to run fast (definitely <

Dmitry Olshansky (14/38) May 15 2013 Sleep in concurrency code that aims to synchronize something is always

Idan Arye (24/33) May 15 2013 Don't say "always". Only a Sith deals in absolutes. Sure,

Jonathan M Davis (3/8) May 16 2013 Of course, that in and of itself is an absolute... :)

1100110 (2/12) May 16 2013 He's a witch!

Dmitry Olshansky (46/72) May 16 2013 Well I embraced the power of the dark side long ago ;)

Idan Arye (53/89) May 16 2013 That's why I added the "(almost)". A race condition happens when

Dmitry Olshansky (37/110) May 17 2013 A terminological moment - a race condition is not that fact "bad

Idan Arye (12/14) May 17 2013 Here: https://gist.github.com/someboddy/5601276
Diggory (10/10) May 17 2013 I think we all understand it's not going to catch every race

Idan Arye (36/46) May 17 2013 Exactly!

Idan Arye (5/8) May 15 2013 One of the primary goals of the Phobos unit tests is to easily

deadalnix (5/6) May 18 2013 1/ You'll get nowhere with sleep.

"Idan Arye" <GenericNPC gmail.com> writes:

I'm making an idioms 
library(http://forum.dlang.org/thread/fofbrlqruxbevnxchxdp forum.dlang.org) 
to add to Phobos, and one of the idioms is the singleton design 
pattern. Since I'm gonna send it to Phobos, it needs to have a 
unit test. Here is my unit test:

     static class Foo
     {
         mixin LowLockSingleton;

         this()
         {
             Thread.sleep(dur!"msecs"(500));
         }
     }

     Foo[10] foos;

     foreach(i; parallel(iota(foos.length)))
     {
         foos[i] = Foo.instance;
     }

     foreach(i; 1 .. foos.length)
     {
         assert(foos[0] == foos[i]);
     }


This unit test works - it doesn't fail, but if I remove the 
`synchronized` from my singleton implementation it does fail.

Now, this is my concern: I'm doing here a 500 millisecond sleep 
in the constructor, and this sleep is required to guarantee a 
race condition. But I'm not sure about two things:

  - Is it enough? If a slow or busy computer runs this test, the 
500ms sleep of the first iteration might be over before the 
second iteration even starts!

  - Is it too much? Phobos has many unit tests, and they need to 
be run many times by many machines - is it really OK to add a 
500ms delay for a single item's implementation?


Your opinion?

May 15 2013

"Diggory" <diggsey googlemail.com> writes:

On Wednesday, 15 May 2013 at 18:28:53 UTC, Idan Arye wrote:
 I'm making an idioms 
 library(http://forum.dlang.org/thread/fofbrlqruxbevnxchxdp forum.dlang.org) 
 to add to Phobos, and one of the idioms is the singleton design 
 pattern. Since I'm gonna send it to Phobos, it needs to have a 
 unit test. Here is my unit test:

     static class Foo
     {
         mixin LowLockSingleton;

         this()
         {
             Thread.sleep(dur!"msecs"(500));
         }
     }

     Foo[10] foos;

     foreach(i; parallel(iota(foos.length)))
     {
         foos[i] = Foo.instance;
     }

     foreach(i; 1 .. foos.length)
     {
         assert(foos[0] == foos[i]);
     }


 This unit test works - it doesn't fail, but if I remove the 
 `synchronized` from my singleton implementation it does fail.

 Now, this is my concern: I'm doing here a 500 millisecond sleep 
 in the constructor, and this sleep is required to guarantee a 
 race condition. But I'm not sure about two things:

  - Is it enough? If a slow or busy computer runs this test, the 
 500ms sleep of the first iteration might be over before the 
 second iteration even starts!

  - Is it too much? Phobos has many unit tests, and they need to 
 be run many times by many machines - is it really OK to add a 
 500ms delay for a single item's implementation?


 Your opinion?

There's no real way to reliably test race conditions. One thing 
you could do is get a bunch of threads ready and waiting to 
access "Foo.instance" and then notify them all at once, that way 
you can do away with "sleep" which is not great to have in a unit 
test anyway. Repeat this a few times and it should be fairly 
reliable, plus it will usually be much faster because you don't 
have to sleep.

May 15 2013

"Idan Arye" <GenericNPC gmail.com> writes:

On Wednesday, 15 May 2013 at 19:17:23 UTC, Diggory wrote:
 On Wednesday, 15 May 2013 at 18:28:53 UTC, Idan Arye wrote:
 I'm making an idioms 
 library(http://forum.dlang.org/thread/fofbrlqruxbevnxchxdp forum.dlang.org) 
 to add to Phobos, and one of the idioms is the singleton 
 design pattern. Since I'm gonna send it to Phobos, it needs to 
 have a unit test. Here is my unit test:

    static class Foo
    {
        mixin LowLockSingleton;

        this()
        {
            Thread.sleep(dur!"msecs"(500));
        }
    }

    Foo[10] foos;

    foreach(i; parallel(iota(foos.length)))
    {
        foos[i] = Foo.instance;
    }

    foreach(i; 1 .. foos.length)
    {
        assert(foos[0] == foos[i]);
    }


 This unit test works - it doesn't fail, but if I remove the 
 `synchronized` from my singleton implementation it does fail.

 Now, this is my concern: I'm doing here a 500 millisecond 
 sleep in the constructor, and this sleep is required to 
 guarantee a race condition. But I'm not sure about two things:

 - Is it enough? If a slow or busy computer runs this test, the 
 500ms sleep of the first iteration might be over before the 
 second iteration even starts!

 - Is it too much? Phobos has many unit tests, and they need to 
 be run many times by many machines - is it really OK to add a 
 500ms delay for a single item's implementation?


 Your opinion?

 There's no real way to reliably test race conditions. One thing 
 you could do is get a bunch of threads ready and waiting to 
 access "Foo.instance" and then notify them all at once, that 
 way you can do away with "sleep" which is not great to have in 
 a unit test anyway. Repeat this a few times and it should be 
 fairly reliable, plus it will usually be much faster because 
 you don't have to sleep.


OK, I used `core.sync.barrier` to make all threads access the 
singleton together:

     static class Foo
     {
         mixin LowLockSingleton;

         private this()
         {
             Thread.sleep(dur!"msecs"(0));
         }
     }

     Foo[10] foos;
     Thread[foos.length] threads;
     Barrier barrier = new Barrier(foos.length);

     class FooInitializer : Thread
     {
         ulong index;

         this(ulong index)
         {
             super(&run);
             this.index = index;
         }

         void run()
         {
             barrier.wait();
             foos[index] = Foo.instance;
         }
     }

     foreach(i; 0 .. foos.length)
     {
         threads[i] = new FooInitializer(i);
         threads[i].start();
     }

     foreach(thread; threads)
     {
         thread.join();
     }

     foreach(i; 1 .. foos.length)
     {
         assert(foos[0] == foos[i]);
     }

This gives me 100% accuracy. Your idea of holding all threads 
together did the trick - if I comment out the call to 
`barrier.wait()` I get a 50% accuracy, which ofcourse is not as 
nearly as good as 100%.

The sleeping, however, is still required. If I remove it I also 
get 50% accuracy. I tried to replace it with `Thread.yield()` and 
that gave me 92% accuracy - which is far better than 50% but not 
as good as 100%. A sleep of 0 seconds is not that bad a price to 
pay for that 100% accuracy.

May 15 2013

"Sebastian Graf" <SebastianGraf t-online.de> writes:

On Wednesday, 15 May 2013 at 18:28:53 UTC, Idan Arye wrote:
 I'm making an idioms 
 library(http://forum.dlang.org/thread/fofbrlqruxbevnxchxdp forum.dlang.org) 
 to add to Phobos, and one of the idioms is the singleton design 
 pattern. Since I'm gonna send it to Phobos, it needs to have a 
 unit test. Here is my unit test:

     static class Foo
     {
         mixin LowLockSingleton;

         this()
         {
             Thread.sleep(dur!"msecs"(500));
         }
     }

     Foo[10] foos;

     foreach(i; parallel(iota(foos.length)))
     {
         foos[i] = Foo.instance;
     }

     foreach(i; 1 .. foos.length)
     {
         assert(foos[0] == foos[i]);
     }


 This unit test works - it doesn't fail, but if I remove the 
 `synchronized` from my singleton implementation it does fail.

 Now, this is my concern: I'm doing here a 500 millisecond sleep 
 in the constructor, and this sleep is required to guarantee a 
 race condition. But I'm not sure about two things:

  - Is it enough? If a slow or busy computer runs this test, the 
 500ms sleep of the first iteration might be over before the 
 second iteration even starts!

  - Is it too much? Phobos has many unit tests, and they need to 
 be run many times by many machines - is it really OK to add a 
 500ms delay for a single item's implementation?


 Your opinion?

Also note that unit tests are supposed to run fast (definitely < 
50 ms), because running 500-5000 of them should be relatively 
cheap. Apart from that, races aren't something to test for in a 
unit test. It may be more a kind of integration test, though the 
sporadic nature of races makes them candidates for "One of those 
occasionally failing bad guys hamstringing development... Let's 
just ignore him."-type thinking.
I would create a seperate Test suite for race tests and execute 
them multiple times if need be... But there are a million better 
sources on the internet than me on that topic.

tldr; don't test for races in unit tests.

May 15 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

16-May-2013 01:09, Sebastian Graf пишет:
 On Wednesday, 15 May 2013 at 18:28:53 UTC, Idan Arye wrote:
 I'm making an idioms
 library(http://forum.dlang.org/thread/fofbrlqruxbevnxchxdp forum.dlang.org)
 to add to Phobos, and one of the idioms is the singleton design
 pattern. Since I'm gonna send it to Phobos, it needs to have a unit
 test. Here is my unit test:


[snip]

 This unit test works - it doesn't fail, but if I remove the
 `synchronized` from my singleton implementation it does fail.

 Now, this is my concern: I'm doing here a 500 millisecond sleep in the
 constructor, and this sleep is required to guarantee a race condition.
 But I'm not sure about two things:

  - Is it enough? If a slow or busy computer runs this test, the 500ms
 sleep of the first iteration might be over before the second iteration
 even starts!


Sleep in concurrency code that aims to synchronize something is always 
bad idea (=BUG).

Sleep is a tool used to give up thread execution resources to avoid 
spinning on something wasting cycles. Use semaphore or analogous 
primitives to coordinate, see other posts.

  - Is it too much? Phobos has many unit tests, and they need to be run
 many times by many machines - is it really OK to add a 500ms delay for
 a single item's implementation?


No.

 Your opinion?

 tldr; don't test for races in unit tests.

One word - it's a stress test (smoke test). It should be a kind of loop 
that executes for as long as you can allow it. Days later if it didn't 
fail usually you either decide that it should be "good enough" or keep 
running them.

-- 
Dmitry Olshansky

May 15 2013

"Idan Arye" <GenericNPC gmail.com> writes:

On Wednesday, 15 May 2013 at 21:52:40 UTC, Dmitry Olshansky wrote:
 Sleep in concurrency code that aims to synchronize something is 
 always bad idea (=BUG).

Don't say "always". Only a Sith deals in absolutes. Sure, 
`sleep()` in concurrent code is a bad idea because it can cause 
race conditions. But I *want* to (almost)cause a race condition 
here - that's the whole point of this unit test, to guarantee 
that the singleton implementation is immune to race conditions.

Crashing your car into a brick wall is usually a bad idea, but 
not always. When they want to test the car's safety systems, they 
crash it into a wall on purpose. The fact that driving safety 
rules save lives does not mean the crash test team has to stick 
to those rules - that would defeat the purpose of the crash test.

 Sleep is a tool used to give up thread execution resources to 
 avoid spinning on something wasting cycles. Use semaphore or 
 analogous primitives to coordinate, see other posts.

In my second version of the unit test(after reading Diggory's 
comment) I used a `Barrier` to coordinate, and it gave me 50% 
accuracy - That is, when I took the synchronization out of the 
implementation the unit test failed 50% of the times. I still 
need the `sleep()` to make sure the threads are getting mixed.

 One word - it's a stress test (smoke test). It should be a kind 
 of loop that executes for as long as you can allow it. Days 
 later if it didn't fail usually you either decide that it 
 should be "good enough" or keep running them.

Why in the world would I want to make a unit test for days, 
waiting for a race condition that might not happen, when I can 
easily force a race condition?

If we return to the car's crash test example - when a crash test 
team wants to test a car, do they give alcohol to a team member 
and make him drive around while following him with measure 
devices? Ofcourse not - they simply drive the car directly to the 
brick wall.

May 15 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Thursday, May 16, 2013 00:21:58 Idan Arye wrote:
 On Wednesday, 15 May 2013 at 21:52:40 UTC, Dmitry Olshansky wrote:
 Sleep in concurrency code that aims to synchronize something is
 always bad idea (=BUG).

 
 Don't say "always". Only a Sith deals in absolutes.

Of course, that in and of itself is an absolute... :)

- Jonathan M Davis

May 16 2013

1100110 <0b1100110 gmail.com> writes:

On 05/16/2013 12:33 PM, Jonathan M Davis wrote:
 On Thursday, May 16, 2013 00:21:58 Idan Arye wrote:
 On Wednesday, 15 May 2013 at 21:52:40 UTC, Dmitry Olshansky wrote:
 Sleep in concurrency code that aims to synchronize something is
 always bad idea (=3DBUG).

 Don't say "always". Only a Sith deals in absolutes.

=20
 Of course, that in and of itself is an absolute... :)
=20
 - Jonathan M Davis

He's a witch!

May 16 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

16-May-2013 02:21, Idan Arye пишет:
 On Wednesday, 15 May 2013 at 21:52:40 UTC, Dmitry Olshansky wrote:
 Sleep in concurrency code that aims to synchronize something is always
 bad idea (=BUG).

 Don't say "always". Only a Sith deals in absolutes. Sure, `sleep()` in
 concurrent code is a bad idea because it can cause race conditions.

Well I embraced the power of the dark side long ago ;)

And I can confidently say that sleep doesn't cause anything reliably at 
all. Try measuring time with it (e.g. in ms range) and you'll see it :)

 But
 I *want* to (almost)cause a race condition here

You can't "cause" a race condition - it's there by definition of code.
What you can cause is a data corruption that happened with one bad 
interleave of reads/writes because it was possible.

- that's the whole point
 of this unit test, to guarantee that the singleton implementation is
 immune to race conditions.

I can't see how suspending all threads by 500ms + arbitrary amount of 
time of on the order of one context switch (or more depends on how 
system is busy) will do that. Sleep basically says wake me up no sooner 
then 500ms (i.e. put back into active threads queue), it doesn't 
guarantee any kind of timing or priority.

Even with proper timings launching a bunch of threads at almost the same 
time still doesn't guarantee it. You are far better off with solid 
formal prof in this case (that is there).

What you might be seeing is that if all of threads on idle system 
preform context switch and then switch back after sleeping X ms they 
might collide.
 Crashing your car into a brick wall is usually a bad idea, but not
 always.

I'm not seeing this test do anything useful anyway - you know there is a 
race condition and then you try to test to see if it works as just as if 
there is none.

It's more like unscrewing a bunch of bolts and seeing if the car manages 
it to the other town. It might or not depending on how the driver rides 
it and on a ton of other chances - truth be told it's blind luck almost. 
Even if you unscrew the same bolts exactly the same way the 2nd car will 
not ride exactly as long a distance as 1st one.


 Sleep is a tool used to give up thread execution resources to avoid
 spinning on something wasting cycles. Use semaphore or analogous
 primitives to coordinate, see other posts.

 In my second version of the unit test(after reading Diggory's comment) I
 used a `Barrier` to coordinate, and it gave me 50% accuracy - That is,
 when I took the synchronization out of the implementation the unit test
 failed 50% of the times. I still need the `sleep()` to make sure the
 threads are getting mixed.

And if your PC was compressing video or serving some HTTP you'll have 
even less no matter how you try to run them I guess...

But if you like it I think Thread.yield will work just as well, it will 
cause threads to do the context switch.

 One word - it's a stress test (smoke test). It should be a kind of
 loop that executes for as long as you can allow it. Days later if it
 didn't fail usually you either decide that it should be "good enough"
 or keep running them.

 Why in the world would I want to make a unit test for days, waiting for
 a race condition that might not happen, when I can easily force a race
 condition?

Sleep doesn't guarantee it. It causes context switch though and that 
might be what you want. Maybe creating a bunch of threads as suspended 
and then waking them all up could get close to that.

Another problem is about expecting something definite out of race 
condition. Yes, here you are getting away with single atomic read/write 
of pointer simply because of x86 architecture.

Technically what you'll can see with a race condition is undefined (not 
speaking of this simple example on x86 that IMO is pointless anyway).
Thus trying to catch it in more complex situation will require more then 
just putting a sleep(xyz) before a function call.

Imagine trying to test a lock-free collection ? You still need to lauch 
many threads and somehow try to schedule them funky. Even then I don't 
see how 1 single-shot can be reliable there.

-- 
Dmitry Olshansky

May 16 2013

"Idan Arye" <GenericNPC gmail.com> writes:

On Thursday, 16 May 2013 at 21:10:52 UTC, Dmitry Olshansky wrote:
 16-May-2013 02:21, Idan Arye пишет:
 But
 I *want* to (almost)cause a race condition here

 You can't "cause" a race condition - it's there by definition 
 of code.
 What you can cause is a data corruption that happened with one 
 bad interleave of reads/writes because it was possible.

That's why I added the "(almost)". A race condition happens when 
one thread reads a variable and writes it back based on the old 
value, and between that read and write another thread writes to 
that variable. By adding a `sleep()` between that read and write, 
I can conjure my own race condition. Ofcourse, the race condition 
will never happen, because the singleton's implementation uses 
synchronization to prevent it - but that's the whole point of my 
unit test, to check that the implementation is doing the 
synchronization correctly.

 I'm not seeing this test do anything useful anyway - you know 
 there is a race condition and then you try to test to see if it 
 works as just as if there is none.

 It's more like unscrewing a bunch of bolts and seeing if the 
 car manages it to the other town. It might or not depending on 
 how the driver rides it and on a ton of other chances - truth 
 be told it's blind luck almost. Even if you unscrew the same 
 bolts exactly the same way the 2nd car will not ride exactly as 
 long a distance as 1st one.

But if I have a system in my car that allows it to keep traveling 
even after unscrewing some bolts, and I want to test it, the way 
to do it is to unscrew those bolts and try to drive it to the 
other town. If I can't get to the other town, that means that 
system failed.

That's what I'm doing here - I have a system that prevents a race 
condition in the singleton's instantiation, and in order to test 
that system I try to force a race condition in the singleton's 
instantiation.

 And if your PC was compressing video or serving some HTTP 
 you'll have even less no matter how you try to run them I 
 guess...

OK, I just tested it by playing for video files at once, and the 
accuracy dropped from 100% to around 96%.

Still, the point of unit tests is to make sure that code changes 
do not break old code. Most of them are super trivial, because 
many of the bugs they prevent are also super trivial - the kind 
of bugs that make you want to hit yourself for being so stupid 
you didn't notice them before. If the library or system have a 
good set of unit tests, a programmer that runs them can catch 
those bugs happening in trivial examples right after they are 
introduced.

So, if a Phobos\druntime\dmd developer somehow manages to change 
something that breaks my singleton implementation, the next time 
they run the unit tests there is over 90% chance that unit test 
will catch the bug even if they put a heavy burden on their 
machine - and if they don't put such a heavy burden while 
developing, those chances climb very near to 100%.

So yea, my unit test is not 100% accurate in worst case scenarios 
- but it still does it's job.

 But if you like it I think Thread.yield will work just as well, 
 it will cause threads to do the context switch.

I tested both. `Thread.yield()` gives around 92% accuracy, while 
`Thread.sleep(dur!"msecs"(0))` gives 100% accuracy(when I don't 
play 4 video files at once). I have no idea why this happens, but 
it happens.

 Sleep doesn't guarantee it. It causes context switch though and 
 that might be what you want. Maybe creating a bunch of threads 
 as suspended and then waking them all up could get close to 
 that.

That's what I did in the second version - I suspended all the 
threads using `core.sync.barrier`, and the barrier took care to 
resume them all at once. This allowed me to use a 0ms sleep - but 
the call to `sleep()` is still required.

 Another problem is about expecting something definite out of 
 race condition. Yes, here you are getting away with single 
 atomic read/write of pointer simply because of x86 architecture.

How is it an atomic read/write if I call `sleep()` *between* the 
read and the write?

 Technically what you'll can see with a race condition is 
 undefined (not speaking of this simple example on x86 that IMO 
 is pointless anyway).
 Thus trying to catch it in more complex situation will require 
 more then just putting a sleep(xyz) before a function call.

 Imagine trying to test a lock-free collection ? You still need 
 to lauch many threads and somehow try to schedule them funky. 
 Even then I don't see how 1 single-shot can be reliable there.

True - my testing method is not possible with lock-free 
collections, since I can't put a `sleep()` call inside an atomic 
operation. But I *can* put a sleep call inside a `synchronized` 
block(or more precisely, inside a constructor that is being 
invoked inside a synchronized block), so my testing method does 
work for my case.

May 16 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

17-May-2013 04:57, Idan Arye пишет:
 On Thursday, 16 May 2013 at 21:10:52 UTC, Dmitry Olshansky wrote:
 16-May-2013 02:21, Idan Arye пишет:
 But
 I *want* to (almost)cause a race condition here

 You can't "cause" a race condition - it's there by definition of code.
 What you can cause is a data corruption that happened with one bad
 interleave of reads/writes because it was possible.

 That's why I added the "(almost)". A race condition happens when one
 thread reads a variable and writes it back based on the old value, and
 between that read and write another thread writes to that variable. By
 adding a `sleep()

` between that read and write, I can conjure my own
 race condition.

A terminological moment - a race condition is not that fact "bad 
interleaving of reads-writes did happen".

Race condition is totally a static property of a code (or a program), 
whereby at least 2 threads may simultaneously access memory location and 
there is no ordering of operations insured (or happens before relations 
established). It's then said that the program has race condition, 
sometimes calling it potential if there is no proof that 2 threads can 
access it simultaneously.

 I'm not seeing this test do anything useful anyway - you know there is
 a race condition and then you try to test to see if it works as just
 as if there is none.

 It's more like unscrewing a bunch of bolts and seeing if the car
 manages it to the other town. It might or not depending on how the
 driver rides it and on a ton of other chances - truth be told it's
 blind luck almost. Even if you unscrew the same bolts exactly the same
 way the 2nd car will not ride exactly as long a distance as 1st one.

 But if I have a system in my car that allows it to keep traveling even
 after unscrewing some bolts,

The trick here is that race condition is not guaranteed to fail in a 
specific way.

 OK, I just tested it by playing for video files at once, and the
 accuracy dropped from 100% to around 96%.

 Still, the point of unit tests is to make sure that code changes do not
 break old code. Most of them are super trivial, because many of the bugs
 they prevent are also super trivial - the kind of bugs that make you
 want to hit yourself for being so stupid you didn't notice them before.
 If the library or system have a good set of unit tests, a programmer
 that runs them can catch those bugs happening in trivial examples right
 after they are introduced.

 So, if a Phobos\druntime\dmd developer somehow manages to change
 something that breaks my singleton implementation, the next time they
 run the unit tests there is over 90% chance that unit test will catch
 the bug even if they put a heavy burden on their machine - and if they
 don't put such a heavy burden while developing, those chances climb very
 near to 100%.

 So yea, my unit test is not 100% accurate in worst case scenarios - but
 it still does it's job.

Chance == smoke test, so I think we finally in agreement :)
Then by definition running it multiple times you'll have (1 - p)^^n 
chance for bugs escaping.
In general since you can't know probability you don't know how long 
should you run it. Hence my reference to the running it long enough.

 But if you like it I think Thread.yield will work just as well, it
 will cause threads to do the context switch.

 I tested both. `Thread.yield()` gives around 92% accuracy, while
 `Thread.sleep(dur!"msecs"(0))` gives 100% accuracy(when I don't play 4
 video files at once). I have no idea why this happens, but it happens.

 Sleep doesn't guarantee it. It causes context switch though and that
 might be what you want. Maybe creating a bunch of threads as suspended
 and then waking them all up could get close to that.

 That's what I did in the second version - I suspended all the threads
 using `core.sync.barrier`, and the barrier took care to resume them all
 at once. This allowed me to use a 0ms sleep - but the call to `sleep()`
 is still required.

To ensure context switch. My guess is that yield didn't work because 
then it could be the case that there is no extra work in the system. In 
which case the OS can stay on the same thread (and avoid switching) 
after yield.

 Another problem is about expecting something definite out of race
 condition. Yes, here you are getting away with single atomic
 read/write of pointer simply because of x86 architecture.

 How is it an atomic read/write

Each operations is atomic.

 if I call `sleep()` *between* the read
 and the write?

I'm not seeing it. I think I've misread the code anyway.

If you test TLS low-lock singleton

vs

if(instance_ != null) //this is critical point, how do you test it?
	synchronize(mut){
		if(instance_ != null)
			instance_ = ...;
	}
return instance_;

Or the one that isn't locked at all?

Can you show the code actually - bogus singleton and correct one?

 Technically what you'll can see with a race condition is undefined
 (not speaking of this simple example on x86 that IMO is pointless
 anyway).
 Thus trying to catch it in more complex situation will require more
 then just putting a sleep(xyz) before a function call.

 Imagine trying to test a lock-free collection ? You still need to
 launch many threads and somehow try to schedule them funky. Even then I
 don't see how 1 single-shot can be reliable there.

 True - my testing method is not possible with lock-free collections,
 since I can't put a `sleep()` call inside an atomic operation. But I
 *can* put a sleep call inside a `synchronized` block(or more precisely,
 inside a constructor that is being invoked inside a synchronized block),
 so my testing method does work for my case.

Which is not what I thought you are trying to test. In this trivial case 
I'd avoid such tests completely YMMV.

-- 
Dmitry Olshansky

May 17 2013

"Idan Arye" <GenericNPC gmail.com> writes:

On Friday, 17 May 2013 at 10:16:21 UTC, Dmitry Olshansky wrote:
 Can you show the code actually - bogus singleton and correct 
 one?

Here: https://gist.github.com/someboddy/5601276
I thinned down the implementation to a minimum working example.

`enum BREAK_IMPLEMENTATION` at the top of the file decides 
between using the broken implementation and the correct one. I 
put the unit test under `main()` so you can compile it without 
the -unittest flag.

The reason that `Foo` is declared at global scope is a bug in dmd 
that was fixed for the next release(can't find it at Bugzilla, 
but it's fixed in dmd master). If I declare it at the unittest's 
or in main's scope, the constructor won't be invoked and 
`sleep()` won't be called.

May 17 2013

"Diggory" <diggsey googlemail.com> writes:

I think we all understand it's not going to catch every race 
condition, or even a given race condition every time, ie. it 
can't prove that the code is correct.

What it can do however is prove that code is incorrect. If the 
unit test fails there is definitely something wrong with the 
code, therefore it's a useful test to have.

The most obvious case is if there is a change to the singleton 
code which accidentally creates a race condition. Any chance of 
detecting that in a unit test is better than no chance of 
detecting it.

May 17 2013

"Idan Arye" <GenericNPC gmail.com> writes:

On Friday, 17 May 2013 at 21:06:30 UTC, Diggory wrote:
 I think we all understand it's not going to catch every race 
 condition, or even a given race condition every time, ie. it 
 can't prove that the code is correct.

 What it can do however is prove that code is incorrect. If the 
 unit test fails there is definitely something wrong with the 
 code, therefore it's a useful test to have.

 The most obvious case is if there is a change to the singleton 
 code which accidentally creates a race condition. Any chance of 
 detecting that in a unit test is better than no chance of 
 detecting it.

Exactly!

Anyways, I've updated the gist with an even better solution, that 
gave me 100% accuracy even when playing 4 video files.

I added a static shared counter that counts how many threads have 
passed the barrier - every thread has to increment it immediately 
after it passes the barrier. The constructor, instead of a single 
call to `sleep()`, calls `sleep()` until the counter reaches the 
number of threads. So, this is what happens:

1) 10 threads are started, each stops at the barrier waiting for 
the others.
2) Once all the threads have reached the barrier, threads are 
starting to get released(from the barrier, not from memory).
3) Each thread released from the barrier increments the counter.
4) At least one thread passes the singleton checks, enters the 
constructor and goes into a loop where it calls `sleep()` all of 
the threads got a chance to run again.
6) Once all the threads got that chance, all the threads stuck in 
the constructor get released from the loop.

Now, theoretically this unit test is not 100% accurate. If all 
the threads except one get switched out between step 3 and step 4 
- that is, after they increment the counter but before they pass 
the singleton checks - then by the time those threads will run 
again, the one other thread gets to initiate the singleton and 
write the __gshared instance field - which means the other 
threads won't pass the singleton checks and won't create more 
instances, and the unit test won't fail even if the 
implementation is broken.

Now, while theoretically possible, this scenario is far fetched 
in reality. The window where a thread needs to be switched is 
very short, and placed immediately after it gets switched in. The 
amount of work the thread does in that window after it got 
switched in is smaller(or at least not much larger) than the work 
needed for the context switch, so no sane operation system should 
switch it out in that window - let alone switch out nine such 
threads.

May 17 2013

"Idan Arye" <GenericNPC gmail.com> writes:

On Wednesday, 15 May 2013 at 21:09:55 UTC, Sebastian Graf wrote:
 I would create a seperate Test suite for race tests and execute 
 them multiple times if need be... But there are a million 
 better sources on the internet than me on that topic.

One of the primary goals of the Phobos unit tests is to easily 
catch a change that causes problems with the usage of some Phobos 
component. If I create a second test, it won't be run by the 
autotester nor by people making changes to dmd\druntime\Phobos

May 15 2013

"deadalnix" <deadalnix gmail.com> writes:

On Wednesday, 15 May 2013 at 18:28:53 UTC, Idan Arye wrote:
 Your opinion?

1/ You'll get nowhere with sleep.
2/ Both singleton and race condition are notoriously difficult to 
test. For the first one, it is considered good practice to avoid 
it altogether, for the second one, formal proof is prefered.

May 18 2013

D Programming

C/C++ Programming

Other

digitalmars.D - Writing a unit test for a singleton implementation