www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Set-up timeouts on thread-related unittests

reply "Iain Buclaw" <ibuclaw gdcproject.org> writes:
Hi,

I've been seeing a problem on the Debian X32 build system where 
unittest process just hangs, and require manual intervention by 
the poor maintainer to kill the process manually before the build 
fails due to inactivity.

Haven't yet managed to reduce the problem (it only happens on a 
native X32 system, but not when running X32 under native x86_64), 
but thought it would be a good idea to suggest that any thread 
related tests should be safely handled by self terminating after 
a period of waiting.

Thoughts from the phobos maintainers?

Regards
Iain
Jun 20 2014
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 20 Jun 2014 03:13:23 -0400, Iain Buclaw <ibuclaw gdcproject.org>  
wrote:

 Hi,

 I've been seeing a problem on the Debian X32 build system where unittest  
 process just hangs, and require manual intervention by the poor  
 maintainer to kill the process manually before the build fails due to  
 inactivity.

 Haven't yet managed to reduce the problem (it only happens on a native  
 X32 system, but not when running X32 under native x86_64), but thought  
 it would be a good idea to suggest that any thread related tests should  
 be safely handled by self terminating after a period of waiting.

 Thoughts from the phobos maintainers?
This could probably be implemented quite simply in druntime. I'd be hesitant to make it default, but it would be nice to tag unit tests as having a maximum timeout. Yet another case for using attributes on unit tests and RTInfo for modules... -Steve
Jun 20 2014
parent reply Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 20 Jun 2014 16:00, "Steven Schveighoffer via Digitalmars-d" <
digitalmars-d puremagic.com> wrote:
 On Fri, 20 Jun 2014 03:13:23 -0400, Iain Buclaw <ibuclaw gdcproject.org>
wrote:
 Hi,

 I've been seeing a problem on the Debian X32 build system where unittest
process just hangs, and require manual intervention by the poor maintainer to kill the process manually before the build fails due to inactivity.
 Haven't yet managed to reduce the problem (it only happens on a native
X32 system, but not when running X32 under native x86_64), but thought it would be a good idea to suggest that any thread related tests should be safely handled by self terminating after a period of waiting.
 Thoughts from the phobos maintainers?
This could probably be implemented quite simply in druntime. I'd be hesitant to make it default, but it would be nice to tag unit
tests as having a maximum timeout. Yet another case for using attributes on unit tests and RTInfo for modules...

I don't see a problem using it as default for these.

1) I assume there is already a timeout for the TCP tests.

2) If the test runs a shortlived (ie: increments some global value)
function in 100 parallel threads, the maintainer of the module who wrote
that test should safely be able to assume that it shouldn't take more than
60 seconds to execute.
Jun 20 2014
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 20 Jun 2014 13:13:30 -0400, Iain Buclaw via Digitalmars-d  
<digitalmars-d puremagic.com> wrote:

 On 20 Jun 2014 16:00, "Steven Schveighoffer via Digitalmars-d" <
 digitalmars-d puremagic.com> wrote:
 On Fri, 20 Jun 2014 03:13:23 -0400, Iain Buclaw <ibuclaw gdcproject.org>
wrote:
 Hi,

 I've been seeing a problem on the Debian X32 build system where  
 unittest
process just hangs, and require manual intervention by the poor maintainer to kill the process manually before the build fails due to inactivity.
 Haven't yet managed to reduce the problem (it only happens on a native
X32 system, but not when running X32 under native x86_64), but thought it would be a good idea to suggest that any thread related tests should be safely handled by self terminating after a period of waiting.
 Thoughts from the phobos maintainers?
This could probably be implemented quite simply in druntime. I'd be hesitant to make it default, but it would be nice to tag unit
tests as having a maximum timeout. Yet another case for using attributes on unit tests and RTInfo for modules...

 I don't see a problem using it as default for these.
No, I mean that druntime would run all unit tests with an expectation that each unit test should time out after N seconds. I think it's much cleaner and less error prone to implement the timeout in the unittest runner than in the unit test itself. But of course, there might be exceptions, we can't put those restrictions on all code. A nice feature would be if the default was to have a timeout of say 1 second, and then allow users to specify an alternate/infinite timeout based on a UDA. -Steve
Jun 20 2014
parent reply "Sean Kelly" <sean invisibleduck.org> writes:
On Friday, 20 June 2014 at 18:24:21 UTC, Steven Schveighoffer
wrote:
 No, I mean that druntime would run all unit tests with an 
 expectation that each unit test should time out after N seconds.
I'd be more inclined to have the test runner kill the process if it takes more than N seconds to complete and call that a test failure. Figuring out something reasonable to do in the unit tester within Druntime would be difficult.
Jun 20 2014
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 20 Jun 2014 14:30:50 -0400, Sean Kelly <sean invisibleduck.org>  
wrote:

 On Friday, 20 June 2014 at 18:24:21 UTC, Steven Schveighoffer
 wrote:
 No, I mean that druntime would run all unit tests with an expectation  
 that each unit test should time out after N seconds.
I'd be more inclined to have the test runner kill the process if it takes more than N seconds to complete and call that a test failure. Figuring out something reasonable to do in the unit tester within Druntime would be difficult.
Timing individual tests is more likely to be accurate than timing the whole set of unit tests. A slow machine could easily double or triple the time the whole thing takes, and it would be difficult to pinpoint a reasonable time that all machines would accept. But a single unit test block should be really quick, I think 1 second is long enough to say it's failed in the vast majority of cases. Of course, if they all take near 1 second, the total time will be huge. But we are not testing speed, we are testing for infinite loops/hangs. The trick is specifying any info to the runtime about specific unit tests, we don't have any mechanism to do that. UDAs would be perfect. I don't think it would be that difficult. you just need a separate thread (can be written in D but only use C runtime), that exits the process if it doesn't get pinged properly. Then the test runner has to ping the thread between each test. -Steve
Jun 20 2014
parent "David Nadlinger" <code klickverbot.at> writes:
On Friday, 20 June 2014 at 19:44:18 UTC, Steven Schveighoffer 
wrote:
 Timing individual tests is more likely to be accurate than 
 timing the whole set of unit tests. A slow machine could easily 
 double or triple the time the whole thing takes, and it would 
 be difficult to pinpoint a reasonable time that all machines 
 would accept.
That's true if you expect the timeout to be hit as part of regular testing. If it's only to keep the auto tester from hanging, just setting a one-minute global timeout per test case (or something like that) should be fine. Sure, the auto-tester throughput would suffer somewhat as long as the build is broken, but… David
Jun 20 2014
prev sibling next sibling parent reply "Sean Kelly" <sean invisibleduck.org> writes:
On Friday, 20 June 2014 at 07:13:24 UTC, Iain Buclaw wrote:
 Hi,

 I've been seeing a problem on the Debian X32 build system where 
 unittest process just hangs, and require manual intervention by 
 the poor maintainer to kill the process manually before the 
 build fails due to inactivity.

 Haven't yet managed to reduce the problem (it only happens on a 
 native X32 system, but not when running X32 under native 
 x86_64), but thought it would be a good idea to suggest that 
 any thread related tests should be safely handled by self 
 terminating after a period of waiting.

 Thoughts from the phobos maintainers?
I'm surprised that there are thread-related tests that deadlock. All the ones I wrote time out for exactly this reason. Of course, getting the timings right can be a pain, so there's no perfect solution.
Jun 20 2014
parent Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 20 June 2014 19:08, Sean Kelly via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On Friday, 20 June 2014 at 07:13:24 UTC, Iain Buclaw wrote:
 Hi,

 I've been seeing a problem on the Debian X32 build system where unittest
 process just hangs, and require manual intervention by the poor maintainer
 to kill the process manually before the build fails due to inactivity.

 Haven't yet managed to reduce the problem (it only happens on a native X32
 system, but not when running X32 under native x86_64), but thought it would
 be a good idea to suggest that any thread related tests should be safely
 handled by self terminating after a period of waiting.

 Thoughts from the phobos maintainers?
I'm surprised that there are thread-related tests that deadlock. All the ones I wrote time out for exactly this reason. Of course, getting the timings right can be a pain, so there's no perfect solution.
From my experience deadlocks in the unittest program have been because
of either problems with core.thread or std.parallelism tests. I am yet to narrow it down though, so it's just a stab in the dark as to what the problem may be.
Jun 20 2014
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 6/20/14, 12:13 AM, Iain Buclaw wrote:
 Hi,

 I've been seeing a problem on the Debian X32 build system where unittest
 process just hangs, and require manual intervention by the poor
 maintainer to kill the process manually before the build fails due to
 inactivity.

 Haven't yet managed to reduce the problem (it only happens on a native
 X32 system, but not when running X32 under native x86_64), but thought
 it would be a good idea to suggest that any thread related tests should
 be safely handled by self terminating after a period of waiting.

 Thoughts from the phobos maintainers?

 Regards
 Iain
I'd be okay with timeboxing unittests assuming of course such termination counts as a fail. -- Andrei
Jun 21 2014